本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。 # 將檔案預先載入您的檔案系統您可以選擇將內容個別檔案或目錄預先載入檔案系統。 ## 使用 HSM 命令匯入檔案第一次存取檔案時，Amazon FSx 會從 Amazon S3 資料儲存庫複製資料。由於這種方法，檔案的初始讀取或寫入會產生少量延遲。如果您的應用程式對此延遲很敏感，而且您知道應用程式需要存取哪些檔案或目錄，您可以選擇預先載入個別檔案或目錄的內容。您可以使用 `hsm_restore`命令來執行此操作，如下所示。您可以使用 `hsm_action`命令（由`lfs`使用者公用程式發出）來驗證檔案的內容是否已完成載入檔案系統。傳回值 `NOOP`表示檔案已成功載入。從已安裝檔案系統的運算執行個體執行下列命令。將 {{path/to/file}} 取代為您預先載入至檔案系統的檔案路徑。 ``` sudo lfs hsm_restore {{path/to/file}} sudo lfs hsm_action {{path/to/file}} ``` 您可以使用下列命令，在檔案系統中預先載入整個檔案系統或整個目錄。（結尾的 ampers 和會讓命令做為背景程序執行。) 如果您同時請求預先載入多個檔案，Amazon FSx 會從 Amazon S3 資料儲存庫平行載入您的檔案。如果檔案已載入檔案系統，則`hsm_restore`命令不會重新載入檔案。 ``` nohup find {{local/directory}} -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_restore & ``` **注意** 如果您連結的 S3 儲存貯體大於檔案系統，您應該能夠將所有檔案中繼資料匯入檔案系統。不過，您只能載入與檔案系統剩餘儲存空間一樣多的實際檔案資料。當您嘗試存取檔案資料時，如果檔案系統上沒有剩餘的儲存空間，您將會收到錯誤。如果發生這種情況，您可以視需要增加儲存容量。如需詳細資訊，請參閱[管理儲存容量](managing-storage-capacity.md)。 ## 驗證步驟您可以執行下列 bash 指令碼，以協助您探索有多少檔案或物件處於封存（已發行）狀態。為了改善指令碼的效能，特別是在具有大量檔案的檔案系統中，CPU 執行緒會根據 `/proc/cpuproc` 檔案自動決定。也就是說，您會看到具有較高 vCPU 計數 Amazon EC2 執行個體的效能更快。 1. 設定 bash 指令碼。 ``` #!/bin/bash # Check if a directory argument is provided if [ $# -ne 1 ]; then echo "Usage: $0 /path/to/lustre/mount" exit 1 fi # Set the root directory from the argument ROOT_DIR="$1" # Check if the provided directory exists if [ ! -d "$ROOT_DIR" ]; then echo "Error: Directory $ROOT_DIR does not exist." exit 1 fi # Automatically detect number of CPUs and set threads if command -v nproc &> /dev/null; then THREADS=$(nproc) elif [ -f /proc/cpuinfo ]; then THREADS=$(grep -c ^processor /proc/cpuinfo) else echo "Unable to determine number of CPUs. Defaulting to 1 thread." THREADS=1 fi # Output file OUTPUT_FILE="released_objects_$(date +%Y%m%d_%H%M%S).txt" echo "Searching in $ROOT_DIR for all released objects using $THREADS threads" echo "This may take a while depending on the size of the filesystem..." # Find all released files in the specified lustre directory using parallel # If you get false positives for file names/paths that include the word 'released', # you can grep 'released exists archived' instead of just 'released' time sudo lfs find "$ROOT_DIR" -type f | \ parallel --will-cite -j "$THREADS" -n 1000 "sudo lfs hsm_state {} | grep released" > "$OUTPUT_FILE" echo "Search complete. Released objects are listed in $OUTPUT_FILE" echo "Total number of released objects: $(wc -l <"$OUTPUT_FILE")" ``` 1. 讓指令碼可執行： ``` $ chmod +x find_lustre_released_files.sh ``` 1. 執行指令碼，如下列範例所示： ``` $ ./find_lustre_released_files.sh /fsxl/sample Searching in /fsxl/sample for all released objects using 16 threads This may take a while depending on the size of the filesystem... real 0m9.906s user 0m1.502s sys 0m5.653s Search complete. Released objects are listed in released_objects_20241121_184537.txt Total number of released objects: 30000 ``` 如果有已發行的物件存在，請在所需的目錄上執行大量還原，將檔案從 S3 帶入 FSx for Lustre，如下列範例所示： ``` $ DIR=/path/to/lustre/mount $ nohup find $DIR -type f -print0 | xargs -0 -n 1 -P 8 sudo lfs hsm_restore & ``` 請注意，有數百萬個檔案需要`hsm_restore`一些時間。