# 监督式微调（SFT）
<a name="nova-fine-tune"></a>

SFT 训练流程主要包含两个阶段：
+ **数据准备**：遵循既定指南，创建、清理或重新格式化数据集，使其符合所需结构。确保输入、输出和辅助信息（例如推理轨迹或元数据）正确对齐和格式化。
+ **训练配置**：定义模型的训练方式。使用时，此配置是在 YAML 配方文件中编写的，包括：
  + 数据来源路径（训练和验证数据集）
  + 关键超参数（epoch、学习率、批次大小）
  + 可选组件（分布式训练参数等）

## Nova 模型对比与选型
<a name="nova-model-comparison"></a>

Amazon Nova 2.0 模型的训练数据集比 Nova 1.0 模型规模更大、种类更丰富。重要改进包括：
+ **增强的推理能力**，支持显式推理模式
+ **更广泛的多语言性能**，覆盖更多语种
+ 在代码生成、工具调用等**复杂任务上表现更优**
+ **扩展的上下文处理能力**，在更长上下文中实现更高的准确性和稳定性

## 何时选用 Nova 1.0 与 Nova 2.0
<a name="nova-model-selection"></a>

满足以下条件时，建议选择 Amazon Nova 2.0：
+ 需要具有高级推理能力的卓越性能
+ 需要多语言支持或复杂的任务处理
+ 需要在代码生成、工具调用或分析任务上获得更优效果

# Nova 2.0 上的 SFT
<a name="nova-sft-2-fine-tune"></a>

Amazon Nova Lite 2.0 提供增强型监督式微调功能，包括高级推理模式、经改进的多模态理解和经扩展的上下文处理。Nova 2.0 上的 SFT 使您能够针对自己的特定使用案例适配这些强大功能，同时保持模型在处理复杂任务时的卓越性能。

Nova 2.0 上的 SFT 的主要特性包括：
+ **推理模式支持**：训练模型在给出最终答案前生成显式推理轨迹，从而增强分析能力。
+ **高级多模态训练**：针对文档理解（PDF）、视频理解和基于图像的任务进行微调，准确率更高。
+ **工具调用功能**：训练模型有效使用外部工具和函数调用，以应对复杂工作流。
+ **经扩展的上下文支持**：利用更长的上下文窗口，在文档密集型应用中获得更佳的稳定性和准确率。

**注意**  
有关适用的容器映像或示例配方的更多信息，请访问 [Amazon Nova 配方](nova-model-recipes.md)。

**Topics**
+ [推理模式选择（仅限 Nova 2.0）](#nova-sft-2-reasoning-mode)
+ [工具调用数据格式](#nova-sft-2-tool-calling)
+ [文档理解数据格式](#nova-sft-2-document-understanding)
+ [SFT 视频理解](#nova-sft-2-video-understanding)
+ [数据上传说明](#nova-sft-2-data-upload)
+ [创建微调作业](#nova-sft-2-creating-job)
+ [SFT 调优参数](#nova-sft-2-tuning-parameters)
+ [超参数指导](#nova-sft-2-hyperparameters)

## SFT 配方示例
<a name="nova-sft-2-sample-recipe"></a>

以下是 SFT 配方示例。您可以在[配方](https://github.com/aws/sagemaker-hyperpod-recipes/tree/main/recipes_collection/recipes/fine-tuning/nova)存储库中找到此配方和其他配方。

```
run:
  name: my-full-rank-sft-run
  model_type: amazon.nova-2-lite-v1:0:256k
  model_name_or_path: nova-lite-2/prod
  data_s3_path: s3://my-bucket-name/train.jsonl  #  only and not compatible with SageMaker Training Jobs
  replicas: 4                                     # Number of compute instances for training, allowed values are 4, 8, 16, 32
  output_s3_path: s3://my-bucket-name/outputs/    # Output artifact path (HyperPod job-specific; not compatible with standard SageMaker Training Jobs)
  mlflow_tracking_uri: ""                         # Required for MLFlow
  mlflow_experiment_name: "my-full-rank-sft-experiment"  # Optional for MLFlow. Note: leave this field non-empty
  mlflow_run_name: "my-full-rank-sft-run"         # Optional for MLFlow. Note: leave this field non-empty

training_config:
  max_steps: 100                    # Maximum training steps. Minimal is 4.
  save_steps: ${oc.select:training_config.max_steps}  # How many training steps the checkpoint will be saved
  save_top_k: 5                     # Keep top K best checkpoints. Note supported only for  jobs. Minimal is 1.
  max_length: 32768                 # Sequence length (options: 8192, 16384, 32768 [default], 65536)
  global_batch_size: 32             # Global batch size (options: 32, 64, 128)
  reasoning_enabled: true           # If data has reasoningContent, set to true; otherwise False

  lr_scheduler:
    warmup_steps: 15                # Learning rate warmup steps. Recommend 15% of max_steps
    min_lr: 1e-6                    # Minimum learning rate, must be between 0.0 and 1.0

  optim_config:                     # Optimizer settings
    lr: 1e-5                        # Learning rate, must be between 0.0 and 1.0
    weight_decay: 0.0               # L2 regularization strength, must be between 0.0 and 1.0
    adam_beta1: 0.9                  # Exponential decay rate for first-moment estimates
    adam_beta2: 0.95                 # Exponential decay rate for second-moment estimates

  peft:                             # Parameter-efficient fine-tuning (LoRA)
    peft_scheme: "null"             # Disable LoRA for PEFT
```

## 推理模式选择（仅限 Nova 2.0）
<a name="nova-sft-2-reasoning-mode"></a>

Amazon Nova 2.0 支持推理模式，可进一步提升分析能力：
+ **推理模式（已启用）**：
  + 在训练配置中设置 `reasoning_enabled: true`
  + 模型训练可在给出最终答案之前生成推理轨迹
  + 在复杂推理任务中获得更优性能
+ **非推理模式（已禁用）**：
  + 设置 `reasoning_enabled: false` 或省略参数（默认）
  + 无显式推理的标准 SFT
  + 适用于无需分步推理即可完成的任务

**注意**  
启用推理后，模型会进行高强度推理。SFT 不提供低推理模式选项。
不支持 SFT 多模态推理内容。推理模式适用于纯文本输入。

### 对非推理数据集使用推理模式
<a name="nova-sft-2-reasoning-non-reasoning-data"></a>

您可在 `reasoning_enabled: true` 的情况下，使用非推理数据集训练 Amazon Nova。但是，这样做可能导致模型丧失推理能力，因为 Amazon Nova 会主要学习直接生成数据中的应答，而不进行推理。

如果使用非推理数据集训练 Amazon Nova，但仍想在推理过程中使用推理模式：

1. 在训练期间禁用推理模式 (`reasoning_enabled: false`)

1. 稍后在推理过程中启用推理模式

虽然这种方法可在推理阶段启用推理模式，但相较于不使用推理模式的推理过程，无法保证获得更优效果。

**最佳实践：**在使用推理数据集时，应同时为训练和推理过程启用推理模式；在使用非推理数据集时，应同时为两者禁用推理模式。

**注意**  
有关适用的容器映像或示例配方的更多信息，请访问 [Amazon Nova 配方](nova-model-recipes.md)。

## 工具调用数据格式
<a name="nova-sft-2-tool-calling"></a>

SFT 支持训练模型使用工具（函数调用）。以下是用于调用工具的示例输入格式：

**示例输入：**

```
{
  "schemaVersion": "bedrock-conversation-2024",
  "system": [
    {
      "text": "You are an expert in composing function calls."
    }
  ],
  "toolConfig": {
    "tools": [
      {
        "toolSpec": {
          "name": "getItemCost",
          "description": "Retrieve the cost of an item from the catalog",
          "inputSchema": {
            "json": {
              "type": "object",
              "properties": {
                "item_name": {
                  "type": "string",
                  "description": "The name of the item to retrieve cost for"
                },
                "item_id": {
                  "type": "string",
                  "description": "The ASIN of item to retrieve cost for"
                }
              },
              "required": [
                "item_id"
              ]
            }
          }
        }
      },
      {
        "toolSpec": {
          "name": "getItemAvailability",
          "description": "Retrieve whether an item is available in a given location",
          "inputSchema": {
            "json": {
              "type": "object",
              "properties": {
                "zipcode": {
                  "type": "string",
                  "description": "The zipcode of the location to check in"
                },
                "quantity": {
                  "type": "integer",
                  "description": "The number of items to check availability for"
                },
                "item_id": {
                  "type": "string",
                  "description": "The ASIN of item to check availability for"
                }
              },
              "required": [
                "item_id", "zipcode"
              ]
            }
          }
        }
      }
    ]
  },
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "I need to check whether there are twenty pieces of the following item available. Here is the item ASIN on Amazon: id-123. Please check for the zipcode 94086"
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "reasoningContent": {
            "reasoningText": {
              "text": "The user wants to check how many pieces of the item with ASIN id-123 are available in the zipcode 94086"
            }
          }
        },
        {
          "toolUse": {
            "toolUseId": "getItemAvailability_0",
            "name": "getItemAvailability",
            "input": {
              "zipcode": "94086",
              "quantity": 20,
              "item_id": "id-123"
            }
          }
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "toolResult": {
            "toolUseId": "getItemAvailability_0",
            "content": [
              {
                "text": "[{\"name\": \"getItemAvailability\", \"results\": {\"availability\": true}}]"
              }
            ]
          }
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "text": "Yes, there are twenty pieces of item id-123 available at 94086. Would you like to place an order or know the total cost?"
        }
      ]
    }
  ]
}
```

工具调用数据的重要注意事项：
+ ToolUse 只能出现在助手轮次中
+ ToolResult 只能出现在用户轮次中
+ ToolResult 只能是文本或 JSON；Amazon Nova 模型目前不支持其他模态
+ toolSpec 中的 inputSchema 必须是有效的 JSON 架构对象
+ 每个 ToolResult 必须引用前序助手轮次 ToolUse 中的有效 toolUseId，且每个 toolUseId 在单次对话中仅可使用一次

**注意**  
有关适用的容器映像或示例配方的更多信息，请访问 [Amazon Nova 配方](nova-model-recipes.md)。

## 文档理解数据格式
<a name="nova-sft-2-document-understanding"></a>

SFT 支持通过文档理解任务来训练模型。以下是示例输入格式：

**示例输入**

```
{
  "schemaVersion": "bedrock-conversation-2024",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "What are the ways in which a customer can experience issues during checkout on Amazon?"
        },
        {
          "document": {
            "format": "pdf",
            "source": {
              "s3Location": {
                "uri": "s3://my-bucket-name/path/to/documents/customer_service_debugging.pdf",
                "bucketOwner": "123456789012"
              }
            }
          }
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "text": "Customers can experience issues with 1. Data entry, 2. Payment methods, 3. Connectivity while placing the order. Which one would you like to dive into?"
        }
      ],
      "reasoning_content": [
        {
          "text": "I need to find the relevant section in the document to answer the question.",
          "type": "text"
        }
      ]
    }
  ]
}
```

文档理解的重要注意事项：
+ 仅支持 PDF 文件
+ 最大文档为 10 MB
+ 单个样本可包含文档与文本，但不可将文档与其他模态（如图像、视频）混用

**注意**  
有关适用的容器映像或示例配方的更多信息，请访问 [Amazon Nova 配方](nova-model-recipes.md)。

## SFT 视频理解
<a name="nova-sft-2-video-understanding"></a>

SFT 支持针对视频理解任务微调模型。以下是示例输入格式：

**示例输入**

```
{
  "schemaVersion": "bedrock-conversation-2024",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "What are the ways in which a customer can experience issues during checkout on Amazon?"
        },
        {
          "video": {
            "format": "mp4",
            "source": {
              "s3Location": {
                "uri": "s3://my-bucket-name/path/to/videos/customer_service_debugging.mp4",
                "bucketOwner": "123456789012"
              }
            }
          }
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "text": "Customers can experience issues with 1. Data entry, 2. Payment methods, 3. Connectivity while placing the order. Which one would you like to dive into?"
        }
      ],
      "reasoning_content": [
        {
          "text": "I need to find the relevant section in the video to answer the question.",
          "type": "text"
        }
      ]
    }
  ]
}
```

视频理解的重要注意事项：
+ 视频大小最大 50 MB
+ 视频时长最长 15 分钟
+ 每个样本仅允许包含一个视频；不支持在同一样本中使用多个视频
+ 单个样本可包含视频与文本，但不可将视频与其他模态（如图像、文档）混用

**注意**  
有关适用的容器映像或示例配方的更多信息，请访问 [Amazon Nova 配方](nova-model-recipes.md)。

## 数据上传说明
<a name="nova-sft-2-data-upload"></a>

将训练数据集与验证数据集上传到 S3 存储桶。在配方的 `run` 数据块中指定以下路径：

```
## Run config
run:
  ...
  data_s3_path: "s3://<bucket-name>/<training-directory>/<training-file>.jsonl"
```

**注意**：将 `<bucket-name>`、`<training-directory>`、`<validation-directory>`、`<training-file>` 和 `<validation-file>` 替换为实际的 S3 路径。

**注意**：Amazon Nova 2.0 SFT 目前不支持验证数据集。如果传入验证数据集，系统将忽略该数据集。

## 创建微调作业
<a name="nova-sft-2-creating-job"></a>

使用 `run` 数据块中的 `model_type` 和 `model_name_or_path` 字段定义基础模型：

```
## Run config
run:
  ...
  model_type: amazon.nova-2-lite-v1:0:256k
  model_name_or_path: nova-lite-2/prod
  ...
```

## SFT 调优参数
<a name="nova-sft-2-tuning-parameters"></a>

可用于 SFT 调优的参数包括：

**运行配置**  

+ **name**：训练作业的描述性名称。这有助于在 AWS 管理控制台中识别对应作业。
+ **model\$1type**：要使用的 Amazon Nova 模型变体。可用选项为 `amazon.nova-2-lite-v1:0:256k`。
+ **model\$1name\$1or\$1path**：用于训练的基本模型的路径。可用选项为 `nova-lite-2/prod`，或训练后检查点的 S3 路径 (`s3://customer-escrow-bucket-unique_id/training_run_name`)。
+ **replicas**：要在分布式训练中使用的计算实例数。可用值因您所选的模型而异。Amazon Nova Lite 2.0 支持 4、8、16 或 32 个副本。
+ **data\$1s3\$1path**：训练数据集的 S3 位置，格式为 JSONL 文件。此文件必须与集群位于同一 AWS 账户和区域中。提供的所有 S3 位置必须位于同一账户和区域中。
+ **validation\$1data\$1s3\$1path**：（选填）验证数据集的 S3 位置，格式为 JSONL 文件。此文件必须与集群位于相同的账户和区域中。提供的所有 S3 位置必须位于同一账户和区域中。
+ **output\$1s3\$1path**：存储清单和 TensorBoard 日志的 S3 位置。提供的所有 S3 位置必须位于同一 AWS 账户和 AWS 区域中。
+ **mlflow\$1tracking\$1uri**：用于 MLflow 日志记录的 MLflow 应用程序的 ARN。
+ **mlflow\$1experiment\$1name**：MLflow 实验名称。
+ **mlflow\$1run\$1name**：MLflow 运行名称。

**训练配置**  

+ **max\$1steps**：训练迭代步数。每一步训练模型所用的元素数量即 `global_batch_size`。
+ **save\$1steps**：训练期间保存模型检查点的频率（以步为单位）。
+ **save\$1top\$1k**：根据验证指标保留的最佳检查点的最大数量。
+ **max\$1length**：以词元为单位的最大序列长度。这决定了训练的上下文窗口大小。SFT 支持的最大值为 32768 个词元。

  更长的序列将会提高训练效率，但会以增加内存需求为代价。建议将 max\$1length 参数设置为与数据分布相匹配。
+ **global\$1batch\$1size**：所有设备与 Worker 节点在一次前向及反向传播中共同处理的训练样本总数。

  该值乘以每台设备的批量大小和设备数量。它会影响训练的稳定性和吞吐量。我们建议从适合您内存的批量大小开始，然后进行扩展。对于特定领域的数据，较大的批量大小可能会使梯度过于平滑。
+ **reasoning\$1enabled**：布尔标志，用于在训练期间启用推理功能。

**学习率调度器**  

+ **warmup\$1steps**：逐步提高学习率的步数。这可以提高训练稳定性。
+ **min\$1lr**：衰减结束时的最低学习率。有效值介于 0 到 1 之间（含两端值），但必须小于学习率。

**优化器配置**  

+ **lr**：学习率，控制优化期间的步长。为了获得良好的性能，我们建议使用介于 1e-6 和 1e-4 之间的值。有效值介于 0 到 1 之间（含两端值）。
+ **weight\$1decay**：L2 正则化强度。值比较高（介于 0.01 到 0.1 之间）会增加正则化强度。
+ **adam\$1beta1**：Adam 优化器中一阶矩估计的指数衰减率。默认值为 0.9。
+ **adam\$1beta2**：Adam 优化器中二阶矩估计的指数衰减率。默认值为 0.95。

**PEFT 配置**  

+ **peft\$1scheme**：要使用的参数高效微调方案。全秩微调的可选值为 `'null'`；基于 LoRA 的微调可选值为 `lora`。

**LoRA 微调（当 peft\$1scheme 为 lora 时）**  

+ **alpha**：LoRA 缩放参数。控制低秩适应的幅度。典型值范围为 8 到 128。
+ **lora\$1plus\$1lr\$1ratio**：LoRA\$1 优化中的学习率比例。该乘数会专门针对 LoRA 参数调整学习率。

## 超参数指导
<a name="nova-sft-2-hyperparameters"></a>

根据训练方法，建议采用如下超参数：

**全秩训练**
+ **epoch**：1
+ **学习率（lr）**：1e – 5
+ **最低学习率（min\$1lr）**：1e – 6

**LoRA（低秩适应）**
+ **epoch**：2
+ **学习率（lr）**：5e – 5
+ **最低学习率（min\$1lr）**：1e – 6

**注意**：请根据数据集规模和验证性能调整这些值。监控训练指标以防止过拟合。