# 微调 Nova 2.0
<a name="nova-fine-tune-2"></a>

## 先决条件
<a name="nova-model-training-jobs-prerequisites2"></a>

在开始训练作业之前，请注意具备以下内容：
+ Amazon S3 存储桶，用于存储您的输入数据和训练作业的输出。您可以为这两者使用一个存储桶，也可以为每种类型的数据使用不同的存储桶。确保您的存储桶位于您创建所有其他训练资源所用的 AWS 区域。有关更多信息，请参阅[创建通用存储桶](https://docs.aws.amazon.com//AmazonS3/latest/userguide/create-bucket-overview.html)。
+ 具有运行训练作业权限的 IAM 角色。请务必为 IAM 策略附加 `AmazonSageMakerFullAccess`。有关更多信息，请参阅[如何使用 SageMaker AI 执行角色](https://docs.aws.amazon.com//sagemaker/latest/dg/sagemaker-roles.html)。
+ 基本 Amazon Nova 配方，请参阅[获取 Amazon Nova 配方](nova-model-recipes.md#nova-model-get-recipes)。

## 什么是 SFT？
<a name="nova-2-what-is-sft"></a>

监督式微调（SFT）使用带标注的输入-输出对训练语言模型。模型从包含提示和响应的示范示例中学习，优化能力以适配特定任务、指令或预期行为。

## 数据准备
<a name="nova-2-data-preparation"></a>

### 概述
<a name="nova-2-data-overview"></a>

Nova 2.0 SFT 数据采用与 Nova 1.0 相同的 Converse API 格式，并新增了可选的推理内容字段。如需完整格式规范，请参阅：
+ 推理内容：[ReasoningContentBlock](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ReasoningContentBlock.html)
+ Converse API 架构：[Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-call.html)
+ 数据集约束：[数据集约束](https://docs.aws.amazon.com/nova/latest/userguide/fine-tune-prepare-data-understanding.html)

### 支持的功能
<a name="nova-2-supported-features"></a>
+ **输入类型**：用户内容块中的文本、图像或视频
+ **助手内容**：纯文本响应和推理内容
+ **数据集构成**：必须为同构数据。选择如下选项之一：
  + 纯文本轮次
  + 文本 \$1 图像轮次
  + 文本 \$1 视频轮次（支持文档理解）

**重要**  
不得在同一数据集中或不同对话轮次间混合使用图像和视频。

### 目前的局限性
<a name="nova-2-current-limitations"></a>
+ **多模态推理内容**：虽然 Converse 格式支持基于图像的推理内容，但 Nova 2.0 SFT 仅支持 reasoningText 字段中基于文本的推理内容。
+ **验证集**：Nova 2.0 的 SFT 训练不支持提供验证数据集。若传入验证数据集，训练过程中将被忽略。该限制同时适用于基于 UI 提交和程序化提交的训练作业。

### 支持的媒体格式
<a name="nova-2-supported-media"></a>
+ **图像**：PNG、JPEG、GIF
+ **视频**：MOV、MKV、MP4

### 数据格式示例
<a name="nova-2-data-examples"></a>

------
#### [ Text-only (Nova 1.0 compatible) ]

```
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "system": [  
    {  
      "text": "You are a digital assistant with a friendly personality"  
    }  
  ],  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "text": "What country is right next to Australia?"  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "text": "The closest country is New Zealand"  
        }  
      ]  
    }  
  ]  
}
```

------
#### [ Text with reasoning (Nova 2.0) ]

```
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "system": [  
    {  
      "text": "You are a digital assistant with a friendly personality"  
    }  
  ],  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "text": "What country is right next to Australia?"  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "reasoningContent": {  
            "reasoningText": {  
              "text": "I need to use my world knowledge of geography to answer this question"  
            }  
          }  
        },  
        {  
          "text": "The closest country to Australia is New Zealand, located to the southeast across the Tasman Sea."  
        }  
      ]  
    }  
  ]  
}
```

------
#### [ Image \$1 text input ]

```
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "system": [  
    {  
      "text": "You are a helpful assistant."  
    }  
  ],  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "image": {  
            "format": "jpeg",  
            "source": {  
              "s3Location": {  
                "uri": "s3://your-bucket/your-path/your-image.jpg",  
                "bucketOwner": "your-aws-account-id"  
              }  
            }  
          }  
        },  
        {  
          "text": "Which country is highlighted in the image?"  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "reasoningContent": {  
            "reasoningText": {  
              "text": "I will determine the highlighted country by examining its location on the map and using my geographical knowledge"  
            }  
          }  
        },  
        {  
          "text": "The highlighted country is New Zealand"  
        }  
      ]  
    }  
  ]  
}
```

------
#### [ Video \$1 text input ]

```
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "system": [  
    {  
      "text": "You are a helpful assistant."  
    }  
  ],  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "video": {  
            "format": "mp4",  
            "source": {  
              "s3Location": {  
                "uri": "s3://your-bucket/your-path/your-video.mp4",  
                "bucketOwner": "your-aws-account-id"  
              }  
            }  
          }  
        },  
        {  
          "text": "What is shown in this video?"  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "reasoningContent": {  
            "reasoningText": {  
              "text": "I will analyze the video content to identify key elements"  
            }  
          }  
        },  
        {  
          "text": "The video shows a map with New Zealand highlighted"  
        }  
      ]  
    }  
  ]  
}
```

------

## 工具调用
<a name="nova-2-tool-calling"></a>

Nova 2.0 SFT 支持基于工具调用模式进行模型训练，使模型能够学习何时以及如何调用外部工具或函数。

### 工具调用的数据格式
<a name="nova-2-tool-calling-format"></a>

工具调用训练数据包含用于定义可用工具的 `toolConfig` 部分，以及展示工具使用模式的对话轮次内容。

**示例输入**

```
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "system": [  
    {  
      "text": "You are an expert in composing function calls."  
    }  
  ],  
  "toolConfig": {  
    "tools": [  
      {  
        "toolSpec": {  
          "name": "getItemCost",  
          "description": "Retrieve the cost of an item from the catalog",  
          "inputSchema": {  
            "json": {  
              "type": "object",  
              "properties": {  
                "item_name": {  
                  "type": "string",  
                  "description": "The name of the item to retrieve cost for"  
                },  
                "item_id": {  
                  "type": "string",  
                  "description": "The ASIN of item to retrieve cost for"  
                }  
              },  
              "required": [  
                "item_id"  
              ]  
            }  
          }  
        }  
      },  
      {  
        "toolSpec": {  
          "name": "getItemAvailability",  
          "description": "Retrieve whether an item is available in a given location",  
          "inputSchema": {  
            "json": {  
              "type": "object",  
              "properties": {  
                "zipcode": {  
                  "type": "string",  
                  "description": "The zipcode of the location to check in"  
                },  
                "quantity": {  
                  "type": "integer",  
                  "description": "The number of items to check availability for"  
                },  
                "item_id": {  
                  "type": "string",  
                  "description": "The ASIN of item to check availability for"  
                }  
              },  
              "required": [  
                "item_id", "zipcode"  
              ]  
            }  
          }  
        }  
      }  
    ]  
  },  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "text": "I need to check whether there are twenty pieces of the following item available. Here is the item ASIN on Amazon: id-123. Please check for the zipcode 94086"  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "reasoningContent": {  
            "reasoningText": {  
              "text": "The user wants to check how many pieces of the item with ASIN id-123 are available in the zipcode 94086"  
            }  
          }  
        },  
        {  
          "toolUse": {  
            "toolUseId": "getItemAvailability_0",  
            "name": "getItemAvailability",  
            "input": {  
              "zipcode": "94086",  
              "quantity": 20,  
              "item_id": "id-123"  
            }  
          }  
        }  
      ]  
    },  
    {  
      "role": "user",  
      "content": [  
        {  
          "toolResult": {  
            "toolUseId": "getItemAvailability_0",  
            "content": [  
              {  
                "text": "[{\"name\": \"getItemAvailability\", \"results\": {\"availability\": true}}]"  
              }  
            ]  
          }  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {  
          "text": "Yes, there are twenty pieces of item id-123 available at 94086. Would you like to place an order or know the total cost?"  
        }  
      ]  
    }  
  ]  
}
```

### 工具调用要求
<a name="nova-2-tool-calling-requirements"></a>

创建工具调用训练数据时，请遵循以下要求：


| 要求 | 说明 | 
| --- | --- | 
| ToolUse 放置 | ToolUse 只能出现在助手轮次中 | 
| ToolResult 放置 | ToolResult 只能出现在用户轮次中 | 
| ToolResult 格式 | ToolResult 只能为文本或 JSON。Nova 模型不支持其他模态 | 
| inputSchema 格式 | toolSpec 中的 inputSchema 必须是有效的 JSON 架构对象 | 
| toolUseId 匹配 | 每个 ToolResult 必须引用前序助手轮次 ToolUse 中的有效 toolUseId，且每个 toolUseId 在单次对话中仅可使用一次 | 

### 重要提示
<a name="nova-2-tool-calling-notes"></a>
+ 确保工具定义在所有训练样本中保持一致
+ 模型将从所提供的示例中学习工具调用模式
+ 包含各类场景示例，明确各工具的适用与不适用情况

## 文档理解
<a name="nova-2-document-understanding"></a>

Nova 2.0 SFT 支持基于文档任务的训练，使模型能够学习如何分析并回答与 PDF 文档相关的问题。

### 文档理解数据格式
<a name="nova-2-document-format"></a>

文档理解训练数据在用户内容块中包含文档引用，模型将学习对文档内容进行提取与推理。

**示例输入**

```
{  
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "text": "What are the ways in which a customer can experience issues during checkout on Amazon?"  
        },  
        {  
          "document": {  
            "format": "pdf",  
            "source": {  
              "s3Location": {  
                "uri": "s3://my-bucket-name/path/to/documents/customer_service_debugging.pdf",  
                "bucketOwner": "123456789012"  
              }  
            }  
          }  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {
          "reasoningContent": {  
            "reasoningText": {  
              "text": "I need to find the relevant section in the document to answer the question."  
            }  
          }
        },
        {  
          "text": "Customers can experience issues with 1. Data entry, 2. Payment methods, 3. Connectivity while placing the order. Which one would you like to dive into?"  
        }   
      ]
    }  
  ]  
}
}
```

### 文档理解限制
<a name="nova-2-document-limitations"></a>


| 限制 | Details | 
| --- | --- | 
| 支持的格式 | 仅支持 PDF 文件 | 
| 最大文档大小 | 10 MB | 
| 模态混合 | 单个样本可包含文档与文本，但不可将文档与其他模态（图像、视频）混用 | 

### 文档理解最佳实践
<a name="nova-2-document-best-practices"></a>
+ 确保文档格式清晰，文本可正常提取
+ 提供覆盖不同文档类型与问题格式的多样化样本
+ 包含推理内容，帮助模型学习文档分析模式

## 视频理解
<a name="nova-2-video-understanding"></a>

Nova 2.0 SFT 支持基于视频任务的训练，使模型能够学习如何分析并回答与视频内容相关的问题。

### 视频理解数据格式
<a name="nova-2-video-format"></a>

视频理解训练数据在用户内容块中包含视频引用，模型将学习从视频内容中提取信息并进行推理。

**示例输入**

```
  
{  
  "schemaVersion": "bedrock-conversation-2024",  
  "messages": [  
    {  
      "role": "user",  
      "content": [  
        {  
          "text": "What are the ways in which a customer can experience issues during checkout on Amazon?"  
        },  
        {  
          "video": {  
            "format": "mp4",  
            "source": {  
              "s3Location": {  
                "uri": "s3://my-bucket-name/path/to/videos/customer_service_debugging.mp4",  
                "bucketOwner": "123456789012"  
              }  
            }  
          }  
        }  
      ]  
    },  
    {  
      "role": "assistant",  
      "content": [  
        {
          "reasoningContent": {  
            "reasoningText": {  
              "text": "I need to find the relevant section in the video to answer the question."  
            }  
          }
        },
        {  
          "text": "Customers can experience issues with 1. Data entry, 2. Payment methods, 3. Connectivity while placing the order. Which one would you like to dive into?"  
        }   
      ]  
    }  
  ]  
}
```

### 视频理解限制
<a name="nova-2-video-limitations"></a>


| 限制 | Details | 
| --- | --- | 
| 最大视频大小 | 50 MB | 
| 视频最长时长 | 15 分钟 | 
| 单个样本视频数量 | 每个样本仅允许包含一个视频，不支持在同一样本中使用多个视频 | 
| 模态混合 | 单个样本可包含视频与文本，但不可将视频与其他模态（图像、文档）混用 | 

### 支持的视频格式
<a name="nova-2-video-formats"></a>
+ MOV
+ MKV
+ MP4

### 视频理解最佳实践
<a name="nova-2-video-best-practices"></a>
+ 保持视频简洁明了，聚焦与任务相关的内容
+ 确保视频画质清晰，便于模型提取有效信息
+ 提出的问题应明确指向视频中的特定内容
+ 提供覆盖不同视频类型与问题格式的多样化样本

## 推理模式与非推理模式
<a name="nova-2-reasoning-modes"></a>

### 理解推理内容
<a name="nova-2-understanding-reasoning"></a>

推理内容（亦称思维链）会记录模型在生成最终答案前的中间思考步骤。在 `assistant` 轮次中，可通过 `reasoningContent` 字段加入这些推理轨迹。

**损失计算方式**
+ **包含推理内容**：训练损失同时计入推理词元和最终输出词元
+ **不含推理内容**：训练损失仅基于最终输出词元计算

您可在多轮对话的多个助手轮次中添加 `reasoningContent`。

**格式规范**
+ 推理内容使用纯文本
+ 除非任务明确要求，否则避免使用 `<thinking>` 和 `</thinking>` 等标记标签
+ 确保推理内容清晰，并与问题求解过程相关

### 何时启用推理模式
<a name="nova-2-when-enable-reasoning"></a>

在以下场景中，在训练配置里设置 `reasoning_enabled: true`：
+ 训练数据包含推理词元
+ 希望模型在生成最终输出前先产生思考词元
+ 需要在复杂推理任务上获得更优性能

允许在 `reasoning_enabled = true` 的情况下，使用非推理数据集训练 Nova。但是，这样做可能会导致模型丧失推理能力，因为 Nova 主要学习数据中呈现的应答方式，而非执行推理过程。如果希望使用非推理数据集训练，同时在推理阶段保留推理能力，则可在训练时关闭推理 (`reasoning_enabled = false`)，并在推理时启用。这种方式虽能在推理时使用推理，但无法保证效果优于不启用推理的推理方式。一般建议：使用推理数据集时，训练与推理均启用推理；使用非推理数据集时，两者均关闭推理。

在以下场景中，设置 `reasoning_enabled: false`：
+ 训练数据不包含推理词元
+ 训练任务较为简单，无需显式推理步骤
+ 希望优化速度并减少词元使用量

### 生成推理数据
<a name="nova-2-generating-reasoning"></a>

如果数据集缺失推理轨迹，可借助 Nova Premier 等具备推理能力的模型来生成。将输入-输出对提供给模型，记录其推理过程，从而构建包含推理内容的增强型数据集。

### 使用推理词元进行训练
<a name="nova-2-using-reasoning-training"></a>

启用推理模式进行训练时，模型会学习将内部推理过程与最终答案分离开。训练过程：
+ 将数据组织为三元组结构：输入、推理和答案
+ 基于推理词元和答案词元的标准下一词元预测损失进行优化
+ 引导模型在生成回复前先进行内部逻辑推理

### 优质推理内容的要素
<a name="nova-2-effective-reasoning"></a>

高质量的推理内容应包含以下要素：
+ 中间思考与分析
+ 逻辑推导与推理步骤
+ 分步解决问题的方法
+ 推理步骤与结论之间的明确关联

这些要素有助于模型掌握“先思考，后作答”的能力。

## 数据集准备指南
<a name="nova-2-dataset-preparation"></a>

### 规模与质量
<a name="nova-2-size-quality"></a>
+ **推荐样本量**：2000 – 10000 个样本
+ **最小样本量**：200
+ **优先级**：质量胜于数量。确保样本准确、标注规范
+ **应用贴合度**：数据集应尽可能贴近实际使用案例

### 多样性
<a name="nova-2-diversity"></a>

样本需具备多样性，满足以下要求：
+ 覆盖所有预期输入类型
+ 包含不同难度级别的样本
+ 纳入边界情况与各类变体
+ 避免模型过拟合于单一模式

### 输出格式
<a name="nova-2-output-formatting"></a>

在助手响应中明确指定所需输出格式：
+ JSON 结构
+ 表
+ CSV 格式
+ 应用程序专属的自定义格式

### 多回合对话
<a name="nova-2-multi-turn"></a>

使用多轮对话数据集时请注意：
+ 损失仅基于助手轮次计算，而非用户轮次
+ 每个助手响应都应格式正确
+ 保持各轮对话风格与格式一致

### 质量检查清单
<a name="nova-2-quality-checklist"></a>
+ 数据集规模充足（2000 – 10000 个样本）
+ 涵盖所有使用案例的多样化样本
+ 输出格式清晰统一
+ 标签与标注准确无误
+ 贴合实际生产场景
+ 无矛盾或歧义内容

### 上传数据
<a name="nova-2-uploading-data"></a>

数据集需上传到 SageMaker 训练作业可访问的存储桶。有关设置相应权限的信息，请参阅[先决条件](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-general-prerequisites.html)。

## 启动训练作业
<a name="nova-2-starting-training"></a>

### 选择超参数并更新配方
<a name="nova-2-selecting-hyperparameters"></a>

Nova 2.0 的设置方式与 Nova 1.0 基本一致。将输入数据上传到 S3 后，使用 [SageMaker HyperPod Recipes](https://github.com/aws/sagemaker-hyperpod-recipes/tree/main/recipes_collection/recipes/fine-tuning/nova) 中 fine-tuning 文件夹下的配方。对于 Nova 2.0，以下是一些可以根据使用案例更新的关键超参数。以下是 Nova 2.0 SFT PEFT 配方的样本。对于容器映像 URI，请使用 `708977205387.dkr.ecr.us-east-1.amazonaws.com/nova-fine-tune-repo:SM-TJ-SFT-V2-latest` 来运行 SFT 微调作业。

请使用 v2.254.1 版本的 SageMaker AI PySDK，确保与 Nova 训练严格兼容。将 SDK 升级到 v3.0 版本会导致破坏性变更。对 SageMaker AI PySDK v3 的支持即将推出。

**示例输入**

```
!pip install sagemaker==2.254.1
```

```
run:  
  name: {peft_recipe_job_name}  
  model_type: amazon.nova-2-lite-v1:0:256k  
  model_name_or_path: {peft_model_name_or_path}  
  data_s3_path: {train_dataset_s3_path} # SageMaker HyperPod (SMHP) only and not compatible with SageMaker Training jobs. Note replace my-bucket-name with your real bucket name for SMHP job  
  replicas: 4                      # Number of compute instances for training, allowed values are 4, 8, 16, 32  
  output_s3_path: ""               # Output artifact path (Hyperpod job-specific; not compatible with standard SageMaker Training jobs). Note replace my-bucket-name with your real bucket name for SMHP job  
  
training_config:  
  max_steps: 10                   # Maximum training steps. Minimal is 4.  
  save_steps: 10                      # How many training steps the checkpoint will be saved. Should be less than or equal to max_steps  
  save_top_k: 1                    # Keep top K best checkpoints. Note supported only for SageMaker HyperPod jobs. Minimal is 1.  
  max_length: 32768                # Sequence length (options: 8192, 16384, 32768 [default], 65536)  
  global_batch_size: 32            # Global batch size (options: 32, 64, 128)  
  reasoning_enabled: true          # If data has reasoningContent, set to true; otherwise False  
  
  lr_scheduler:  
    warmup_steps: 15               # Learning rate warmup steps. Recommend 15% of max_steps  
    min_lr: 1e-6                   # Minimum learning rate, must be between 0.0 and 1.0  
  
  optim_config:                    # Optimizer settings  
    lr: 1e-5                       # Learning rate, must be between 0.0 and 1.0  
    weight_decay: 0.0              # L2 regularization strength, must be between 0.0 and 1.0  
    adam_beta1: 0.9                # Exponential decay rate for first-moment estimates, must be between 0.0 and 1.0  
    adam_beta2: 0.95               # Exponential decay rate for second-moment estimates, must be between 0.0 and 1.0  
  
  peft:                            # Parameter-efficient fine-tuning (LoRA)  
    peft_scheme: "lora"            # Enable LoRA for PEFT  
    lora_tuning:  
      alpha: 64                    # Scaling factor for LoRA weights ( options: 32, 64, 96, 128, 160, 192),  
      lora_plus_lr_ratio: 64.0
```

该配方包含的超参数与 Nova 1.0 基本一致。核心超参数如下：
+ `max_steps`：希望作业运行的步数。通常，1 轮 epoch（遍历完整数据集一次）的计算方式为：步数 = 数据样本数/全局批次大小。步数越大、全局批次越小，任务运行耗时越长。
+ `reasoning_enabled`：控制数据集的推理模式。选项：
  + `true`：启用推理模式（相当于高强度推理）
  + `false`：禁用推理模式

  注意：在 SFT 中，无法精细控制推理强度。设置 `reasoning_enabled: true` 即启用完整推理能力。
+ `peft.peft_scheme`：设置为“lora”即启用基于 PEFT 的微调。设置为 null（不带引号）即启用全秩微调。

### 启动训练作业
<a name="nova-2-start-job"></a>

```
from sagemaker.pytorch import PyTorch  
  
# define OutputDataConfig path  
if default_prefix:  
    output_path = f"s3://{bucket_name}/{default_prefix}/{sm_training_job_name}"  
else:  
    output_path = f"s3://{bucket_name}/{sm_training_job_name}"  

output_kms_key = "<KMS key arn to encrypt trained model in Amazon-owned S3 bucket>" # optional, leave blank for Amazon managed encryption
  
recipe_overrides = {  
    "run": {  
        "replicas": instance_count,  # Required  
        "output_s3_path": output_path  
    },  
}  
  
estimator = PyTorch(  
    output_path=output_path,  
    base_job_name=sm_training_job_name,  
    role=role,  
    disable_profiler=True,  
    debugger_hook_config=False,  
    instance_count=instance_count,  
    instance_type=instance_type,  
    training_recipe=training_recipe,  
    recipe_overrides=recipe_overrides,  
    max_run=432000,  
    sagemaker_session=sagemaker_session,  
    image_uri=image_uri,
    output_kms_key=output_kms_key,
    tags=[  
        {'Key': 'model_name_or_path', 'Value': model_name_or_path},  
    ]  
)  
  
print(f"\nsm_training_job_name:\n{sm_training_job_name}\n")  
print(f"output_path:\n{output_path}")
```

```
from sagemaker.inputs import TrainingInput  
  
train_input = TrainingInput(  
    s3_data=train_dataset_s3_path,  
    distribution="FullyReplicated",  
    s3_data_type="Converse",  
)  
  
estimator.fit(inputs={"validation": val_input}, wait=False)
```

**注意**  
Nova 2.0 的监督式微调不支持传递验证数据集。

要启动作业，请执行以下操作：
+ 更新配方中的数据集路径与超参数
+ 运行笔记本中指定的代码单元，提交训练作业

笔记本将自动处理作业提交，并提供状态跟踪功能。