用DeepSeek-R1-Distill-data-110k蒸馏中文数据集 微调Qwen2.5-7B-Instruct!
-
下载模型与数据
模型下载:
huggingface:
Qwen/Qwen2.5-7B-Instruct · HF MirrorWe’re on a journey to advance and democratize artificial intelligence through open source and open science.https://hf-mirror.com/Qwen/Qwen2.5-7B-Instruct
魔搭:
魔搭社区汇聚各领域最先进的机器学习模型,提供模型探索体验、推理、训练、部署和应用的一站式服务。https://www.modelscope.cn/models/Qwen/Qwen2.5-7B-Instruct
数据下载:
https://huggingface.co/datasets/Congliu/Chinese-DeepSeek-R1-Distill-data-110khttps://huggingface.co/datasets/Congliu/Chinese-DeepSeek-R1-Distill-data-110k
-
安装swift
使用 pip 安装:
pip install ms-swift -U
从源安装:
# pip install git+https://github.com/modelscope/ms-swift.git git clone https://github.com/modelscope/ms-swift.git cd ms-swift pip install -e .
-
微调
CUDA_VISIBLE_DEVICES=0,1 \ swift sft \ --model /home/models/pretrained_models/llm/Qwen2.5-7B-Instruct \ --train_type lora \ --dataset /home/data/Chinese-DeepSeek-R1-Distill-data-110k-SFT/new_distill_r1_110k_sft.json \ --torch_dtype bfloat16 \ --num_train_epochs 6 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --learning_rate 1e-4 \ --lora_rank 8 \ --lora_alpha 32 \ --target_modules all-linear \ --gradient_accumulation_steps 16 \ --eval_steps 50 \ --save_steps 50 \ --save_total_limit 5 \ --logging_steps 5 \ --output_dir output \ --system 'You are a deep thinking assistant.' \ --warmup_ratio 0.05 \ --dataloader_num_workers 4 \ --model_author Q \ --model_name Q-AILab-Qwen2.5-7B-Instruct-R1-Distill
-
训练过程
2张A800,训练时长5天,共训练6轮。
-
推理效果
推理:
CUDA_VISIBLE_DEVICES=0,1 \ swift infer \ --adapters /home/model/swift/output/v6-20250217-075043/checkpoint-50 \ --stream true \ --temperature 0 \ --max_new_tokens 8192
推理测试:
Qwen2.5-7B-Instruct-DeepSeek-R1-Distill-data-110K 训练完成! -
后续合并Loar、断点训练、推送模型、可参考Swift github项目地址:
https://github.com/modelscope/ms-swifthttps://github.com/modelscope/ms-swift