ChatGLM2-6B 是清华大学KEG和数据挖掘小组(THUDM)开源中英双语对话模型,这个模型能够实现低门槛部署,对话流畅,并且非常方便研究和探索下游应用场景。具体介绍,我们引用官网的详细介绍,如下所示:
- 更强大的性能:基于 ChatGLM 初代模型的开发经验,我们全面升级了 ChatGLM2-6B 的基座模型。ChatGLM2-6B 使用了 GLM 的混合目标函数,经过了 1.4T 中英标识符的预训练与人类偏好对齐训练,评测结果显示,相比于初代模型,ChatGLM2-6B 在 MMLU(+23%)、CEval(+33%)、GSM8K(+571%) 、BBH(+60%)等数据集上的性能取得了大幅度的提升,在同尺寸开源模型中具有较强的竞争力。
- 更长的上下文:基于 FlashAttention 技术,我们将基座模型的上下文长度(Context Length)由 ChatGLM-6B 的 2K 扩展到了 32K,并在对话阶段使用 8K 的上下文长度训练,允许更多轮次的对话。但当前版本的 ChatGLM2-6B 对单轮超长文档的理解能力有限,我们会在后续迭代升级中着重进行优化。
- 更高效的推理:基于 Multi-Query Attention 技术,ChatGLM2-6B 有更高效的推理速度和更低的显存占用:在官方的模型实现下,推理速度相比初代提升了 42%,INT4 量化下,6G 显存支持的对话长度由 1K 提升到了 8K。
- 更开放的协议:ChatGLM2-6B 权重对学术研究完全开放,在填写问卷进行登记后亦允许免费商业使用。
1 环境准备
- CentOS 7.6 64
- Anaconda3-2023.07-1-Linux-x86_64
- Python 3.11.3
- GPU Tesla P40(显存24 G/1 core)
- CPU 6 vCore/56G
2 安装部署
首先安装 Anaconda,参考命令:
wget https://repo.anaconda.com/archive/Anaconda3-2023.07-1-Linux-x86_64.sh sudo sh Anaconda3-2023.07-1-Linux-x86_64.sh
我们直接使用 Conda 默认的 base 环境。
如果没有安装 Git,可以执行如下命令:
sudo yum install git
由于 Anaconda 中的 base 环境中没有安装 PyTorch,如果直接默认安装后面的 ChatGLM-6B 的话,PyTorch 不是使用 CUDA 编译的版本,所以我们这里提前编译安装 PyTorch:
conda install pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia
下面,我们就可以安装 ChatGLM2-6B 了,执行如下命令:
mkdir ChatGLM cd ChatGLM git clone https://github.com/THUDM/ChatGLM2-6B.git cd ChatGLM2-6B pip install -r requirements.txt
安装过程如果没有报错,说明基本的环境就没问题了,接着就可以在 Python 环境中下载安装 ChatGLM6B 模型,我们使用默认的安装,需要从网上下载(注意:下面加载模型,需要从网上下载模型的7个分片文件,总共大约10几个G大小,依赖自己的网络状况耗时不同):
>>> from transformers import AutoTokenizer, AutoModel >>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True) >>> model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='cuda') >>> model = model.eval() >>> response, history = model.chat(tokenizer, "厄尔尼诺现象会带来哪些影响呢?", history=[]) >>> print(response) 厄尔尼诺现象是指太平洋赤道上的海洋表面温度异常升高现象,通常会导致全球范围内的气候变化。以下是厄尔尼诺现象可能带来的一些影响: 1. 太平洋地区和整个地球的气候都会受到影响。厄尔尼诺现象可能导致全球气温上升,尤其是在赤道附近的海域。 2. 厄尔尼诺现象会对全球范围内的农业生产产生影响。通常情况下,厄尔尼诺现象会导致南美洲的降雨量减少,而北美和非洲的降雨量增加。这可能会对全球范围内的粮食生产产生影响。 3. 厄尔尼诺现象可能会导致全球范围内的海洋生态系统发生变化。海洋表面温度的升高可能导致海洋流和海洋环流的变化,从而影响到海洋生态系统中的生物。 4. 厄尔尼诺现象可能会对全球范围内的海洋生物产生致命影响。海洋表面的温度升高可能导致海洋生物种群数量的减少和物种灭绝。 5. 厄尔尼诺现象可能会对全球范围内的气候模式产生影响。厄尔尼诺现象可能导致全球气温上升和气候变化,从而对全球范围内的生态系统产生影响。 总之,厄尔尼诺现象可能会对全球范围内的气候、环境、农业生产、海洋生态系统和人类健康产生不同程度的影响。 >>>
3 微调实践
如果不特殊说明,我们都是在本地目录 /root/ChatGLM/ChatGLM2-6B/ptuning/ 下面进行模型微调等相关操作,以防我们运行代码过程中找不到对应的文件。
下面,我们分几个步骤来实践:
3.1 准备数据集
准备我们自己的数据集,我们可以从这里 https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1 下载 AdvertiseGen 数据集,我随便从里面截取出部分,分别生成训练文件和测试文件这两个文件,放在目录 ChatGLM2-6B/ptuning/myDataset/ 下面。
训练集文件: train_file.json
测试集文件: val_file.json
3.2 查看显卡信息
查看 NVIDIA 显卡:
(base) [root@VM-16-3-centos ptuning]# nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla P40 On | 00000000:00:08.0 Off | 0 | | N/A 24C P8 9W / 250W | 0MiB / 22919MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
3.3 微调并训练新模型
修改 train.sh 脚本文件,修改后的配置为:
PRE_SEQ_LEN=128 LR=2e-2 NUM_GPUS=1 torchrun --standalone --nnodes=1 --nproc-per-node=$NUM_GPUS main.py \ --do_train \ --train_file myDataset/train_file.json \ --validation_file myDataset/val_file.json \ --preprocessing_num_workers 6 \ --prompt_column content \ --response_column summary \ --overwrite_cache \ --model_name_or_path THUDM/chatglm2-6b \ --output_dir output/my-chatglm2-6b-checkpoint \ --overwrite_output_dir \ --max_source_length 64 \ --max_target_length 128 \ --per_device_train_batch_size 6 \ --per_device_eval_batch_size 6 \ --gradient_accumulation_steps 16 \ --predict_with_generate \ --max_steps 20 \ --logging_steps 5 \ --save_steps 5 \ --learning_rate $LR \ --pre_seq_len $PRE_SEQ_LEN \ --quantization_bit 8
可以根据自己的需要去修改。
后面微调训练,需要依赖一些 Python 模块,提前安装一下:
conda install rouge_chinese nltk jieba datasets
然后,我们就可以进行微调训练:
cd /root/ChatGLM/ChatGLM2-6B/ptuning/ bash train.sh
如果训练过程中没有报错,最后说明正常完成。
上述训练过程输出信息参考如下:
master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified. 07/20/2023 22:04:37 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 07/20/2023 22:04:37 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=False, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=None, evaluation_strategy=IntervalStrategy.NO, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_max_length=None, generation_num_beams=None, gradient_accumulation_steps=16, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=<HUB_TOKEN>, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=0.02, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=output/my-chatglm2-6b-checkpoint/runs/Jul20_22-04-37_VM-16-3-centos, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=5, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_type=SchedulerType.LINEAR, max_grad_norm=1.0, max_steps=20, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=OptimizerNames.ADAMW_HF, optim_args=None, output_dir=output/my-chatglm2-6b-checkpoint, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=6, per_device_train_batch_size=6, predict_with_generate=True, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=<PUSH_TO_HUB_TOKEN>, ray_scope=last, remove_unused_columns=True, report_to=[], resume_from_checkpoint=None, run_name=output/my-chatglm2-6b-checkpoint, save_on_each_node=False, save_steps=5, save_strategy=IntervalStrategy.STEPS, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xpu_backend=None, ) 07/20/2023 22:04:38 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-47a404743bdc4e4e/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 943.28it/s] [INFO|configuration_utils.py:668] 2023-07-20 22:04:48,365 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/config.json [WARNING|configuration_auto.py:905] 2023-07-20 22:04:48,365 >> Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|configuration_utils.py:668] 2023-07-20 22:04:50,087 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/config.json [INFO|configuration_utils.py:720] 2023-07-20 22:04:50,088 >> Model config ChatGLMConfig { "_name_or_path": "THUDM/chatglm2-6b", "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration" }, "bias_dropout_fusion": true, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float16", "transformers_version": "4.27.1", "use_cache": true, "vocab_size": 65024 } [WARNING|tokenization_auto.py:652] 2023-07-20 22:04:50,313 >> Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:04:51,823 >> loading file tokenizer.model from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/tokenizer.model [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:04:51,823 >> loading file added_tokens.json from cache at None [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:04:51,823 >> loading file special_tokens_map.json from cache at None [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:04:51,823 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/tokenizer_config.json [WARNING|auto_factory.py:456] 2023-07-20 22:04:51,857 >> Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|modeling_utils.py:2403] 2023-07-20 22:04:53,431 >> loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/pytorch_model.bin.index.json [INFO|configuration_utils.py:575] 2023-07-20 22:04:53,433 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:06<00:00, 1.03it/s] [INFO|modeling_utils.py:3032] 2023-07-20 22:05:00,454 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration. [WARNING|modeling_utils.py:3034] 2023-07-20 22:05:00,454 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at THUDM/chatglm2-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|modeling_utils.py:2690] 2023-07-20 22:05:01,079 >> Generation config file not found, using a generation config created from the model config. Quantized to 8 bit input_ids [64790, 64792, 790, 30951, 517, 30910, 30939, 30996, 13, 13, 54761, 31211, 33467, 31010, 56778, 30998, 55090, 54888, 31010, 49899, 30998, 33692, 31010, 54829, 54785, 30998, 32799, 31010, 44785, 30998, 32799, 31010, 40589, 30998, 37505, 31010, 44785, 30998, 37505, 31010, 37216, 30998, 37505, 31010, 45859, 30998, 56778, 54578, 56164, 31010, 56716, 57486, 30998, 56778, 54625, 31010, 44724, 30998, 56778, 54847, 54888, 31010, 30933, 54847, 34001, 54625, 55115, 34172, 33481, 45859, 44724, 31735, 31123, 57494, 54878, 39819, 33894, 39145, 54980, 31123, 56716, 57486, 54706, 54754, 31123, 51605, 32760, 55599, 55967, 40397, 32600, 31155, 45859, 33162, 35765, 54557, 32291, 35752, 56059, 32789, 35642, 31123, 31975, 54829, 54785, 40462, 55140, 54901, 31123, 35191, 39392, 54530, 44785, 35769, 31155, 56778, 54888, 54535, 55679, 34319, 31123, 55025, 31759, 33550, 57148, 55343, 49899, 31155, 54539, 30933, 54847, 34481, 31123, 38327, 32291, 46274, 48117, 37216, 31123, 40589, 54892, 43691, 31155, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] inputs [Round 1] 问:类型#裙*版型#显瘦*颜色#深色*风格#复古*风格#性感*图案#复古*图案#线条*图案#印花*裙下摆#垂坠*裙长#连衣裙*裙领型#v领 中长款式的时尚印花连衣裙设计,甄选柔软舒适的面料,垂坠感强,上身给予丝滑的美好享受。印花元素彰显出女性优雅柔美的气息,结合深色系的底布,带来了浓郁的复古情怀。裙型不挑身材,且能够轻松遮肉显瘦。大v领的设计,凸显女性优美的颈部线条,性感又迷人。 label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 34001, 54625, 55115, 34172, 33481, 45859, 44724, 31735, 31123, 57494, 54878, 39819, 33894, 39145, 54980, 31123, 56716, 57486, 54706, 54754, 31123, 51605, 32760, 55599, 55967, 40397, 32600, 31155, 45859, 33162, 35765, 54557, 32291, 35752, 56059, 32789, 35642, 31123, 31975, 54829, 54785, 40462, 55140, 54901, 31123, 35191, 39392, 54530, 44785, 35769, 31155, 56778, 54888, 54535, 55679, 34319, 31123, 55025, 31759, 33550, 57148, 55343, 49899, 31155, 54539, 30933, 54847, 34481, 31123, 38327, 32291, 46274, 48117, 37216, 31123, 40589, 54892, 43691, 31155, 2, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100] labels 中长款式的时尚印花连衣裙设计,甄选柔软舒适的面料,垂坠感强,上身给予丝滑的美好享受。印花元素彰显出女性优雅柔美的气息,结合深色系的底布,带来了浓郁的复古情怀。裙型不挑身材,且能够轻松遮肉显瘦。大v领的设计,凸显女性优美的颈部线条,性感又迷人。 [INFO|trainer.py:543] 2023-07-20 22:05:07,589 >> max_steps is given, it will override any value given in num_train_epochs /root/anaconda3/lib/python3.11/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( [INFO|trainer.py:1740] 2023-07-20 22:05:07,674 >> ***** Running training ***** [INFO|trainer.py:1741] 2023-07-20 22:05:07,674 >> Num examples = 52 [INFO|trainer.py:1742] 2023-07-20 22:05:07,674 >> Num Epochs = 20 [INFO|trainer.py:1743] 2023-07-20 22:05:07,674 >> Instantaneous batch size per device = 6 [INFO|trainer.py:1744] 2023-07-20 22:05:07,674 >> Total train batch size (w. parallel, distributed & accumulation) = 96 [INFO|trainer.py:1745] 2023-07-20 22:05:07,674 >> Gradient Accumulation steps = 16 [INFO|trainer.py:1746] 2023-07-20 22:05:07,674 >> Total optimization steps = 20 [INFO|trainer.py:1747] 2023-07-20 22:05:07,675 >> Number of trainable parameters = 1835008 0%| | 0/20 [00:00<?, ?it/s]07/20/2023 22:05:07 - WARNING - transformers_modules.THUDM.chatglm2-6b.b1502f4f75c71499a3d566b14463edd62620ce9f.modeling_chatglm - `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`... {'loss': 2.02, 'learning_rate': 0.015, 'epoch': 3.56} 25%|███████████████████████████████████ | 5/20 [04:33<13:00, 52.02s/it]Saving PrefixEncoder [INFO|configuration_utils.py:457] 2023-07-20 22:09:41,545 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-5/config.json [INFO|configuration_utils.py:362] 2023-07-20 22:09:41,546 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-5/generation_config.json [INFO|modeling_utils.py:1762] 2023-07-20 22:09:41,556 >> Model weights saved in output/my-chatglm2-6b-checkpoint/checkpoint-5/pytorch_model.bin [INFO|tokenization_utils_base.py:2163] 2023-07-20 22:09:41,557 >> tokenizer config file saved in output/my-chatglm2-6b-checkpoint/checkpoint-5/tokenizer_config.json [INFO|tokenization_utils_base.py:2170] 2023-07-20 22:09:41,557 >> Special tokens file saved in output/my-chatglm2-6b-checkpoint/checkpoint-5/special_tokens_map.json {'loss': 1.6124, 'learning_rate': 0.01, 'epoch': 7.0} 50%|█████████████████████████████████████████████████████████████████████▌ | 10/20 [08:55<09:23, 56.38s/it]Saving PrefixEncoder [INFO|configuration_utils.py:457] 2023-07-20 22:14:03,345 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-10/config.json [INFO|configuration_utils.py:362] 2023-07-20 22:14:03,346 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-10/generation_config.json [INFO|modeling_utils.py:1762] 2023-07-20 22:14:03,356 >> Model weights saved in output/my-chatglm2-6b-checkpoint/checkpoint-10/pytorch_model.bin [INFO|tokenization_utils_base.py:2163] 2023-07-20 22:14:03,357 >> tokenizer config file saved in output/my-chatglm2-6b-checkpoint/checkpoint-10/tokenizer_config.json [INFO|tokenization_utils_base.py:2170] 2023-07-20 22:14:03,357 >> Special tokens file saved in output/my-chatglm2-6b-checkpoint/checkpoint-10/special_tokens_map.json {'loss': 1.3611, 'learning_rate': 0.005, 'epoch': 10.0} 75%|████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 15/20 [12:44<04:17, 51.59s/it]Saving PrefixEncoder [INFO|configuration_utils.py:457] 2023-07-20 22:17:52,562 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-15/config.json [INFO|configuration_utils.py:362] 2023-07-20 22:17:52,563 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-15/generation_config.json [INFO|modeling_utils.py:1762] 2023-07-20 22:17:52,572 >> Model weights saved in output/my-chatglm2-6b-checkpoint/checkpoint-15/pytorch_model.bin [INFO|tokenization_utils_base.py:2163] 2023-07-20 22:17:52,573 >> tokenizer config file saved in output/my-chatglm2-6b-checkpoint/checkpoint-15/tokenizer_config.json [INFO|tokenization_utils_base.py:2170] 2023-07-20 22:17:52,573 >> Special tokens file saved in output/my-chatglm2-6b-checkpoint/checkpoint-15/special_tokens_map.json {'loss': 1.3431, 'learning_rate': 0.0, 'epoch': 13.0} 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [16:34<00:00, 45.91s/it]Saving PrefixEncoder [INFO|configuration_utils.py:457] 2023-07-20 22:21:41,836 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-20/config.json [INFO|configuration_utils.py:362] 2023-07-20 22:21:41,838 >> Configuration saved in output/my-chatglm2-6b-checkpoint/checkpoint-20/generation_config.json [INFO|modeling_utils.py:1762] 2023-07-20 22:21:41,847 >> Model weights saved in output/my-chatglm2-6b-checkpoint/checkpoint-20/pytorch_model.bin [INFO|tokenization_utils_base.py:2163] 2023-07-20 22:21:41,848 >> tokenizer config file saved in output/my-chatglm2-6b-checkpoint/checkpoint-20/tokenizer_config.json [INFO|tokenization_utils_base.py:2170] 2023-07-20 22:21:41,848 >> Special tokens file saved in output/my-chatglm2-6b-checkpoint/checkpoint-20/special_tokens_map.json [INFO|trainer.py:2012] 2023-07-20 22:21:41,869 >> Training completed. Do not forget to share your model on huggingface.co/models =) {'train_runtime': 994.1944, 'train_samples_per_second': 1.931, 'train_steps_per_second': 0.02, 'train_loss': 1.584136962890625, 'epoch': 13.0} 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [16:34<00:00, 49.70s/it] ***** train metrics ***** epoch = 13.0 train_loss = 1.5841 train_runtime = 0:16:34.19 train_samples = 52 train_samples_per_second = 1.931 train_steps_per_second = 0.02
训练过程中,可以通过如下命令查看显存使用情况:
nvidia-smi -lms 1000
3.4、微调后模型的推理与评估
对微调后的模型进行评估验证,修改 evaluate.sh 脚本中的 checkpoint 目录:
PRE_SEQ_LEN=128 CHECKPOINT=my-chatglm2-6b-checkpoint STEP=20 NUM_GPUS=1 torchrun --standalone --nnodes=1 --nproc-per-node=$NUM_GPUS main.py \ --do_predict \ --validation_file myDataset/train_file.json \ --test_file myDataset/val_file.json \ --overwrite_cache \ --prompt_column content \ --response_column summary \ --model_name_or_path THUDM/chatglm2-6b \ --ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP \ --output_dir ./output/$CHECKPOINT \ --overwrite_output_dir \ --max_source_length 64 \ --max_target_length 64 \ --per_device_eval_batch_size 1 \ --predict_with_generate \ --pre_seq_len $PRE_SEQ_LEN \ --quantization_bit 8
执行如下命令,对微调后的模型进行推理和评估:
/root/ChatGLM/ChatGLM2-6B/ptuning/ bash evaluate.sh
上述验证过程输出信息参考如下:
master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified. 07/20/2023 22:29:56 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 07/20/2023 22:29:56 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=False, do_predict=True, do_train=False, eval_accumulation_steps=None, eval_delay=0, eval_steps=None, evaluation_strategy=IntervalStrategy.NO, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_max_length=None, generation_num_beams=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=<HUB_TOKEN>, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=5e-05, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./output/my-chatglm2-6b-checkpoint/runs/Jul20_22-29-56_VM-16-3-centos, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_type=SchedulerType.LINEAR, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=OptimizerNames.ADAMW_HF, optim_args=None, output_dir=./output/my-chatglm2-6b-checkpoint, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=6, per_device_train_batch_size=8, predict_with_generate=True, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=<PUSH_TO_HUB_TOKEN>, ray_scope=last, remove_unused_columns=True, report_to=[], resume_from_checkpoint=None, run_name=./output/my-chatglm2-6b-checkpoint, save_on_each_node=False, save_steps=500, save_strategy=IntervalStrategy.STEPS, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xpu_backend=None, ) Downloading and preparing dataset json/default to /root/.cache/huggingface/datasets/json/default-ffe7658adbb1cddc/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4... Downloading data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 10578.32it/s] Extracting data files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1598.13it/s] Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-ffe7658adbb1cddc/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data. 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1020.26it/s] [INFO|configuration_utils.py:668] 2023-07-20 22:29:58,077 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/config.json [WARNING|configuration_auto.py:905] 2023-07-20 22:29:58,077 >> Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|configuration_utils.py:668] 2023-07-20 22:29:59,505 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/config.json [INFO|configuration_utils.py:720] 2023-07-20 22:29:59,506 >> Model config ChatGLMConfig { "_name_or_path": "THUDM/chatglm2-6b", "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration" }, "bias_dropout_fusion": true, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float16", "transformers_version": "4.27.1", "use_cache": true, "vocab_size": 65024 } [WARNING|tokenization_auto.py:652] 2023-07-20 22:30:00,121 >> Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:30:01,156 >> loading file tokenizer.model from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/tokenizer.model [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:30:01,156 >> loading file added_tokens.json from cache at None [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:30:01,156 >> loading file special_tokens_map.json from cache at None [INFO|tokenization_utils_base.py:1802] 2023-07-20 22:30:01,156 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/tokenizer_config.json [WARNING|auto_factory.py:456] 2023-07-20 22:30:01,184 >> Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|modeling_utils.py:2403] 2023-07-20 22:30:04,903 >> loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--THUDM--chatglm2-6b/snapshots/b1502f4f75c71499a3d566b14463edd62620ce9f/pytorch_model.bin.index.json [INFO|configuration_utils.py:575] 2023-07-20 22:30:04,904 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:06<00:00, 1.03it/s] [INFO|modeling_utils.py:3032] 2023-07-20 22:30:11,836 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration. [WARNING|modeling_utils.py:3034] 2023-07-20 22:30:11,836 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at THUDM/chatglm2-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|modeling_utils.py:2690] 2023-07-20 22:30:12,107 >> Generation config file not found, using a generation config created from the model config. Quantized to 8 bit input_ids [64790, 64792, 790, 30951, 517, 30910, 30939, 30996, 13, 13, 54761, 31211, 33467, 31010, 56778, 30998, 38317, 31010, 55599, 57764, 30998, 33692, 31010, 32658, 30998, 37505, 31010, 55336, 54668, 30998, 37505, 31010, 45859, 30998, 56778, 54578, 56164, 31010, 56262, 55540, 55086, 30998, 56778, 54578, 56164, 31010, 54867, 55086, 30998, 56778, 54625, 31010, 54625, 56778, 30998, 56778, 40877, 31010, 56069, 54762, 13, 13, 55437, 31211] inputs [Round 1] 问:类型#裙*材质#丝绒*颜色#绿色*图案#波点*图案#印花*裙下摆#荷叶边*裙下摆#花边*裙长#长裙*裙款式#拼接 答: label_ids [64790, 64792, 40503, 54801, 55144, 55500, 54625, 56778, 52161, 54606, 33109, 54537, 56262, 55540, 35561, 56069, 54762, 31735, 31123, 56432, 54868, 54807, 34101, 54867, 55086, 42128, 31123, 31923, 38327, 32291, 34678, 46633, 34219, 31123, 54617, 56278, 33800, 56262, 55540, 55086, 36661, 31759, 45588, 43518, 49244, 38871, 31123, 56778, 56164, 54807, 34100, 54772, 56778, 55091, 41076, 50745, 56719, 56822, 54706, 31155, 55336, 54668, 45859, 54295] labels 这件连衣长裙多处都运用了荷叶边的拼接设计,肩头处添加花边点缀,更加凸显女性温柔典雅气质,而腰部的荷叶边正好能够勾勒修饰腰部曲线,裙摆处则是让裙装更具灵动飘逸感。波点印花元素的 07/20/2023 22:30:18 - INFO - __main__ - *** Predict *** [INFO|trainer.py:3068] 2023-07-20 22:30:18,524 >> ***** Running Prediction ***** [INFO|trainer.py:3070] 2023-07-20 22:30:18,524 >> Num examples = 562 [INFO|trainer.py:3073] 2023-07-20 22:30:18,524 >> Batch size = 6 [INFO|configuration_utils.py:575] 2023-07-20 22:30:18,530 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 0%| | 0/94 [00:00<?, ?it/s][INFO|configuration_utils.py:575] 2023-07-20 22:30:46,287 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 2%|██▉ | 2/94 [00:26<19:58, 13.03s/it][INFO|configuration_utils.py:575] 2023-07-20 22:31:12,344 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 3%|████▍ | 3/94 [00:52<28:00, 18.46s/it][INFO|configuration_utils.py:575] 2023-07-20 22:31:38,413 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 4%|█████▉ | 4/94 [01:18<31:55, 21.29s/it][INFO|configuration_utils.py:575] 2023-07-20 22:32:04,448 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 5%|███████▍ | 5/94 [01:44<34:01, 22.94s/it][INFO|configuration_utils.py:575] 2023-07-20 22:32:30,477 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 6%|████████▉ | 6/94 [02:10<35:06, 23.94s/it][INFO|configuration_utils.py:575] 2023-07-20 22:32:56,439 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 7%|██████████▍ | 7/94 [02:36<35:39, 24.59s/it][INFO|configuration_utils.py:575] 2023-07-20 22:33:22,396 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 9%|███████████▉ | 8/94 [03:02<35:51, 25.02s/it][INFO|configuration_utils.py:575] 2023-07-20 22:33:48,350 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 10%|█████████████▍ | 9/94 [03:28<35:51, 25.31s/it][INFO|configuration_utils.py:575] 2023-07-20 22:34:14,310 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 11%|██████████████▊ | 10/94 [03:53<35:42, 25.51s/it][INFO|configuration_utils.py:575] 2023-07-20 22:34:40,274 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 12%|████████████████▎ | 11/94 [04:19<35:28, 25.65s/it][INFO|configuration_utils.py:575] 2023-07-20 22:35:06,235 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 13%|█████████████████▋ | 12/94 [04:45<35:11, 25.74s/it][INFO|configuration_utils.py:575] 2023-07-20 22:35:32,198 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 14%|███████████████████▏ | 13/94 [05:11<34:50, 25.81s/it][INFO|configuration_utils.py:575] 2023-07-20 22:35:58,158 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 15%|████████████████████▋ | 14/94 [05:37<34:28, 25.85s/it][INFO|configuration_utils.py:575] 2023-07-20 22:36:24,116 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 16%|██████████████████████▏ | 15/94 [06:03<34:04, 25.89s/it][INFO|configuration_utils.py:575] 2023-07-20 22:36:50,075 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 17%|███████████████████████▋ | 16/94 [06:29<33:40, 25.91s/it][INFO|configuration_utils.py:575] 2023-07-20 22:37:16,037 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 18%|█████████████████████████▏ | 17/94 [06:55<33:16, 25.92s/it][INFO|configuration_utils.py:575] 2023-07-20 22:37:42,000 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 19%|██████████████████████████▌ | 18/94 [07:21<32:51, 25.94s/it][INFO|configuration_utils.py:575] 2023-07-20 22:38:07,967 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 20%|████████████████████████████ | 19/94 [07:47<32:25, 25.95s/it][INFO|configuration_utils.py:575] 2023-07-20 22:38:33,934 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 21%|█████████████████████████████▌ | 20/94 [08:13<32:00, 25.95s/it][INFO|configuration_utils.py:575] 2023-07-20 22:38:59,892 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 22%|███████████████████████████████ | 21/94 [08:39<31:34, 25.95s/it][INFO|configuration_utils.py:575] 2023-07-20 22:39:25,854 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 23%|████████████████████████████████▌ | 22/94 [09:05<31:08, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:39:51,819 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 24%|██████████████████████████████████ | 23/94 [09:31<30:43, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:40:17,782 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 26%|███████████████████████████████████▍ | 24/94 [09:57<30:17, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:40:43,744 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 27%|████████████████████████████████████▉ | 25/94 [10:23<29:51, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:41:09,700 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 28%|██████████████████████████████████████▍ | 26/94 [10:49<29:25, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:41:35,658 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 29%|███████████████████████████████████████▉ | 27/94 [11:15<28:59, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:42:01,617 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 30%|█████████████████████████████████████████▍ | 28/94 [11:41<28:33, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:42:27,571 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 31%|██████████████████████████████████████████▉ | 29/94 [12:07<28:07, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:42:53,531 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 32%|████████████████████████████████████████████▎ | 30/94 [12:33<27:41, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:43:19,492 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 33%|█████████████████████████████████████████████▊ | 31/94 [12:59<27:15, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:43:45,452 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 34%|███████████████████████████████████████████████▎ | 32/94 [13:25<26:49, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:44:11,411 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 35%|████████████████████████████████████████████████▊ | 33/94 [13:51<26:23, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:44:37,369 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 36%|██████████████████████████████████████████████████▎ | 34/94 [14:17<25:57, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:45:03,329 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 37%|███████████████████████████████████████████████████▊ | 35/94 [14:42<25:31, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:45:29,287 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 38%|█████████████████████████████████████████████████████▏ | 36/94 [15:08<25:05, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:45:55,246 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 39%|██████████████████████████████████████████████████████▋ | 37/94 [15:34<24:39, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:46:21,205 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 40%|████████████████████████████████████████████████████████▏ | 38/94 [16:00<24:13, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:46:47,162 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 41%|█████████████████████████████████████████████████████████▋ | 39/94 [16:26<23:47, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:47:13,120 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 43%|███████████████████████████████████████████████████████████▏ | 40/94 [16:52<23:21, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:47:39,077 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 44%|████████████████████████████████████████████████████████████▋ | 41/94 [17:18<22:55, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:48:05,036 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 45%|██████████████████████████████████████████████████████████████ | 42/94 [17:44<22:29, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:48:30,993 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 46%|███████████████████████████████████████████████████████████████▌ | 43/94 [18:10<22:03, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:48:56,949 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 47%|█████████████████████████████████████████████████████████████████ | 44/94 [18:36<21:37, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:49:22,906 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 48%|██████████████████████████████████████████████████████████████████▌ | 45/94 [19:02<21:11, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:49:48,867 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 49%|████████████████████████████████████████████████████████████████████ | 46/94 [19:28<20:46, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:50:14,830 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 50%|█████████████████████████████████████████████████████████████████████▌ | 47/94 [19:54<20:20, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:50:40,792 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 51%|██████████████████████████████████████████████████████████████████████▉ | 48/94 [20:20<19:54, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:51:06,758 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 52%|████████████████████████████████████████████████████████████████████████▍ | 49/94 [20:46<19:28, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:51:32,713 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 53%|█████████████████████████████████████████████████████████████████████████▉ | 50/94 [21:12<19:02, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:51:58,672 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 54%|███████████████████████████████████████████████████████████████████████████▍ | 51/94 [21:38<18:36, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:52:24,629 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 55%|████████████████████████████████████████████████████████████████████████████▉ | 52/94 [22:04<18:10, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:52:50,588 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 56%|██████████████████████████████████████████████████████████████████████████████▎ | 53/94 [22:30<17:44, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:53:16,544 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 57%|███████████████████████████████████████████████████████████████████████████████▊ | 54/94 [22:56<17:18, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:53:42,499 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 59%|█████████████████████████████████████████████████████████████████████████████████▎ | 55/94 [23:22<16:52, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:54:08,456 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 60%|██████████████████████████████████████████████████████████████████████████████████▊ | 56/94 [23:48<16:26, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:54:34,413 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 61%|████████████████████████████████████████████████████████████████████████████████████▎ | 57/94 [24:14<16:00, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:55:00,369 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 62%|█████████████████████████████████████████████████████████████████████████████████████▊ | 58/94 [24:40<15:34, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:55:26,327 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 63%|███████████████████████████████████████████████████████████████████████████████████████▏ | 59/94 [25:05<15:08, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:55:52,286 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 64%|████████████████████████████████████████████████████████████████████████████████████████▋ | 60/94 [25:31<14:42, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:56:18,246 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 65%|██████████████████████████████████████████████████████████████████████████████████████████▏ | 61/94 [25:57<14:16, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:56:44,205 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 66%|███████████████████████████████████████████████████████████████████████████████████████████▋ | 62/94 [26:23<13:50, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:57:10,168 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 67%|█████████████████████████████████████████████████████████████████████████████████████████████▏ | 63/94 [26:49<13:24, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:57:36,122 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 68%|██████████████████████████████████████████████████████████████████████████████████████████████▋ | 64/94 [27:15<12:58, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:58:02,085 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 69%|████████████████████████████████████████████████████████████████████████████████████████████████ | 65/94 [27:41<12:32, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:58:28,038 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 70%|█████████████████████████████████████████████████████████████████████████████████████████████████▌ | 66/94 [28:07<12:06, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:58:53,993 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 71%|███████████████████████████████████████████████████████████████████████████████████████████████████ | 67/94 [28:33<11:40, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:59:19,952 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 72%|████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 68/94 [28:59<11:14, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 22:59:45,907 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 73%|██████████████████████████████████████████████████████████████████████████████████████████████████████ | 69/94 [29:25<10:48, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:00:11,874 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 74%|███████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 70/94 [29:51<10:23, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:00:37,835 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 76%|████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 71/94 [30:17<09:57, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:01:03,792 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 77%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 72/94 [30:43<09:31, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:01:29,750 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 73/94 [31:09<09:05, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:01:55,707 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 79%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 74/94 [31:35<08:39, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:02:21,662 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 80%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 75/94 [32:01<08:13, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:02:47,625 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 76/94 [32:27<07:47, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:03:13,589 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 82%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 77/94 [32:53<07:21, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:03:39,548 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 83%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 78/94 [33:19<06:55, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:04:05,503 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 84%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 79/94 [33:45<06:29, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:04:31,463 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 85%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 80/94 [34:11<06:03, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:04:57,419 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 86%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 81/94 [34:37<05:37, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:05:23,391 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 87%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 82/94 [35:03<05:11, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:05:49,348 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 88%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 83/94 [35:29<04:45, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:06:15,305 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 89%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 84/94 [35:54<04:19, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:06:41,261 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 85/94 [36:20<03:53, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:07:07,217 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 91%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 86/94 [36:46<03:27, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:07:33,172 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 93%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 87/94 [37:12<03:01, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:07:59,136 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 94%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 88/94 [37:38<02:35, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:08:25,092 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 89/94 [38:04<02:09, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:08:51,050 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 96%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 90/94 [38:30<01:43, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:09:17,014 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 91/94 [38:56<01:17, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:09:42,970 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 98%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 92/94 [39:22<00:51, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:10:08,926 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 93/94 [39:48<00:25, 25.96s/it][INFO|configuration_utils.py:575] 2023-07-20 23:10:34,882 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.27.1" } 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 94/94 [40:15<00:00, 26.23s/it]Building prefix dict from the default dictionary ... 07/20/2023 23:11:02 - DEBUG - jieba - Building prefix dict from the default dictionary ... Dumping model to file cache /tmp/jieba.cache 07/20/2023 23:11:03 - DEBUG - jieba - Dumping model to file cache /tmp/jieba.cache Loading model cost 0.658 seconds. 07/20/2023 23:11:03 - DEBUG - jieba - Loading model cost 0.658 seconds. Prefix dict has been built successfully. 07/20/2023 23:11:03 - DEBUG - jieba - Prefix dict has been built successfully. 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 94/94 [40:18<00:00, 25.73s/it] ***** predict metrics ***** predict_bleu-4 = 3.8157 predict_rouge-1 = 21.1755 predict_rouge-2 = 2.5345 predict_rouge-l = 18.0414 predict_runtime = 0:40:46.74 predict_samples = 562 predict_samples_per_second = 0.23 predict_steps_per_second = 0.038
3.5 验证与使用微调后的模型
我们需要在自己的 Python 代码中,将原来的 THUDM/chatglm2-6b 替换为我们微调训练后生成的 Checkpoint 路径,这里面为 ./output/my-chatglm2-6b-checkpoint/checkpoint-20。
验证使用微调后模型的 Python 代码,如下所示:
>>> from transformers import AutoConfig, AutoModel, AutoTokenizer >>> import os >>> import torch >>> from transformers import AutoConfig >>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True) >>> config = AutoConfig.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, pre_seq_len=128) >>> model = AutoModel.from_pretrained("THUDM/chatglm2-6b", config=config, trust_remote_code=True) >>> prefix_state_dict = torch.load(os.path.join("./output/my-chatglm2-6b-checkpoint/checkpoint-20", "pytorch_model.bin")) >>> new_prefix_state_dict = {} >>> for k, v in prefix_state_dict.items(): ... if k.startswith("transformer.prefix_encoder."): ... new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v ... >>> model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) >>> model = model.half().cuda() >>> model.transformer.prefix_encoder.float() >>> model = model.eval() >>> >>> response, history = model.chat(tokenizer, "类型#上衣*颜色#黑白*风格#简约*风格#休闲*图案#条纹*衣样式#风衣*衣样式#外套", history=[]) >>> print(response) 首先,我们需要分析这道题目。这是一道关于风衣的问题,需要我们选择适合的颜色和风格。题目中给出了风衣的颜色为黑白,风格为简约休闲,图案为条纹。因此,我们可以选择黑色或白色作为风衣的颜色,选择简约休闲的休闲风格,或者选择条纹作为图案。 最终答案:黑/白 简约休闲 风衣。 >>> >>> response, history = model.chat(tokenizer, "风衣有什么特征呢", history=[]) >>> print(response) 风衣是一种轻便、简洁的外套,通常由薄羊毛或尼龙纤维制成,有着流畅的曲线和轻柔的触感,因此具有优雅、高贵的特征。此外,风衣通常会有一些细节设计,如双排扣、大口袋、系带等,这些都增加了它的时尚感和个性。 >>> >>> response, history = model.chat(tokenizer, "日常休闲一般穿什么风格的衣服好呢?", history=[]) >>> print(response) 日常休闲一般穿什么风格的衣服好呢? 首先,我们可以考虑一下自己的日常休闲活动,比如去购物、去旅游、去咖啡店等等,这些活动都需要一些舒适的衣服来穿着。接下来,我们可以考虑一下自己的喜好和风格,比如喜欢休闲的、喜欢随性的、喜欢简约的等等,这些喜好会影响我们的穿着选择。 因此,我们可以选择一些舒适、随性、简约、休闲的服装来穿着日常休闲活动。比如,一件宽松的T恤、一条舒适的牛仔裤、一双舒适的拖鞋、一个随性的背包等等,这些都可以让我们在休闲时更加舒适自在。 此外,我们也可以考虑一下自己的喜好和风格。比如,如果喜欢休闲、随性的风格,我们可以选择一些休闲的、随性的服装来穿着,比如一件宽松的T恤、一条舒适的牛仔裤、一双舒适的拖鞋、一个随性的背包等等。如果喜欢简约的,我们可以选择一些简约的、清新的服装来穿着,比如一件简约的T恤、一条清新的牛仔裤、一双舒适的拖鞋、一个清新的背包等等。 总之,日常休闲时,我们可以选择一些舒适、随性、简约、休闲的服装来穿着,这样可以让我们的生活更加轻松自在。 >>>
我们对比一下,使用原始的 THUDM/ChatGLM2-6B 模型进行对话,同样问在使用微调后模型提的几个问题,看看效果:
>>> >>> response, history = model.chat(tokenizer, "类型#上衣*颜色#黑白*风格#简约*风格#休闲*图案#条纹*衣样式#风衣*衣样式#外套", history=[]) >>> print(response) 根据题目要求,我们需要根据上衣的类型、颜色、风格和图案,以及外套的衣样式和风格,来选择合适的答案。 首先,我们看到上衣的类型是“风衣”,颜色是“条纹”,风格是“简约”。根据这些信息,我们可以排除掉选项A、B、C。 接下来,我们看到外套的衣样式是“休闲”,风格是“简约”。根据这些信息,我们可以得出最终答案为“简约 休闲 条纹 风衣”。 因此,最终答案为“简约 休闲 条纹 风衣”。 >>> >>> response, history = model.chat(tokenizer, "风衣有什么特征呢", history=[]) >>> print(response) 风衣是一种轻便、防风、透气、易携带的外套,通常由薄的金属材料制成,如铝或钢。以下是风衣的一些特征: 1. 防风:风衣通常采用防风面料制成,以减少风寒刺骨和风沙对身体的摩擦。 2. 透气:风衣通常采用透气面料制成,以允许身体排汗和呼吸。 3. 易携带:风衣通常轻便、易折叠,可方便地放入背包或行李中携带。 4. 耐用:风衣通常采用耐用材料制成,具有出色的耐磨性和耐用性。 5. 多功能性:风衣通常设计为具有多个功能,如防风、防水、透气、易携带等。 6. 款式多样:风衣款式多样,包括长款、短款、连帽、无帽、束腰、宽松等,适合不同的需求和场合。 >>> >>> response, history = model.chat(tokenizer, "日常休闲一般穿什么风格的衣服好呢?", history=[]) >>> print(response) 日常休闲时,可以选择一些舒适、休闲、百搭的衣服来穿着。以下是一些常见的日常休闲服装风格: 1. 休闲衬衫:休闲衬衫是一种经典的日常休闲服装,可以搭配许多不同的下装。可以选择经典的白色或蓝色衬衫,或者选择一些有趣的印花或卡通图案的衬衫。 2. 休闲裤:休闲裤是一种轻便舒适的服装,适合在户外活动或进行日常活动。可以选择经典的牛仔裤或休闲裤,或者选择一些花色的宽松裤。 3. 运动鞋:运动鞋是一种舒适、休闲的鞋子,非常适合日常休闲活动。可以选择一双经典的运动鞋,或者选择一双时尚的运动鞋。 4. 帽子或配饰:戴上一顶帽子或配饰可以让日常休闲服装更加丰富。可以选择一款合适的帽子的,或者添加一些时尚的配饰,如围巾、帽子、手套或手表等。 5. 休闲外套:休闲外套可以为日常休闲活动增加一份时尚感。可以选择一款舒适的外套,如羊毛大衣、夹克或羽绒服等。 日常休闲时,可以选择舒适、休闲、时尚的衣服来穿着。最重要的是,要选择适合自己的衣服,让自己感到舒适自在。 >>>
可见,比我们微调后的模型效果要好一些,回答更流畅且有条理,因为我们微调训练模型为方便和快速跑通整个流程,使用的参数迭代次数比较少,而且没有做什么调优工作,微调后的模型自然没那么好。
4 参考资源
- https://github.com/THUDM/ChatGLM2-6B
- https://www.heywhale.com/mw/project/64984a7b72ebe240516ae79c
- https://www.anaconda.com/download
- https://zhuanlan.zhihu.com/p/641719964
本文基于署名-非商业性使用-相同方式共享 4.0许可协议发布,欢迎转载、使用、重新发布,但务必保留文章署名时延军(包含链接:http://shiyanjun.cn),不得用于商业目的,基于本文修改后的作品务必以相同的许可发布。如有任何疑问,请与我联系。