12.8 baichuan2-7b Reasoning about container environment setup
- Mirror acquisition
Get the ModelZoo PyTorch framework base image from the Ascend community mirror repository (address: AscendHub (huawei.com)), download 23.0RC2.1.11.0
Set image download credentials after logging in to your account: Mouse over your account and click on Set image download credentials (password for subsequent docker logins)
Click on the Download Now link behind the mirror to see the download instructions, then log in to the server and pull the mirror.
Follow the above instructions to confirm that the image has been acquired. docker images | grep modelzoo
- Start and enter the container
Execute the following command to start the container
Please note the following: --device=/dev/davinci{num-select the card number to be used}, for multiple cards, copy one more line to modify the specified card --name please write the name according to the demand The last -v can be used as a cache directory for transferring files to and from the host.
docker run -itd -u root --ipc=host \
--device=/dev/davinci5 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /home/temp/:/home/temp \
--name {Container name} \
ascendhub.huawei.com/public-ascendhub/pytorch-modelzoo:23.0.RC2-1.11.0 \
/bin/bash
Log in to the container after startup is complete:
docker exec -it {Container name or ID} /bin/bash
- Install the in-container environment Download and install toolkit
cd /home/temp
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/CANN/CANN%207.0.RC1/Ascend-cann-toolkit_7.0.RC1_linux-aarch64.run
chmod +x Ascend-cann-toolkit_7.0.RC1_linux-aarch64.run --install
Install pip dependencies:
pip3 install --upgrade pip
pip3 install einops sympy regex decorator scipy setuptools-scm prompt-toolkit attrs accelerate sentencepiece transformers==4.28.1
Install deepspeed:
pip3 install deepspeed==0.9.2
git clone https://gitee.com/ascend/DeepSpeed.git
cd DeepSpeed
python setup.py develop
Replace the relevant files in the transformers repository, git pull down the baichuan source code, (Note! Please modify the transformers path to the transformers installation directory)
git clone https://gitee.com/ascend/ModelZoo-PyTorch.git
cd ModelZoo-PyTorch/PyTorch/built-in/foundation/Baichuan2/7Btransformers_modify
cp -f training_args.py {transformers path}/transformers/training_args.py
cp -f trainer.py {transformers path}/transformers/trainer.py
cp -f versions.py {transformers path}/utils/versions.py
Follow-up Notes After downloading the model weights file for Baichuan2-7B-Base, replace the modeling_baichuan.py file in the downloaded model weights folder with Baichuan2-7B/modeling_baichuan.py in the root directory of the source package.