12.10 Introduction to Mindformers

The goal of MindSpore Transformers is to build a full-flow development suite for large model training, fine-tuning, evaluation, inference, and deployment: providing the industry's mainstream Transformer class of pre-training models and SOTA downstream task applications, covering a wealth of parallel features. It is expected to help users easily realize large model training and innovative research and development. MindSpore Transformers Suite is based on MindSpore's built-in parallel technology and componentized design, with the following features:

● Seamlessly switch from single card to large-scale cluster training with a single line of code;

● Provides flexible and easy-to-use personalized parallel configuration;

●Ability to automatically perform topology sensing and efficiently fuse data parallelism and model parallelism strategies;

● One-click initiation of single/multi-card training, fine-tuning, evaluation, and inference processes for any task;

●Support users to perform componentized configuration of any module, such as optimizer, learning strategy, network assembly, etc.;

● Provide high-level ease-of-use interfaces such as Trainer, pipeline, and AutoClass;

●Provide automatic download and loading of preset SOTA weights;

● Support seamless migration and deployment of AI computing centers;

The list of currently supported models is as follows:

model	model name
LLama2	llama2_7b, llama2_13b, llama2_7b_lora, llama2_13b_lora, llama2_70b
GLM2	glm2_6b, glm2_6b_lora
CodeGeex2	codegeex2_6b
LLama	llama_7b, llama_13b, llama_7b_lora
GLM	glm_6b, glm_6b_lora
Bloom	bloom_560m, bloom_7.1b
GPT2	gpt2, gpt2_13b
PanGuAlpha	pangualpha_2_6_b, pangualpha_13b
BLIP2	blip2_stage1_vit_g
CLIP	clip_vit_b_32,clip_vit_b_16,clip_vit_l_14,clip_vit_l_14@336
T5	t5_small
sam	sam_vit_b, sam_vit_l, sam_vit_h
MAE	mae_vit_base_p16
VIT	vit_base_p16
Swin	swin_base_p4w7
skywork	skywork_13b
Baichuan2	baichuan2_7b,baichuan2_13b,baichuan2_7b_lora, baichuan2_13b_lora
Baichuan	baichuan_7b, baichuan_13b
Qwen	qwen_7b, qwen_14b, qwen_7b_lora, qwen_14b_lora
Wizardcoder	wizardcoder_15b
Internlm	internlm_7b,internlm_20b,internlm_7b_lora
ziya	ziya_13b
VisualGLM	visualglm

Model Support List

user web-1