ImageToText

  1. Mac use library deepface.
    I first want to use deepface: a library for face recognization, analysis and many other staffs. However, whatever way I use to install deepface library will cause a problem: when I program the python code: from deepface import deepface, it will cause a problem:cannot import name ‘DeepFace’ from partially initialized module ‘deepface’ (most likely due to a circular import). I tried to install the library by pip or by download the github repository and install it. But all failed. I find the problem, it is a stupid problem: I named the python file deepface.py the same name as the library deepface, so it caused circular reference problem. I mixed the mistake. What a foolist problem!. Then I encountered another problem when I download deepface weight .h5 files, the files are too large that it will make it timeout for downloading.
    The ideas I learned:
    ViT, the basic structure of LLM, the pre-training method of LLM, the basic principle of VLM, CLIP, BILP, zero shot, CogVLM,EVA-CLIP, metircs of image caption
    Pre-train, fine-tuning,ViT, CLIP,Prompt engineering

transformer.HfAugumentation:HfArgumentParser:解析命令行参数,从类对象中创建解析对象,可以将类对象中的实例属性转换成转换为解析参数
mtp:multi-token prediction
vision-tower:视觉编码器
PEFT:Parameter-Efficient Fine-Tuning
lora_enable:低秩适配器
Prompt learning:任务设置过于理想,试图只调节输入端的小部分参数,对深层部分的影响是相当有限的,这就会造成最终fine-tune的效果受到局限。
transformers.AutoTokenizer API:将文本输入转化为模型可以接受的输入
tune_mm_mlp_adapter:
mm_use_im_start_end添加特殊的图像标记
deepspeed:Zero 3
fsdp: PyTorch 原生的 FSDP (FullyShardedDataParallel)

TODO:
预训练模型训练方法:Albert,LoRA,Prompt Learning, Prefix-truning,P-Truning