0%

Llama

本文以Llama2模型为例介绍大语言模型的原理、微调,以及国内优秀的大语言模型。

1. GPT

GPT相关的资料:

GPT-1

  • introduce the pre-training + fine-tuning methos to NLP
  • Pre-training
    • Transformer decoder
  • Fine-tuning (linear+softmax layer)
    • Textual entailment
    • Similarity
    • Question Answering
    • Commonsense Reasoning

GPT-2

  • large and diverse dataset
    • article or fiction books
    • Common Crawl (mostly unintelligible)
      • filter this dataset using Reddit to ensure the used document received at least 3 karma
    • WebText
  • Zero-shot or few-shot

GPT-3

  • scaling up language models
  • in-context learning
    • Zero-shot
    • One-shot
    • Few-shot

InstructGPT

  • align large language with human
    • fine-tuning with human feedback
    • sorted dataset using reinforcement learning
  • cost function is designed for predict next word in a sentence, cause the misalign between human need with model
  • three steps
    • Key point
      • how to label SFT data
      • how to label sorted data
        • human come up with some promt and GPT playground
        • helpfulness for traing and truthfulness and harmfulness for evaluation
      • how to train RM
        • Pairwith ranking loss
          • K = 9 is effectiva not only for label cost but also for evaluation cost
      • how to fine-tune mode with RM
        • PPO-ptx
          • maximize the reward for SFT model while keep the original ablity

追溯GPT-3.5各项能力的起源

2. 大语言模型的微调

微调的综述:https://www.simform.com/blog/completeguide-finetuning-llm/

大模型微调技术 https://www.zhihu.com/question/599396505/answer/3141866148

微调Llama2的代码:Google colab + hugging face

https://www.youtube.com/watch?v=eeM6V5aPjhk&list=PLpdmBGJ6ELUKpTgL9RVR86cnPXjfscM5d&index=2

3. 微调模型的数据获取

爬虫的学习:https://github.com/Kr1s77/Python-crawler-tutorial-starts-from-zero/tree/master

4. 其他优秀的开源模型

Llama

Llama2是mata发布的免费可商用版本的开源大模型,一经发布就在圈内引起波然大轩。

但是Llama2的训练预料英文占比太高导致该模型在中文的表现能力上一般,为此,我寻找了其他在中文领域表现优秀的开源模型:

百川:

https://huggingface.co/baichuan-inc/Baichuan-13B-Chat

http://8.130.122.143:36002/

中文版本的Llama2:

https://mp.weixin.qq.com/s/_TbBKKRhqMv66GGyyW74XQ