LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Meta info.
- Authors: Zhang, Han
- Paper: https://arxiv.org/abs/2303.16199
TL; DR
LLaMA-Adapter, a method for quickly and efficiently fine-tuning LLaMA into an instruction-following model using self-instruct demonstrations, matching Alpaca's modeling performance.





Sugestions
In contrast to LLaMA-Alpaca, it has not fine-tuning the whole param, this approach adds a layer with 1.2M params on top of 7B LLaMA.
- a lightweight adaption method that efficiently fine-tunes LLaMA into an instruction-following model using self-instruct demonstrations.
- with only 1.2M learnable parameters using 52K instruction data and less than an hour of fine-tuning on 8 A100 GPUs(Alpaca took 3 hours), LLaMA-Adapter can effectively inject new instructional cues into LLaMA while preserving its pre-train. #pic5
- model can be generalized to image conditions for multi-modal reasoning and achieve competitive performance on the ScienceQA benchmark. (it allowed to add other input image and video tokens!, #pic3)
Personal note. μνμΉ΄λ λΉλ±νλ€λλ° μ μ μνμΉ΄λ μ±λ₯λΉκ΅ν 건 μλλ―β¦?Β λ©ν°λͺ¨λ¬μ μ κΈ°νλ€μβ¦