2 minute read

Meta info.
  • Authors: Zhiyuan Liang, Dongwen Tang, Yuhao Zhou, Xuanlei Zhao, Mingjia Shi, Wangbo Zhao, Zekai Li, Peihao Wang, Konstantin Schรผrholt, Damian Borth, Michael M. Bronstein, Yang You, Zhangyang Wang, Kai Wang
  • Paper: https://arxiv.org/pdf/2506.16406
  • Affiliation: NUS, Oxford Univ., UT Austin, Univ. SG
  • Published: June 19, 2025

TL; DR

prompt๋ฅผ input์œผ๋กœ, LoRA-tuend ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ output์œผ๋กœ ํ•˜์—ฌ SFTํ•˜๋Š” ๋ชจ๋ธ DnD ์ œ์•ˆ. DnD๋ฅผ ํ•œ ๋ฒˆ ํ•™์Šต ํ•ด๋‘๋ฉด task๋งˆ๋‹ค ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด๋„ task-specific LoRA weight๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

image 1 image 2 image 3 image 4 image

Background

LoRA-tuning๋„ ์–ด์จŒ๋“  ๋น„์šฉ์ด ํฌ๋‹ค.

  • PEFT๋ฅผ ์“ฐ๋ฉด low-rank matrices ํ›ˆ๋ จ์œผ๋กœ FFT ์—†์ด๋„ LLM tuning ๊ฐ€๋Šฅ > ์—ฌ์ „ํžˆ per-task fine-tuning ํ•„์š”
  • RPG, COND P-DIFF, ORAL ๋“ฑ ๋‹ค๋ฅธ hyper-network๋“ค์€ ๋ณดํ†ต task ID ๊ฐ™์€ ์‹ฌํ”Œํ•œ condition์„ ์‚ฌ์šฉ
    • ์ž์—ฐ์–ด ํ”„๋กฌํ”„ํŠธ์˜ ๋‹ค์–‘ํ•œ ๋ณ€ํ˜•์„ ์ฒ˜๋ฆฌํ•˜๊ฑฐ๋‚˜ ์ƒˆ๋กœ์šด ์ž‘์—…์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™” ํ•œ๊ณ„

Problem States

label์ด๋‚˜ finetuning ์—†์ด raw prompt์—์„œ per-task LoRA weight(BA)๋ฅผ ์ง์ ‘ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์„๊นŒ?

Suggestions

DnD

  • ์ž์—ฐ์–ด ํ”„๋กฌํ”„ํŠธ๋ฅผ condition์œผ๋กœ LoRA ๊ฐ€์ค‘์น˜๋ฅผ ์ง์ ‘ ์ƒ์„ฑ (prompt-to-weight)
    • ๊ธฐ์กด LoRA-tuning: ๋ฐ์ดํ„ฐ > gradient > weight
    • ์ œ์•ˆ DnD ๋ฐฉ์‹: ๋ฐ์ดํ„ฐ(prompt) > weight
  • {prompt, LoRA-tuned weight (ckpt)}ย pair๋ฅผ parameter generator๊ฐ€ย **MSE loss๋กœ ํ•™์Šต (ํšŒ๊ท€๋ฌธ์ œ๋กœ ์ ‘๊ทผ)
    • ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์…‹ (e.g., ARC, BoolQ, gsm8K)์œผ๋กœ LLM์„ LoRA-tuningํ•˜์—ฌ ckpt ์ˆ˜์ง‘
    • ๊ฐ weight์™€ mappingํ•  condition prompts ๊ตฌ์ถ•: ๋‹ค์–‘ํ•œ task๋ณ„ ๋ชจ๋ธ์— ์ž…๋ ฅ์œผ๋กœ ๋“ค์–ด๊ฐ€๋Š” query๋“คย **(๋‹ต์€ย โŒ)
      • LoRA tuning์‹œ ์‚ฌ์šฉํ•œ input ํ…์ŠคํŠธ๋“ค ์ผ๋ถ€ ์ƒ˜ํ”Œ๋ง (batch ๋‹จ์œ„)
      • ๋žœ๋ค pair ๊ตฌ์ถ•: 1๊ฐœ์˜ ckpt, 1๊ฐœ์˜ prompt pair๋ฅผ random mapping
    • text encoder: SBERT๋กœ prompt embedding ์ƒ์„ฑ
    • parameter generator: Hyper-Convolutional Decoder
      • input: condition prompt embedding batch, output: weight matrix
      • B=batch size, N=prompt ๊ฐœ์ˆ˜, C: embedding ์ฐจ์›, L: token ๊ธธ์ด (prompt ๋‹น)

          clW = Conv1H(Conv1W(clโˆ’1)) # prompt ๋‚ด๋ถ€ ๊ด€๊ณ„ ํฌ์ฐฉ > prompt๊ฐ„ ์ƒ๊ด€์„ฑ ํฌ์ฐฉ
          clH = Conv2W(Conv2H(clโˆ’1)) # prompt๊ฐ„ ์ƒ๊ด€์„ฑ ํฌ์ฐฉ > prompt ๋‚ด๋ถ€ ๊ด€๊ณ„ ํฌ์ฐฉ
          cl = ConvL((clW + clH + b) / 3) # ๋ ˆ์ด์–ด๋ณ„๋กœ LoRA weight๋ฅผ ๋ถ„๋ฆฌํ•ด์„œ ์ƒ์„ฑ
        
        • clW: ํ”„๋กฌํ”„ํŠธ ๋‚ด๋ถ€ ๋จผ์ € ๋ณด๊ณ  > ํ”„๋กฌํ”„ํŠธ๊ฐ„ ํŒจํ„ด ํ™•์ธ
          • Conv1W: L ร— C ์ฐจ์›์—์„œ ํ”„๋กฌํ”„ํŠธ ๋‚ด๋ถ€ ํ† ํฐ ์‹œํ€€์Šค ์ƒ๊ด€์„ฑ ํฌ์ฐฉ
          • Conv1H: N ร— L ์ฐจ์›์—์„œ ํ”„๋กฌํ”„ํŠธ ๊ฐ„ ์ƒ๊ด€์„ฑ ํฌ์ฐฉ
        • clH: clW ์ˆœ์„œ ๊ต์ฒด
        • cl: ์„œ๋กœ ๋‹ค๋ฅธ ์ •๋ณด์˜ clW์™€ clH ํ‰๊ท  > ConvL ์ฒ˜๋ฆฌ(๋ ˆ์ด์–ด๋ณ„๋กœ LoRA weight๋ฅผ ๋ถ„๋ฆฌํ•ด์„œ ์ƒ์„ฑ)
      • training: ์ƒ์„ฑ๋œ weight vs ์‹ค์ œ weight MSE Loss ํ•™์Šต
      • inference: ๋ฝ‘ํžŒ weight๋ฅผ ๋ฐ”๋กœ LLM์— ๊ฝ‚์•„์„œ inference ์ˆ˜ํ–‰

Effects

ํšจ์œจ์„ฑ, ์†๋„, few-shot/ICL๊ณผ ๋น„๊ต ๊ฒฐ๊ณผ DnD๊ฐ€ ์šฐ์œ„

  • full fine-tuning ๋Œ€๋น„ ์—ฐ์‚ฐ๋Ÿ‰ 1๋งŒ 2์ฒœ๋ฐฐ ์ ˆ๊ฐ (์ดˆ๋‹จ์œ„ weight ์ƒ์„ฑ)
  • LoRA full-shot ํŠœ๋‹๋ณด๋‹ค ์„ฑ๋Šฅ ์šฐ์œ„: few-shot, ICL๊ณผ ๋น„๊ตํ•ด๋„ย 256-shot ์ด์ „์—๋Š” ๋ฌด์กฐ๊ฑด ์šฐ์œ„
    • ์‹ฌ์ง€์–ด ์ผ๋ถ€ task์—์„œ๋Š” ์›๋ณธ LLM ์ž์ฒด๋ณด๋‹ค๋„ ๋” ์ข‹์€ ์„ฑ๋Šฅ
  • ์‹คํ—˜ํ•œ task๋“ค์—์„œ unseen task ์ˆ˜ํ–‰์‹œ, ๊ธฐ์กด optimized-LoRA๋ณด๋‹ค ํ‰๊ท  30% ์„ฑ๋Šฅ ํ–ฅ์ƒ ํ™•์ธ
  • ablation study:
    • ๋Œ€ํ˜•๋ชจ๋ธ๋„ DnD ๊ฐ€๋Šฅ: 7B ๋ชจ๋ธ๊นŒ์ง€ ํ™•์žฅ ๊ฐ€๋Šฅ ํ™•์ธ
    • prompt ๊ฐœ์ˆ˜๋Š” ๋งŽ์„์ˆ˜๋ก ์ข‹์•˜๊ณ , ๋‹ต์€ ์•ˆ์ฃผ๋Š”๊ฒŒ ๋” ์„ฑ๋Šฅ์ด ์ข‹๊ณ 
    • embedding ์„ฑ๋Šฅ์ด ์ค‘์š”. ์‹คํ—˜์—์„œ SBERT๊ฐ€ ๊ฐ€์žฅ ์ข‹์•˜๋‹ค๊ณ 
    • DnDํ•™์Šต๋ฐ์ดํ„ฐ๋Š” ๋งŽ์€๊ฒŒ ์ข‹๋‹ค๊ณ 

Personal note. input text ์ฃผ๊ณ  LoRA matrix ์ˆซ์ž๋กœ ๋ฑ‰๊ฒŒ ํ•™์Šตํ•œ ๋ชจ๋ธ ํ•˜๋‚˜ ์ž˜ ๋ฝ‘์•„๋†“์œผ๋ฉด = DnD, ๊ทธ ๋ชจ๋ธ์— ์›ํ•˜๋Š” input ์ž˜๋งŒ ์ •์˜ํ•ด์„œ ์คฌ๋”๋‹ˆ ๊ฝค ๊ทธ๋Ÿด๋“ฏํ•œ weight matrix๋ฝ‘์•„์ฃผ๋”๋ผ ๋‹น์—ฐํžˆ ๋งค task๋งˆ๋‹ค LoRA ํŠœ๋‹ํ•˜๋Š”๊ฒƒ๋ณด๋‹ค์•ผ ์‹ธ๊ณ  ์„ฑ๋Šฅ๋„ ๋น„์Šทํ•˜๊ฑฐ๋‚˜ ๋” ์ž˜ํ•˜๊ธฐ๋„~์˜ ํ๋ฆ„์ž…๋‹ˆ๋‹ค. ๋‹จ์ˆœํ•˜๊ฒŒ ์จ๋จน์„ ๋ฐ๊ฐ€ ๋งŽ์•„๋ณด์ด๊ณ , ๊ฐ ๋„๋ฉ”์ธ์— ์ž˜๋งŒ ๊ฐ–๋‹ค ์“ฐ๋ฉด ๋˜ ์‰ฌ์šด ํ™œ์šฉ์ด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด personalization์—์„œ ์“ด๋‹ค๋˜๊ฐ€..