1 minute read

Meta info.
  • Authors: Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky
  • Paper: https://arxiv.org/pdf/2407.16833
  • Affiliation: Google DeepMind
  • Published: July 23, 2024

TL; DR

(1) RAG vs. Long-context LLM์— ๋Œ€ํ•ด, ์ž์›๋งŒ ์ถฉ๋ถ„ํ•˜๋‹ค๋ฉด ๊ฒฐ๊ณผ์ ์œผ๋กœ๋Š” LC LLM์ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋‚˜, (2) ๋น„์šฉ ์ธก๋ฉด์˜ ํšจ์œจ์„ ์œ„ํ•ด RAG๋กœ routingํ•˜๋Š” approach, Self-Route ์ œ์•ˆ

image.png

image.png

image.png

image.png

Problem States

์–ด๋–ค ์กฐ๊ฑด์—์„œ LC๊ฐ€ RAG๋ณด๋‹ค ๋‚ซ๊ฑฐ๋‚˜ ๊ทธ๋ ‡์ง€ ์•Š์€์ง€ ์‹๋ณ„

  • LC: ๊ฒ€์ƒ‰ ์—†์ด ๋ฌธ์„œ ์ „์ฒด ์ž…๋ ฅ
  • RAG: query ๊ธฐ๋ฐ˜์œผ๋กœ ๊ด€๋ จ์„ฑ ๋†’์€ ์ƒ์œ„ k๊ฐœ(5๊ฐœ) ํ•˜์œ„ passage(๋ฌธ์„œ์˜ ํ•˜์œ„ ๋‹จ์œ„. 300๊ฐœ token์”ฉ ์ž๋ฆ„) ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ

Suggestion

  1. RAG vs LCย Table 1
    • target: Gemini-1.5-Pro, GPT-4o, GPT-3.5-Turbo
      • retriever: Contriever
    • dataset: LongBench, โˆžBench
      • metrics: ROUGE, F1-score, Accuracy ๋“ฑ
    • result: LC LLM์ด ์ผ๊ด€๋˜๊ฒŒ long context ์ดํ•ด ๋Šฅ๋ ฅ์ด ๋” ์ข‹์Œ.
      • Gemini-1.5-Pro 7.6%, GPT-4o 13.1%, GPT-3.5-Turbo 3.6% ๋“ฑ
      • ์˜ˆ์™ธ: input length๊ฐ€ model input์„ ์ดˆ๊ณผํ•˜๋Š” ๊ฒฝ์šฐ
        • RAG๊ฐ€ GPT-3.5-Turbo์—์„œ ๋” ๋‚˜์€ ์„ฑ๋Šฅ (๊ณ„์‚ฐ ๋น„์šฉ ์ธก๋ฉด์—์„œ ์—ฌ์ „ํžˆ ์œ ๋ฆฌ)
  2. Self-Route
    • motivation: Suggestion 1์˜ ๊ฒฐ๊ณผ(LC-LLM์ด ๋” ์ž˜ํ•œ๋‹ค) + RAG๋‚˜ LC-LLM์ด๋‚˜ 60%๋Š” ๊ฐ™์€ ์˜ˆ์ธกย Figure 2
    • process:
      1. RAG-and-Route Step: query์— ๊ฒ€์ƒ‰ ์ •๋ณด๊ฐ€ ๋ถˆ์ถฉ๋ถ„ํ•˜๋‹ค๋ฉด โ€˜๋‹ต๋ณ€ ๋ถˆ๊ฐ€โ€™๋กœ ๋‹ต๋ณ€ ๊ฑฐ๋ถ€ํ•˜๋„๋ก ์˜ต์…˜ ์ œ๊ณตํ•˜์—ฌ RAG๋จผ์ € ํƒœ์›€
      2. Long-Context Prediction Step: ์•ž์„œ ๋‹ต๋ณ€ ๋ถˆ๊ฐ€์ธ ๊ฒฝ์šฐ๋งŒ ๊ณจ๋ผ์„œ LC-LLM ํƒœ์›€
    • result: LC ๋Œ€๋น„ input token ๊ฐœ์ˆ˜๋ฅผ ์ตœ๋Œ€ 65%๊นŒ์ง€ ์ค„์ด๋ฉด์„œ LC ์„ฑ๋Šฅ์— ๊ทผ์ ‘ย Figure 1
      • Gemini-1.5-Pro๊ธฐ์ค€ LC๋Œ€๋น„ 38% token๋งŒ ์จ๋„ ๊ฑฐ์˜ ๋น„์Šทํ•œ ์„ฑ๋Šฅ ๋ณด์ด๋Š” ๋“ฑ
      • ablation
        • k(retrieve ๊ฐœ์ˆ˜)๋Š” ํด์ˆ˜๋ก RAG ์„ฑ๋Šฅ ์ข‹์•„์ง€๋Š” ๊ฑด ์ผ๋ฐ˜๋ก ์ด์ง€๋งŒ ๋น„์šฉ ๊ณ ๋ ค์‹œ k=5๊ฐ€ ์ตœ์ 
        • RAG ์‹คํŒจ์˜ ์ฃผ์š” ์›์ธ์€ multi-step reasoning. ๋ณต์žกํ•˜๊ฑฐ๋‚˜ ์•”์‹œ์ ์ธ query ์—ญ์‹œ ์‹คํŒจ ์š”์ธย Figure 4
        • retriever ๋ฐ”๊ฟ”๋„ ๋น„์Šทํ•ด ์ผ๋ฐ˜ํ™”ํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค๊ณ  ์ œ์•ˆ (Dragon ๊ณผ ๋น„๊ต)