1 minute read

Meta info.

TL; DR

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

image.png

image.png

image.png

image.png

Problem States

knowledge ground(์™ธ๋ถ€ ์ง€์‹์„ ํ†ตํ•ฉํ•˜๋Š”) dialogue system์—์„œ RAG๊ฐ€ ํšจ๊ณผ์ ์ธ ๊ฒƒ์€ ์‚ฌ์‹ค์ด๊ณ  (Figure 1), ์ด์— ๋”ฐ๋ผ ๋งค ํ„ด๋งˆ๋‹ค RAG๋ฅผ ์‹œํ–‰ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ด๋‚˜, ์‹ค์งˆ์ ์œผ๋กœ ๋งค ํ„ด๋งˆ๋‹ค RAG๋ฅผ ์‹คํ–‰ํ•  ํ•„์š”๋„ ์—†๊ณ , ๋ถˆํ•„์š”ํ•œ ๊ฒฝ์šฐ์— ๋Œ€ํ•ด์„œ retrieve๋œ ๊ฒฐ๊ณผ๊ฐ€ ๋ง๋ถ™์„ ๊ฒฝ์šฐ ์˜คํžˆ๋ ค hallucination ๋ฐœ์ƒ

Suggestions

System Response์˜ ๊ฐ ํ„ด๋งˆ๋‹ค ์™ธ๋ถ€ ์ง€์‹์œผ๋กœ ์ฆ๊ฐ•๋˜์–ด์•ผ ํ•  ํ•„์š”์„ฑ์„ ์กฐ์‚ฌํ•˜๋Š” RAGate ์ œ์•ˆย Figure 2

  • Objective: ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ๊ด€๋ จ์„ฑ์ด ์žˆ์œผ๋ฉฐ ์ƒํ™ฉ์— ๋งž๋Š” ๋Œ€ํ™” ์‹œ์Šคํ…œ์˜ ์‘๋‹ต
  • Task: ์™ธ๋ถ€ ์ง€์‹์œผ๋กœ ์‹œ์Šคํ…œ ์‘๋‹ต์„ ๋ณด๊ฐ•ํ•  ์‹œ์ ์„ ์‹๋ณ„
  • Gate Mechanism: binary mechanism (dialogue history) vs. (dialogue history + external knowledge)
    1. RAG-Prompt: ํ”„๋กฌํ”„ํŠธ์— ์ง€์‹œ
    1. Backbone: Llama-v2-7B, Llama-v2-13B
    2. ICL: zero-shot, few-shot
      1. RAGate-PEFT: QLoRA๋กœ instruction-input-output ํŠœ๋‹
    3. Backbone: Llama-v2-7B
    4. input feature: dialogue history, system response, synthetic responses, named entities, retrieved knowledge
    5. output: ์™ธ๋ถ€์ง€์‹ ํ•„์š” ์—ฌ๋ถ€ ํŒ๋‹จ
      1. RAGate-MHA: Multi-Head Attention encoder ํ™œ์šฉ
    6. input: token embedding + position encoding + MHA ๋“ฑ layer์„ ํฌํ•จํ•˜๋Š” FFNN ๊ตฌ์„ฑ(ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ํƒ์ƒ‰ ํฌํ•จ)
    7. output: (์ƒ๋™)

Effects

  • Experiment
    • task: Binary Classification
      • metric: Precision, Recall, F1-Score, AUC, FDR
      • dataset: KETOD
        • 52K์˜ turn์„ ๊ฐ€์ง€๋Š” ์•ฝ 5K ๋Œ€ํ™” ๋ฐ์ดํ„ฐ์…‹
        • 33K์˜ external knowledge snippet ํฌํ•จ(์•ฝ 12.1% turn)
      • baseline: No-Aug, Aug-all
    • retriever: TF-IDF, BERT-Ranker
      • metric: Recall@1 & Recall@3
    • generator: GPT-2
      • metric: BLEU, ROUGE-1/2/L, & BERTScore
  • Result
    • RAGateย Table 2
      • RAGate-Prompt๊ฐ€ ์–ด๋А์ •๋„๋Š” ํ•˜๋”๋ผ๋„ ์ƒ๋Œ€์ ์œผ๋กœ finetuning-st ๋‹ค๋ฅธ ๋‘ ๋ฐฉ์‹์ด ๋” ๋‚˜์€ ์˜ˆ์ธก
      • dialogue history-only input์ธ ๊ฒฝ์šฐ ํŠนํžˆ RAGate-PEFT ์„ฑ๋Šฅ์ด RAGate-Prompt๋ณด๋‹ค ํฐํญ์œผ๋กœ ๊ฐœ์„ 
        • ๋Œ€์ฒด๋กœ Input feature๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด ์„ฑ๋Šฅ์ด ๋Š˜๊ธด ํ•˜๋‚˜, recall์€ ํ•˜๋ฝํ•˜๊ธฐ๋„.
      • RAGate-MHA๊ฐ€ ๊ฐ€์žฅ ์ข‹์€ recall์„ ๋ณด์ด๋ฉด์„œ โ€œRAGโ€๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์‹๋ณ„์— ํŠนํžˆ ์œ ์˜ํ•˜๋‚˜, Precision์€ ์ƒ๋Œ€์ ์œผ๋กœRAGate-PEFT ๋ฐฉ์‹์ด ๋” ๋‚˜์Œ.
        • ์•„๋งˆ ์ ˆ์ถฉ์•ˆ์ด ์žˆ์ง€ ์•Š์„์ง€?
      • ablation
        • ๋Œ€ํ™”์˜ ์‹œ์ž‘ ๋ถ€๋ถ„์—์„œ ๋” ๋งŽ์€ RAG ์‹œ๋„
        • ์‹ค์ œ๋กœ ์ดˆ๊ธฐ ๋Œ€ํ™”์— RAGํ•˜๋ฉด ํ–ฅํ›„ ๋Œ€ํ™”์—์„œ๋„ ๋” ์ž์—ฐ์Šค๋Ÿฌ์›Œ์ง์œผ๋กœ ๊ทธ ๊ฐ€์น˜๊ฐ€ ๋” ํฐ๋“ฏ
        • ์—ฌํ–‰, ํ˜ธํ…”, ํ•ญ๊ณตํŽธ ๋“ฑ ๋„๋ฉ”์ธ์—์„œ ๋” ๋งŽ์€ RAG: RAGate-MHA๊ฐ€ ๋” ์‚ฌ๋žŒ๊ณผ ๋น„์Šท
  • Generationย Table 3
    • No-Aug, Aug-all๋ณด๋‹ค RAGate๊ฐ€ ๋” ๋‚˜์€ ์„ฑ๋Šฅ
      • ํŠนํžˆ RAGate-MHA๋Š” ํ›จ์”ฌ ์ ์€ ์ˆ˜์˜ ์ฆ๊ฐ•์œผ๋กœ Aug-All์˜ ์„ฑ๋Šฅ๊ณผ ๊ฑฐ์˜ ์ผ์น˜
    • ๊ฐ€์žฅ relevantํ•œ snippet๋งŒ ์‚ฌ์šฉํ•˜๋ฉด(BERT-ranker๋กœ ๊ฒฐ์ •) ์‹ ๋ขฐ๋„ ์œ ์ง€ + ๊ฐœ์„