less than 1 minute read

Meta info.

TL; DR

๋‹ค์–‘ํ•œ ๋ฌธ์„œ ์ƒ์„ฑ + QA pair ๊ตฌ์„ฑํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ LLM์˜ ์ง€์‹ ์‚ฌ์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€ํ•˜๋Š” Framework ์ œ์•ˆ

image.png

image.png

image.png

image.png

image.png

Problem States

๊ธฐ์กด RAG ๋ฒค์น˜๋งˆํฌ๋Š” ์ฃผ๋กœ ์ผ๋ฐ˜ ์ง€์‹์— ๋Œ€ํ•ด ํ‰๊ฐ€ํ•˜์ง€๋งŒ, ์‚ฌ์‹ค์€ ์ „๋ฌธ ๋„๋ฉ”์ธ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํ‰๊ฐ€๋Š” ์ž˜ ์•ˆ๋˜๊ณ  ์žˆ์Œ

Suggestions

  1. (stage 1) Schema Summary: ๊ท€๋‚ฉ์  ์ถ”๋ก ์œผ๋กœ ๋„๋ฉ”์ธ๋ณ„ ๋ฌธ์„œ์—์„œ ํ•„์ˆ˜์ ์ธ ์‚ฌ์‹ค ์ •๋ณด๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์Šคํ‚ค๋งˆ ๊ตฌ์ถ•. seed text set์„ ๋ฐ”ํƒ•์œผ๋กœ organization, type, event, data, place,,,, ๋“ฑ ์ฃผ์š” ์š”์†Œ๋ฅผ ์บก์Аํ™”
  2. (stage 2) Document Generation: ์•ž์„  schema์—์„œ ํŒŒ์ƒ๋˜์–ด, ์‚ฌ์‹ค์ ์ด๊ณ  ์ผ๊ด€์ ์ธ ํ…์ŠคํŠธ ์ƒ์„ฑ. ๊ทœ์น™๊ธฐ๋ฐ˜ + LLM๊ธฐ๋ฐ˜
  3. (stage 3) QRA Generation: ์•ž์„  Schema์™€ Document๊ธฐ๋ฐ˜์œผ๋กœ Query - Reference - Answer์˜ triple ๊ตฌ์กฐ ์ƒ์„ฑ
    • Query ์œ ํ˜•: Factuality, multi-hop, summarization, multi-doc, unanswerable question, โ€ฆ.

Effects

  • Evaluation metrics
    • Retrieve: EIR, Recall
    • Generation : Completeness, Hallucination, Irrelevancy
  • Results:
    • ์•ž์„  framework์œผ๋กœ ๊ตฌ์ถ•ํ•œ DRAGONBall Dataset ํ™œ์šฉ: ๊ธˆ์œต, ๋ฒ•๋ฅ , ์˜ํ•™ ๋“ฑ, ์ค‘๊ตญ์–ด/์˜์–ด ๋ฐ์ดํ„ฐ์…‹
    • Human evaluation์„ machine-generated evaluation๊ณผ ๋น„๊ต

Personal note. RAG๋‚˜ context ๋ถ™์ด๋Š” task์—์„œ LLM์œผ๋กœ ํŠน์ˆ˜ ๋„๋ฉ”์ธ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•ํ•  ๋•Œ ์ฐธ๊ณ  ๊ฐ€๋Šฅํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.์•ฝ๊ฐ„ ์Šคํ‚ค๋งˆ ๋งŒ๋“œ๋Š”๊ฒŒ ์ œ๋„ˆ๋Ÿดํ•œ ๋А๋‚Œ์˜ ์œ„ํ‚คํ”ผ๋””์•„ ์ธํฌ๋ฐ•์Šค ๊ฐ™๊ธฐ๋„ ํ•˜๋„ค์š”ย ๐Ÿค”