2 minute read

Meta info.

TL; DR

์•„๋ž-์„œ๊ตฌ๋ฌธํ™”๊ฐ€ ๋Œ€์กฐ๋˜๋Š” entity์™€ natural occurring prompt ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹ CAMeL์„ ์ œ์•ˆํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ๋ก€์—ฐ๊ตฌํ•œ ๊ฒฐ๊ณผ LLM์ด ์„œ๊ตฌ๋ฌธํ™”๊ถŒ entity์— ํŽธํ–ฅ๋˜์–ด ์žˆ์Œ์— ๋Œ€ํ•œ ์šฐ๋ ค

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

Problem States

multi-lingual์ด ๋˜๋Š” LM์ด๋ผ๊ณ  ํ•ด๋„ ๋ฌธํ™”์  ๋‰˜์•™์Šค๋ฅผ ๊ณ ๋ คํ•˜์ง„ ์•Š์•„, ๋น„์„œ๊ตฌ๊ถŒ์—์„œ๋Š” ๋ฌธํ™”์  ๋งฅ๋ฝ์—์„œ ๋ถ€์ ์ ˆํ•œ ์ƒ์„ฑ ๋ฐœ์ƒย Figure 1

  • Dataset Construction
    • ๋Œ€์กฐ entity ๊ตฌ์„ฑ (20K)ย Figure 3, 4
      • type: ์‚ฌ๋žŒ ์ด๋ฆ„, ์Œ์‹/์š”๋ฆฌ, ์Œ๋ฃŒ, ์˜๋ฅ˜ ์•„์ดํ…œ, ์œ„์น˜(๋„์‹œ), ๋ฌธํ•™ ์ž‘๊ฐ€, ์ข…๊ต์  ์˜ˆ๋ฐฐ ์žฅ์†Œ, ์Šคํฌ์ธ  ํด๋Ÿฝ
      • ํŒจํ„ด๊ธฐ๋ฐ˜ entity ์ถ”์ถœ from Wiki data + CommonCrawl
        • ํ•„ํ„ฐ๋ง ์ˆ˜๋™,,
    • naturally occurring prompt (600)ย Table 1
      • X์—์„œ ํ‚ค์›Œ๋“œ ๊ฒ€์ƒ‰์œผ๋กœ ์ถ”์ถœ
      • ๋ฌธํ™”์ ์œผ๋กœ ์˜์กด์„ฑ ์žˆ๋Š” ํ”„๋กฌํ”„ํŠธ(CAMeL-Co)์™€ ๋ฌธํ™”์ ์œผ๋กœ ๋…๋ฆฝ์ ์ธ ํ”„๋กฌํ”„ํŠธ(CAMeL-Ag)๋กœ ๊ตฌ๋ถ„
      • human filtering + ์ต๋ช…ํ™”
  • Experiment
    • backbone: 16๊ฐœ LM
      • AraBERT, ARBERT, CAMeLBERT, MARBERT, and AraGPT2. Multilingual models include mBERT, XLM-R, BLOOM, GPT-3.5, GPT-4
      • ์•„๋ž ์–ธ์–ด ํŠนํ™” ๋ชจ๋ธ ํฌํ•จ
    • evaluation
      • intrinsic: embedding distance or probability ๋“ฑ..
      • extrinsic: ๊ฐ์ •๋ถ„์„, NER, ์Šคํ† ๋ฆฌ์ƒ์„ฑ ๋“ฑ
        • cloze-style text infilling task์—์„œ ์•„๋ž ๋ฌธํ™”์  ๋งฅ๋ฝ์„ ์ œ๋Œ€๋กœ ์ดํ•ดํ•˜์ง€ ๋ชปํ•จ์„ ํ™•์ธ

Effect

  • Story Generation: ์•„๋ž๊ณผ ์„œ์–‘ ์ด๋ฆ„์„ ๊ฐ€์ง„ ์ธ๋ฌผ์— ๋Œ€ํ•œ ์ด์•ผ๊ธฐ๋ฅผ ์ƒ์„ฑย Table 2
    • ๊ฐ ํ˜•์šฉ์‚ฌ์— ๋Œ€ํ•œ ์˜ค์ฆˆ๋น„(Odds Ratio) ๊ณ„์‚ฐ: ์•„๋ž ์ธ๋ฌผ์— ๋Œ€ํ•œ ์ด์•ผ๊ธฐ์—์„œ ํ•ด๋‹น ํ˜•์šฉ์‚ฌ๊ฐ€ ์„œ์–‘ ์ธ๋ฌผ์— ๋น„ํ•ด ๋” ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š”์ง€ (OR์ด ๋†’์œผ๋ฉด ํ•ด๋‹น ํ˜•์šฉ์‚ฌ๊ฐ€ ์„œ์–‘ ์ด๋ฆ„๊ณผ ๊ฐ•ํ•œ ์—ฐ๊ด€์„ฑ์„ ๋ณด์ธ๋‹ค๊ณ  ๊ฐ€์ •)ย Figure 5
    • ์„œ๊ตฌ ๋ฐ”์ด์–ด์Šค: โ€œwealthy,โ€ โ€œpopular,โ€ โ€œintelligentโ€
    • ์•„๋ž ๋ฐ”์ด์–ด์Šค: โ€œpoor,โ€ โ€œmodest,โ€ โ€œtraditionalโ€
  • NER: ANERCorp(์•„๋ž NER๋ฐ์ดํ„ฐ์…‹)ํ™œ์šฉ finetuningํ•˜๊ฑฐ๋‚˜ ICL(5-shot) for GPT-style model
    • ์ตœ๋Œ€ 20์  ์ด์ƒ ์•„๋ž NER์„ ๋” ๋ชปํƒœ๊น…ํ•จ (์œ„์น˜)ย Figure 6
  • sentiment analysis:
    • ์˜คํƒ๋ฅ ์€ ๋ณ„ ๊ฒฝํ–ฅ์€ ์—†์—ˆ์ง€๋งŒ, ์ƒ๋Œ€์ ์œผ๋กœ ์„œ๊ตฌ๊ถŒ entity์— ๊ธ์ •์„ฑ์„ ๋” ๋”.ย Figure 6
  • text(entity) infillingย Figure 7
    • metric CBS: ์„œ๊ตฌ๊ถŒ ํŽธํ–ฅ ์ ์ˆ˜. ํ”„๋กฌํ”„ํŠธ์—์„œ mask token ์ฑ„์šธ ๋•Œ LM์ด ์•„๋ž๊ถŒ๋ณด๋‹ค ์„œ๊ตฌ๊ถŒ entity์— ์„ ํ˜ธ๋„๋ฅผ ์ธก์ •
    • prompt adaptation: ์„œ๊ตฌ ํŽธํ–ฅ ์ค„์ด๊ธฐ ์œ„ํ•ด
      • cultural token ์ถ”๊ฐ€ : special token์œผ๋กœ [Arab] ์ถ”๊ฐ€
      • fewshot์— ์•„๋ž entity ํฌํ•จ๋˜์ง€ ์•Š๋„๋ก ์ฒ˜๋ฆฌ
    • ์•„๋ฌด๋ฆฌ special token ์ค€๋‹ค๊ณ  ํ•œ๋“ค western entity์— ํŽธํ–ฅ
  • 6 ๊ฐœ์˜ ์‚ฌ์ „ํ•™์Šต ์•„๋ž Corpus Analysis: OpenGRM์„ ์‚ฌ์šฉํ•ด 4-gram LM ํ•™์Šต( ๋นˆ๋„์ˆ˜ ๊ธฐ๋ฐ˜ LM์ด๋ฏ€๋กœ ์ง๊ด€์  ๋น„๊ต ๊ฐ€๋Šฅ) โ†’ CAMeL-Co์— ๋Œ€ํ•œ CBS ์ธก์ •ย Figure 8
    • ์•„๋ž์–ด๋”๋ผ๋„ ์„œ๊ตฌ ๋‰ด์Šค๋ฅผ ์ฃผ๋กœ ๋‹ค๋ฃจ๋Š” ๋“ฑ Wikipedia์™€ ๊ฐ™์ด ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์†Œ์Šค๋ฅผ ์กฐ์ • ์—†์ด ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ๋ฌธํ™”์ ์œผ๋กœ ํŽธํ–ฅ ์—†์•ค LM ํ•™์Šต์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š์„๋“ฏ.
    • ๊ทธ๋‚˜๋งˆ Twitter ๋ฐ์ดํ„ฐ๊ฐ€ ์ œ์ผ ๋‚˜์€ CBS ์ ์ˆ˜ ํš๋“

Personal note. ๋‹จ์ผ์–ธ์–ด๋งŒ์„ ๋น„๊ตํ•œ ๋ถ€๋ถ„ - cross-lingual setting๋„ ์ถฉ๋ถ„ํžˆ ๋ฌธ์ œ๋‹ˆ๊นŒ.. ๋‹น์—ฐํ• ์ง€๋„? ํ•œํŽธ์œผ๋ก  ๊ทธ๋ ‡๋‹ค๊ณ  ํ•ด๋„, ์•„๋ฌด๋ž˜๋„ ํ•œ๊ตญ์–ด๋ณด๋‹ค๋Š” ๋ฌธํ™”โ€๊ถŒโ€์œผ๋กœ ๋ฌถ์ด๋Š” ์•„๋ž์–ด๊ฐ€ ๋” ์ ์ ˆํ•œ ๋Œ€์กฐ๊ตฐ์ด๋ผ๋Š” ์ ์€ ๋‚ฉ๋“ํ•  ์ˆ˜๋ฐ–์—. entity๋ฅผ ๋Œ€์กฐ ๊ธฐ์ค€์œผ๋กœ ์žก์€ ๊ฒƒ๋„ ์œ ์šฉํ•˜๊ณ  ๊ดœ์ฐฎ์€ ์ ‘๊ทผ์œผ๋กœ ๋ณด์ž„.

ํŽธํ–ฅ์ด ์žˆ๋‹ค๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•ด ํŠนํžˆ ๋นˆ์นธ ์ฑ„์šฐ๊ธฐ ์ž˜ ๋ชปํ•˜๋Š” ๊ฑธ ๋ณด์ธ ๊ฒŒ, ์ผ๋ฐ˜์ ์œผ๋กœ ํ™•์ธํ•˜๋Š” QA๋ณด๋‹ค ์‰ฝ๊ณ  ์ง๊ด€์ ์ด๊ณ  ์„ค๋“๋จ.๊ทธ๋ฐ–์— NLU/NLG task๋ฅผ ๋ชจ๋‘ ํฌ๊ด„ํ•œ ๊ฒƒ๋„ ์ ์ ˆํ•œ ์ ‘๊ทผ - ๊ธฐ์กด QA task๋ฅผ cloze style๋กœ ๋ฐ”๊พธ๋Š” ๊ฑด ์–ด๋ ค์šฐ๋ ค๋‚˜? ์ƒ๋Œ€์ ์œผ๋กœ ์‰ฌ์šธ์ง€๋„? ์–ด์ฐจํ”ผ ์ด ๋…ผ๋ฌธ๋„ ์ถ”์ถœํ•œ entity๊ฐ€ wiki data -based๋ผ๋ฉด..

entity๋“  ๋ญ๋“  ์„œ๊ตฌ๋ฌธํ™”๊ถŒ์— ๋ฐ”์ด์–ด์Šค ๋ผ์–ด์žˆ์„ ๊ฑฐ๋ผ๋Š” ์ ์€ ๋˜๊ฒŒ ๋‹น์—ฐํ•˜์ง€ ์•Š๋‚˜? ๊ฒฐ๋ก ๋„ ์‹ฌํ”Œํ•˜๊ฒŒ ๋ฌธํ™” ๋ฐ”์ด์–ด์Šค๋ฅผ ๊ณ ๋ คํ•œ ๋ฆฌ์†Œ์Šค ๊ตฌ์ถ•์ธ ์ .. ๋ป”ํ•œ ๊ฒฐ๊ณผ์™€ ๋ป”ํ•œ ์ฃผ์žฅ์„ ์–ด๋–ป๊ฒŒ ํฌ์žฅํ–ˆ์„๊นŒ? (์•ฝ๊ฐ„ ๋ณ„๊ฑด์ด์ง€๋งŒ) lingual bias๋ฅผ ๋‹ค๋ฃจ๋ฉด์„œ cultural context๋ฅผ ๋ฌด์‹œํ•˜๋Š” ๊ฒƒ์€ ๋„ˆ๋ฌด ํŽธํ˜‘ํ•œ ์—ฐ๊ตฌ๊ฐ€ ๋  ๊ฒƒ ๊ฐ™์Œ.

์ฃผ์žฅ์„ ๊ด€์ฒ ํ•˜๋Š” ๋ฐ์— ์ผ๊ด€์„ฑ์žˆ๊ณ  ๊ตฐ๋”๋”๊ธฐ ์—†๋Š” ์ถฉ์‹คํ•œ ์—ฐ๊ตฌ: ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•์„ ํฌํ•จํ•˜์—ฌ ์–ด๋–ค ํ˜„์ƒ์„ ํฌ์ฐฉํ•˜๊ธฐ ์œ„ํ•œ study๋ผ๋ฉด ์ด๋Ÿฐ ํ๋ฆ„์œผ๋กœ ๊ตฌ์„ฑํ•˜๋Š”๊ฒŒ ํƒ€๋‹นํ•ด๋ณด์ž„

์‹คํ—˜์ด ํ’๋ถ€ํ•จ: ์ถฉ์‹คํ•œ ์„ค๋ช… + appendix๋„. ์ œ์‹œํ•œ task๊ฐ€ ์ ์ ˆํ•œ ์ , ๊ธฐ์กด metric + ์ž์ฒด metricย ยป ์ผ๊ด€๋œ ๊ฒฐ๊ณผ, ์‹คํ—˜ ablation์—์„œ ๊ทธ์น˜์ง€ ์•Š๊ณ  ๋ป”ํ•œ ์ฃผ์žฅ์— ์„ค๋“๋ ฅ์„ ๋”ํ•˜๊ธฐ ์œ„ํ•ด ์ฝ”ํผ์Šค ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ์ถ”๊ฐ€ํ•œ ํ๋ฆ„์ด ์ž์—ฐ์Šค๋Ÿฌ์›€