1 minute read

Meta info.

TL; DR

LLM์€ persona์˜ sensitivity์— ๋งค์šฐ ๋ฏผ๊ฐํ•˜์—ฌ ๋ถ€์ •์  persona๋Š” ์ผ๊ด€์„ฑ ์—†๋Š” ๋Œ€ํ™”๋ฅผ, ๊ธ์ •์  persona๋Š” ๋” ์›ํ™œํ•˜๊ณ  ์งˆ ๋†’์€ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•˜๊ธฐ ๋–„๋ฌธ์—, robustness ๊ฐœ์„ ์„ ์œ„ํ•ด polarity-aware ์ƒ์„ฑ ์ „๋žต ์ œ์•ˆ

image 1 image 2 image 3 image

Background

personalized dialogue์— ๋Œ€ํ•ด prompt ์ˆ˜์ค€์—์„œ persona ์ฃผ์ž…ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ LLM ํ†ตํ•ฉ ์‹œ๋„ ๋ฐœ์ „

  • LLM์ด contextual sentiment์— ๋ฏผ๊ฐํ•˜๋”๋ผ
  • ํ˜น์€ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์–ด๋–ป๊ฒŒ ์“ธ ๊ฒƒ์ธ๊ฐ€์— ์ง‘์ค‘

Problem States

persona์˜ sentiment polarity์— ๋Œ€ํ•ด์„œ๋Š” ์–ด๋– ํ•œ๊ฐ€?

  • RQ1ย LLM์€ persona profile์˜ sentiment polarity(๊ธ์ •/๋ถ€์ •/์ค‘๋ฆฝ)์— ๋ฏผ๊ฐํ• ๊นŒ?
  • RQ2ย ์‹ค์ œ๋กœ ๋ฏผ๊ฐํ•˜๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ robustํ•˜๊ฒŒ ํ•  ์ˆ˜ ์žˆ์„๊นŒ?

Suggestions

  • Large-Scale Polarity-Aware Dialogue Analysis: DistilBERT (๊ฐ์ •๋ถ„๋ฅ˜๊ธฐ)๋กœ ConvAI2 ํŽ˜๋ฅด์†Œ๋‚˜ ๋ฌธ์žฅ์— ๊ธ์ •/๋ถ€์ •/์ค‘๋ฆฝ์œผ๋กœ ๋ ˆ์ด๋ธ”๋ง
  • Persona-Aware Dialogue Generation Framework
    • Turn-Based Generation:ย single-turn์œผ๋กœ ๋ฒˆ๊ฐˆ์•„๊ฐ€๋ฉด์„œ ๋…๋ฆฝ์ ์œผ๋กœ ๊ฐ ํ”„๋กœํ•„์˜ ๋Œ€ํ™” ์ƒ์„ฑ
      • ๊ฐ ํ”„๋กœํ•„์— ์ถฉ์‹คํ•˜๋ฉด์„œ negative๊ฐ€ ์ „์ด๋˜๋Š” ํ˜„์ƒ ์™„ํ™”
        • (๊ธฐ์กด ์—ฐ๊ตฌ) ๋‘ ํ”„๋กœํ•„์„ ๋™์‹œ์— llm์— ๋„ฃ๊ณ  ์ „์ฒด๋ฅผ ์ƒ์„ฑ์‹œ์ผฐ๋Š”๋ฐ ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด negative์ชฝ์ด ๋ถ€๊ฐ๋˜๊ฑฐ๋‚˜ ์„œ๋กœ ๋งํˆฌ์— ์˜ํ–ฅ์„ ๋ฐ›๊ธฐ๋„ ํ–ˆ๋‹ค๊ณ 
      • LLaMA-3.2-3B, Qwen-2.5-3B ํ™œ์šฉ
    • Profile Ordering:ย ์ž…๋ ฅํ•  ๋•Œ polarity confidence์— ๋”ฐ๋ผ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ์ •๋ ฌ
      • LLM์€ ์•ž์— ๋‚˜์˜ค๋Š” ์ •๋ณด์— ๋” ํฐ ์˜ํ–ฅ์„ ๋ฐ›๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์œผ๋ฏ€๋กœ, ๋ถ€์ •์ ์ด๊ฑฐ๋‚˜ ์ค‘๋ฆฝ์ ์ธ ํ”„๋กœํ•„์„ ๋จผ์ € ๋†“๊ณ  ๊ทธ ๋‹ค์Œ ๊ธ์ •์ ์ธ ํ”„๋กœํ•„ ๋ฐฐ์น˜
      • ๊ฐ์ • ํ‘œํ˜„์ด ์•ฝํ•˜๊ฑฐ๋‚˜ ์–ด๋ ค์šด ํ”„๋กœํ•„์„ ์ดˆ๊ธฐ ๋งฅ๋ฝ์— ๋” ์ž˜ ๋ฐ˜์˜ (๊ธ์ •์ ์ธ๊ฑด ์›๋ž˜ ์ž˜ ๋งŒ๋“œ๋‹ˆ๊นŒ)
      • ๊ฒฐ๊ณผ์ ์œผ๋กœ center-out score ascending ์ฆ‰ ๊ฐ€์žฅ ์ค‘๋ฆฝ์ธ ๊ฒƒ๋ถ€ํ„ฐ ๋†“๊ณ  negative์— ๊ฐ€์ค‘์น˜ ๋‘๋Š” ํŽธ์œผ๋กœ ์ง„ํ–‰
    • Sentiment-Aware Prompting(SAP):ย ๋ถ€์ •์ /์ค‘๋ฆฝ์  ํŽ˜๋ฅด์†Œ๋‚˜ ๋“ฑ ๊ฐ์ •์ด ์•ฝํ•œ ํ”„๋กœํ•„์„ ์ž˜ ์ฒ˜๋ฆฌํ•˜๋ฅผ ์œ„ํ•œ Instruction ์ถ”๊ฐ€
      • Please ensure that each user's persona, especially negative or neutral personas, is well integrated into the dialogue...
      • ์ถ”๊ฐ€๋งŒ ํ•ด์ค˜๋„ coherence, consistency ๋ชจ๋‘ ํ–ฅ์ƒ
    • Perplexity Gap (P gap) metric ์ œ์•ˆ:ย GPT2-large๋กœ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ condition์œผ๋กœ ํ•  ๋•Œ ๋Œ€ํ™”์˜ PPL ๋ณ€ํ™”๋Ÿ‰ ์ธก์ •
      • $\text{P}_\text{gap} = \text{Perplexity}(D) - \text{Perplexity}(D U_1, U_2)$

Effects

  • Evaluation setup:
    • metrics:
      • Consistency: C score, Contradiction Ratio (Contd.), Perplexity Gap (P gap), G-eval
      • Coherence: Perplexity, Q-DCE, PairEval, G-eval
    • baselines: LLaMA-3.2-3B, Qwen-2.5-7B, Ministal-8B, Gemma-2-9B
  • RQ1ย LLM์€ persona profile์˜ sentiment polarity(๊ธ์ •/๋ถ€์ •/์ค‘๋ฆฝ)์— ๋ฏผ๊ฐํ• ๊นŒ? >ย Tab 1
    • negative profile: consistency ๋†’์ง€๋งŒ contradiction ์ด ๋งŽ๊ณ  coherence ๋‚ฎ์•„์ง
    • positive profile: persona๋ฅผ ์„ ํƒ์ ์œผ๋กœ ์ทจํ•ด์„œ contradiction๋„ ์ ๊ณ 
    • neutral (mixed) profile: ์–ด์ค‘๊ฐ„ํ• ์ˆ˜๋ก ๋” ๋Œ€ํ™” ํ’ˆ์งˆ์ด ๋‚ฎ๋‹ค๊ณ 
    • Polarity level์— ๋Œ€ํ•ด
      • confidence ๋†’์„ ์ˆ˜๋ก ์„ฑ๋Šฅ U ์ž ์ปค๋ธŒ, ์ฆ‰ ๊ฐ์ •์ด ๊ทน์ด ๋ ์ˆ˜๋ก ๋Œ€ํ™” ํ’ˆ์งˆ์€ ๋” ์ข‹์•„์งย Fig 3
  • RQ2ย ์‹ค์ œ๋กœ ๋ฏผ๊ฐํ•˜๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ robustํ•˜๊ฒŒ ํ•  ์ˆ˜ ์žˆ์„๊นŒ? coherence ๋†’์Œ >ย Tab 3ย Tab 4
    • ์ œ์•ˆ ๋ฐฉ์‹์ฒ˜๋Ÿผ turn ๋‹จ์œ„๋กœ ์ƒ์„ฑํ•˜๋ฉด์„œ profile์— order ์ฃผ๊ณ  instruction ์ถ”๊ฐ€ํ•ด์ฃผ๋ฉด ๊ฐ€์žฅ ์ข‹์€ ํšจ๊ณผ

Personal note. 8์›”์— ๊ต๋ฅ˜๊ฐ€ ์žˆ๋Š” CMU ์ดํ™˜ํฌ ๊ต์ˆ˜๋‹˜ ์—ฐ๊ตฌ์‹ค ๋…ผ๋ฌธ์ค‘์— dialogue personalization ๋“ฑ๊ณผ ๊ด€๋ จํ•œ ๋…ผ๋ฌธ์ด ๋ˆˆ์— ๋„์–ด์„œ ๋ดค์Šต๋‹ˆ๋‹ค. prompt-level์˜ ์ œ์•ˆ์ด๋ผ ๋ฌด์ฒ™ ๋‹จ์ˆœํ•˜์ง€๋งŒ ๊ทธ ๋‹จ์ˆœํ•จ ๋•๋ถ„์— ์‹ค์šฉ์ ์ธ๊ฒŒ ๊ฐ•์กฐ๋œ ๊ฒƒ ๊ฐ™๊ณ , ๋‚˜๋ฆ„ ํ‹ˆ์ƒˆ๋ฅผ ์ž˜ ๋…ธ๋ฆฐ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.