less than 1 minute read

Meta info.

TL; DR

multi-layer๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ transformer ๊ณ„์—ด ๋ชจ๋ธ์—์„œ prompt๊ฐ€ ๋’ค์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์žŠํ˜€์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” DualLoRA ์ œ์•ˆ

Untitled

Untitled

Untitled

Untitled

Problem States

DST์˜ ๋น„์šฉ์ง‘์•ฝ์ ์ธ ๋ฐ์ดํ„ฐ ๋ ˆ์ด๋ธ”๋ง์ด๋‚˜ tuning ์—†์ด ๋‚ฏ์„  ๋„๋ฉ”์ธ ์ฒ˜๋ฆฌ์˜ ์–ด๋ ค์›€.

Suggestions

DualLoRA(pic1ย pic3) ์ œ์•ˆ โ†’ prompt ์˜ํ–ฅ ๋ณ„๋„๋กœ ๋‘๋ฉด์„œ ๋ชจ๋ธ ์ „ ๋ ˆ์ด์–ด์— ์ ์šฉ

  • ์›๋ณธ dialogue context ์ฒ˜๋ฆฌ LoRA: prompt + context ํ†ตํ•ฉ
  • prompt ์ตœ์ ํ™” LoRA : ์ผ์ข…์˜ prompt tuning, slot prompt ์˜ํ•ด ๋ฐœ์ƒ๋˜๋Š” ์ดˆ๊ธฐ ๋…ธ์ด์ฆˆ ์™„ํ™” ๋ชฉ์ ์œผ๋กœ $B_p$๋Š” 0์œผ๋กœ ์ดˆ๊ธฐํ™”. ($A_p$๋Š” Gaissian)
  • pic2: ๋™์ผํ•œ ๋„๋ฉ”์ธ์˜ slot embedding์€ ์ฐจ์ด๊ฐ€ ๊ฑฐ์˜ ์—†์Œ + ํ•™์Šต ์ดˆ๊ธฐ์— ํŠนํžˆ ๋ชจ๋ธ์˜ ์‚ฌ์ „ํ•™์Šต ์ง€์‹๊ณผ ์Šฌ๋กฏ ์ •๋ณด๊ฐ€ ์ถฉ๋Œ๋˜๊ธฐ ๋•Œ๋ฌธ (๋ชจ๋ธ์ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž˜๋ชป ํ•™์Šตํ•  ์—ฌ์ง€)
  • backbone: T5-style (๊ฐ attention layer์™€ ๊ฒฐํ•ฉํ•˜๋Š” ๊ตฌ์กฐ)

Effects

MultiWoZ ๋ฐ SGD์—์„œ SOTA (pic4ย )

Personal note. ์•„๋งˆ ๋Œ€๋‹จํžˆ ์ƒˆ๋กœ์šด ๋‚ด์šฉ์€ ์•„๋‹ ์ˆ˜ ์žˆ๊ฒ ์ง€๋งŒ, DST ํฌํ•จ ToD์”ฌ์—์„œ ํฌ์ง€ ์•Š์€ ๋ชจ๋ธ์„ ์„ ํ˜ธํ•˜๋Š” ์ธก๋ฉด์—์„œ ํŠนํžˆ ์œ ํšจํ•œ ์ ‘๊ทผ์œผ๋กœ ๋ณด์—ฌ์ง‘๋‹ˆ๋‹ค.