less than 1 minute read

Meta info.
  • Authors: Pei Zhou, Jay Pujara, Xiang Ren, Xinyun Chen, Heng-Tze Cheng, Quoc V. Le, Ed H. Chi, Denny Zhou, Swaroop Mishra, Huaixiu Steven Zheng
  • Paper: https://arxiv.org/pdf/2402.03620.pdf
  • Affiliation: Google DeepMind, USC
  • Published: February 6, 2024

TL; DR

๋ธ์ด ์—ฌ๋Ÿฌ reasoning techniques(CoT, critical thinking, ...) ์ค‘์—์„œ ํ•˜๋‚˜๋ฅผ ์Šค์Šค๋กœ ์„ ํƒํ•˜์—ฌ task๋ณ„๋กœ ์ ํ•ฉํ•œ ์ถ”๋ก  ์ „๋žต์„ ๊ตฌ์„ฑํ•˜๋„๋ก ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ. BBH์—์„œ ๋‹จ์ˆœ CoT๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹๊ณ  CoT Self-consistency๋ณด๋‹ค๋„ ์ถ”๋ก  ์—ฐ์‚ฐ์ด 10~40x ๋œ ๋“ ๋‹ค๊ณ . sLLM์—์„œ ๋” ์ž˜๋œ๋‹ค๊ณ  ์–ธ๊ธ‰.

Untitled

Untitled

Untitled

Suggestions

  • stage1: task level์—์„œ ์ถ”๋ก  ๊ตฌ์กฐ ์„ ํƒ. ์ „์ฒด key-value format ์ƒ์„ฑ
    • 3๊ฐ€์ง€ meta-prompt
      • select: proper framework (CoT, critical thinking, โ€ฆ)
      • adapt:ย rephrase for specific task
      • implement: actionable to fill the values
  • stage2: value-filling ํฌ๋งท์œผ๋กœ instance level solving

Personal note. ์‚ฌ์ „ํ•™์Šต์ด๋‚˜ label์„ ์ฃผ์ง€ ์•Š๋Š” ์ ์ด orca ํ•™์Šต๋ฐฉ์‹๊ณผ ์ฐจ์ด์ธ ๋“ฏ ํ•ฉ๋‹ˆ๋‹ค.(Orca๋Š” ์ž‘์€ ๋ชจ๋ธ instruction tuningํ•  ๋•Œ ์–ด๋–ค instruction์„ ์ค„ ์ง€ ๋ชจ๋ธ์ด ์„ ํƒํ•˜๊ฒŒ ํ•˜๋Š” ๊ฑฐ์˜€๊ณ , self-discover๋Š” ๋น„์Šทํ•œ ๋ฐฉ์‹์„ icl์—์„œ ์ ‘๊ทผ.)