2 minute read

Meta info.

TL; DR

๊ฒฝ๋Ÿ‰ memorizer์™€ full-page store + deep research๋กœ Just-In-Time memory ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ, ๊ธฐ์กด ์‚ฌ์ „์••์ถ• (static) ๋ฉ”๋ชจ๋ฆฌ ๋Œ€๋น„ ๋‹ค์–‘ํ•œ long-term + multi-hop ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋‹ฌ์„ฑ

image 1 image 2 image 3 image 4 image 5 image 6 image 7 image

Background

  • session history๋ฅผ ๊ธธ๊ฒŒ ์ถ•์ ํ•˜๋”๋ผ๋„ context window ํ•œ๊ณ„, ๋น„์šฉ๋ฌธ์ œ, context rot๋“ฑ์—์„œ ๊ฐˆ๋“ฑ
  • AOT(Ahead-of-Time) memory: ๊ธฐ์กด ๋ฉ”๋ชจ๋ฆฌ์‹œ์Šคํ…œ์€ ์ „์ฒด ํžˆ์Šคํ† ๋ฆฌ๋ฅผ ์˜คํ”„๋ผ์ธ์—์„œ ์••์ถ• > test-time์—์„œ ์ด ์••์ถ• ๋ฉ”๋ชจ๋ฆฌ๋งŒ ํ™œ์šฉ
    • A-mem, Mem0, MemoryOS, LightMem ๋“ฑ

Problem States

์ „์ฒด ๋‚ด์šฉ์„ ์ €์žฅํ•˜๋˜, ์‹ค์ œ ๋‹ต๋ณ€์‹œ์—๋Š” retrieval + reasoning์œผ๋กœ ์ตœ์ ์˜ ์ปจํ…์ŠคํŠธ๋ฅผ ์ฆ‰์„์œผ๋กœ(just-in-time) ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์„๊นŒ?

  • AOT๋Š” ์ •๋ณด์†์‹ค์ด ํฌ๋‹ค : test ์‹œ์ ์— ํ•„์š”ํ•œ ์ž‘์€ ์ •๋ณด๋“ค์ด ์ด๋ฏธ ์š”์•ฝ๊ณผ์ •์—์„œ ์†Œ์‹ค
  • ์ •์  ๊ตฌ์กฐ๋ผ ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•œ ์š”์ฒญ์— ๋Œ€์‘ ์–ด๋ ค์›€: ๋™์  ์ •๋ณด ์กฐํ•ฉ ๋ฐ ํƒ์ƒ‰์ด ํ•„์š”ํ•  ๋•Œ ๋Œ€์‘ ๋ถˆ๊ฐ€
  • heuristc ์˜์กด๋„๊ฐ€ ๋†’๊ณ  generalization ๋ถ€์กฑ: chunk ํฌ๊ธฐ, summary ๋ฐฉ์‹, category ๋“ฑ์˜ ๊ตฌ์กฐ๊ฐ€ ์—ฐ๊ตฌ์ž ์„ค๊ณ„์— ์˜์กด

Suggestions

General Agentic Memory (GAM)

  • memorizer + researcher์˜ 2๊ฐœ agent ๊ตฌ์„ฑ
  • memory system: (def) task์™€ history๋ฅผ ๋ฐ›์•„ downstream ์„ฑ๋Šฅ์€ ์ตœ๋Œ€๋กœ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ฐ€์žฅ ์งง์€ context c๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ์‹œ์Šคํ…œ
  • memorizer: session๋งˆ๋‹ค page(s) ์ƒ์„ฑ. ๊ฐ page์—๋Š” header๋ผ๋Š” ์š”์•ฝ(๊ฒ€์ƒ‰ํ’ˆ์งˆ ํ–ฅ์ƒ๋ชฉ์ )์ด๋ž‘ session์˜ ๋ณธ๋ฌธ ์ „์ฒด๋ฅผ ๋‹ด๊ณ  ํŽ˜์ด์ง€๋Š” ์ตœ๋Œ€ 2048-token์œผ๋กœ ๊ตฌ์„ฑ. session ์ „์ฒด์— ๋Œ€ํ•ด์„œ๋Š” memo (์ด ์—ญ์‹œ ์š”์•ฝ) ์ƒ์„ฑ
  • researcher: planning > searching + integration > reflection
    • planning: ์–ด๋–ค retriever(BGE-M3 dense, BM25, Page ID ๊ฒ€์ƒ‰)๋ฅผ ์–ด๋–ค ๊ฒ€์ƒ‰์–ด(query)๋กœ ๊ฒ€์ƒ‰ํ• ์ง€ ๊ณ„ํš
      • Page ID ๊ฒ€์ƒ‰: ์ œ์•ˆ ๋ฐฉ์‹์œผ๋กœ, ์ง์ ‘์ ์œผ๋กœ session id๋ฅผ memo์—์„œ ์–ธ๊ธ‰ํ•˜๋Š” ๋ฐฉ์‹, ๊ทธ session์„ retrieval์—์„œ ํ™œ์šฉ
    • searching + integration: ๊ณ„ํš๋œ ๋„๊ตฌ ํ˜ธ์ถœํ•˜์—ฌ ์–ป์€ ํŽ˜์ด์ง€๋“ค๋กœ IntegrateAgent๊ฐ€ ๊ฐ ํŽ˜์ด์ง€์—์„œ ์š”์ฒญ ๊ด€๋ จ ์ •๋ณด๋งŒ ์ถ”์ถœ + ์š”์•ฝ > ๋ˆ„์  I(ํ†ตํ•ฉ๋ฉ”๋ชจ๋ฆฌ) ๊ฐฑ์‹ 
    • reflection: InfoAgent๊ฐ€ โ€œ์ด์ œ ๋‹ต๋ณ€ ๊ฐ€๋Šฅํ•œ๊ฐ€?โ€์— ๋Œ€ํ•ด ํŒ๋‹จ. ๋ถ€์กฑํ•˜๋‹ค๋ฉด FollowUpRequestAgent๊ฐ€ ํ›„์† ์ฟผ๋ฆฌ ์งˆ๋ฌธ ์ƒ์„ฑ > ๋‹ค์‹œ planning๋ถ€ํ„ฐ (๋ฐ˜๋ณต ์ˆ˜๋Š” hyper parameter)
    • ๊ฐ•ํ™”ํ•™์Šต: memorizer์™€ researcher๋ฅผ policy๋กœ ๋‘๊ณ  downstream reward๋ฅผ ์“ฐ๋Š” policy gradient update ์„ค๊ณ„ (ํ›„์† ์—ฐ๊ตฌ๋กœ ์–ธ๊ธ‰)
      • memorizer: summary/page-store ๊ตฌ์„ฑ๋ฐฉ์‹์„ task-oriented๋กœ ํ•™์Šต
      • researcher: planning & retrieval ์ „๋žต์„ ๋ณด์ƒ๊ธฐ๋ฐ˜์œผ๋กœ ์ตœ์ ํ™”

Effects

  • Experiment setup
    • backbone: gpt-4o-mini(128k), Qwen2.5-14B(128k)
    • retriever: BGE-M3, BM25, Page-ID
    • baselines:
      • Long-LLM(์›๋ฌธ ์ „์ฒด ๋„ฃ๊ธฐ), RAG
      • A-Mem, Mem0, MemoryOS, LightMem
  • Results
    • tab 1 Main Results
      • LoCoMo(๋Œ€ํ™” ์žฅ๊ธฐ๊ธฐ์–ต): GAM์ด ๋ชจ๋“  baseline์„ ๋ช…ํ™•ํ•˜๊ฒŒ ์••๋„
        • temporal: mem0 48.93 > GAM 59.45
        • ODQA: mem0 28.64 > GAM 33.3
      • HotpotQA: ๊ธธ์ด๊ฐ€ ๊ธธ์–ด์งˆ์ˆ˜๋ก baseline ์„ฑ๋Šฅ ํ•˜๋ฝ ๋Œ€๋น„ GAM์€ ํฐ ๊ฒฉ์ฐจ๋กœ ์šฐ์œ„
      • RULER(synthetic long-context): multi-hop tracing์—์„œ Long-LLM 60.6 ๋Œ€๋น„ GAM 93.2 ๋“ฑ
    • tab 2ย scaling ํšจ๊ณผ
      • memorizer: 0.5B์—์„œ 32B๋กœ ํ‚ค์›Œ๋“œ ์„ฑ๋Šฅํ–ฅ์ƒ ์ œํ•œ์  (์ž‘์€๋ชจ๋ธ๋กœ ์š”์•ฝ์€ ์ถฉ๋ถ„ํ•˜๋‹ค๊ณ  ๋ถ„์„)
      • researcher: 0.5B์—์„œ๋Š” ๋ถˆ๊ฐ€, ์ตœ์†Œ 14B์—์„œ 32B๋Š” ๋˜์–ด์•ผ ์„ฑ๋Šฅ (์ƒ๋Œ€์ ์œผ๋กœ ๊ณ„ํš, ํƒ์ƒ‰๋“ฑ์€ ๋” ์–ด๋ ค์šด ์ž‘์—…์œผ๋กœ ๋ถ„์„)
    • fig 2ย test-time scaling
      • reflection depth ์ถ”๊ฐ€: 1> 5 ๋Š˜๋ฆด์ˆ˜๋ก ์ฆ๊ฐ€ (3~4์—์„œ ์ˆ˜๋ ด)
      • retrieved pages๋„ ๋Š˜๋ฆด์ˆ˜๋ก ๋†’์€ ์„ฑ๋Šฅ
    • ablationย tab 3ย tab 4
      • BM25 ๋‹จ๋…์ด dense ๋‹จ๋…๋ณด๋‹ค๋Š” ๋‚ซ๊ณ  ๋‹ค ์กฐํ•ฉํ•ด์•ผ ์ตœ๊ณ ์„ฑ๋Šฅ
      • memorizer๋‚˜ researcher ๋ชจ๋‘ ์œ ์˜ํ•œ ๋ชจ๋“ˆ์ด๊ณ 
      • header+full-page๊ฐ€ ์ตœ๊ณ ์„ฑ๋Šฅ (vs. summary-only, summary+snippets)
    • tab 5ย ํšจ์œจ์„ฑ
      • online์ด๋ฉด ๋‹น์—ฐํžˆ AOT๋ณด๋‹ค GAM์ด ๋” ๋น„์šฉ์ด ํฌ์ง€๋งŒ ์ „์ฒด offline + online๋น„๊ต์‹œ ์ œ์•ˆ๋ฐฉ์‹์ด ์šฐ์œ„, ์„ฑ๋Šฅ ์—ญ์‹œ ์šฐ์ˆ˜ํ•จ ํ™•์ธ

Personal note. ์ œ์•ˆ ๋‚ด์šฉ์ด ๋˜‘๋ถ€๋Ÿฌ์ง„ ๊ฒƒ ๊ฐ™์ง„ ์•Š์ง€๋งŒ, ์ธํŠธ๋กœ๊ฐ€ ํฅ๋ฏธ๋กœ์›Œ์„œ ๋ดค์Šต๋‹ˆ๋‹ค. ์ƒ๋Œ€์ ์œผ๋กœ researcher๊ฐ€ ๋ฌด๊ฑฐ์›Œ๋ณด์ด๊ณ , ๋™์ ์œผ๋กœ ๊ทธ๋•Œ๊ทธ๋•Œ context ๊ฐ€์ ธ๋‹ค ์“ฐ๊ฒ ๋‹ค๋Š” ํ๋ฆ„๋„ ์ฐธ์‹ ํ•œ ์•„์ด๋””์–ด๋Š” ์•„๋‹Œ ๊ฒƒ ๊ฐ™์•„์š”.