1 minute read

Meta info.
  • Authors: Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, Ningyu Zhang
  • Paper: https://arxiv.org/pdf/2510.18866
  • Affiliation: NUS, Zhejiang Univ
  • Published: October 21, 2025

TL; DR

sensory > topic-aware short-term > sleep-time long-term memory ์—…๋ฐ์ดํŠธ์˜ 3๋‹จ๊ณ„ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ์ œ์•ˆ, LongMemEval ์ •ํ™•๋„ ํ–ฅ์ƒ ๋ฐ token/API call/runtime ๋น„์šฉ ๋Œ€ํญ ์ถ•์†Œ ํ™•์ธ

image 1 image 2 image 3 image 4 image

Background

  • long-term/multi-turn dialogue์˜ context ์†์‹ค ๋ฌธ์ œ ํ•ด์†Œ๋ฅผ ์œ„ํ•œ external memory ๊ตฌ์กฐ ๋„์ž…: LangMem, A-MEM, MemoryOS, Mem0
  • ์ค‘๋ณต + noise, turn-local processing์—์„œ topic entanglement, ๋น„์šฉ ์ง‘์•ฝ์  ๋ฌธ์ œ

Problem States

์ค‘๋ณต ์ •๋ณด ํ•„ํ„ฐ๋ง + topic๋ณ„ history ๊ด€๋ฆฌ + ์‹ค์‹œ๊ฐ„ ์ถ”๋ก ๊ณผ update/forget ๋ถ„๋ฆฌ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ ์ œ์•ˆ

Suggestion

LightMem

  • Sensory memory
    • pre-compression: ํŒŒ์ดํ”„๋ผ์ธ ์ง„์ž… ์ „ LLM-lingua2 ํ™œ์šฉ input ์••์ถ•
    • topic cueing: ๋œ ์ค‘์š”ํ•œ ํ† ํฐ ์ œ๊ฑฐํ•˜์—ฌ dialogue history ๋ถ„ํ• ์„ ์œ„ํ•œ attention score ๊ณ„์‚ฐ
      • attention ๊ธฐ๋ฐ˜ ๊ฒฝ๊ณ„์™€ ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ๊ฒฝ๊ณ„์˜ ๊ต์ง‘ํ•ฉ์œผ๋กœ ๊ฒฐ์ •
  • Topic-awareย short-term memoryย (Light2)
    • ๋ฐœํ™”๋ฅผ ์˜๋ฏธ/์ฃผ์ œ ์œ ์‚ฌ์„ฑ์— ๋”ฐ๋ผ content-adaptive boundaries(fix-window ๋Œ€์‹ )๋กœ ๋ถ„ํ•  > memory item์œผ๋กœ ์š”์•ฝ (online update)
  • Long-term memoryย w/sleep-time update (Light3)
    • CRUD. offline์—์„œ ์žฌ๊ตฌ์„ฑ/์ค‘๋ณต ์ œ๊ฑฐ/์ถ”์ƒํ™” ์ˆ˜ํ–‰: ์œ ์ง€ ๊ด€๋ฆฌ๋ฅผ ์˜จ๋ผ์ธ ์ถ”๋ก ๊ณผ ๋ถ„๋ฆฌ
    • ์œ ์ต์„ฑ (/S 4.6)
      • ์„œ๋กœ ๊ด€๋ จ๋˜์ง€๋งŒ ๋ชจ์ˆœ๋˜์ง€ ์•Š๋Š” ๋‘ ์ •๋ณด๊ฐ€ ์ œ์‹œ๋  ๋•Œ LLM์€ ์ด๋ฅผ ์ถฉ๋Œ๋กœ ์˜คํ•ด
        • ์˜ค๋ž˜๋œ ๋ฉ”๋ชจ๋ฆฌ ์‚ญ์ œํ•  ๊ฒฝ์šฐ irreversible information loss๋ฅผ ์•ผ๊ธฐ
        • ์ •๋ณด๋ฅผ ๋ณ‘ํ•ฉํ•˜๊ฑฐ๋‚˜ ๋‹จ์ˆœํžˆ ์ƒˆ ํ•ญ๋ชฉ์„ ์ถ”๊ฐ€ํ•  ์ˆ˜๋„ ์žˆ๊ฒ ์ง€๋งŒ,
      • ์ œ์•ˆ ๋ฐฉ์‹์€ test-time์—์„œ soft update๋กœ incremental addition๋งŒ ์ˆ˜ํ–‰ (ST) > global information ์œ ์ง€ ๊ฐ€๋Šฅ

Effects

  • Experiment setup:
    • Task: LongMemEval-S (500-query, ํ‰๊ท  50-sess., 110k-token)
    • Backbone: GPT-4o-mini, Qwen3-30B-A3B
    • Baseline: Full Text/Naive RAG/LangMem/A-MEM/MemoryOS/Mem0
    • Metrics: Accuracy, token/call/runtime ํšจ์œจ
  • Results: LightMe๋Š” ๊ฑฐ์˜ ๋ชจ๋“  ์ง€ํ‘œ์™€ ๋‘ LLM ๋ฐฑ๋ณธ ๋ชจ๋‘์—์„œ ์šฐ์ˆ˜+๊ฒฌ๊ณ +์œ ์—ฐ ์ž…์ฆ
    • accuracy:
      • ST online ๊ธฐ์ค€ ์ •ํ™•๋„ 2.7-9.65%p ํ–ฅ์ƒ
      • LT offline์€ ๋น„์Šท
    • efficiency: token์€ 32-106๋ฐฐ๊นŒ์ง€, api call์€ ์ตœ๋Œ€ 177๋ฐฐ๊นŒ์ง€ ๋‹จ์ถ• ๊ฐ€๋Šฅ
    • temporal/multi-session/knowledge-update ์—์„œ ํŠนํžˆ ์œ ์ต.
      • ๋‹จ์ผํ™”์ž์—์„œ๋Š” naive RAG๊ฐ€ ์ถฉ๋ถ„ํžˆ ๊ฐ•๋ ฅํ•˜๊ธฐ๋„

Personal note. ์ž‘์„ฑ์ผ ๊ธฐ์ค€ ์–ด์ œ๋ถ€ํ„ฐ ์ฃผ๋ชฉ๋ฐ›๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ์—”์ง€๋‹ˆ์–ด๋ง์ ์œผ๋กœ ํฐ ํšจ์œจ์„ ๋‹ฌ์„ฑํ•œ ๊ฒŒ ์œ ์˜๋ฏธํ•ด๋ณด์ด๊ณ , ์ •ํ™•๋„ ์ธก๋ฉด์—์„œ gain์€ ํฌ์ง€ ์•Š์€ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. compression๋“ฑ๋„ llmlingua๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ก ์  ๋ฐฑ๊ทธ๋ผ์šด๋“œ๋ฅผ ์—„๋ฐ€ํ•˜๊ฒŒ ๊ฐ€์ ธ์˜จ ๊ฒƒ์€ ์•„๋‹ˆ๊ณ , ๋‹ค๋งŒ ๊ทธ ๊ตฌ๋ถ„ ๊ฒฝ๊ณ„๋ฅผ ๊ฒฐ์ •ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค. conflict์— ๋Œ€ํ•ด ์–ธ๊ธ‰๋œ ๊ฒŒ ์ธ์ƒ์ ์ด์ง€๋งŒ, ๊ด€๋ จํ•ด์„œ ์–ด๋–ค ํ•ด๊ฒฐ์„ ๊พ€ํ–ˆ๋‹ค๊ธฐ ๋ณด๋‹ค๋Š”, ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ํ•ด๊ฒฐ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์—ฌ์ง€๋ฅผ ์ฃผ๋Š” ๋ฐฉ๋ฒ•์ด๋ผ๋Š” ๊ฑธ ์„ค๋ช…ํ•œ ์ •๋„๋กœ ๋ณด์—ฌ์š”.