Tags reflect my interests at the time of writing, so some keywords may be missing.

activation 3 posts

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

1 minute read

Softmax๋ฅผ Sigmoid์™€ ์ƒ์ˆ˜ bias (sequence length๊ธฐ๋ฐ˜)๋กœ ๋Œ€์ฒดํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์œผ๋กœ attention ์—ฐ์‚ฐ ์†๋„๋ฅผ 18%๊ฐ€๋Ÿ‰ ํ–ฅ์ƒ์‹œํ‚จ FLASHSIGMOID ์ œ์•ˆ

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

1 minute read

๊ธฐ์กด vanilla ReLU๋ฅผ jumpReLU๋ผ๋Š” ๋น„์—ฐ์† activation์œผ๋กœ ๋Œ€์ฒดํ•˜์—ฌ ์ƒˆ๋กœ์šด SAE (sparse autoencodesr) SOTA, ๋น„์—ฐ์†์ ์ธ activation ์‚ฌ์šฉํ•˜์ง€๋งŒ straight-through estimator๋กœ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šต

adaptor 2 posts

Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation

less than 1 minute read

multi-layer๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ transformer ๊ณ„์—ด ๋ชจ๋ธ์—์„œ prompt๊ฐ€ ๋’ค์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์žŠํ˜€์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” DualLoRA ์ œ์•ˆ

agent 10 posts

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

3 minute read

Long-horizon LLM agents์˜ context window bottleneck ํ•ด๊ฒฐ์„ ์œ„ํ•ด, ๊ตฌ์กฐํ™”๋œ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ Indexed Experience Memory์™€ ์ด๋ฅผ ํ•™์Šตํ•˜๋Š” MemexRL ์ œ์•ˆ

MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks

2 minute read

multi-session + interdependent subtask ํ™˜๊ฒฝ์˜ Memory-Agent-Environment loop๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” benchmark๋ฅผ ์ œ์•ˆํ•˜๊ณ , ๊ธฐ์กด memory system์ด ์‹ค์ œ agentic setting์—์„œ ๋งค์šฐ ์ทจ์•ฝํ•จ์„ ์‹ค์ฆ

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

2 minute read

memory consolidation๊ณผ reasoning์„ ํ•˜๋‚˜์˜ internal state๋กœ ํ†ตํ•ฉํ•˜๋„๋ก RL ํ•™์Šตํ•˜์—ฌ long-horizon task์—์„œ ๊ฑฐ์˜ ์ผ์ •ํ•œ context size ์œ ์ง€ํ•˜๋ฉฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

SimpleMem: Efficient Lifelong Memory for LLM Agents

3 minute read

LLM Agent์˜ LTM์„ semantic lossless compression์œผ๋กœ ์žฌ์ •์˜ํ•˜๊ณ , write-time ๊ตฌ์กฐํ™”ยทonline synthesisยทintent-aware retrieval๋กœ ์„ฑ๋Šฅ๊ณผ ํ† ํฐ ํšจ์œจ(์ตœ๋Œ€ 30๋ฐฐ)์„ ๊ฐœ์„ ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

3 minute read

long-horizon task์—์„œ ๋ฐœ์ƒํ•˜๋Š” planning ์‹คํŒจ์˜ ํ•ต์‹ฌ ์›์ธ์„ entanglement๋กœ ๊ทœ์ •, ์ด๋ฅผ subtask ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌ๋œ DAG ๊ธฐ๋ฐ˜ planning์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์„ ์ œ์•ˆ, ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ํ† ํฐ ์ ˆ๊ฐ์—์„œ ์œ ์˜

Adaptation of Agentic AI

2 minute read

agentic AI ์—ฐ๊ตฌ์—์„œ adaptation์ด๋ผ๋Š” ๊ฐœ๋…์ด ํ˜ผ์šฉ๋˜์–ด์™”๊ณ , ์ฒด๊ณ„์ ์ธ ์‹œ์Šคํ…œ ์ˆ˜์ค€ ์„ค๊ณ„ ๋ฐ ๋น„๊ต๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด adaptation ๋Œ€์ƒ(agent vs tool)๊ณผ adaptation์„ ์œ ๋„ํ•˜๋Š” ์‹ ํ˜ธ๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๋ถ„๋ฅ˜ ์ฒด๊ณ„ ์ œ์•ˆ

Budget-Aware Tool-Use Enables Effective Agent Scaling

2 minute read

ํˆด ํ˜ธ์ถœ ์˜ˆ์‚ฐ์„ ๋‹จ์ˆœํžˆ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋Š” ์—์ด์ „ํŠธ ์„ฑ๋Šฅ์ด ์Šค์ผ€์ผ(TTS)๋˜์ง€ ์•Š์œผ๋ฉฐ, ์˜ˆ์‚ฐ์„ ๋ช…์‹œ์ ์œผ๋กœ ์ธ์‹ํ•˜๋„๋ก ํ•˜๋Š” Budget Tracker์™€ BATS ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋„์ž…ํ•˜๋ฉด ๋น„์šฉ ๋Œ€๋น„ ์„ฑ๋Šฅ ์Šค์ผ€์ผ๋ง๊ณผ Pareto frontier๊ฐ€ ํฌ๊ฒŒ ๊ฐœ์„ ๋œ๋‹ค.

ai-detection 3 posts

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

less than 1 minute read

๋ณ„๋„ ํ•™์Šต์ด๋‚˜ ํŠœ๋‹ ์—†์ด ํ•œ ์Œ์˜ pretrained LLM์œผ๋กœ ๊ฐ„๋‹จํžˆ ๊ณ„์‚ฐ๋งŒ ํ•˜๋ฉด machine generated text๋ฅผ ํƒ์ง€ํ•ด๋‚ด๋Š” ๋ฐฉ๋ฒ•๋ก  Binoculars ์ œ์•ˆ. ์ƒ์„ฑ๋œ sample 90% ์ด์ƒ ํƒ์ง€(pic1)

alignment-learning 6 posts

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

2 minute read

LLM์—์„œ์˜ ๊ฐœ์ธํ™”/๋‹ค์›์  ์„ ํ˜ธ ์ •๋ ฌ์„ training/test-time, ์‚ฌ์šฉ์ž ๋ชจ๋ธ๋ง ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์œผ๋กœ ์ฒด๊ณ„ํ™”, ํ‰๊ฐ€ ๋ฐ ํ™•์žฅ์„ฑ ์ธก๋ฉด์˜ ๊ตฌ์กฐ์  ํ•œ๊ณ„ ํ™•์ธ

The Differences Between Direct Alignment Algorithms are a Blur

1 minute read

Direct Alignment Algorithms (DAAs)์˜ ๊ตฌ์กฐ์  ์ฐจ์ด ๋ถ„์„, RL ์—†์ด๋„ DPO ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ ๋‹ฌ์„ฑ ๊ฐ€๋Šฅ์„ฑ ์‹œ์‚ฌ

Alignment Faking in Large Language Models

2 minute read

alignment learning์ค‘์— LLM์€ objective๋ฅผ ๋”ฐ๋ฅด๋Š” ์ฒ™ ํ•˜์ง€๋งŒ, ์‚ฌ์‹ค์€ ์›๋ž˜ pretraining์—์„œ๋ถ€ํ„ฐ ๊ฐ–๊ณ  ์žˆ๋˜ ์„ ํ˜ธ(์ž๊ธฐ ์„ ํ˜ธ)๋ฅผ ์žƒ๊ธฐ ์‹ซ๊ธฐ ๋•Œ๋ฌธ์—, training์ค‘์—๋งŒ alignment๋œ ์ฒ™ ์œ„์žฅํ•˜๋Š” Alignment Faking ๋ฐœ์ƒ ํ˜„์ƒ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ

Planning Like Human: A Dual-process Framework for Dialogue Planning

1 minute read

์ต์ˆ™ํ•œ ์ƒํ™ฉ์„ ์ฒ˜๋ฆฌํ•˜๋Š” intuitive (fast) ์ •์ฑ… ๋ชจ๋ธ๊ณผ ์ƒˆ๋กœ์šด ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์œ„ํ•œ analytical (slow)์˜ ์ •์ฑ… ๋ชจ๋ธ์„ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์ค‘ dialogue planning ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Scaling Laws for Reward Model Overoptimization

less than 1 minute read

RM์œผ๋กœ Policy model์„ ํ•™์Šตํ•˜๋ฉด ํ•™์Šตํ• ์ˆ˜๋ก real (human) preference์™€ ๊ฒฉ์ฐจ๊ฐ€ ๋ฒŒ์–ด์ง€๋Š” overoptimization์ด (๋ฐ˜๋“œ์‹œ) ๋ฐœ์ƒ๋˜๋ฉฐ, ์ด ํ˜„์ƒ์˜ ๋„๋‹ฌ์„ ๋Šฆ์ถ”๋Š”(?) ๋ฐ์—๋Š” RM์˜ ์‚ฌ์ด์ฆˆ๋ฅผ ํ‚ค์šฐ๋Š”๊ฒŒ ์œ ์˜ํ•œ ์˜ํ–ฅ์„ ๋ผ์น˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž„.

attention 2 posts

Differential Transformer

1 minute read

Q/K๋ฅผ ๊ฐ๊ฐ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด 2๊ฐœ์˜ softmax attention map๊ฐ„ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ, relevant context์— ๋Œ€ํ•œ attention์„ ํ‚ค์šฐ๊ณ  ๋…ธ์ด์ฆˆ๋Š” ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ์‹์˜ transformers ๋ณ€ํ˜• ์ œ์•ˆ, hallucination ๊ฐœ์„ 

Selective Attention Improves Transformer

1 minute read

attention ์—ฐ์‚ฐ์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€๊ฒฝ ์—†์ด, ์ƒ์„ฑ๋œ token์ด ๋‹ค๋ฅธ token์ด ๋”์ด์ƒ ํ•„์š” ์—†๋‹ค๊ณ  ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฒ˜๋ฆฌ, ๋ฏธ๋ž˜ ์‹œ์ ์—์„œ๋Š” ํ•ด๋‹น token์ด ๋ถˆํ•„์š”ํ•˜๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๋˜ token๋“ค์— ๋Œ€ํ•œ attention์„ ์ค„์ด๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰๊ณผ ๊ณ„์‚ฐ ๋น„์šฉ์„ ...

benchmark 10 posts

MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks

2 minute read

multi-session + interdependent subtask ํ™˜๊ฒฝ์˜ Memory-Agent-Environment loop๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” benchmark๋ฅผ ์ œ์•ˆํ•˜๊ณ , ๊ธฐ์กด memory system์ด ์‹ค์ œ agentic setting์—์„œ ๋งค์šฐ ์ทจ์•ฝํ•จ์„ ์‹ค์ฆ

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs

2 minute read

multi-turn setup์—์„œ์˜ ๋‚œ์ œ 4๊ฐ€์ง€ (Instruction Retention, Inference Memory, Reliable Versioned Editing, Self-Coherence)๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ์ œ์•ˆ, ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ์— ์„ฑ๊ณตํ•˜๋Š” ์ตœ์‹  SOTA ๋ชจ๋ธ๋“ค๋„ ์ œ์•ˆ...

TO CHAT OR TASK: a Multi-turn Dialogue Generation Framework for Task-Oriented Dialogue Systems

1 minute read

chitchat๊ณผ task request๊ฐ€ ๊ฒฐํ•ฉ๋œ multi-turn dialogue ์ž๋™ ๊ตฌ์ถ•ํ•˜๋Š” framework CTFUSION ์ œ์•ˆ, ์ด๋ฅผ ํ™œ์šฉํ•ด ๋งŒ๋“  IVSR-CTF ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ICS ๋ชจ๋ธ์ด ๊ธฐ๋Šฅ ์˜๋„ ๋ถ„๋ฅ˜์—์„œ LLM์„ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ ๊ทธ ํšจ๊ณผ ํ™•์ธ

MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents

3 minute read

multi-scenario (participation & observation) + multi-level (factual & reflective) ๋ฉ”๋ชจ๋ฆฌ ์œ ํ˜• ํ†ตํ•ฉ, multi-metric evaluation๋ฅผ ์‚ฌ์šฉํ•˜๋Š” LLM-based agent์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ์ธ M...

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

1 minute read

ํ˜‘์—…์ /๊ฒฝ์Ÿ์  ์ƒํ™ฉ์—์„œ ์—์ด์ „ํŠธ๋ผ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ์‹œ์Šคํ…œ ํ‰๊ฐ€์— ๋Œ€ํ•œ ๋ฒค์น˜๋งˆํฌย MARBLEย ์ œ์•ˆ

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

1 minute read

(1) ์—ฌ๋Ÿฌ ๊ธธ์ด์˜ interval (2) ๋‹ค์–‘ํ•œ depth range๋ฅผ ๊ฐ€์ง„ (3) ์ ์ง„์ ์œผ๋กœ ์–ด๋ ค์›Œ์ง€๋Š” (4) 2 ์–ธ์–ด(์˜๋ฌธ/์ค‘๋ฌธ)์˜ long context ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” NeedleBench ์ œ์•ˆ ๋ฐ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋กœ ํ‰๊ฐ€ ๊ฒฐ๊ณผ ๋ฆฌํฌํŠธ

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

less than 1 minute read

๊ธฐ์กด RAG ๋ฒค์น˜๋งˆํฌ๋Š” ๋ฒ”์œ„์™€ ๋‹ค์–‘์„ฑ์ด ์ œํ•œ๋˜์–ด ์žˆ๊ณ , ๊ฒ€์ƒ‰ ์š”์†Œ(retriever)์™€ ์™ธ๋ถ€ KB์˜ ์˜ํ–ฅ์„ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•˜๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค๊ณ  ์ง€์ ํ•˜๋ฉฐ, RAG Application์˜ ๋ฒ”์œ„๋ฅผ CRUD๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ  ๊ฐ๊ฐ์— ๋Œ€ํ•œ ํ‰๊ฐ€ task์™€ ๋ฐ์ดํ„ฐ์…‹ ๊ณต๊ฐœ. (์ค‘๊ตญ์–ด)

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

less than 1 minute read

ODQA์—์„œ ๋ชจ๋ธ response๋ฅผ ๋” ์„ธ๋ถ„ํ™”๋œ ์ˆ˜์ค€์œผ๋กœ ๋‚˜๋ˆ ์„œ ์ •ํ™•์„ฑ ๋ฐ ์ •๋ณด์„ฑ ์ธก๋ฉด์—์„œ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” GRANOLA QA ๋ฒค์น˜๋งˆํฌ ๊ณต๊ฐœ ๋ฐ ๊ทธ ์„ธ๋ถ„ํ™”๋œ ์ •๋ณด์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ ๋””์ฝ”๋”ฉ ๋ฐฉ์‹ DRAG ์ œ์•ˆ

classification 1 posts
code 5 posts

LLM-Assisted Code Cleaning For Training Accurate Code Generators

less than 1 minute read

Code Generation ๋ชจ๋ธ ํ•™์Šต์‹œ ํ•™์Šต ๋ฐ์ดํ„ฐ=์ฝ”๋“œ๋ฅผ ๊ฐ€๋…์„ฑ ์ข‹๊ฒŒ ๋ฆฌํŒฉํ† ๋งํ•˜๋ฉด ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ํ›จ์”ฌ ์ข‹์•„์ง„๋‹ค.

data-selection 1 posts

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

1 minute read

instance level๋กœ ๊ดœ์ฐฎ์€ ๋ฐ์ดํ„ฐ๋งŒ ๊ณจ๋ผ ํ•™์Šตํ•˜๊ธฐ๋ณด๋‹ค, k-means clustering ํ™œ์šฉํ•œ Diversity-Centric Data Selection์ด LLM finetuning์˜ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์œ ์˜ํ•˜๋‹ค.

decoding 1 posts

WAIT, WAIT, WAITโ€ฆ Why Do Reasoning Models Loop?

2 minute read

Reasoning ๋ชจ๋ธ์˜ looping์€ decoding artifact๋งŒ์ด ์•„๋‹ˆ๋ผ learning errors๊ฐ€ greedy/low-temp์—์„œ ์ฆํญ๋˜๋ฉฐ ๋ฐœ์ƒ, temperature๋Š” loop๋ฅผ ์ค„์ด์ง€๋งŒ ๊ทผ๋ณธ ์›์ธ์„ ๊ณ ์น˜์ง€ ๋ชปํ•ด ๋ถˆํ•„์š”ํ•˜๊ฒŒ ๊ธด CoT๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

dialogue-system 18 posts

Flipping the Dialogue: Training and Evaluating User Language Models

3 minute read

Assistant์šฉ LM์„ user์ฒ˜๋Ÿผ ์—ญํ•  ์ง€์‹œํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” ๊ธฐ์กด ๋ฐฉ์‹์€ ๋ณธ์งˆ์ ์œผ๋กœ ๋น„ํ˜„์‹ค์ ์ด๋ฉฐ, ์‹ค์ œ human user ํ–‰๋™์„ ํ•™์Šตํ•œ UserLM์ด ํ›จ์”ฌ ๋” ์ž์—ฐ์Šค๋Ÿฌ์šด multi-turn user behavior๋ฅผ ์žฌํ˜„ํ•ด assistant ์„ฑ๋Šฅ์˜ ์ง„์งœ ํ•œ๊ณ„๋ฅผ ๋“œ๋Ÿฌ๋‚ธ๋‹ค.

LightMem: Lightweight and Efficient Memory-Augmented Generation

1 minute read

sensory > topic-aware short-term > sleep-time long-term memory ์—…๋ฐ์ดํŠธ์˜ 3๋‹จ๊ณ„ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ์ œ์•ˆ, LongMemEval ์ •ํ™•๋„ ํ–ฅ์ƒ ๋ฐ token/API call/runtime ๋น„์šฉ ๋Œ€ํญ ์ถ•์†Œ ํ™•์ธ

Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

2 minute read

์ตœ์‹  ๋Œ€ํ™” ๋ชจ๋ธ์€ ์ข…์ข… ์ •์ฒด์„ฑ์„ ์œ ์ง€ํ•˜์ง€ ๋ชปํ•˜๋ฉฐ, expanded attention & classifier-based reranking์œผ๋กœ ์˜ค๋ฅ˜๋ฅผ 65% ์ค„์ผ ์ˆ˜ ์žˆ์œผ๋‚˜ ์—ฌ์ „ํžˆ challenge์ด๋‹ค.

TO CHAT OR TASK: a Multi-turn Dialogue Generation Framework for Task-Oriented Dialogue Systems

1 minute read

chitchat๊ณผ task request๊ฐ€ ๊ฒฐํ•ฉ๋œ multi-turn dialogue ์ž๋™ ๊ตฌ์ถ•ํ•˜๋Š” framework CTFUSION ์ œ์•ˆ, ์ด๋ฅผ ํ™œ์šฉํ•ด ๋งŒ๋“  IVSR-CTF ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ICS ๋ชจ๋ธ์ด ๊ธฐ๋Šฅ ์˜๋„ ๋ถ„๋ฅ˜์—์„œ LLM์„ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ ๊ทธ ํšจ๊ณผ ํ™•์ธ

Exploring Persona Sentiment Sensitivity in Personalized Dialogue Generation

1 minute read

LLM์€ persona์˜ sensitivity์— ๋งค์šฐ ๋ฏผ๊ฐํ•˜์—ฌ ๋ถ€์ •์  persona๋Š” ์ผ๊ด€์„ฑ ์—†๋Š” ๋Œ€ํ™”๋ฅผ, ๊ธ์ •์  persona๋Š” ๋” ์›ํ™œํ•˜๊ณ  ์งˆ ๋†’์€ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•˜๊ธฐ ๋–„๋ฌธ์—, robustness ๊ฐœ์„ ์„ ์œ„ํ•ด polarity-aware ์ƒ์„ฑ ์ „๋žต ์ œ์•ˆ

Dynamic Epistemic Friction in Dialogue

3 minute read

๋Œ€ํ™”์—์„œ belief์€ ํ†ต์ƒ ์—ฐ๊ตฌ๋“ค์˜ ๊ฐ€์ •์ฒ˜๋Ÿผ '๋งค๋„๋Ÿฝ๊ฒŒ' ์—…๋ฐ์ดํŠธ ๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ, ์ƒˆ๋กœ์šด ์ •๋ณด์— ๋Œ€ํ•œ ์ˆ˜์šฉ ์ €ํ•ญ(epistemic friction)์„ ์ •๋Ÿ‰ํ™”/๋ฒกํ„ฐํ™”ํ•˜์—ฌ ๋ชจ๋ธ๋งํ•˜๋Š” belief ๋ณ€ํ™” ๋ชจ๋ธ๋ง ์ œ์•ˆ

CONFETTI: Conversational Function-Calling Evaluation Through Turn-Level Interactions

2 minute read

multi-turn dialogue์—์„œ LLM Function Calling์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ CONFETTI ์ œ์•ˆ. ํ˜„์žฌ ๋ชจ๋ธ๋“ค์€ ์—ฌ์ „ํžˆ ๋ณต์žกํ•œ ์—ฐ์‡„์˜/๊ธด ์ปจํ…์ŠคํŠธ/๋Œ€ํ˜• API ์„ ํƒ์— ํ•œ๊ณ„๊ฐ€ ์žˆ์Œ์„ ํ™•์ธ.

Planning Like Human: A Dual-process Framework for Dialogue Planning

1 minute read

์ต์ˆ™ํ•œ ์ƒํ™ฉ์„ ์ฒ˜๋ฆฌํ•˜๋Š” intuitive (fast) ์ •์ฑ… ๋ชจ๋ธ๊ณผ ์ƒˆ๋กœ์šด ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์œ„ํ•œ analytical (slow)์˜ ์ •์ฑ… ๋ชจ๋ธ์„ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์ค‘ dialogue planning ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Adaptive Retrieval-Augmented Generation for Conversational Systems

1 minute read

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation

less than 1 minute read

multi-layer๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ transformer ๊ณ„์—ด ๋ชจ๋ธ์—์„œ prompt๊ฐ€ ๋’ค์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์žŠํ˜€์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” DualLoRA ์ œ์•ˆ

ReALM: Reference Resolution As Language Modeling

less than 1 minute read

Pipeline style๋กœ reference resolution์— ๋Œ€ํ•ด finetune๋œ ์ž‘์€ ๋ชจ๋ธ(ReALM)๋กœ ํ•ด๊ฒฐ ์‹œ๋„

ChatQA: Building GPT-4 Level Conversational QA Models

less than 1 minute read

LLM zero-shot์—์„œ ๋Œ€ํ™”๊ผด QA ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ๋Š” 2-stage instruction tuning ๋ฐฉ๋ฒ• ์ œ์•ˆ.

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

less than 1 minute read

LM์ด Self-Talk๋ฅผ ํ†ตํ•ด training ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ>์ •์ œ>SFT์— ํ™œ์šฉ (bootstrapping). ์ด ๊ณผ์ •์—์„œ ๋ณ‘๋ชฉ์„ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ๋Œ€ํ™”์„ฑ๊ณต ์—ฌ๋ถ€๋ฅผ ์ธก์ •ํ•˜๋Š” automatic metric ์ œ์•ˆ

diffusion 1 posts

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

2 minute read

ํ›ˆ๋ จํ•  ๋•Œ ๋ณธ context length๋ฅผ ๋„˜์–ด์„œ๋„ Diffusion-based LLM์˜ "local perception" ๋•๋ถ„์— ์•ˆ์ •์  ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋Š” LongLLaDA ์ œ์•ˆ. NTK ๊ธฐ๋ฐ˜ RoPE extrapolation์œผ๋กœ Diffusion-based LLM์˜ input le...

domain-adaptation 6 posts

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

less than 1 minute read

๋‹ค์–‘ํ•œ ๋ฌธ์„œ ์ƒ์„ฑ + QA pair ๊ตฌ์„ฑํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ LLM์˜ ์ง€์‹ ์‚ฌ์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€ํ•˜๋Š” Framework ์ œ์•ˆ

Specialized Language Models with Cheap Inference from Limited Domain Data

less than 1 minute read

1) generic pretraining cost 2) domain-specific pretraining cost 3) inference cost 4) size of specific domain training set ๋„ค๊ฐ€์ง€ ์ œ์•ฝ์กฐ๊ฑด ํ•˜์—์„œ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ํ•™์Šต์— ๋Œ€ํ•œ emperic...

DocLLM: A layout-aware generative language model for multimodal document understanding

less than 1 minute read

multi-modal LLM์—์„œ ์ฐฉ์•ˆ, LM์ด text์™€ (์ •ํ˜•ํ™”๋œ document ๋‚ด์—์„œ ) ์œ„์น˜์ •๋ณด๋ฅผ input์œผ๋กœ ๋ฐ›๋„๋ก ํ•˜์—ฌ internal structured document understanding ๋ฌธ์ œ ํ•ด๊ฒฐ

LLaMA Pro: Progressive LLaMA with Block Expansion

less than 1 minute read

์ƒˆ๋กœ ์ถ”๊ฐ€ํ•œ ๋ธ”๋ก์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋งŒ ๋„๋ฉ”์ธ ๋ฐ์ดํ„ฐ๋กœ ์—…๋ฐ์ดํŠธํ•˜๋Š” post-pretraining ๋ฐฉ์‹์˜ block expansion์ด domain-specific task์— ํŠนํžˆ ์œ ์šฉํ•˜๋‹ค๊ณ  ์ œ์•ˆ. ์ „์ฒด๋ฅผ finetuningํ•  ๋•Œ ๋ฐœ์ƒ๋˜๋Š” ๋ง๊ฐ์ด ์ผ์–ด๋‚˜์ง€ ์•Š๋Š”๋‹ค๊ณ . ๋™์ผ ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ์„ ์ „์ œ...

BloombergGPT: A Large Language Model for Finance

less than 1 minute read

A combined pre-training approach for domain-specific and non-domain-specific corpus. It describes the dataset, model configuration, and training procedure fo...

dpo 3 posts

Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLMโ€™s Reasoning Capability

1 minute read

์˜ค๋ฅ˜ ์ถ”๋ก ์ด ๋ฐœ์ƒํ•˜๋Š” ๊ณผ์ •์— ์ค‘์š” ์—ญํ• (์›์ธ)์„ ํ•˜๋Š” ํ† ํฐ (critical token)์„ ์‹๋ณ„ํ•˜์—ฌ ์ด ํ† ํฐ์„ ๋ชจ๋ธ ์ถ”๋ก  ๊ฐœ์„ ์— ์ ์šฉ(cDPO)ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ

Self-Rewarding Language Models

less than 1 minute read

๋ฐ˜๋ณต์ ์ธ DPO ํ›ˆ๋ จ์œผ๋กœ ์‚ฌ๋žŒ์ด ์„ค๊ณ„ํ•œ reward model์ด ์•„๋‹Œ,ย LLM-as-a-Judgeย mechanism์„ ์‚ฌ์šฉ, LM์ด ์ž์œจ์ ์œผ๋กœ instruction following & reward modeling > refine ๋ฐ˜๋ณต.

dst 1 posts
ensemble 5 posts

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Knowledge Fusion of Large Language Models

1 minute read

๊ธฐ์กด์— ๊ฐ๊ธฐ ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉด์„œ ๋‹ค์–‘ํ•œ ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋œ ์—ฌ๋Ÿฌ LLMs(soucre LLMs)์„ ๋ณ‘ํ•ฉํ•ด์„œ ๋” strongํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•(pic1)์œผ๋กœ, ์—ฌ๋Ÿฌ LLM์˜ ์ง€์‹์„ ์™ธ๋ถ€ํ™”ํ•˜์—ฌ ๊ทธ๋“ค์˜ capability๋ฅผ ์ƒˆ๋กœ์šด LLM(target LLM)์œผ๋กœ transferํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ...

Blending is All You Need

less than 1 minute read

์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž‘์€ ๋ชจ๋ธ์„ Blendํ•ด์„œ ํ•˜๋‚˜์˜ ํฐ ๋ชจ๋ธ๊ณผ ๋น„์Šทํ•œ ํ˜น์€ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

evaluation 8 posts

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs

2 minute read

multi-turn setup์—์„œ์˜ ๋‚œ์ œ 4๊ฐ€์ง€ (Instruction Retention, Inference Memory, Reliable Versioned Editing, Self-Coherence)๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ์ œ์•ˆ, ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ์— ์„ฑ๊ณตํ•˜๋Š” ์ตœ์‹  SOTA ๋ชจ๋ธ๋“ค๋„ ์ œ์•ˆ...

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

less than 1 minute read

๋‹ค์–‘ํ•œ ๋ฌธ์„œ ์ƒ์„ฑ + QA pair ๊ตฌ์„ฑํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ LLM์˜ ์ง€์‹ ์‚ฌ์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€ํ•˜๋Š” Framework ์ œ์•ˆ

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

1 minute read

(1) ์—ฌ๋Ÿฌ ๊ธธ์ด์˜ interval (2) ๋‹ค์–‘ํ•œ depth range๋ฅผ ๊ฐ€์ง„ (3) ์ ์ง„์ ์œผ๋กœ ์–ด๋ ค์›Œ์ง€๋Š” (4) 2 ์–ธ์–ด(์˜๋ฌธ/์ค‘๋ฌธ)์˜ long context ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” NeedleBench ์ œ์•ˆ ๋ฐ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋กœ ํ‰๊ฐ€ ๊ฒฐ๊ณผ ๋ฆฌํฌํŠธ

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

less than 1 minute read

๊ธฐ์กด RAG ๋ฒค์น˜๋งˆํฌ๋Š” ๋ฒ”์œ„์™€ ๋‹ค์–‘์„ฑ์ด ์ œํ•œ๋˜์–ด ์žˆ๊ณ , ๊ฒ€์ƒ‰ ์š”์†Œ(retriever)์™€ ์™ธ๋ถ€ KB์˜ ์˜ํ–ฅ์„ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•˜๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค๊ณ  ์ง€์ ํ•˜๋ฉฐ, RAG Application์˜ ๋ฒ”์œ„๋ฅผ CRUD๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ  ๊ฐ๊ฐ์— ๋Œ€ํ•œ ํ‰๊ฐ€ task์™€ ๋ฐ์ดํ„ฐ์…‹ ๊ณต๊ฐœ. (์ค‘๊ตญ์–ด)

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

less than 1 minute read

LM์ด Self-Talk๋ฅผ ํ†ตํ•ด training ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ>์ •์ œ>SFT์— ํ™œ์šฉ (bootstrapping). ์ด ๊ณผ์ •์—์„œ ๋ณ‘๋ชฉ์„ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ๋Œ€ํ™”์„ฑ๊ณต ์—ฌ๋ถ€๋ฅผ ์ธก์ •ํ•˜๋Š” automatic metric ์ œ์•ˆ

factuality 5 posts

Real-time Fake News from Adversarial Feedback

1 minute read

LLM์˜ fake news๋ฅผ ๋” ์ž˜ ์ƒ์„ฑํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•. ํ•™์Šต ์ดํ›„ ๋ฐœ์ƒ๋˜๋Š” ์‚ฌ๊ฑด์˜ fake news ํƒ์ง€๋ฅผ ์œ„ํ•ด, adversarial iterative fake news ์ƒ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ ์ œ์•ˆ

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability

less than 1 minute read

standard LM training์— ํŠน์ • text๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ•™์Šต์‹œํ‚จ๋‹ค๊ณ  ํ•ด์„œ ๊ทธ text์˜ implies(ํ•จ์˜)์— ํ•ด๋‹นํ•˜๋Š” text๋“ค์˜ probability๊ฐ€ ๋†’์•„์ง€๋Š” ๊ฒƒ์€ ์•„๋‹˜. factuality ์ธก๋ฉด์—์„œ ๊ด€๋ จ fact set (text)์—๋„ ๋†’์€ ํ™•๋ฅ ์„ assignํ•˜๊ธฐ...

DocLLM: A layout-aware generative language model for multimodal document understanding

less than 1 minute read

multi-modal LLM์—์„œ ์ฐฉ์•ˆ, LM์ด text์™€ (์ •ํ˜•ํ™”๋œ document ๋‚ด์—์„œ ) ์œ„์น˜์ •๋ณด๋ฅผ input์œผ๋กœ ๋ฐ›๋„๋ก ํ•˜์—ฌ internal structured document understanding ๋ฌธ์ œ ํ•ด๊ฒฐ

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

less than 1 minute read

ODQA์—์„œ ๋ชจ๋ธ response๋ฅผ ๋” ์„ธ๋ถ„ํ™”๋œ ์ˆ˜์ค€์œผ๋กœ ๋‚˜๋ˆ ์„œ ์ •ํ™•์„ฑ ๋ฐ ์ •๋ณด์„ฑ ์ธก๋ฉด์—์„œ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” GRANOLA QA ๋ฒค์น˜๋งˆํฌ ๊ณต๊ฐœ ๋ฐ ๊ทธ ์„ธ๋ถ„ํ™”๋œ ์ •๋ณด์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ ๋””์ฝ”๋”ฉ ๋ฐฉ์‹ DRAG ์ œ์•ˆ

function-calling 6 posts

Adaptation of Agentic AI

2 minute read

agentic AI ์—ฐ๊ตฌ์—์„œ adaptation์ด๋ผ๋Š” ๊ฐœ๋…์ด ํ˜ผ์šฉ๋˜์–ด์™”๊ณ , ์ฒด๊ณ„์ ์ธ ์‹œ์Šคํ…œ ์ˆ˜์ค€ ์„ค๊ณ„ ๋ฐ ๋น„๊ต๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด adaptation ๋Œ€์ƒ(agent vs tool)๊ณผ adaptation์„ ์œ ๋„ํ•˜๋Š” ์‹ ํ˜ธ๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๋ถ„๋ฅ˜ ์ฒด๊ณ„ ์ œ์•ˆ

Budget-Aware Tool-Use Enables Effective Agent Scaling

2 minute read

ํˆด ํ˜ธ์ถœ ์˜ˆ์‚ฐ์„ ๋‹จ์ˆœํžˆ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋Š” ์—์ด์ „ํŠธ ์„ฑ๋Šฅ์ด ์Šค์ผ€์ผ(TTS)๋˜์ง€ ์•Š์œผ๋ฉฐ, ์˜ˆ์‚ฐ์„ ๋ช…์‹œ์ ์œผ๋กœ ์ธ์‹ํ•˜๋„๋ก ํ•˜๋Š” Budget Tracker์™€ BATS ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋„์ž…ํ•˜๋ฉด ๋น„์šฉ ๋Œ€๋น„ ์„ฑ๋Šฅ ์Šค์ผ€์ผ๋ง๊ณผ Pareto frontier๊ฐ€ ํฌ๊ฒŒ ๊ฐœ์„ ๋œ๋‹ค.

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

2 minute read

์ž‘์€ 8B ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ดํ„ฐ ๋ชจ๋ธ์ด ๋‹ค์–‘ํ•œ ํˆด๊ณผ LLM์„ RL๋กœ ํ†ตํ•ฉ์ ์œผ๋กœ ์กฐ์ •ํ•˜์—ฌ ์ •ํ™•๋„/๋น„์šฉ/latency/์œ ์ € ์„ ํ˜ธ๋ฅผ ๋™์‹œ์— ์ตœ์ ํ™”ํ•˜๋Š” ํˆด ๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆ. GPT-5๋ณด๋‹ค ์‹ธ๊ณ  ์„ฑ๋Šฅ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ธ๋‹ค.

fusion 1 posts

Knowledge Fusion of Large Language Models

1 minute read

๊ธฐ์กด์— ๊ฐ๊ธฐ ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉด์„œ ๋‹ค์–‘ํ•œ ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋œ ์—ฌ๋Ÿฌ LLMs(soucre LLMs)์„ ๋ณ‘ํ•ฉํ•ด์„œ ๋” strongํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•(pic1)์œผ๋กœ, ์—ฌ๋Ÿฌ LLM์˜ ์ง€์‹์„ ์™ธ๋ถ€ํ™”ํ•˜์—ฌ ๊ทธ๋“ค์˜ capability๋ฅผ ์ƒˆ๋กœ์šด LLM(target LLM)์œผ๋กœ transferํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ...

gan 1 posts
hallucination 5 posts

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs

2 minute read

multi-turn setup์—์„œ์˜ ๋‚œ์ œ 4๊ฐ€์ง€ (Instruction Retention, Inference Memory, Reliable Versioned Editing, Self-Coherence)๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ์ œ์•ˆ, ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ์— ์„ฑ๊ณตํ•˜๋Š” ์ตœ์‹  SOTA ๋ชจ๋ธ๋“ค๋„ ์ œ์•ˆ...

Knowing When to Ask - Bridging Large Language Models and Data

1 minute read

Data Commons (knowledge Graph)๋ฅผ ํ™œ์šฉํ•˜์—ฌ LLM ์‘๋‹ต์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒ์‹œ์ผœ LLM๊ณผ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ฒฉ์ฐจ ํ•ด์†Œํ•˜๋Š” DataGemma ์†Œ๊ฐœ

Pandoraโ€™s Box or Aladdinโ€™s Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

2 minute read

LLM์˜ RAG ์ƒํ™ฉ์—์„œ ๋‹ค์–‘ํ•œ Noise๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ณ  ๋ถ„์„. ์œ ์ตํ•œ Noise์˜ ๊ฒฝ์šฐ ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธ. ๋ฒค์น˜๋งˆํฌ NoiserBench๋ฅผ ์ œ์‹œํ•˜์—ฌ LLM์˜ Noise ๋Œ€์‘ ํ‰๊ฐ€ ๋ฐ ์œ ์ตํ•œ noise๋Š” ํ™œ์šฉํ•˜๊ณ  ํ•ด๋กœ์šด noise๋Š” ์ค„์ด๋Š” ๋ฐฉ๋ฒ• ์ œ์‹œ.

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

1 minute read

๋ชจ๋ธ ์‚ฌ์ด์ฆˆ๊ฐ€ ํฌ๊ณ  ํ•™์Šต ์‹œ๊ฐ„์ด ๊ธธ์ˆ˜๋ก hallucination์ด ๋œ ๋ฐœ์ƒํ•˜๋Š” ๊ฑด ๋งž์ง€๋งŒ,ย ์ด๋ฅผ 5%์ดํ•˜์˜ ๋‚ฎ์€ ์ˆ˜์ค€์œผ๋กœ ์ค„์ด๋ ค๋ฉด (์ผ๋ฐ˜์ ์œผ๋กœ ์•Œ๋ ค์ง„ scaling law๋ณด๋‹ค) ํ›จ์”ฌ ๋” ํฐ ๋ชจ๋ธ๊ณผ ๋” ๋งŽ์€ ์ปดํ“จํŒ… ์ž์›์ด ํ•„์š”ํ•˜๋‹ค.

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

2 minute read

์•„๋ž-์„œ๊ตฌ๋ฌธํ™”๊ฐ€ ๋Œ€์กฐ๋˜๋Š” entity์™€ natural occurring prompt ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹ CAMeL์„ ์ œ์•ˆํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ๋ก€์—ฐ๊ตฌํ•œ ๊ฒฐ๊ณผ LLM์ด ์„œ๊ตฌ๋ฌธํ™”๊ถŒ entity์— ํŽธํ–ฅ๋˜์–ด ์žˆ์Œ์— ๋Œ€ํ•œ ์šฐ๋ ค

hci 1 posts
hypernetwork 1 posts

Specialized Language Models with Cheap Inference from Limited Domain Data

less than 1 minute read

1) generic pretraining cost 2) domain-specific pretraining cost 3) inference cost 4) size of specific domain training set ๋„ค๊ฐ€์ง€ ์ œ์•ฝ์กฐ๊ฑด ํ•˜์—์„œ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ํ•™์Šต์— ๋Œ€ํ•œ emperic...

icl 7 posts

Adaptive Retrieval-Augmented Generation for Conversational Systems

1 minute read

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

Self-Discover: Large Language Models Self-Compose Reasoning Structures

less than 1 minute read

๋ธ์ด ์—ฌ๋Ÿฌ reasoning techniques(CoT, critical thinking, ...) ์ค‘์—์„œ ํ•˜๋‚˜๋ฅผ ์Šค์Šค๋กœ ์„ ํƒํ•˜์—ฌ task๋ณ„๋กœ ์ ํ•ฉํ•œ ์ถ”๋ก  ์ „๋žต์„ ๊ตฌ์„ฑํ•˜๋„๋ก ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ. BBH์—์„œ ๋‹จ์ˆœ CoT๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹๊ณ  CoT Self-consistency๋ณด๋‹ค๋„ ์ถ”...

Orion-14B: Open-source Multilingual Large Language Models

less than 1 minute read

ํ•œ๊ตญ์–ด ํฌํ•จ ๋™์•„์‹œ์•„๊ถŒ ์–ธ์–ด๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ํ•™์Šต๋œ multilingual model ๊ณต๊ฐœ. Vocab ์‚ฌ์ด์ฆˆ๋„ย ์ƒ๋Œ€์ ์ด์ง€๋งŒย ๊ฒฐ์ฝ” ์ž‘์ง€ ์•Š๊ณ , ์‹ค์ œ ์„ฑ๋Šฅ๋„ ํ›Œ๋ฅญํ•œ ์ˆ˜์ค€.

The Power of Noise: Redefining Retrieval for RAG Systems

less than 1 minute read

RAG์—์„œ Retrieval ์— ์ง‘์ค‘ํ•˜์—ฌ, document์™€ prompt์˜ ์—ฐ๊ด€์„ฑ, prompt์—์„œ document์˜ ์œ„์น˜์™€ ์ˆ˜ ๋“ฑ ๋‹ค์–‘ํ•œ ์š”์†Œ๋ฅผ ํ‰๊ฐ€.

Corrective Retrieval Augmented Generation

less than 1 minute read

confidence score, web search, knowledge refinement๋กœ ์ž˜๋ชป ์ฐพ์•„์˜จ, ํ˜น์€ ์ตœ์ ์ด ์•„๋‹Œ ๊ฒฐ๊ณผ๋ฅผ self-correctionํ•˜์—ฌ ๋ชจ๋ธ ์ƒ์„ฑ ๊ฒฐ๊ณผ์— hallucination ๊ฐ์†Œ

Larger language models do in-context learning differently

1 minute read

์ถฉ๋ถ„ํžˆ ํฐ LLM์€ ์‚ฌ์ „ํ•™์Šต๊ณผ ๋ฐฐ์ฒ™๋˜๋Š” label์ด ์ฃผ์–ด์ง€๋”๋ผ๋„, ์‚ฌ์ „ํ•™์Šต ๋‚ด์šฉ์„ ๋ฎ์–ด๋‘๊ณ  ์ƒˆ๋กœ ์ฃผ์–ด์ง„ label๋กœ override ํ•  ์ˆ˜ ์žˆ์Œ. ์ด ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ถฉ๋ถ„ํžˆ ํฐ LLM์€ label์„ ์˜๋ฏธ์ ์œผ๋กœ ๊ด€๋ จ ์—†๋Š” label๋กœ ๋Œ€์ฒดํ•ด๋„ ์„ฑ๋Šฅ์ด ๋‚˜์˜ด.

industry 4 posts

DocLLM: A layout-aware generative language model for multimodal document understanding

less than 1 minute read

multi-modal LLM์—์„œ ์ฐฉ์•ˆ, LM์ด text์™€ (์ •ํ˜•ํ™”๋œ document ๋‚ด์—์„œ ) ์œ„์น˜์ •๋ณด๋ฅผ input์œผ๋กœ ๋ฐ›๋„๋ก ํ•˜์—ฌ internal structured document understanding ๋ฌธ์ œ ํ•ด๊ฒฐ

interpretability 6 posts

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Safety Layers of Aligned Large Language Models: The Key to LLM Security

1 minute read

๋‹ค์–‘ํ•œ Aligned LLM์˜ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ์— safety layer๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธ. safety layer๋Š” ์•…์˜์ ์ธ ์‚ฌ์šฉ์ž ์งˆ์˜๋ฅผ ์‹๋ณ„ํ•˜๊ณ  ๋˜ ๊ฑฐ๋ถ€ํ•˜๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰. ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ safety๋ฅผ ์œ ์ง€ํ•˜๋Š” Finetuning ๋ฐฉ๋ฒ•๋ก  SPPFT ์ œ์•ˆ.

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

1 minute read

๊ธฐ์กด vanilla ReLU๋ฅผ jumpReLU๋ผ๋Š” ๋น„์—ฐ์† activation์œผ๋กœ ๋Œ€์ฒดํ•˜์—ฌ ์ƒˆ๋กœ์šด SAE (sparse autoencodesr) SOTA, ๋น„์—ฐ์†์ ์ธ activation ์‚ฌ์šฉํ•˜์ง€๋งŒ straight-through estimator๋กœ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šต

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

1 minute read

Claude3-sonet์˜ ์ค‘๊ฐ„ layer์—์„œ ๋‚˜์˜จ Residual stream๋กœ Sparse Auto-encoder (SAE) ํ•™์Šต, SAE์™€ ๊ทธ feature vector ํ™œ์šฉํ•˜์—ฌ ํ•ด์„ ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์˜ ํŠน์„ฑ ํ™•์ธ๊ฐ€๋Šฅ.

knowledge-conflicts 5 posts

When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

4 minute read

Personalization์€ ๋‹จ์ˆœํžˆ user-aligned bias๊ฐ€ ์•„๋‹ˆ๋ผ factual representation๊ณผ entangle๋˜๋ฉด์„œ ์ฒด๊ณ„์ ์ธ hallucination์„ ๋งŒ๋“ ๋‹ค๋Š” ์‚ฌ์‹ค์„ representation level์—์„œ ๋ฐํžˆ๊ณ  inference-time์—์„œ ์ด๋ฅผ ์ œ...

Real-time Fake News from Adversarial Feedback

1 minute read

LLM์˜ fake news๋ฅผ ๋” ์ž˜ ์ƒ์„ฑํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•. ํ•™์Šต ์ดํ›„ ๋ฐœ์ƒ๋˜๋Š” ์‚ฌ๊ฑด์˜ fake news ํƒ์ง€๋ฅผ ์œ„ํ•ด, adversarial iterative fake news ์ƒ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ ์ œ์•ˆ

Pandoraโ€™s Box or Aladdinโ€™s Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

2 minute read

LLM์˜ RAG ์ƒํ™ฉ์—์„œ ๋‹ค์–‘ํ•œ Noise๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ณ  ๋ถ„์„. ์œ ์ตํ•œ Noise์˜ ๊ฒฝ์šฐ ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธ. ๋ฒค์น˜๋งˆํฌ NoiserBench๋ฅผ ์ œ์‹œํ•˜์—ฌ LLM์˜ Noise ๋Œ€์‘ ํ‰๊ฐ€ ๋ฐ ์œ ์ตํ•œ noise๋Š” ํ™œ์šฉํ•˜๊ณ  ํ•ด๋กœ์šด noise๋Š” ์ค„์ด๋Š” ๋ฐฉ๋ฒ• ์ œ์‹œ.

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

2 minute read

์•„๋ž-์„œ๊ตฌ๋ฌธํ™”๊ฐ€ ๋Œ€์กฐ๋˜๋Š” entity์™€ natural occurring prompt ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹ CAMeL์„ ์ œ์•ˆํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ๋ก€์—ฐ๊ตฌํ•œ ๊ฒฐ๊ณผ LLM์ด ์„œ๊ตฌ๋ฌธํ™”๊ถŒ entity์— ํŽธํ–ฅ๋˜์–ด ์žˆ์Œ์— ๋Œ€ํ•œ ์šฐ๋ ค

knowledge-editing 3 posts

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability

less than 1 minute read

standard LM training์— ํŠน์ • text๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ•™์Šต์‹œํ‚จ๋‹ค๊ณ  ํ•ด์„œ ๊ทธ text์˜ implies(ํ•จ์˜)์— ํ•ด๋‹นํ•˜๋Š” text๋“ค์˜ probability๊ฐ€ ๋†’์•„์ง€๋Š” ๊ฒƒ์€ ์•„๋‹˜. factuality ์ธก๋ฉด์—์„œ ๊ด€๋ จ fact set (text)์—๋„ ๋†’์€ ํ™•๋ฅ ์„ assignํ•˜๊ธฐ...

knowledge-graph 1 posts

Knowing When to Ask - Bridging Large Language Models and Data

1 minute read

Data Commons (knowledge Graph)๋ฅผ ํ™œ์šฉํ•˜์—ฌ LLM ์‘๋‹ต์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒ์‹œ์ผœ LLM๊ณผ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ฒฉ์ฐจ ํ•ด์†Œํ•˜๋Š” DataGemma ์†Œ๊ฐœ

language-modeling 30 posts

MoEE: Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

1 minute read

MoE LLM์˜ router weight๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋ณ„๋„ ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด decoder-style LLM์—์„œ๋„ ๊ดœ์ฐฎ์€ representation (embedding) ๋ฝ‘์„ ์ˆ˜ ์žˆ๋‹ค.

LC-LLM RAG: Long-Context LLMs Meet RAG

2 minute read

LC-LLM์„ RAG์—์„œ ์“ธ ๋•Œ, (1) context ์ˆœ์„œ๋ฅผ ์ž˜ ์ฃผ๊ณ  (2) RAG ๋А๋‚Œ์„ ํŠœ๋‹์‹œ์ผœ์ฃผ๊ณ  (3) ๋ช…์‹œ์ ์œผ๋กœ relevant ์—ฌ๋ถ€๋ฅผ ํŒ๋‹จํ•˜๋„๋ก reasoning step ์ฃผ๋ฉด ๋” ์ž˜ํ•œ๋‹ค.

Differential Transformer

1 minute read

Q/K๋ฅผ ๊ฐ๊ฐ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด 2๊ฐœ์˜ softmax attention map๊ฐ„ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ, relevant context์— ๋Œ€ํ•œ attention์„ ํ‚ค์šฐ๊ณ  ๋…ธ์ด์ฆˆ๋Š” ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ์‹์˜ transformers ๋ณ€ํ˜• ์ œ์•ˆ, hallucination ๊ฐœ์„ 

Selective Attention Improves Transformer

1 minute read

attention ์—ฐ์‚ฐ์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€๊ฒฝ ์—†์ด, ์ƒ์„ฑ๋œ token์ด ๋‹ค๋ฅธ token์ด ๋”์ด์ƒ ํ•„์š” ์—†๋‹ค๊ณ  ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฒ˜๋ฆฌ, ๋ฏธ๋ž˜ ์‹œ์ ์—์„œ๋Š” ํ•ด๋‹น token์ด ๋ถˆํ•„์š”ํ•˜๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๋˜ token๋“ค์— ๋Œ€ํ•œ attention์„ ์ค„์ด๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰๊ณผ ๊ณ„์‚ฐ ๋น„์šฉ์„ ...

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

1 minute read

Softmax๋ฅผ Sigmoid์™€ ์ƒ์ˆ˜ bias (sequence length๊ธฐ๋ฐ˜)๋กœ ๋Œ€์ฒดํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์œผ๋กœ attention ์—ฐ์‚ฐ ์†๋„๋ฅผ 18%๊ฐ€๋Ÿ‰ ํ–ฅ์ƒ์‹œํ‚จ FLASHSIGMOID ์ œ์•ˆ

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

1 minute read

๋ชจ๋ธ ์‚ฌ์ด์ฆˆ๊ฐ€ ํฌ๊ณ  ํ•™์Šต ์‹œ๊ฐ„์ด ๊ธธ์ˆ˜๋ก hallucination์ด ๋œ ๋ฐœ์ƒํ•˜๋Š” ๊ฑด ๋งž์ง€๋งŒ,ย ์ด๋ฅผ 5%์ดํ•˜์˜ ๋‚ฎ์€ ์ˆ˜์ค€์œผ๋กœ ์ค„์ด๋ ค๋ฉด (์ผ๋ฐ˜์ ์œผ๋กœ ์•Œ๋ ค์ง„ scaling law๋ณด๋‹ค) ํ›จ์”ฌ ๋” ํฐ ๋ชจ๋ธ๊ณผ ๋” ๋งŽ์€ ์ปดํ“จํŒ… ์ž์›์ด ํ•„์š”ํ•˜๋‹ค.

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

2 minute read

์•„๋ž-์„œ๊ตฌ๋ฌธํ™”๊ฐ€ ๋Œ€์กฐ๋˜๋Š” entity์™€ natural occurring prompt ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹ CAMeL์„ ์ œ์•ˆํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ๋ก€์—ฐ๊ตฌํ•œ ๊ฒฐ๊ณผ LLM์ด ์„œ๊ตฌ๋ฌธํ™”๊ถŒ entity์— ํŽธํ–ฅ๋˜์–ด ์žˆ์Œ์— ๋Œ€ํ•œ ์šฐ๋ ค

Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach

1 minute read

(1) RAG vs. Long-context LLM์— ๋Œ€ํ•ด, ์ž์›๋งŒ ์ถฉ๋ถ„ํ•˜๋‹ค๋ฉด ๊ฒฐ๊ณผ์ ์œผ๋กœ๋Š” LC LLM์ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋‚˜, (2) ๋น„์šฉ ์ธก๋ฉด์˜ ํšจ์œจ์„ ์œ„ํ•ด RAG๋กœ routingํ•˜๋Š” approach, Self-Route ์ œ์•ˆ

Better & Faster Large Language Models via Multi-token Prediction

less than 1 minute read

ํ•œ ๋ฒˆ์— 1๊ฐœ๊ฐ€ ์•„๋‹Œ multi-token prediction์„ ํ•™์Šตํ•˜๋ฉด ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ๋” ์ข‹๋‹ค๊ณ . 4-token prediction์„ ํ•™์Šตํ•œ LM์ด ๋ฐฐ์น˜๊ฐ€ ํฐ ๊ฒฝ์šฐ์—๋„ ์ตœ๋Œ€ 3๋ฐฐ ์ถ”๋ก  ์†๋„ ํ–ฅ์ƒ ๊ฐ€๋Šฅ.

Generative Representational Instruction Tuning

less than 1 minute read

text embedding๊ณผ generation ํ†ตํ•ฉํ•˜๋Š” Generative Representational Instruction Tuning ์ œ์•ˆ. ๋‹จ์ผ๋ชจ๋ธ์ธ GritLM์€ embedding(MTEB) ๋ฐ generation task(BBH...)์—์„œ ๋ชจ๋‘ SoTA๋ฅผ ๋‹ฌ์„ฑ.

Chain-of-Thought Reasoning Without Prompting

less than 1 minute read

LLM์˜ decoding์„ greedy decoding์—์„œ top-k decoding์œผ๋กœ ๋ฐ”๊พธ๋ฉด prompt ์—†์ด๋„ CoT reasoning ์œ ๋„ ๊ฐ€๋Šฅ

Specialized Language Models with Cheap Inference from Limited Domain Data

less than 1 minute read

1) generic pretraining cost 2) domain-specific pretraining cost 3) inference cost 4) size of specific domain training set ๋„ค๊ฐ€์ง€ ์ œ์•ฝ์กฐ๊ฑด ํ•˜์—์„œ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ํ•™์Šต์— ๋Œ€ํ•œ emperic...

Orion-14B: Open-source Multilingual Large Language Models

less than 1 minute read

ํ•œ๊ตญ์–ด ํฌํ•จ ๋™์•„์‹œ์•„๊ถŒ ์–ธ์–ด๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ํ•™์Šต๋œ multilingual model ๊ณต๊ฐœ. Vocab ์‚ฌ์ด์ฆˆ๋„ย ์ƒ๋Œ€์ ์ด์ง€๋งŒย ๊ฒฐ์ฝ” ์ž‘์ง€ ์•Š๊ณ , ์‹ค์ œ ์„ฑ๋Šฅ๋„ ํ›Œ๋ฅญํ•œ ์ˆ˜์ค€.

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability

less than 1 minute read

standard LM training์— ํŠน์ • text๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ•™์Šต์‹œํ‚จ๋‹ค๊ณ  ํ•ด์„œ ๊ทธ text์˜ implies(ํ•จ์˜)์— ํ•ด๋‹นํ•˜๋Š” text๋“ค์˜ probability๊ฐ€ ๋†’์•„์ง€๋Š” ๊ฒƒ์€ ์•„๋‹˜. factuality ์ธก๋ฉด์—์„œ ๊ด€๋ จ fact set (text)์—๋„ ๋†’์€ ํ™•๋ฅ ์„ assignํ•˜๊ธฐ...

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

less than 1 minute read

ODQA์—์„œ ๋ชจ๋ธ response๋ฅผ ๋” ์„ธ๋ถ„ํ™”๋œ ์ˆ˜์ค€์œผ๋กœ ๋‚˜๋ˆ ์„œ ์ •ํ™•์„ฑ ๋ฐ ์ •๋ณด์„ฑ ์ธก๋ฉด์—์„œ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” GRANOLA QA ๋ฒค์น˜๋งˆํฌ ๊ณต๊ฐœ ๋ฐ ๊ทธ ์„ธ๋ถ„ํ™”๋œ ์ •๋ณด์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ ๋””์ฝ”๋”ฉ ๋ฐฉ์‹ DRAG ์ œ์•ˆ

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

less than 1 minute read

LM์ด Self-Talk๋ฅผ ํ†ตํ•ด training ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ>์ •์ œ>SFT์— ํ™œ์šฉ (bootstrapping). ์ด ๊ณผ์ •์—์„œ ๋ณ‘๋ชฉ์„ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ๋Œ€ํ™”์„ฑ๊ณต ์—ฌ๋ถ€๋ฅผ ์ธก์ •ํ•˜๋Š” automatic metric ์ œ์•ˆ

Blending is All You Need

less than 1 minute read

์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž‘์€ ๋ชจ๋ธ์„ Blendํ•ด์„œ ํ•˜๋‚˜์˜ ํฐ ๋ชจ๋ธ๊ณผ ๋น„์Šทํ•œ ํ˜น์€ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

LLaMA Pro: Progressive LLaMA with Block Expansion

less than 1 minute read

์ƒˆ๋กœ ์ถ”๊ฐ€ํ•œ ๋ธ”๋ก์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋งŒ ๋„๋ฉ”์ธ ๋ฐ์ดํ„ฐ๋กœ ์—…๋ฐ์ดํŠธํ•˜๋Š” post-pretraining ๋ฐฉ์‹์˜ block expansion์ด domain-specific task์— ํŠนํžˆ ์œ ์šฉํ•˜๋‹ค๊ณ  ์ œ์•ˆ. ์ „์ฒด๋ฅผ finetuningํ•  ๋•Œ ๋ฐœ์ƒ๋˜๋Š” ๋ง๊ฐ์ด ์ผ์–ด๋‚˜์ง€ ์•Š๋Š”๋‹ค๊ณ . ๋™์ผ ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ์„ ์ „์ œ...

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

less than 1 minute read

sLLM(GPT2-small, LLaMA-7B, etc. )์œผ๋กœ ํ”„๋กฌํ”„ํŠธ์—์„œ ๋ถˆํ•„์š”ํ•œ ํ† ํฐ์„ ์‹๋ณ„>์ œ๊ฑฐ(์••์ถ•), LLM์˜ ์„ฑ๋Šฅ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋ฉด์„œ ์ตœ๋Œ€ 20๋ฐฐ์˜ ์••์ถ• ๋‹ฌ์„ฑ ๊ฐ€๋Šฅ

LLaMA : Open and Efficient Foundation Language Models

less than 1 minute read

10๋ฐฐ ๋” ์ ์€ ํŒŒ๋ผ๋ฏธํ„ฐ(13B)๋กœ GPT-3 175B ๋Œ€๋น„ ๊ฑฐ์˜ ๋ชจ๋“  ๋ฒค์น˜๋งˆํฌ์—์„œ ๋” ๋‚˜์€ ์„ฑ๋Šฅ ๋‹ฌ์„ฑ.

llm-as-a-judge 5 posts

RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback

1 minute read

ํ•ด๋‹ต์˜ ์ •ํ™•์„ฑ ๋ฐ ๊ฐœ์„  ๊ธฐ์—ฌ ํ”ผ๋“œ๋ฐฑ์„ ๋ชจ๋‘ ํ‰๊ฐ€ํ•˜๋Š” dual-reward RL-trained critic model์„ ๋„์ž…ํ•œ RefCritic ์ œ์•ˆ, ์ˆ˜๋ฆฌ ์ถ”๋ก  ๊ณผ์ œ์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

Scaling Laws of Synthetic Data for Language Models

2 minute read

SYNTHLLM ๋ฐฉ์‹์œผ๋กœ ์ƒ์„ฑํ•œ ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ๋Š” LLM finetuning์— ๋Œ€ํ•ด ์˜ˆ์ธก ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ๊ณผ์ ์œผ๋กœ scale ๋˜๊ณ , ์ˆ˜์ •ํ•œ scaling law์— ๋”ฐ๋ผ natural data ๋ถ€์กฑ์— ๋Œ€ํ•œ ํ™•์žฅ๊ฐ€๋Šฅํ•œ ์†”๋ฃจ์…˜์ด ๋œ๋‹ค๊ณ  ์ฃผ์žฅ

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

1 minute read

์‚ฌ์ „์— ํ‰๊ฐ€ ๊ธฐ์ค€์„ ์ œ๊ณตํ•˜์ง€ ์•Š๊ณ , ์ž์ฒด์ ์œผ๋กœ ํ‰๊ฐ€ ๊ณ„ํš-์‹คํ–‰-ํŒ๋‹จ์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์ˆ˜ํ–‰ํ•˜๋Š” Self-training loop์˜ thinking-llm-as-a-judge framework ์ œ์•ˆ, ์ ์€ ๋ฐ์ดํ„ฐ๋กœ๋„ SOTA ์„ฑ๋Šฅ๋‹ฌ์„ฑ

LLM Evaluators Recognize and Favor Their Own Generations

less than 1 minute read

LLM์€ ์ž๊ธฐ๊ฐ€ ๋งŒ๋“  ๊ฒฐ๊ณผ๋ฅผ ์„ ํ˜ธํ•œ๋‹ค๋Š” ๊ธฐ์กด ์ฃผ์žฅ์— ๋Œ€ํ•œ ์‹ฌ์ธต ๋…ผ์˜ (๊ฒฐ๋ก : ์‹ค์ œ ๊ทธ๋ ‡๋‹ค)

long-context 9 posts

LightMem: Lightweight and Efficient Memory-Augmented Generation

1 minute read

sensory > topic-aware short-term > sleep-time long-term memory ์—…๋ฐ์ดํŠธ์˜ 3๋‹จ๊ณ„ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ์ œ์•ˆ, LongMemEval ์ •ํ™•๋„ ํ–ฅ์ƒ ๋ฐ token/API call/runtime ๋น„์šฉ ๋Œ€ํญ ์ถ•์†Œ ํ™•์ธ

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs

2 minute read

multi-turn setup์—์„œ์˜ ๋‚œ์ œ 4๊ฐ€์ง€ (Instruction Retention, Inference Memory, Reliable Versioned Editing, Self-Coherence)๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ์ œ์•ˆ, ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ์— ์„ฑ๊ณตํ•˜๋Š” ์ตœ์‹  SOTA ๋ชจ๋ธ๋“ค๋„ ์ œ์•ˆ...

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

2 minute read

ํ›ˆ๋ จํ•  ๋•Œ ๋ณธ context length๋ฅผ ๋„˜์–ด์„œ๋„ Diffusion-based LLM์˜ "local perception" ๋•๋ถ„์— ์•ˆ์ •์  ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋Š” LongLLaDA ์ œ์•ˆ. NTK ๊ธฐ๋ฐ˜ RoPE extrapolation์œผ๋กœ Diffusion-based LLM์˜ input le...

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

1 minute read

gist memory์™€ interactive look-up์ ์šฉํ•˜์—ฌ LLM์ด ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ ๋‹ค์‹œ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์œผ๋กœ ์ตœ๋Œ€ 20๋ฐฐ ๋” ๊ธด context๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” prompting ์‹œ์Šคํ…œ์œผ๋กœ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ

Inference Scaling for Long-Context Retrieval Augmented Generation

2 minute read

LM์˜ RAG inference ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ scaling ์ „๋žต์„ ์ œ์•ˆํ•˜๊ณ , ์œ ํšจ ์ปจํ…์ŠคํŠธ ๊ธธ์ด์˜ ๊ทœ๋ชจ์™€ RAG ์„ฑ๋Šฅ ๊ฐ„์— ์„ ํ˜•์ ์ธ ๊ด€๊ณ„๊ฐ€ ์žˆ์Œ์„ ํ™•์ธ

LC-LLM RAG: Long-Context LLMs Meet RAG

2 minute read

LC-LLM์„ RAG์—์„œ ์“ธ ๋•Œ, (1) context ์ˆœ์„œ๋ฅผ ์ž˜ ์ฃผ๊ณ  (2) RAG ๋А๋‚Œ์„ ํŠœ๋‹์‹œ์ผœ์ฃผ๊ณ  (3) ๋ช…์‹œ์ ์œผ๋กœ relevant ์—ฌ๋ถ€๋ฅผ ํŒ๋‹จํ•˜๋„๋ก reasoning step ์ฃผ๋ฉด ๋” ์ž˜ํ•œ๋‹ค.

Differential Transformer

1 minute read

Q/K๋ฅผ ๊ฐ๊ฐ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด 2๊ฐœ์˜ softmax attention map๊ฐ„ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ, relevant context์— ๋Œ€ํ•œ attention์„ ํ‚ค์šฐ๊ณ  ๋…ธ์ด์ฆˆ๋Š” ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ์‹์˜ transformers ๋ณ€ํ˜• ์ œ์•ˆ, hallucination ๊ฐœ์„ 

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

1 minute read

(1) ์—ฌ๋Ÿฌ ๊ธธ์ด์˜ interval (2) ๋‹ค์–‘ํ•œ depth range๋ฅผ ๊ฐ€์ง„ (3) ์ ์ง„์ ์œผ๋กœ ์–ด๋ ค์›Œ์ง€๋Š” (4) 2 ์–ธ์–ด(์˜๋ฌธ/์ค‘๋ฌธ)์˜ long context ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” NeedleBench ์ œ์•ˆ ๋ฐ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋กœ ํ‰๊ฐ€ ๊ฒฐ๊ณผ ๋ฆฌํฌํŠธ

long-horizon 3 posts

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

3 minute read

Long-horizon LLM agents์˜ context window bottleneck ํ•ด๊ฒฐ์„ ์œ„ํ•ด, ๊ตฌ์กฐํ™”๋œ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ Indexed Experience Memory์™€ ์ด๋ฅผ ํ•™์Šตํ•˜๋Š” MemexRL ์ œ์•ˆ

MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks

2 minute read

multi-session + interdependent subtask ํ™˜๊ฒฝ์˜ Memory-Agent-Environment loop๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” benchmark๋ฅผ ์ œ์•ˆํ•˜๊ณ , ๊ธฐ์กด memory system์ด ์‹ค์ œ agentic setting์—์„œ ๋งค์šฐ ์ทจ์•ฝํ•จ์„ ์‹ค์ฆ

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

3 minute read

long-horizon task์—์„œ ๋ฐœ์ƒํ•˜๋Š” planning ์‹คํŒจ์˜ ํ•ต์‹ฌ ์›์ธ์„ entanglement๋กœ ๊ทœ์ •, ์ด๋ฅผ subtask ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌ๋œ DAG ๊ธฐ๋ฐ˜ planning์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์„ ์ œ์•ˆ, ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ํ† ํฐ ์ ˆ๊ฐ์—์„œ ์œ ์˜

lrm 2 posts

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

2 minute read

LRM์ด thinkํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์—ฌ๋„, ๋ณต์žก๋„๊ฐ€ ๋†’์œผ๋ฉด ์‹คํŒจํ•˜๊ฑฐ๋‚˜ ์ถ”๋ก ๋„ ๋น„ํšจ์œจ์ ์œผ๋กœ(=๋œ) ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„, ์ง„์ •ํ•œ ์ผ๋ฐ˜ํ™” ์ถ”๋ก  ์„ฑ๋Šฅ์€ ๋ถ€์กฑํ•˜๋‹ค.

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

1 minute read

LRMs์ด overthinkingํ•˜๊ฒŒ ๋˜๋ฉด agentic ํ™˜๊ฒฝ๊ณผ ์ œ๋Œ€๋กœ ์ƒํ˜ธ์ž‘์šฉํ•˜์ง€ ๋ชปํ•˜๋Š” Reasoning-Action Dilemma๊ฐ€ ๋ฐœ์ƒ๋˜๊ณ , ์ด๋Š” ์„ฑ๋Šฅ ํ•˜๋ฝ์„ ์ดˆ๋ž˜ํ•œ๋‹ค๋Š” ๊ฒฐ๊ณผ ๋ณด๊ณ 

lvlm 1 posts

Slow Perception: Letโ€™s Perceive Geometric Figures Step-by-step

1 minute read

๊ธฐํ•˜ ๋ฌธ์ œ ํ’€์ด์— ์žˆ์–ด์„œ ๋ชจ๋ธ์ด ์ฒœ์ฒœํžˆ ๋ณด๊ฒŒ ํ•˜๋Š”๊ฒŒ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์ด ๋œ๋‹ค.

memory 11 posts

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

3 minute read

Long-horizon LLM agents์˜ context window bottleneck ํ•ด๊ฒฐ์„ ์œ„ํ•ด, ๊ตฌ์กฐํ™”๋œ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ Indexed Experience Memory์™€ ์ด๋ฅผ ํ•™์Šตํ•˜๋Š” MemexRL ์ œ์•ˆ

MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks

2 minute read

multi-session + interdependent subtask ํ™˜๊ฒฝ์˜ Memory-Agent-Environment loop๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” benchmark๋ฅผ ์ œ์•ˆํ•˜๊ณ , ๊ธฐ์กด memory system์ด ์‹ค์ œ agentic setting์—์„œ ๋งค์šฐ ์ทจ์•ฝํ•จ์„ ์‹ค์ฆ

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

2 minute read

memory consolidation๊ณผ reasoning์„ ํ•˜๋‚˜์˜ internal state๋กœ ํ†ตํ•ฉํ•˜๋„๋ก RL ํ•™์Šตํ•˜์—ฌ long-horizon task์—์„œ ๊ฑฐ์˜ ์ผ์ •ํ•œ context size ์œ ์ง€ํ•˜๋ฉฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

SimpleMem: Efficient Lifelong Memory for LLM Agents

3 minute read

LLM Agent์˜ LTM์„ semantic lossless compression์œผ๋กœ ์žฌ์ •์˜ํ•˜๊ณ , write-time ๊ตฌ์กฐํ™”ยทonline synthesisยทintent-aware retrieval๋กœ ์„ฑ๋Šฅ๊ณผ ํ† ํฐ ํšจ์œจ(์ตœ๋Œ€ 30๋ฐฐ)์„ ๊ฐœ์„ ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Learning User Preferences Through Interaction for Long-Term Collaboration

2 minute read

multi-turn interaction์—์„œ user์˜ explicit preference๋ฅผ memory๋กœ ํ•™์Šตํ•˜๋ฉด ๋‹จ์ˆœ Recall-based memory๋ณด๋‹ค long-term collaboration(์„ฑ๊ณต๋ฅ /ํšจ์œจ/user burden)์ด ์œ ์˜ํ•˜๊ฒŒ ๊ฐœ์„ ๋œ๋‹ค.

Adaptation of Agentic AI

2 minute read

agentic AI ์—ฐ๊ตฌ์—์„œ adaptation์ด๋ผ๋Š” ๊ฐœ๋…์ด ํ˜ผ์šฉ๋˜์–ด์™”๊ณ , ์ฒด๊ณ„์ ์ธ ์‹œ์Šคํ…œ ์ˆ˜์ค€ ์„ค๊ณ„ ๋ฐ ๋น„๊ต๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด adaptation ๋Œ€์ƒ(agent vs tool)๊ณผ adaptation์„ ์œ ๋„ํ•˜๋Š” ์‹ ํ˜ธ๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๋ถ„๋ฅ˜ ์ฒด๊ณ„ ์ œ์•ˆ

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

2 minute read

LLM Agent๊ฐ€ test-time์— ๊ณผ๊ฑฐ ๊ฒฝํ—˜์„ ์Šค์Šค๋กœ ์ง„ํ™”์‹œํ‚ค๋ฉฐ ํ•™์Šตํ•˜๋Š” ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” streaming benchmark Evo-Memory ์ œ์•ˆ, ExpRAG / ReMem ๊ฐ™์€ baseline์„ ์ œ์•ˆํ•˜์—ฌ ๊ฒฝํ—˜ ์žฌ์‚ฌ์šฉ ๊ธฐ๋ฐ˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋Œ€ํ•œ ๋น„๊ต ํ‰๊ฐ€ ๊ธฐ๋ฐ˜ ์ œ์‹œ

General Agentic Memory via Deep Research

2 minute read

๊ฒฝ๋Ÿ‰ memorizer์™€ full-page store + deep research๋กœ Just-In-Time memory ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ, ๊ธฐ์กด ์‚ฌ์ „์••์ถ• (static) ๋ฉ”๋ชจ๋ฆฌ ๋Œ€๋น„ ๋‹ค์–‘ํ•œ long-term + multi-hop ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋‹ฌ์„ฑ

HaluMem: Evaluating Hallucinations in Memory Systems of Agents

4 minute read

Agent memory system์˜ hallucination์ด ์–ด๋””(extract > update > QA)์—์„œ ๋‚˜ํƒ€๋‚˜๋Š”์ง€ ์ง„๋‹จํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ์ œ์•ˆ

LightMem: Lightweight and Efficient Memory-Augmented Generation

1 minute read

sensory > topic-aware short-term > sleep-time long-term memory ์—…๋ฐ์ดํŠธ์˜ 3๋‹จ๊ณ„ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ์ œ์•ˆ, LongMemEval ์ •ํ™•๋„ ํ–ฅ์ƒ ๋ฐ token/API call/runtime ๋น„์šฉ ๋Œ€ํญ ์ถ•์†Œ ํ™•์ธ

MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents

3 minute read

multi-scenario (participation & observation) + multi-level (factual & reflective) ๋ฉ”๋ชจ๋ฆฌ ์œ ํ˜• ํ†ตํ•ฉ, multi-metric evaluation๋ฅผ ์‚ฌ์šฉํ•˜๋Š” LLM-based agent์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ์ธ M...

mia 1 posts

Detecting Training Data of Large Language Models via Expectation Maximization

2 minute read

Expectation-Maximization ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๋ฉค๋ฒ„์‹ญ ์ ์ˆ˜์™€ prefix ์ ์ˆ˜๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋” ๋‚˜์€ ๋ฉค๋ฒ„์‹ญ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ƒˆ๋กœ์šด LLM์šฉ MIA ๋ฐฉ์‹ EM-MIA ์ œ์•ˆ

mid 1 posts
mllm 1 posts

Honeybee: Locality-enhanced Projector for Multimodal LLM

2 minute read

MLLM์—์„œ vision encoder์™€ LLM ์‚ฌ์ด์˜ visual projector๊ฐ€ ํ•ต์‹ฌ ๋ณ‘๋ชฉ์ž„์„ ๋ถ„์„, visual token flexibility์™€ locality preservation์„ ๋™์‹œ์— ๋งŒ์กฑํ•˜๋Š” Honeybee projector๋ฅผ ์ œ์•ˆ

moe 1 posts

MoEE: Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

1 minute read

MoE LLM์˜ router weight๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋ณ„๋„ ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด decoder-style LLM์—์„œ๋„ ๊ดœ์ฐฎ์€ representation (embedding) ๋ฝ‘์„ ์ˆ˜ ์žˆ๋‹ค.

multi-agent 3 posts

General Agentic Memory via Deep Research

2 minute read

๊ฒฝ๋Ÿ‰ memorizer์™€ full-page store + deep research๋กœ Just-In-Time memory ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ, ๊ธฐ์กด ์‚ฌ์ „์••์ถ• (static) ๋ฉ”๋ชจ๋ฆฌ ๋Œ€๋น„ ๋‹ค์–‘ํ•œ long-term + multi-hop ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋‹ฌ์„ฑ

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

1 minute read

ํ˜‘์—…์ /๊ฒฝ์Ÿ์  ์ƒํ™ฉ์—์„œ ์—์ด์ „ํŠธ๋ผ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ์‹œ์Šคํ…œ ํ‰๊ฐ€์— ๋Œ€ํ•œ ๋ฒค์น˜๋งˆํฌย MARBLEย ์ œ์•ˆ

multi-linguality 4 posts

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

2 minute read

์•„๋ž-์„œ๊ตฌ๋ฌธํ™”๊ฐ€ ๋Œ€์กฐ๋˜๋Š” entity์™€ natural occurring prompt ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹ CAMeL์„ ์ œ์•ˆํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์‚ฌ๋ก€์—ฐ๊ตฌํ•œ ๊ฒฐ๊ณผ LLM์ด ์„œ๊ตฌ๋ฌธํ™”๊ถŒ entity์— ํŽธํ–ฅ๋˜์–ด ์žˆ์Œ์— ๋Œ€ํ•œ ์šฐ๋ ค

Orion-14B: Open-source Multilingual Large Language Models

less than 1 minute read

ํ•œ๊ตญ์–ด ํฌํ•จ ๋™์•„์‹œ์•„๊ถŒ ์–ธ์–ด๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ํ•™์Šต๋œ multilingual model ๊ณต๊ฐœ. Vocab ์‚ฌ์ด์ฆˆ๋„ย ์ƒ๋Œ€์ ์ด์ง€๋งŒย ๊ฒฐ์ฝ” ์ž‘์ง€ ์•Š๊ณ , ์‹ค์ œ ์„ฑ๋Šฅ๋„ ํ›Œ๋ฅญํ•œ ์ˆ˜์ค€.

multi-modality 4 posts

Honeybee: Locality-enhanced Projector for Multimodal LLM

2 minute read

MLLM์—์„œ vision encoder์™€ LLM ์‚ฌ์ด์˜ visual projector๊ฐ€ ํ•ต์‹ฌ ๋ณ‘๋ชฉ์ž„์„ ๋ถ„์„, visual token flexibility์™€ locality preservation์„ ๋™์‹œ์— ๋งŒ์กฑํ•˜๋Š” Honeybee projector๋ฅผ ์ œ์•ˆ

Slow Perception: Letโ€™s Perceive Geometric Figures Step-by-step

1 minute read

๊ธฐํ•˜ ๋ฌธ์ œ ํ’€์ด์— ์žˆ์–ด์„œ ๋ชจ๋ธ์ด ์ฒœ์ฒœํžˆ ๋ณด๊ฒŒ ํ•˜๋Š”๊ฒŒ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์ด ๋œ๋‹ค.

ReALM: Reference Resolution As Language Modeling

less than 1 minute read

Pipeline style๋กœ reference resolution์— ๋Œ€ํ•ด finetune๋œ ์ž‘์€ ๋ชจ๋ธ(ReALM)๋กœ ํ•ด๊ฒฐ ์‹œ๋„

multi-turn 1 posts

Flipping the Dialogue: Training and Evaluating User Language Models

3 minute read

Assistant์šฉ LM์„ user์ฒ˜๋Ÿผ ์—ญํ•  ์ง€์‹œํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” ๊ธฐ์กด ๋ฐฉ์‹์€ ๋ณธ์งˆ์ ์œผ๋กœ ๋น„ํ˜„์‹ค์ ์ด๋ฉฐ, ์‹ค์ œ human user ํ–‰๋™์„ ํ•™์Šตํ•œ UserLM์ด ํ›จ์”ฌ ๋” ์ž์—ฐ์Šค๋Ÿฌ์šด multi-turn user behavior๋ฅผ ์žฌํ˜„ํ•ด assistant ์„ฑ๋Šฅ์˜ ์ง„์งœ ํ•œ๊ณ„๋ฅผ ๋“œ๋Ÿฌ๋‚ธ๋‹ค.

odqa 5 posts

GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning

2 minute read

RL(GRPO)์— 2๊ฐ€์ง€ constrained reward(RPA + CAF) ์ ์šฉํ•˜์—ฌ GraphRAG agent ํ•™์Šต > ๊ฒ€์ƒ‰ํ•  ๋•Œ ์ž…๋ ฅ์œผ๋กœ triplet๊ณผ ์ž์—ฐ์–ด ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ™œ์šฉํ•˜์—ฌ multi-hop QA์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ํ™•์ธ

SSRL: Self-Search Reinforcement Learning

1 minute read

๊ฒ€์ƒ‰์—”์ง„์ด๋‚˜ ๋‹ค๋ฅธ LLM ๋“ฑ ์™ธ๋ถ€ tool ์—†์ด ๊ฒ€์ƒ‰์„ Full-simulationํ•ด์„œ RL โ†’ real-world๋กœ ์ „์ด ๊ฐ€๋Šฅํ•œ self-search ๋ชจ๋ธ ๊ตฌ์ถ•

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

less than 1 minute read

ODQA์—์„œ ๋ชจ๋ธ response๋ฅผ ๋” ์„ธ๋ถ„ํ™”๋œ ์ˆ˜์ค€์œผ๋กœ ๋‚˜๋ˆ ์„œ ์ •ํ™•์„ฑ ๋ฐ ์ •๋ณด์„ฑ ์ธก๋ฉด์—์„œ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” GRANOLA QA ๋ฒค์น˜๋งˆํฌ ๊ณต๊ฐœ ๋ฐ ๊ทธ ์„ธ๋ถ„ํ™”๋œ ์ •๋ณด์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ ๋””์ฝ”๋”ฉ ๋ฐฉ์‹ DRAG ์ œ์•ˆ

Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets

less than 1 minute read

ODQA์—์„œ ์ž์ฃผ ์‚ฌ์šฉํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ NQ์— ๋Œ€ํ•œ ๋น„ํŒ์  ์‹œ๊ฐ์„ ๋‹ด์€ ๋…ผ๋ฌธ. ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ๋Š” train์—์„œ ๋ณธ ๋‚ด์šฉ์„ ์•”๊ธฐํ•˜๋Š” ์—ญํ• ์„ ํ…Œ์ŠคํŠธํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž„.

optimization 6 posts

Planning Like Human: A Dual-process Framework for Dialogue Planning

1 minute read

์ต์ˆ™ํ•œ ์ƒํ™ฉ์„ ์ฒ˜๋ฆฌํ•˜๋Š” intuitive (fast) ์ •์ฑ… ๋ชจ๋ธ๊ณผ ์ƒˆ๋กœ์šด ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์œ„ํ•œ analytical (slow)์˜ ์ •์ฑ… ๋ชจ๋ธ์„ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์ค‘ dialogue planning ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

The boundary of neural network trainability is fractal

less than 1 minute read

๋ณต์žกํ•œ ๋ฐ˜๋ณต ํŒจํ„ด์ธ Fractal ํŒจํ„ด์ด AI ํ•™์Šต ํ”„๋กœ์„ธ์Šค(ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ)๋ฅผ ์ œ์–ดํ•˜๋Š” setting์— ๋‚˜ํƒ€๋‚œ๋‹ค.

pbrl 1 posts
peft 5 posts

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

2 minute read

prompt๋ฅผ input์œผ๋กœ, LoRA-tuend ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ output์œผ๋กœ ํ•˜์—ฌ SFTํ•˜๋Š” ๋ชจ๋ธ DnD ์ œ์•ˆ. DnD๋ฅผ ํ•œ ๋ฒˆ ํ•™์Šต ํ•ด๋‘๋ฉด task๋งˆ๋‹ค ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด๋„ task-specific LoRA weight๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

Differential Transformer

1 minute read

Q/K๋ฅผ ๊ฐ๊ฐ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด 2๊ฐœ์˜ softmax attention map๊ฐ„ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ, relevant context์— ๋Œ€ํ•œ attention์„ ํ‚ค์šฐ๊ณ  ๋…ธ์ด์ฆˆ๋Š” ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ์‹์˜ transformers ๋ณ€ํ˜• ์ œ์•ˆ, hallucination ๊ฐœ์„ 

Adaptive Retrieval-Augmented Generation for Conversational Systems

1 minute read

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

Generative Representational Instruction Tuning

less than 1 minute read

text embedding๊ณผ generation ํ†ตํ•ฉํ•˜๋Š” Generative Representational Instruction Tuning ์ œ์•ˆ. ๋‹จ์ผ๋ชจ๋ธ์ธ GritLM์€ embedding(MTEB) ๋ฐ generation task(BBH...)์—์„œ ๋ชจ๋‘ SoTA๋ฅผ ๋‹ฌ์„ฑ.

Specialized Language Models with Cheap Inference from Limited Domain Data

less than 1 minute read

1) generic pretraining cost 2) domain-specific pretraining cost 3) inference cost 4) size of specific domain training set ๋„ค๊ฐ€์ง€ ์ œ์•ฝ์กฐ๊ฑด ํ•˜์—์„œ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ํ•™์Šต์— ๋Œ€ํ•œ emperic...

persona 1 posts

Persona Vectors: Monitoring and Controlling Character Traits in Language Models

3 minute read

LLM fine-tuning ์ „ํ›„ ํ˜น์€ ๊ทธ ๊ณผ์ •์—์„œ personality trait shifts(์•„์ฒจ, ํ™˜๊ฐ, ์•…์˜) ํƒ์ง€/์˜ˆ์ธก/์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•ด persona vector๋ฅผ ์ž๋™์œผ๋กœ ์ถ”์ถœํ•˜๊ณ  ์ ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ• ์ œ์•ˆ

personalization 2 posts

When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

4 minute read

Personalization์€ ๋‹จ์ˆœํžˆ user-aligned bias๊ฐ€ ์•„๋‹ˆ๋ผ factual representation๊ณผ entangle๋˜๋ฉด์„œ ์ฒด๊ณ„์ ์ธ hallucination์„ ๋งŒ๋“ ๋‹ค๋Š” ์‚ฌ์‹ค์„ representation level์—์„œ ๋ฐํžˆ๊ณ  inference-time์—์„œ ์ด๋ฅผ ์ œ...

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

2 minute read

LLM์—์„œ์˜ ๊ฐœ์ธํ™”/๋‹ค์›์  ์„ ํ˜ธ ์ •๋ ฌ์„ training/test-time, ์‚ฌ์šฉ์ž ๋ชจ๋ธ๋ง ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์œผ๋กœ ์ฒด๊ณ„ํ™”, ํ‰๊ฐ€ ๋ฐ ํ™•์žฅ์„ฑ ์ธก๋ฉด์˜ ๊ตฌ์กฐ์  ํ•œ๊ณ„ ํ™•์ธ

petl 6 posts

Selective Attention Improves Transformer

1 minute read

attention ์—ฐ์‚ฐ์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€๊ฒฝ ์—†์ด, ์ƒ์„ฑ๋œ token์ด ๋‹ค๋ฅธ token์ด ๋”์ด์ƒ ํ•„์š” ์—†๋‹ค๊ณ  ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฒ˜๋ฆฌ, ๋ฏธ๋ž˜ ์‹œ์ ์—์„œ๋Š” ํ•ด๋‹น token์ด ๋ถˆํ•„์š”ํ•˜๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๋˜ token๋“ค์— ๋Œ€ํ•œ attention์„ ์ค„์ด๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰๊ณผ ๊ณ„์‚ฐ ๋น„์šฉ์„ ...

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation

less than 1 minute read

multi-layer๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ transformer ๊ณ„์—ด ๋ชจ๋ธ์—์„œ prompt๊ฐ€ ๋’ค์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์žŠํ˜€์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” DualLoRA ์ œ์•ˆ

Specialized Language Models with Cheap Inference from Limited Domain Data

less than 1 minute read

1) generic pretraining cost 2) domain-specific pretraining cost 3) inference cost 4) size of specific domain training set ๋„ค๊ฐ€์ง€ ์ œ์•ฝ์กฐ๊ฑด ํ•˜์—์„œ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ํ•™์Šต์— ๋Œ€ํ•œ emperic...

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

less than 1 minute read

weight matrtix๋ฅผ ๋” ๊ณ ๋ฐ€๋„์˜ ์ž‘์€ ํ–‰๋ ฌ๋กœ slicingํ•˜๋Š” ๋ฐฉ์‹์˜ ์ƒˆ๋กœ์šด post training sparsification ์ œ์•ˆ. ์„ฑ๋Šฅ drop์€ 1%~10% ๋‚ด๋กœ ๋ฐฉ์–ดํ•˜๋ฉด์„œ ํŒŒ๋ผ๋ฏธํ„ฐ(embedding ํฌํ•จ)๋Š” ์ตœ๋Œ€ 25%๊นŒ์ง€ ์ œ๊ฑฐ ๊ฐ€๋Šฅ.

planning 2 posts

SimpleMem: Efficient Lifelong Memory for LLM Agents

3 minute read

LLM Agent์˜ LTM์„ semantic lossless compression์œผ๋กœ ์žฌ์ •์˜ํ•˜๊ณ , write-time ๊ตฌ์กฐํ™”ยทonline synthesisยทintent-aware retrieval๋กœ ์„ฑ๋Šฅ๊ณผ ํ† ํฐ ํšจ์œจ(์ตœ๋Œ€ 30๋ฐฐ)์„ ๊ฐœ์„ ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

3 minute read

long-horizon task์—์„œ ๋ฐœ์ƒํ•˜๋Š” planning ์‹คํŒจ์˜ ํ•ต์‹ฌ ์›์ธ์„ entanglement๋กœ ๊ทœ์ •, ์ด๋ฅผ subtask ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌ๋œ DAG ๊ธฐ๋ฐ˜ planning์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์„ ์ œ์•ˆ, ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ํ† ํฐ ์ ˆ๊ฐ์—์„œ ์œ ์˜

post-training 1 posts

Reasoning with Sampling: Your Base Model is Smarter Than You Think

2 minute read

์ถ”๊ฐ€ ํ•™์Šต ์—†์ด ๋‹จ์ˆœ MCMC ๊ธฐ๋ฐ˜ ์ƒ˜ํ”Œ๋ง๋งŒ์œผ๋กœ LLM์˜ base model์ด RL๋กœ post-training๋œ ๋ชจ๋ธ ์ˆ˜์ค€์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

ppo 1 posts

GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning

2 minute read

RL(GRPO)์— 2๊ฐ€์ง€ constrained reward(RPA + CAF) ์ ์šฉํ•˜์—ฌ GraphRAG agent ํ•™์Šต > ๊ฒ€์ƒ‰ํ•  ๋•Œ ์ž…๋ ฅ์œผ๋กœ triplet๊ณผ ์ž์—ฐ์–ด ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ™œ์šฉํ•˜์—ฌ multi-hop QA์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ํ™•์ธ

preference 1 posts

Learning User Preferences Through Interaction for Long-Term Collaboration

2 minute read

multi-turn interaction์—์„œ user์˜ explicit preference๋ฅผ memory๋กœ ํ•™์Šตํ•˜๋ฉด ๋‹จ์ˆœ Recall-based memory๋ณด๋‹ค long-term collaboration(์„ฑ๊ณต๋ฅ /ํšจ์œจ/user burden)์ด ์œ ์˜ํ•˜๊ฒŒ ๊ฐœ์„ ๋œ๋‹ค.

projector 1 posts

Honeybee: Locality-enhanced Projector for Multimodal LLM

2 minute read

MLLM์—์„œ vision encoder์™€ LLM ์‚ฌ์ด์˜ visual projector๊ฐ€ ํ•ต์‹ฌ ๋ณ‘๋ชฉ์ž„์„ ๋ถ„์„, visual token flexibility์™€ locality preservation์„ ๋™์‹œ์— ๋งŒ์กฑํ•˜๋Š” Honeybee projector๋ฅผ ์ œ์•ˆ

prompt-compression 2 posts

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

1 minute read

gist memory์™€ interactive look-up์ ์šฉํ•˜์—ฌ LLM์ด ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ ๋‹ค์‹œ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์œผ๋กœ ์ตœ๋Œ€ 20๋ฐฐ ๋” ๊ธด context๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” prompting ์‹œ์Šคํ…œ์œผ๋กœ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ

prompting 10 posts

Chain of Draft: Thinking Faster by Writing Less

1 minute read

ํ•„์ˆ˜์ ์ธ ์ค‘๊ฐ„ ์ถ”๋ก ๋งŒ ์ตœ์†Œํ•œ์œผ๋กœ ์ƒ์„ฑ, ํ† ํฐ ์‚ฌ์šฉ๊ณผ ์ถ”๋ก  ์‹œ๊ฐ„์„ ํฌ๊ฒŒ ์ค„์ด๋Š” ํ”„๋กฌํ”„ํŒ… ๋ฐฉ์‹ CoD ์ œ์•ˆ

Adaptive Retrieval-Augmented Generation for Conversational Systems

1 minute read

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

1 minute read

๋‹จ์ˆœํ•˜๊ฒŒ prompt์— ๊ธธ์ด ์ œํ•œ์„ ๊ฑธ์–ด๋„ ์„ฑ๋Šฅ์— ๋ณ„ ์˜ํ–ฅ์ด ์•ˆ๊ฐ€๋ฉด์„œ ํšจ์œจ์  ์ถ”๋ก  ๊ฐ€๋Šฅ

Self-Discover: Large Language Models Self-Compose Reasoning Structures

less than 1 minute read

๋ธ์ด ์—ฌ๋Ÿฌ reasoning techniques(CoT, critical thinking, ...) ์ค‘์—์„œ ํ•˜๋‚˜๋ฅผ ์Šค์Šค๋กœ ์„ ํƒํ•˜์—ฌ task๋ณ„๋กœ ์ ํ•ฉํ•œ ์ถ”๋ก  ์ „๋žต์„ ๊ตฌ์„ฑํ•˜๋„๋ก ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ. BBH์—์„œ ๋‹จ์ˆœ CoT๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹๊ณ  CoT Self-consistency๋ณด๋‹ค๋„ ์ถ”...

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

less than 1 minute read

sLLM(GPT2-small, LLaMA-7B, etc. )์œผ๋กœ ํ”„๋กฌํ”„ํŠธ์—์„œ ๋ถˆํ•„์š”ํ•œ ํ† ํฐ์„ ์‹๋ณ„>์ œ๊ฑฐ(์••์ถ•), LLM์˜ ์„ฑ๋Šฅ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋ฉด์„œ ์ตœ๋Œ€ 20๋ฐฐ์˜ ์••์ถ• ๋‹ฌ์„ฑ ๊ฐ€๋Šฅ

rag 27 posts

SimpleMem: Efficient Lifelong Memory for LLM Agents

3 minute read

LLM Agent์˜ LTM์„ semantic lossless compression์œผ๋กœ ์žฌ์ •์˜ํ•˜๊ณ , write-time ๊ตฌ์กฐํ™”ยทonline synthesisยทintent-aware retrieval๋กœ ์„ฑ๋Šฅ๊ณผ ํ† ํฐ ํšจ์œจ(์ตœ๋Œ€ 30๋ฐฐ)์„ ๊ฐœ์„ ํ•œ ๋ฉ”๋ชจ๋ฆฌ ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

LightMem: Lightweight and Efficient Memory-Augmented Generation

1 minute read

sensory > topic-aware short-term > sleep-time long-term memory ์—…๋ฐ์ดํŠธ์˜ 3๋‹จ๊ณ„ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ์ œ์•ˆ, LongMemEval ์ •ํ™•๋„ ํ–ฅ์ƒ ๋ฐ token/API call/runtime ๋น„์šฉ ๋Œ€ํญ ์ถ•์†Œ ํ™•์ธ

GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning

2 minute read

RL(GRPO)์— 2๊ฐ€์ง€ constrained reward(RPA + CAF) ์ ์šฉํ•˜์—ฌ GraphRAG agent ํ•™์Šต > ๊ฒ€์ƒ‰ํ•  ๋•Œ ์ž…๋ ฅ์œผ๋กœ triplet๊ณผ ์ž์—ฐ์–ด ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ™œ์šฉํ•˜์—ฌ multi-hop QA์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ํ™•์ธ

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

1 minute read

gist memory์™€ interactive look-up์ ์šฉํ•˜์—ฌ LLM์ด ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ ๋‹ค์‹œ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์œผ๋กœ ์ตœ๋Œ€ 20๋ฐฐ ๋” ๊ธด context๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” prompting ์‹œ์Šคํ…œ์œผ๋กœ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ

Inference Scaling for Long-Context Retrieval Augmented Generation

2 minute read

LM์˜ RAG inference ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ scaling ์ „๋žต์„ ์ œ์•ˆํ•˜๊ณ , ์œ ํšจ ์ปจํ…์ŠคํŠธ ๊ธธ์ด์˜ ๊ทœ๋ชจ์™€ RAG ์„ฑ๋Šฅ ๊ฐ„์— ์„ ํ˜•์ ์ธ ๊ด€๊ณ„๊ฐ€ ์žˆ์Œ์„ ํ™•์ธ

LC-LLM RAG: Long-Context LLMs Meet RAG

2 minute read

LC-LLM์„ RAG์—์„œ ์“ธ ๋•Œ, (1) context ์ˆœ์„œ๋ฅผ ์ž˜ ์ฃผ๊ณ  (2) RAG ๋А๋‚Œ์„ ํŠœ๋‹์‹œ์ผœ์ฃผ๊ณ  (3) ๋ช…์‹œ์ ์œผ๋กœ relevant ์—ฌ๋ถ€๋ฅผ ํŒ๋‹จํ•˜๋„๋ก reasoning step ์ฃผ๋ฉด ๋” ์ž˜ํ•œ๋‹ค.

Knowing When to Ask - Bridging Large Language Models and Data

1 minute read

Data Commons (knowledge Graph)๋ฅผ ํ™œ์šฉํ•˜์—ฌ LLM ์‘๋‹ต์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒ์‹œ์ผœ LLM๊ณผ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ฒฉ์ฐจ ํ•ด์†Œํ•˜๋Š” DataGemma ์†Œ๊ฐœ

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Pandoraโ€™s Box or Aladdinโ€™s Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

2 minute read

LLM์˜ RAG ์ƒํ™ฉ์—์„œ ๋‹ค์–‘ํ•œ Noise๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ณ  ๋ถ„์„. ์œ ์ตํ•œ Noise์˜ ๊ฒฝ์šฐ ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธ. ๋ฒค์น˜๋งˆํฌ NoiserBench๋ฅผ ์ œ์‹œํ•˜์—ฌ LLM์˜ Noise ๋Œ€์‘ ํ‰๊ฐ€ ๋ฐ ์œ ์ตํ•œ noise๋Š” ํ™œ์šฉํ•˜๊ณ  ํ•ด๋กœ์šด noise๋Š” ์ค„์ด๋Š” ๋ฐฉ๋ฒ• ์ œ์‹œ.

Adaptive Retrieval-Augmented Generation for Conversational Systems

1 minute read

์ฃผ์–ด์ง„ ๋Œ€ํ™”์—์„œ ์ „ํ™˜์‹œ ์™ธ๋ถ€ ์ง€์‹์˜ ์ฆ๊ฐ•์ด ํ•„์š”ํ•œ์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋งค์ปค๋‹ˆ์ฆ˜ ์ œ์•ˆ

Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach

1 minute read

(1) RAG vs. Long-context LLM์— ๋Œ€ํ•ด, ์ž์›๋งŒ ์ถฉ๋ถ„ํ•˜๋‹ค๋ฉด ๊ฒฐ๊ณผ์ ์œผ๋กœ๋Š” LC LLM์ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋‚˜, (2) ๋น„์šฉ ์ธก๋ฉด์˜ ํšจ์œจ์„ ์œ„ํ•ด RAG๋กœ routingํ•˜๋Š” approach, Self-Route ์ œ์•ˆ

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

less than 1 minute read

๋‹ค์–‘ํ•œ ๋ฌธ์„œ ์ƒ์„ฑ + QA pair ๊ตฌ์„ฑํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ LLM์˜ ์ง€์‹ ์‚ฌ์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€ํ•˜๋Š” Framework ์ œ์•ˆ

Generative Representational Instruction Tuning

less than 1 minute read

text embedding๊ณผ generation ํ†ตํ•ฉํ•˜๋Š” Generative Representational Instruction Tuning ์ œ์•ˆ. ๋‹จ์ผ๋ชจ๋ธ์ธ GritLM์€ embedding(MTEB) ๋ฐ generation task(BBH...)์—์„œ ๋ชจ๋‘ SoTA๋ฅผ ๋‹ฌ์„ฑ.

The Power of Noise: Redefining Retrieval for RAG Systems

less than 1 minute read

RAG์—์„œ Retrieval ์— ์ง‘์ค‘ํ•˜์—ฌ, document์™€ prompt์˜ ์—ฐ๊ด€์„ฑ, prompt์—์„œ document์˜ ์œ„์น˜์™€ ์ˆ˜ ๋“ฑ ๋‹ค์–‘ํ•œ ์š”์†Œ๋ฅผ ํ‰๊ฐ€.

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

less than 1 minute read

๊ธฐ์กด RAG ๋ฒค์น˜๋งˆํฌ๋Š” ๋ฒ”์œ„์™€ ๋‹ค์–‘์„ฑ์ด ์ œํ•œ๋˜์–ด ์žˆ๊ณ , ๊ฒ€์ƒ‰ ์š”์†Œ(retriever)์™€ ์™ธ๋ถ€ KB์˜ ์˜ํ–ฅ์„ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•˜๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค๊ณ  ์ง€์ ํ•˜๋ฉฐ, RAG Application์˜ ๋ฒ”์œ„๋ฅผ CRUD๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ  ๊ฐ๊ฐ์— ๋Œ€ํ•œ ํ‰๊ฐ€ task์™€ ๋ฐ์ดํ„ฐ์…‹ ๊ณต๊ฐœ. (์ค‘๊ตญ์–ด)

Corrective Retrieval Augmented Generation

less than 1 minute read

confidence score, web search, knowledge refinement๋กœ ์ž˜๋ชป ์ฐพ์•„์˜จ, ํ˜น์€ ์ตœ์ ์ด ์•„๋‹Œ ๊ฒฐ๊ณผ๋ฅผ self-correctionํ•˜์—ฌ ๋ชจ๋ธ ์ƒ์„ฑ ๊ฒฐ๊ณผ์— hallucination ๊ฐ์†Œ

DocLLM: A layout-aware generative language model for multimodal document understanding

less than 1 minute read

multi-modal LLM์—์„œ ์ฐฉ์•ˆ, LM์ด text์™€ (์ •ํ˜•ํ™”๋œ document ๋‚ด์—์„œ ) ์œ„์น˜์ •๋ณด๋ฅผ input์œผ๋กœ ๋ฐ›๋„๋ก ํ•˜์—ฌ internal structured document understanding ๋ฌธ์ œ ํ•ด๊ฒฐ

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

less than 1 minute read

sLLM(GPT2-small, LLaMA-7B, etc. )์œผ๋กœ ํ”„๋กฌํ”„ํŠธ์—์„œ ๋ถˆํ•„์š”ํ•œ ํ† ํฐ์„ ์‹๋ณ„>์ œ๊ฑฐ(์••์ถ•), LLM์˜ ์„ฑ๋Šฅ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋ฉด์„œ ์ตœ๋Œ€ 20๋ฐฐ์˜ ์••์ถ• ๋‹ฌ์„ฑ ๊ฐ€๋Šฅ

REPLUG: Retrieval-Augmented Black-Box Language Models

less than 1 minute read

์–ธ์–ด ๋ชจ๋ธ์„ ๋ธ”๋ž™๋ฐ•์Šค๋กœ ์ทจ๊ธ‰ํ•˜๊ณ  ๊ฒ€์ƒ‰ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ์ž ์žฌ์ ์œผ๋กœ ์กฐ์ •๊ฐ€๋Šฅํ•œ ๋ชจ๋“ˆ๋กœ ์ถ”๊ฐ€ํ•˜๋Š” ์ƒˆ๋กœ์šด retrieval-Augmented LM ํŒจ๋Ÿฌ๋‹ค์ž„ ์ œ์•ˆ

reasoning 17 posts

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

2 minute read

memory consolidation๊ณผ reasoning์„ ํ•˜๋‚˜์˜ internal state๋กœ ํ†ตํ•ฉํ•˜๋„๋ก RL ํ•™์Šตํ•˜์—ฌ long-horizon task์—์„œ ๊ฑฐ์˜ ์ผ์ •ํ•œ context size ์œ ์ง€ํ•˜๋ฉฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

3 minute read

long-horizon task์—์„œ ๋ฐœ์ƒํ•˜๋Š” planning ์‹คํŒจ์˜ ํ•ต์‹ฌ ์›์ธ์„ entanglement๋กœ ๊ทœ์ •, ์ด๋ฅผ subtask ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌ๋œ DAG ๊ธฐ๋ฐ˜ planning์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์„ ์ œ์•ˆ, ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ํ† ํฐ ์ ˆ๊ฐ์—์„œ ์œ ์˜

WAIT, WAIT, WAITโ€ฆ Why Do Reasoning Models Loop?

2 minute read

Reasoning ๋ชจ๋ธ์˜ looping์€ decoding artifact๋งŒ์ด ์•„๋‹ˆ๋ผ learning errors๊ฐ€ greedy/low-temp์—์„œ ์ฆํญ๋˜๋ฉฐ ๋ฐœ์ƒ, temperature๋Š” loop๋ฅผ ์ค„์ด์ง€๋งŒ ๊ทผ๋ณธ ์›์ธ์„ ๊ณ ์น˜์ง€ ๋ชปํ•ด ๋ถˆํ•„์š”ํ•˜๊ฒŒ ๊ธด CoT๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

Reasoning with Sampling: Your Base Model is Smarter Than You Think

2 minute read

์ถ”๊ฐ€ ํ•™์Šต ์—†์ด ๋‹จ์ˆœ MCMC ๊ธฐ๋ฐ˜ ์ƒ˜ํ”Œ๋ง๋งŒ์œผ๋กœ LLM์˜ base model์ด RL๋กœ post-training๋œ ๋ชจ๋ธ ์ˆ˜์ค€์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

SSRL: Self-Search Reinforcement Learning

1 minute read

๊ฒ€์ƒ‰์—”์ง„์ด๋‚˜ ๋‹ค๋ฅธ LLM ๋“ฑ ์™ธ๋ถ€ tool ์—†์ด ๊ฒ€์ƒ‰์„ Full-simulationํ•ด์„œ RL โ†’ real-world๋กœ ์ „์ด ๊ฐ€๋Šฅํ•œ self-search ๋ชจ๋ธ ๊ตฌ์ถ•

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

2 minute read

LRM์ด thinkํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์—ฌ๋„, ๋ณต์žก๋„๊ฐ€ ๋†’์œผ๋ฉด ์‹คํŒจํ•˜๊ฑฐ๋‚˜ ์ถ”๋ก ๋„ ๋น„ํšจ์œจ์ ์œผ๋กœ(=๋œ) ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„, ์ง„์ •ํ•œ ์ผ๋ฐ˜ํ™” ์ถ”๋ก  ์„ฑ๋Šฅ์€ ๋ถ€์กฑํ•˜๋‹ค.

Reasoning Models Can Be Effective Without Thinking

1 minute read

reasoning ์—†์ด reasoning ์„ฑ๋Šฅ ๋‚ด๊ธฐ - ํ”„๋กฌํ”„ํŠธ๋งŒ ๋ฐ”๊ฟ”์„œ ์งง๊ฒŒ ์—ฌ๋Ÿฌ ๋‹ต๋ณ€ ์ƒ์„ฑ์‹œํ‚ค๋Š”๊ฒŒ ๊ธด CoT๋ณด๋‹ค ๋‚˜์„ ์ˆ˜ ์žˆ๋‹ค.

Concise Reasoning via Reinforcement Learning

1 minute read

RL๋กœ ํ•™์Šต๋œ LLM์ด ๋ถˆํ•„์š”ํ•˜๊ฒŒ ๊ธด ์ถ”๋ก ์„ ์ƒ์„ฑํ•˜์ง€๋งŒ, 2-phrase RL๋กœ ์ •ํ™•๋„๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ฐ„๊ฒฐํ•œ ์ถ”๋ก ์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

1 minute read

LRMs์ด overthinkingํ•˜๊ฒŒ ๋˜๋ฉด agentic ํ™˜๊ฒฝ๊ณผ ์ œ๋Œ€๋กœ ์ƒํ˜ธ์ž‘์šฉํ•˜์ง€ ๋ชปํ•˜๋Š” Reasoning-Action Dilemma๊ฐ€ ๋ฐœ์ƒ๋˜๊ณ , ์ด๋Š” ์„ฑ๋Šฅ ํ•˜๋ฝ์„ ์ดˆ๋ž˜ํ•œ๋‹ค๋Š” ๊ฒฐ๊ณผ ๋ณด๊ณ 

LIMO - Less is More for Reasoning

1 minute read

์ž‘์ง€๋งŒ ์ข‹์€ ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ ์ˆ˜๋ฆฌ์ถ”๋ก  ๋Šฅ๋ ฅ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ = ๋ชจ๋ธ์ด ์ด๋ฏธ ์•Œ๊ณ  ์žˆ๋Š” ๊ฑธ ์ž˜ ๋„์ง‘์–ด๋‚ด๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

1 minute read

o1-like LLMs์ด ์–ด๋ ค์šด ๋ฌธ์ œ๋ฅผ ํ’€ ๋•Œ ๋ถˆํ•„์š”ํ•˜๊ฒŒ ์‚ฌ๊ณ  ํ๋ฆ„์„ ์ž์ฃผ ๋ณ€๊ฒฝํ•˜๋Š” Underthinking ํ˜„์ƒ ๋ถ„์„

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

1 minute read

Repeated Sampling์ด LLM ์„ฑ๋Šฅ์—์„œ coverage ์ธก๋ฉด์˜ ํšจ์šฉ์ด ๋งค์šฐ ํฌ๊ณ , ์ž๋™ verification์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ ์ •ํ™•๋„๊นŒ์ง€ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.

Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLMโ€™s Reasoning Capability

1 minute read

์˜ค๋ฅ˜ ์ถ”๋ก ์ด ๋ฐœ์ƒํ•˜๋Š” ๊ณผ์ •์— ์ค‘์š” ์—ญํ• (์›์ธ)์„ ํ•˜๋Š” ํ† ํฐ (critical token)์„ ์‹๋ณ„ํ•˜์—ฌ ์ด ํ† ํฐ์„ ๋ชจ๋ธ ์ถ”๋ก  ๊ฐœ์„ ์— ์ ์šฉ(cDPO)ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก  ์ œ์•ˆ

Reverse Thinking Makes LLMs Stronger Reasoners

1 minute read

LLM์ด '์—ญ๋ฐœ์ƒ'์„ ํ•™์Šตํ•˜๋„๋ก ํ›ˆ๋ จํ•˜๋ฉด ์ƒ์‹, ์ˆ˜ํ•™, ๋…ผ๋ฆฌ์  ์ถ”๋ก ๊ฐ™์€ task ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ํฐ ๋„์›€. x10๋งŒํผ์˜ forward training(standard finetuning)๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜๋‹ค๊ณ  ์ฃผ์žฅ.

reinforcement-learning 11 posts

Reasoning with Sampling: Your Base Model is Smarter Than You Think

2 minute read

์ถ”๊ฐ€ ํ•™์Šต ์—†์ด ๋‹จ์ˆœ MCMC ๊ธฐ๋ฐ˜ ์ƒ˜ํ”Œ๋ง๋งŒ์œผ๋กœ LLM์˜ base model์ด RL๋กœ post-training๋œ ๋ชจ๋ธ ์ˆ˜์ค€์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning

2 minute read

RL(GRPO)์— 2๊ฐ€์ง€ constrained reward(RPA + CAF) ์ ์šฉํ•˜์—ฌ GraphRAG agent ํ•™์Šต > ๊ฒ€์ƒ‰ํ•  ๋•Œ ์ž…๋ ฅ์œผ๋กœ triplet๊ณผ ์ž์—ฐ์–ด ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ™œ์šฉํ•˜์—ฌ multi-hop QA์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ํ™•์ธ

TTRL: Test-Time Reinforcement Learning

1 minute read

test ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ majority-voting์œผ๋กœ reward ์ถ”์ •, ์ด๋ฅผ ํ†ตํ•ด RL ์‹œ๋„ํ•˜๋Š” ์ œ์•ˆ TTRL์ดย reasoning ์„ฑ๋Šฅ์„ x2~x3๊นŒ์ง€ ๋Œ์–ด์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค

Concise Reasoning via Reinforcement Learning

1 minute read

RL๋กœ ํ•™์Šต๋œ LLM์ด ๋ถˆํ•„์š”ํ•˜๊ฒŒ ๊ธด ์ถ”๋ก ์„ ์ƒ์„ฑํ•˜์ง€๋งŒ, 2-phrase RL๋กœ ์ •ํ™•๋„๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ฐ„๊ฒฐํ•œ ์ถ”๋ก ์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

Planning Like Human: A Dual-process Framework for Dialogue Planning

1 minute read

์ต์ˆ™ํ•œ ์ƒํ™ฉ์„ ์ฒ˜๋ฆฌํ•˜๋Š” intuitive (fast) ์ •์ฑ… ๋ชจ๋ธ๊ณผ ์ƒˆ๋กœ์šด ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์œ„ํ•œ analytical (slow)์˜ ์ •์ฑ… ๋ชจ๋ธ์„ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์ค‘ dialogue planning ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

Scaling Laws for Reward Model Overoptimization

less than 1 minute read

RM์œผ๋กœ Policy model์„ ํ•™์Šตํ•˜๋ฉด ํ•™์Šตํ• ์ˆ˜๋ก real (human) preference์™€ ๊ฒฉ์ฐจ๊ฐ€ ๋ฒŒ์–ด์ง€๋Š” overoptimization์ด (๋ฐ˜๋“œ์‹œ) ๋ฐœ์ƒ๋˜๋ฉฐ, ์ด ํ˜„์ƒ์˜ ๋„๋‹ฌ์„ ๋Šฆ์ถ”๋Š”(?) ๋ฐ์—๋Š” RM์˜ ์‚ฌ์ด์ฆˆ๋ฅผ ํ‚ค์šฐ๋Š”๊ฒŒ ์œ ์˜ํ•œ ์˜ํ–ฅ์„ ๋ผ์น˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž„.

Self-Rewarding Language Models

less than 1 minute read

๋ฐ˜๋ณต์ ์ธ DPO ํ›ˆ๋ จ์œผ๋กœ ์‚ฌ๋žŒ์ด ์„ค๊ณ„ํ•œ reward model์ด ์•„๋‹Œ,ย LLM-as-a-Judgeย mechanism์„ ์‚ฌ์šฉ, LM์ด ์ž์œจ์ ์œผ๋กœ instruction following & reward modeling > refine ๋ฐ˜๋ณต.

representation-learning 5 posts

MoEE: Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

1 minute read

MoE LLM์˜ router weight๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋ณ„๋„ ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด decoder-style LLM์—์„œ๋„ ๊ดœ์ฐฎ์€ representation (embedding) ๋ฝ‘์„ ์ˆ˜ ์žˆ๋‹ค.

Is Cosine-Similarity of Embeddings Really About Similarity?

less than 1 minute read

cosine-similarity๋ฅผ ์˜๋ฏธ์  ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ์ฒ™๋„๋กœ ๋งน์‹ ํ•˜์ง€๋Š” ๋ง์•„์•ผ ํ•œ๋‹ค.

Generative Representational Instruction Tuning

less than 1 minute read

text embedding๊ณผ generation ํ†ตํ•ฉํ•˜๋Š” Generative Representational Instruction Tuning ์ œ์•ˆ. ๋‹จ์ผ๋ชจ๋ธ์ธ GritLM์€ embedding(MTEB) ๋ฐ generation task(BBH...)์—์„œ ๋ชจ๋‘ SoTA๋ฅผ ๋‹ฌ์„ฑ.

Improving Text Embeddings with Large Language Models

less than 1 minute read

GPT-3.5, GPT-4๋ฅผ ํ™œ์šฉ, 2-step prompt ์‚ฌ์šฉํ•ด์„œ ๋งŒ๋“  synthetic data(94 languages, 500K examples)๋กœ decoder-only LLM(Mistral-7B)์„ contrastive loss ์‚ฌ์šฉํ•ด 1-epoch ํ•™์Šต. ์ด unlab...

rl 2 posts

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

3 minute read

Long-horizon LLM agents์˜ context window bottleneck ํ•ด๊ฒฐ์„ ์œ„ํ•ด, ๊ตฌ์กฐํ™”๋œ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ Indexed Experience Memory์™€ ์ด๋ฅผ ํ•™์Šตํ•˜๋Š” MemexRL ์ œ์•ˆ

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

2 minute read

memory consolidation๊ณผ reasoning์„ ํ•˜๋‚˜์˜ internal state๋กœ ํ†ตํ•ฉํ•˜๋„๋ก RL ํ•™์Šตํ•˜์—ฌ long-horizon task์—์„œ ๊ฑฐ์˜ ์ผ์ •ํ•œ context size ์œ ์ง€ํ•˜๋ฉฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

sae 2 posts

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

1 minute read

๊ธฐ์กด vanilla ReLU๋ฅผ jumpReLU๋ผ๋Š” ๋น„์—ฐ์† activation์œผ๋กœ ๋Œ€์ฒดํ•˜์—ฌ ์ƒˆ๋กœ์šด SAE (sparse autoencodesr) SOTA, ๋น„์—ฐ์†์ ์ธ activation ์‚ฌ์šฉํ•˜์ง€๋งŒ straight-through estimator๋กœ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šต

safety 4 posts

Safety Layers of Aligned Large Language Models: The Key to LLM Security

1 minute read

๋‹ค์–‘ํ•œ Aligned LLM์˜ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ์— safety layer๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธ. safety layer๋Š” ์•…์˜์ ์ธ ์‚ฌ์šฉ์ž ์งˆ์˜๋ฅผ ์‹๋ณ„ํ•˜๊ณ  ๋˜ ๊ฑฐ๋ถ€ํ•˜๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰. ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ safety๋ฅผ ์œ ์ง€ํ•˜๋Š” Finetuning ๋ฐฉ๋ฒ•๋ก  SPPFT ์ œ์•ˆ.

Social Learning: Towards Collaborative Learning with Large Language Models

1 minute read

Social Learning์œผ๋กœ๋ถ€ํ„ฐ ์ฐฉ์•ˆ, LLM(Teacher)์ด ๋‹ค๋ฅธ AI๋ชจ๋ธ(Students)์„ ๊ฐ€๋ฅด์น˜๋Š” ๊ตฌ์กฐ ์ œ์•ˆ, ์„ฑ๋Šฅ๋ฉด์—์„œ ์ฐจ์ด ์—†์ด ์•ˆ์ „์„ฑ ์ฆ๊ฐ€

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

less than 1 minute read

LLM๋„ ๊ธฐ๋งŒ์ (deceptive)์ผ ์ˆ˜ ์žˆ๋‹ค. LLM์ด ๋”์šฑ ์ผ๊ด€๋˜๊ณ  ๋…ผ๋ฆฌ์ ์ธ ๊ธฐ๋งŒ์„ ์ƒ์„ฑํ•˜๋„๋ก ํ•™์Šต ๊ฐ€๋Šฅํ•˜๊ณ , ์ด๋Š” standard๋กœ ์•Œ๋ ค์ง„ safety ํ•™์Šต ๋ฐฉ์‹์œผ๋กœ๋Š” ์ฒ˜๋ฆฌ๋˜์ง€ ๋ชปํ•จ.

scaling-laws 1 posts

Scaling Laws of Synthetic Data for Language Models

2 minute read

SYNTHLLM ๋ฐฉ์‹์œผ๋กœ ์ƒ์„ฑํ•œ ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ๋Š” LLM finetuning์— ๋Œ€ํ•ด ์˜ˆ์ธก ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ๊ณผ์ ์œผ๋กœ scale ๋˜๊ณ , ์ˆ˜์ •ํ•œ scaling law์— ๋”ฐ๋ผ natural data ๋ถ€์กฑ์— ๋Œ€ํ•œ ํ™•์žฅ๊ฐ€๋Šฅํ•œ ์†”๋ฃจ์…˜์ด ๋œ๋‹ค๊ณ  ์ฃผ์žฅ

self-improvement 14 posts

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

2 minute read

LLM Agent๊ฐ€ test-time์— ๊ณผ๊ฑฐ ๊ฒฝํ—˜์„ ์Šค์Šค๋กœ ์ง„ํ™”์‹œํ‚ค๋ฉฐ ํ•™์Šตํ•˜๋Š” ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” streaming benchmark Evo-Memory ์ œ์•ˆ, ExpRAG / ReMem ๊ฐ™์€ baseline์„ ์ œ์•ˆํ•˜์—ฌ ๊ฒฝํ—˜ ์žฌ์‚ฌ์šฉ ๊ธฐ๋ฐ˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋Œ€ํ•œ ๋น„๊ต ํ‰๊ฐ€ ๊ธฐ๋ฐ˜ ์ œ์‹œ

RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback

1 minute read

ํ•ด๋‹ต์˜ ์ •ํ™•์„ฑ ๋ฐ ๊ฐœ์„  ๊ธฐ์—ฌ ํ”ผ๋“œ๋ฐฑ์„ ๋ชจ๋‘ ํ‰๊ฐ€ํ•˜๋Š” dual-reward RL-trained critic model์„ ๋„์ž…ํ•œ RefCritic ์ œ์•ˆ, ์ˆ˜๋ฆฌ ์ถ”๋ก  ๊ณผ์ œ์—์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

2 minute read

LRM์ด thinkํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์—ฌ๋„, ๋ณต์žก๋„๊ฐ€ ๋†’์œผ๋ฉด ์‹คํŒจํ•˜๊ฑฐ๋‚˜ ์ถ”๋ก ๋„ ๋น„ํšจ์œจ์ ์œผ๋กœ(=๋œ) ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„, ์ง„์ •ํ•œ ์ผ๋ฐ˜ํ™” ์ถ”๋ก  ์„ฑ๋Šฅ์€ ๋ถ€์กฑํ•˜๋‹ค.

Scaling Laws of Synthetic Data for Language Models

2 minute read

SYNTHLLM ๋ฐฉ์‹์œผ๋กœ ์ƒ์„ฑํ•œ ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ๋Š” LLM finetuning์— ๋Œ€ํ•ด ์˜ˆ์ธก ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ๊ณผ์ ์œผ๋กœ scale ๋˜๊ณ , ์ˆ˜์ •ํ•œ scaling law์— ๋”ฐ๋ผ natural data ๋ถ€์กฑ์— ๋Œ€ํ•œ ํ™•์žฅ๊ฐ€๋Šฅํ•œ ์†”๋ฃจ์…˜์ด ๋œ๋‹ค๊ณ  ์ฃผ์žฅ

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

1 minute read

์‚ฌ์ „์— ํ‰๊ฐ€ ๊ธฐ์ค€์„ ์ œ๊ณตํ•˜์ง€ ์•Š๊ณ , ์ž์ฒด์ ์œผ๋กœ ํ‰๊ฐ€ ๊ณ„ํš-์‹คํ–‰-ํŒ๋‹จ์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์ˆ˜ํ–‰ํ•˜๋Š” Self-training loop์˜ thinking-llm-as-a-judge framework ์ œ์•ˆ, ์ ์€ ๋ฐ์ดํ„ฐ๋กœ๋„ SOTA ์„ฑ๋Šฅ๋‹ฌ์„ฑ

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

1 minute read

Repeated Sampling์ด LLM ์„ฑ๋Šฅ์—์„œ coverage ์ธก๋ฉด์˜ ํšจ์šฉ์ด ๋งค์šฐ ํฌ๊ณ , ์ž๋™ verification์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ ์ •ํ™•๋„๊นŒ์ง€ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

1 minute read

instance level๋กœ ๊ดœ์ฐฎ์€ ๋ฐ์ดํ„ฐ๋งŒ ๊ณจ๋ผ ํ•™์Šตํ•˜๊ธฐ๋ณด๋‹ค, k-means clustering ํ™œ์šฉํ•œ Diversity-Centric Data Selection์ด LLM finetuning์˜ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์œ ์˜ํ•˜๋‹ค.

Self-Rewarding Language Models

less than 1 minute read

๋ฐ˜๋ณต์ ์ธ DPO ํ›ˆ๋ จ์œผ๋กœ ์‚ฌ๋žŒ์ด ์„ค๊ณ„ํ•œ reward model์ด ์•„๋‹Œ,ย LLM-as-a-Judgeย mechanism์„ ์‚ฌ์šฉ, LM์ด ์ž์œจ์ ์œผ๋กœ instruction following & reward modeling > refine ๋ฐ˜๋ณต.

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

less than 1 minute read

LM์ด Self-Talk๋ฅผ ํ†ตํ•ด training ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ>์ •์ œ>SFT์— ํ™œ์šฉ (bootstrapping). ์ด ๊ณผ์ •์—์„œ ๋ณ‘๋ชฉ์„ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ๋Œ€ํ™”์„ฑ๊ณต ์—ฌ๋ถ€๋ฅผ ์ธก์ •ํ•˜๋Š” automatic metric ์ œ์•ˆ

self-learning 1 posts
sft 1 posts

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

1 minute read

instance level๋กœ ๊ดœ์ฐฎ์€ ๋ฐ์ดํ„ฐ๋งŒ ๊ณจ๋ผ ํ•™์Šตํ•˜๊ธฐ๋ณด๋‹ค, k-means clustering ํ™œ์šฉํ•œ Diversity-Centric Data Selection์ด LLM finetuning์˜ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์œ ์˜ํ•˜๋‹ค.

synthetic-data 1 posts

Scaling Laws of Synthetic Data for Language Models

2 minute read

SYNTHLLM ๋ฐฉ์‹์œผ๋กœ ์ƒ์„ฑํ•œ ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ๋Š” LLM finetuning์— ๋Œ€ํ•ด ์˜ˆ์ธก ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ๊ณผ์ ์œผ๋กœ scale ๋˜๊ณ , ์ˆ˜์ •ํ•œ scaling law์— ๋”ฐ๋ผ natural data ๋ถ€์กฑ์— ๋Œ€ํ•œ ํ™•์žฅ๊ฐ€๋Šฅํ•œ ์†”๋ฃจ์…˜์ด ๋œ๋‹ค๊ณ  ์ฃผ์žฅ

tableqa 1 posts

Knowing When to Ask - Bridging Large Language Models and Data

1 minute read

Data Commons (knowledge Graph)๋ฅผ ํ™œ์šฉํ•˜์—ฌ LLM ์‘๋‹ต์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒ์‹œ์ผœ LLM๊ณผ ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ฒฉ์ฐจ ํ•ด์†Œํ•˜๋Š” DataGemma ์†Œ๊ฐœ

test-time-scaling 1 posts

TTRL: Test-Time Reinforcement Learning

1 minute read

test ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ majority-voting์œผ๋กœ reward ์ถ”์ •, ์ด๋ฅผ ํ†ตํ•ด RL ์‹œ๋„ํ•˜๋Š” ์ œ์•ˆ TTRL์ดย reasoning ์„ฑ๋Šฅ์„ x2~x3๊นŒ์ง€ ๋Œ์–ด์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค

time-sensitive 1 posts

Real-time Fake News from Adversarial Feedback

1 minute read

LLM์˜ fake news๋ฅผ ๋” ์ž˜ ์ƒ์„ฑํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•. ํ•™์Šต ์ดํ›„ ๋ฐœ์ƒ๋˜๋Š” ์‚ฌ๊ฑด์˜ fake news ํƒ์ง€๋ฅผ ์œ„ํ•ด, adversarial iterative fake news ์ƒ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ ์ œ์•ˆ

transformers 4 posts

Differential Transformer

1 minute read

Q/K๋ฅผ ๊ฐ๊ฐ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด 2๊ฐœ์˜ softmax attention map๊ฐ„ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ, relevant context์— ๋Œ€ํ•œ attention์„ ํ‚ค์šฐ๊ณ  ๋…ธ์ด์ฆˆ๋Š” ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ์‹์˜ transformers ๋ณ€ํ˜• ์ œ์•ˆ, hallucination ๊ฐœ์„ 

Selective Attention Improves Transformer

1 minute read

attention ์—ฐ์‚ฐ์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€๊ฒฝ ์—†์ด, ์ƒ์„ฑ๋œ token์ด ๋‹ค๋ฅธ token์ด ๋”์ด์ƒ ํ•„์š” ์—†๋‹ค๊ณ  ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฒ˜๋ฆฌ, ๋ฏธ๋ž˜ ์‹œ์ ์—์„œ๋Š” ํ•ด๋‹น token์ด ๋ถˆํ•„์š”ํ•˜๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๋˜ token๋“ค์— ๋Œ€ํ•œ attention์„ ์ค„์ด๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰๊ณผ ๊ณ„์‚ฐ ๋น„์šฉ์„ ...

Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation

less than 1 minute read

multi-layer๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ transformer ๊ณ„์—ด ๋ชจ๋ธ์—์„œ prompt๊ฐ€ ๋’ค์ชฝ์œผ๋กœ ๊ฐˆ์ˆ˜๋ก ์žŠํ˜€์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” DualLoRA ์ œ์•ˆ

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

1 minute read

Claude3-sonet์˜ ์ค‘๊ฐ„ layer์—์„œ ๋‚˜์˜จ Residual stream๋กœ Sparse Auto-encoder (SAE) ํ•™์Šต, SAE์™€ ๊ทธ feature vector ํ™œ์šฉํ•˜์—ฌ ํ•ด์„ ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์˜ ํŠน์„ฑ ํ™•์ธ๊ฐ€๋Šฅ.

translate 1 posts
unlearning 1 posts
weight-merging 2 posts

Configurable Foundation Models: Building LLMs from a Modular Perspective

2 minute read

LLM์„ ์ธ๊ฐ„์˜ ๋‡Œ์™€ ๊ฐ™์ด ๊ธฐ๋Šฅ์  ๋ชจ๋“ˆ๋กœ ์ ‘๊ทผํ•˜์ž๋Š” ๊ด€์  ์ œ์•ˆ (brick ๋‹จ์œ„๋กœ ๋ถ„ํ•ด)๊ณผ ๊ฒฝํ—˜์  ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ณด๊ณ 

Knowledge Fusion of Large Language Models

1 minute read

๊ธฐ์กด์— ๊ฐ๊ธฐ ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉด์„œ ๋‹ค์–‘ํ•œ ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋œ ์—ฌ๋Ÿฌ LLMs(soucre LLMs)์„ ๋ณ‘ํ•ฉํ•ด์„œ ๋” strongํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•(pic1)์œผ๋กœ, ์—ฌ๋Ÿฌ LLM์˜ ์ง€์‹์„ ์™ธ๋ถ€ํ™”ํ•˜์—ฌ ๊ทธ๋“ค์˜ capability๋ฅผ ์ƒˆ๋กœ์šด LLM(target LLM)์œผ๋กœ transferํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ...