Posts 2026
Jun 2026 2 posts
May 2026 4 posts
Advancing and Benchmarking Personalized Tool Invocation for LLMs
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
Synthetic Users, Real Differences: an Evaluation Framework for User Simulation in Multi-Turn Conversations
Latent Agents: A Post-Training Procedure for Internalized Multi-Agent Debate
Apr 2026 5 posts
Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents
ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Is the Modality Gap a Bug or a Feature? A Robustness Perspective
Mar 2026 4 posts
TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models
Honeybee: Locality-enhanced Projector for Multimodal LLM
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks