RouteLLM: Learning to Route LLMs with Preference Data

July 10, 2024 1 minute read

Meta info.

Authors: Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M Waleed Kadous, Ion Stoica
Paper: https://arxiv.org/pdf/2406.18665
Published: June 26, 2024
Code: https://github.com/lm-sys/RouteLLM
References: http://lmsys.org/blog/2024-07-01-routellm/

TL; DR

비용 절감을 위한 LLM routing 방법 제안

Untitled

(1) Chatbot Arena platform 데이터
(2) (1) 의 증강을 위해 gold data의 label 답변 보고 Strong model 과 weak model 선호 데이터 구축
Strong model (GPT-4) 와 weak model (Mixtral-8x7B) 이진 class routing
- win prediction model: (1)과 (2) 활용하여 학습, 상대 비교하여 선호 class를 선택하는 모델 pic3
  - backbone: text-embedding-3-small
    1. matrix factorization router: 각 모델별로 low dimensional space에 represent하면서 모델-쿼리간 score function 학습
    2. similarity weighted ranking router : Bradley-Terry model 활용, training 데이터셋에서 유사 쿼리 계산, 그를 바탕으로 학습 데이터(과거 선호) 중요도에 weight 부여
cost threshold([0, 1])를 설정하여 품질과 비용사이 trade-off 정도 조정