SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Meta info.
- Authors: Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman
- Paper: https://arxiv.org/pdf/2401.15024.pdf
- Affiliation: Microsoft Research
- Code: https://github.com/microsoft/TransformerCompression
- Conference: ICLR2024
TL; DR
weight matrtix๋ฅผ ๋ ๊ณ ๋ฐ๋์ ์์ ํ๋ ฌ๋ก slicingํ๋ ๋ฐฉ์์ ์๋ก์ด post training sparsification ์ ์. ์ฑ๋ฅ drop์ 1%~10% ๋ด๋ก ๋ฐฉ์ดํ๋ฉด์ ํ๋ผ๋ฏธํฐ(embedding ํฌํจ)๋ ์ต๋ 25%๊น์ง ์ ๊ฑฐ ๊ฐ๋ฅ.




