Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You need all the weights every token, so even with optimal splitting the fraction of the weights you can farm out to an SSD is proportional to how fast your SSD is compared to your RAM.

You'd need to be in a weirdly compute-limited situation before you can replace significant amounts of RAM with SSD, unless I'm missing something big.

> MoE architecture should help quite a bit here.

In that you're actually using a smaller model and swapping between them less frequently, sure.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: