For most companies, the honest answer is: nobody knows.
Researchers at Tsinghua University and Z.ai built IndexCache to eliminate redundant computation in sparse attention models ...