this says it drops parameters by 25x using retrieval: [1]https://www.deepmind.com/publications/improving-language- models-by-retrieving-from-trillions-of-tokens if we're stuck copying work we bump into, then it makes sense to figure out what's cited that paper (or bump into something else) References 1. https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens