Meta releases quantized Llama models, Hugging Face Evaluation Guidebook, a paper on Speculative Streaming: Fast LLM Inference Without Auxiliary Models, and many more!
Deep Learning Weekly: Issue 377
Meta releases quantized Llama models, Hugging Face Evaluation Guidebook, a paper on Speculative Streaming: Fast LLM Inference Without Auxiliary Models, and many more!