Ollama Now Runs Faster on Macs
Ollama has released version 0.19 with major performance improvements on Apple silicon Macs thanks to Apple's MLX machine learning framework. The update boosts prefill speed by around 1.6 times and nearly doubles response generation speed, with M5 Macs seeing the biggest gains due to new GPU Neural Accelerators. Better memory management also makes long coding sessions and AI assistants more responsive. The preview currently supports only Alibaba's Qwen3.5, but broader model support is planned.
Read the full story on MacRumors →