Been happily using Ollama () for running open-source LLM models locally. But I ...

2025-07-18 09:08:32 UTC

Been happily using Ollama (https://ollama.com) for running open-source LLM models locally.

But I haven't found anything similar for voice models that runs on arm64. vLLM is what Mistral recommends to run Voxtral voice models with, but it seems to be x86+CUDA only.

Any recommendations?

Author Public Key