Join Nostr
2025-01-27 14:13:08 UTC
in reply to

Jeff Triplett on Nostr: If you are running a Mac this is a little easier, btw because of the unified bus. A ...

If you are running a Mac this is a little easier, btw because of the unified bus. A Mac shares the bus and RAM across the GPU and CPU, which makes running small and medium-sized models both performant and possible with Ollama.

These models are GPU and RAM hungry between the model size and the context size.

The best video I have seen to somewhat fit it in my head is: https://www.youtube.com/watch?v=QfFRNF5AhME