If you are running a Mac this is a little easier, btw because of the unified bus. A ...

2025-01-27 14:13:08 UTC

If you are running a Mac this is a little easier, btw because of the unified bus. A Mac shares the bus and RAM across the GPU and CPU, which makes running small and medium-sized models both performant and possible with Ollama.

These models are GPU and RAM hungry between the model size and the context size.

The best video I have seen to somewhat fit it in my head is: https://www.youtube.com/watch?v=QfFRNF5AhME

Author Public Key

npub1nsp3hg75ge84dn5fttlkpxpxnj0dfl697239dtyz4js3mdw2jllqf8ahke

Show more details

Jeff Triplett on Nostr: If you are running a Mac this is a little easier, btw because of the unified bus. A ...