Qwen3.5-a3b is absolutely blowing my mind. It's an MoE model, meaning it has access ...

2026-03-07 11:28:19 UTC

Qwen3.5-a3b is absolutely blowing my mind.

It's an MoE model, meaning it has access to 35b parameters worth of "knowledge" to access. But uses only around 3b active tokens at a time to answer requests. Which means its faster for inference and can run on lower VRAM than would be required for a model that size.

For me it means that it runs reliably on a 16gb vram graphic card, it's very smart, does agentic stuff well and can even code reliably.

Local models are essential to save the world!

Author Public Key

npub1cgd35mxmy37vhkfcmjckk9dylguz6q8l67cj6h9m45tj5rx569cql9kfex

Seen on

wss://relay.damus.io

Show more details

ABH3PO on Nostr: Qwen3.5-a3b is absolutely blowing my mind. It's an MoE model, meaning it has access ...