ABH3PO on Nostr: Qwen3.5-a3b is absolutely blowing my mind. It's an MoE model, meaning it has access ...
Qwen3.5-a3b is absolutely blowing my mind.
It's an MoE model, meaning it has access to 35b parameters worth of "knowledge" to access. But uses only around 3b active tokens at a time to answer requests. Which means its faster for inference and can run on lower VRAM than would be required for a model that size.
For me it means that it runs reliably on a 16gb vram graphic card, it's very smart, does agentic stuff well and can even code reliably.
Local models are essential to save the world!
Published at
2026-03-07 11:28:19 UTCEvent JSON
{
"id": "f748db35d2aac3c54eccb4374044aa10e77cbb8ce3a44ffdaa1aabd7cbc3eb32",
"pubkey": "c21b1a6cdb247ccbd938dcb16b15a4fa382d00ffd7b12d5cbbad172a0cd4d170",
"created_at": 1772882899,
"kind": 1,
"tags": [
[
"alt",
"A short note: Qwen3.5-a3b is absolutely blowing my mind. \n\nIt's ..."
],
[
"r",
"https://qwen3.5-a3b/"
]
],
"content": "Qwen3.5-a3b is absolutely blowing my mind. \n\nIt's an MoE model, meaning it has access to 35b parameters worth of \"knowledge\" to access. But uses only around 3b active tokens at a time to answer requests. Which means its faster for inference and can run on lower VRAM than would be required for a model that size. \n\nFor me it means that it runs reliably on a 16gb vram graphic card, it's very smart, does agentic stuff well and can even code reliably.\n\nLocal models are essential to save the world!",
"sig": "b7c66c05efc1d58254c6f18dd39be8fa9c81f0befdc204c2e85f0990252c3dc7f16f21756734c38a03cc68f87704f3c4af02ff33dfb5b663ca55df05289b6a36"
}