I've been reading up on the Lottery Ticket Hypothesis, which is super interesting. ...

Why Nostr? What is Njump? Join Nostr

npub1wg…6wzsz

2025-01-31 13:54:53 UTC

I've been reading up on the Lottery Ticket Hypothesis, which is super interesting.

Basically, the observation is that these days we build *vast* neural networks with billions of parameters, but most of the parameters aren't needed. That is, after training, you can just throw away 95% of the network (pruning), and it will still work fine.

The LTH paper is asking: could we start with a network just 5% of the size, and get comparable results? If so, that would be a *huge* performance win for Deep Learning.

What's interesting is that you *can* do this, but only by training the full network (perhaps several times) to see which neurons are needed. They argue that training a neural network isn't so much *creating* a model, as finding a lucky sub-network (a lottery ticket) from the randomly initialized network, a bit like a sculpter "finding" the bust hidden in a block of marble.

Initial LTH paper: http://arxiv.org/abs/1803.03635
Follow-up with major clarifications: http://arxiv.org/abs/1905.01067

#science #ai #machinelearning

Author Public Key

npub1wg38fugtvv94x4n5fzlqywcw48wcwyl46md6a98nd5v58q3k8xgqg6wzsz

Seen on

wss://relay.momostr.pink

Show more details

Published at

2025-01-31 13:54:53 UTC

Kind type

1 Short Text Note

Event JSON

{ "id": "6fb675279767adb031ff0dfe57c3d296994245e96c892e754178dfc337c418d2", "pubkey": "722274f10b630b53567448be023b0ea9dd8713f5d6dbae94f36d194382363990", "created_at": 1738331693, "kind": 1, "tags": [ [ "t", "science" ], [ "proxy", "https://tech.lgbt/@ngaylinn/113923305876437677", "web" ], [ "t", "ai" ], [ "t", "machinelearning" ], [ "proxy", "https://tech.lgbt/users/ngaylinn/statuses/113923305876437677", "activitypub" ], [ "L", "pink.momostr" ], [ "l", "pink.momostr.activitypub:https://tech.lgbt/users/ngaylinn/statuses/113923305876437677", "pink.momostr" ], [ "-" ] ], "content": "I've been reading up on the Lottery Ticket Hypothesis, which is super interesting.\n\nBasically, the observation is that these days we build *vast* neural networks with billions of parameters, but most of the parameters aren't needed. That is, after training, you can just throw away 95% of the network (pruning), and it will still work fine.\n\nThe LTH paper is asking: could we start with a network just 5% of the size, and get comparable results? If so, that would be a *huge* performance win for Deep Learning.\n\nWhat's interesting is that you *can* do this, but only by training the full network (perhaps several times) to see which neurons are needed. They argue that training a neural network isn't so much *creating* a model, as finding a lucky sub-network (a lottery ticket) from the randomly initialized network, a bit like a sculpter \"finding\" the bust hidden in a block of marble.\n\nInitial LTH paper: http://arxiv.org/abs/1803.03635\nFollow-up with major clarifications: http://arxiv.org/abs/1905.01067\n\n#science #ai #machinelearning", "sig": "27bae243d98015baf29017157020f25c980beaa32ea4f063e02dd25cc260675585e68a60dad9831bd447aa038cc35370d4fb95fce660a724ec30e83559fa7691" }