Wikipedia will eventually be a good jumping-off point for more news. Some quotes: ...

Why Nostr? What is Njump? Join Nostr

John Carlos Baez

npub1nf…3nqe4

2025-01-25 06:31:31 UTC

in reply to nevent1q…n0l8

Wikipedia will eventually be a good jumping-off point for more news. Some quotes:

"DeepSeek (Chinese: 深度求索; pinyin: Shēndù Qiúsuǒ) is a Chinese artificial intelligence company that develops open-source large language models. The company is funded solely by Chinese hedge fund High-Flyer. Both DeepSeek and High-Flyer are based in Hangzhou, Zhejiang."

"In December 2024, DeepSeek-V3 was launched. It came with 671 billion parameters and trained in around 55 days at a cost of US$5.58 million, using significantly less resources compared to its peers. It was trained on a dataset of 14.8 trillion tokens. Benchmark tests showed it outperformed Llama 3.1 and Qwen 2.5 while matching GPT-4o and Claude 3.5 Sonnet. DeepSeek's optimization on limited resources highlighted potential limits of US sanctions on China's AI development. An opinion piece by The Hill described the release as American AI reaching its Sputnik moment."

"On January 20, 2025, the DeepSeek-R1 and DeepSeek-R1-Zero were released. They were based on V3-Base. Like V3, each is a MoE with 671B total parameters and 37B activated parameters. They also released some "DeepSeek-R1-Distill" models, which are not based on R1. Instead, they are similar to other open-weight models like LLaMA and Qwen, fine-tuned on synthetic data generated by R1."

(3/n)

https://en.wikipedia.org/wiki/DeepSeek

Author Public Key

npub1nf4p4rh06z6n6lsvje4txk7eqs23y3hs8vd7nraq6tgwady5qvsqy3nqe4

Show more details

Published at

2025-01-25 06:31:31 UTC

Kind type

1 Short Text Note

Event JSON

{ "id": "fd49c73bc7e131b660ac9515971e2a10881b87be20ba9be2b9821477c6d51edf", "pubkey": "9a6a1a8eefd0b53d7e0c966ab35bd904151246f03b1be98fa0d2d0eeb4940320", "created_at": 1737786691, "kind": 1, "tags": [ [ "e", "34928219a99e4254a6a3d22e1dcc945a38fcdee731c3f8765570a00ade89da5d", "", "reply", "9a6a1a8eefd0b53d7e0c966ab35bd904151246f03b1be98fa0d2d0eeb4940320" ], [ "p", "9a6a1a8eefd0b53d7e0c966ab35bd904151246f03b1be98fa0d2d0eeb4940320" ], [ "e", "c4034a088e87256cd0c567960d82b429f3a9346e7fac9a8453c926cce3f06a58", "", "root", "9a6a1a8eefd0b53d7e0c966ab35bd904151246f03b1be98fa0d2d0eeb4940320" ], [ "proxy", "https://mathstodon.xyz/@johncarlosbaez/113887588603490472", "web" ], [ "proxy", "https://mathstodon.xyz/users/johncarlosbaez/statuses/113887588603490472", "activitypub" ], [ "L", "pink.momostr" ], [ "l", "pink.momostr.activitypub:https://mathstodon.xyz/users/johncarlosbaez/statuses/113887588603490472", "pink.momostr" ], [ "-" ] ], "content": "Wikipedia will eventually be a good jumping-off point for more news. Some quotes:\n\n\"DeepSeek (Chinese: 深度求索; pinyin: Shēndù Qiúsuǒ) is a Chinese artificial intelligence company that develops open-source large language models. The company is funded solely by Chinese hedge fund High-Flyer. Both DeepSeek and High-Flyer are based in Hangzhou, Zhejiang.\"\n\n\"In December 2024, DeepSeek-V3 was launched. It came with 671 billion parameters and trained in around 55 days at a cost of US$5.58 million, using significantly less resources compared to its peers. It was trained on a dataset of 14.8 trillion tokens. Benchmark tests showed it outperformed Llama 3.1 and Qwen 2.5 while matching GPT-4o and Claude 3.5 Sonnet. DeepSeek's optimization on limited resources highlighted potential limits of US sanctions on China's AI development. An opinion piece by The Hill described the release as American AI reaching its Sputnik moment.\"\n\n\"On January 20, 2025, the DeepSeek-R1 and DeepSeek-R1-Zero were released. They were based on V3-Base. Like V3, each is a MoE with 671B total parameters and 37B activated parameters. They also released some \"DeepSeek-R1-Distill\" models, which are not based on R1. Instead, they are similar to other open-weight models like LLaMA and Qwen, fine-tuned on synthetic data generated by R1.\"\n\n(3/n)\n\nhttps://en.wikipedia.org/wiki/DeepSeek", "sig": "3d1c48203a869a3d21bf12c9ee1a76b1f06b488417f5f02cf6502ee25442bdf6b79a55ed35a5ac09295955bfa13039c08f64452463ad755277eb96c0fb88a3e1" }