DeepSeek AI tops Apple App Store
Open-source AI Model from China captures the world's attention
The image above is from the Apple App Store. An open source AI research team called DeepSeek has released competitive and popular new foundational AI model. There are parts of the DeepSeek story that are questionable and other parts that are undeniable. I go through all of this.
Undeniable
One undeniable part of the story is that open source AI models are here to stay. There was a period when some people believed closed sourced models backed by greater resources had an unreachable advantage. It seems like open source AI is here to stay. For all the flack Mark Zuckerberg gets, Meta’s Llama is a big reason open source AI has been able to stay relevant.
Another undeniable part of the story is that DeepSeek’s model is very cost effective. Large language models (LLMs) are priced on a token basis. A token is a small portion of text that is typically a couple of letters.
DeepSeek R1
$0.55 per million tokens (input)
$2.19 per million tokens (output)
OpenAI o1
$15.00 per million tokens (input)
$60.00 per million tokens (output)
DeepSeek provides a compelling product that people are using. Otherwise, it wouldn’t be number one on the App Store.
Questionable
The questionable part is whether roughly $6 million dollars and 2,048 NVIDIA H800 chips were used to train the model. It’s possible but seems unlikely in my opinion.
The number seems small. The DeepSeek team is likely able to avoid advanced chip export restrictions imposed by the outgoing Biden administration but can’t talk about it publicly for obvious reasons.
By the close of 2024, the Bank of China is set to provide loans amounting to 1.91 trillion yuan (around £195 billion) to more than 100,000 technology companies.
The Bank of China would not be handing out so much money to Chinese technology companies if training foundation models was so affordable. This is just me using simple deductive reasoning.
Implications
One important implication is throwing money at AI alone doesn’t ensure winning. The Stargate project is a $500 billion AI infrastructure project recently announced.
This large sum of money does not ensure victory. The other big implication of the story is that AI is still a wide open game.
It’s in America’s best interest to bring the best minds of other countries to America and keep them. Keeping talent out of the country just creates competitors outside of the scope of the American hegemony. It’s a lot more wise to brain drain other countries of their top minds as we enter a critical phase of AI reshaping the world.
The most amusing idea that I didn't write about this story goes like this:
Liang Wenfeng financed DeepSeek as a pet project. His day job is managing a quant hedge fund. He could have potentially initiated a short position on US + Chinese tech companies prior to releasing Deep Seek.
The profits could then be used for spending even more money compute to build better models.
There have been complaints about Liang Wenfeng spending more time on Deep Seek than managing his hedge fund. But this would be the ultimate chess move.
It's a plan that would imply amazing foresight. But then again we are also talking about someone who is building AI models for fun.