- Advertisement -
- Advertisement -

Related

Asilo Weighs in on DeepSeek’s Disruptive AI Model

Industry Report

- Advertisement -

Stockholm (HedgeNordic) – Artificial intelligence has dominated headlines in recent months, with investors and Mr. Market alike betting on a growing demand for powerful microchips, energy, and data centers to fuel AI development. However, the release of a new AI model called DeepSeek-R1 by a Chinese upstart has upended some core assumptions about AI progress. The model demonstrates that AI advancements may not require as many high-end chips as previously expected. This revelation rattled markets on Monday, sending Nvidia shares tumbling 17 percent and triggering sell-offs across other key “picks and shovels” stocks that support the development of AI.

Finnish portfolio managers Ernst Grönblom and Henri Blomster, who run a high-conviction strategy focused on identifying “future superstar” stocks, increased the AI exposure in Asilo Argo’s portfolio from about 15 percent at the start of 2024 to over 40 percent by year-end. However, the duo is not too worried about their AI investments after hearing that China’s DeepSeek startup has created the R1 model, which could challenge some of today’s leading AI models at a much lower cost. Grönblom, known for his understanding of concepts and mental models, references the Jevons paradox when discussing how DeepSeek’s more efficient model might affect the market.

“In general, it’s known that improving efficiency not only reduces the amount needed for a given purpose but also lowers the relative cost of using a resource, which in turn tends to increase its overall demand.”

Ernst Grönblom

“In general, it’s known that improving efficiency not only reduces the amount needed for a given purpose but also lowers the relative cost of using a resource, which in turn tends to increase its overall demand,” Grönblom explains, referencing the Jevons paradox. “The increased efficiency of coal use led to greater coal consumption, so why wouldn’t the same happen with GPUs [graphics processing units]?” he adds. However, Grönblom and his co-manager Blomster are quick to point out that their current thoughts are a “quick-and-dirty” initial take on the matter.

The Technicals Behind the Breakthroughs of the DeepSeek Model

According to Grönblom, a key breakthrough lies in their “sophisticated mixed-precision training framework, which allows the use of 8-bit floating point numbers (FP8) throughout the entire training process,” unlike the “full precision” 32-bit numbers typically used by Western AI labs. This innovation, he explains, “saves memory and boosts performance,” leading to a dramatic reduction in GPU requirements since each GPU can handle much more data. Another major advancement is the R1’s ability to “predict multiple tokens simultaneously while maintaining the quality of single-token predictions,” effectively doubling inference speed without compromising much on quality.

Grönblom also highlights their novel Multi-head Latent Attention (MLA), which “stores a compressed version of Key-Value indices, capturing essential information while using far less memory.” Additionally, their Mixture-of-Experts (MOE) Transformer architecture activates “only a small subset of parameters at any given time.”  This means that during inference, only this subset of weights needs to be stored in VRAM. For example, an MOE model with 671 billion parameters may only have 37 billion parameters active at any given time. Grönblom notes that “the sum total of these innovations, when layered together, has led to the 45x efficiency improvement.”

U.S. Labs to Study DeepSeek’s R1

Regardless of whether the training of R1 was as cost-effective as claimed, Henri Blomster believes leading U.S. labs will likely study the technological breakthroughs it offers and, if deemed useful, adopt them. “This should lead to more efficient compute, both in terms of training and inference. When you apply that efficiency to the vast number of GPUs available in the U.S., while China continues to face restricted access to computing power, the implications become significant,” he argues. “We believe scaling laws apply even after R1 DeepSeek.”

“This should lead to more efficient compute, both in terms of training and inference. When you apply that efficiency to the vast number of GPUs available in the U.S., while China continues to face restricted access to computing power, the implications become significant.”

Henri Blomster

Blomster emphasizes that if the ultimate goal of companies like Anthropic, OpenAI, and other major players were to build an o1 model and sell it to consumers, “then yes, we would be alarmed by the DeepSeek R1 model’s potential in destroying the value of US top lab’ investments.” However, he argues, “o1 is not the end goal.” While its feasibility is debatable, Blomster argues that “leading labs have the goal of building a digital God.” These labs have a clear roadmap, and measurable progress, and are nearing their goal, as described by Anthropic’s CEO, Dario Amodei, who stated: “My view and I’ve been saying over the last few days that I’m becoming more confident in it, this idea that we may be only two or three years away from A.I. systems being better than humans at almost all tasks.”

AI as the Manhattan Project 2.0

According to Grönblom, AI has the potential to accelerate progress across all technological fields, including military technology. “This is the Manhattan Project 2.0, with the power to shift the global balance of power,” he argues. “It seems that China has taken a step forward in a race the U.S. cannot afford to lose.” In response, he believes the logical course of action would not be to slow down development but to intensify efforts even further.

“In summary, we do not believe that algorithmic improvements will reduce the demand for AI infrastructure.”

Ernst Grönblom

“In summary, we do not believe that algorithmic improvements will reduce the demand for AI infrastructure,” concludes Grönblom. “We do not believe that leading labs will see this as an opportunity to slow down their investments, we believe they will see this as an opportunity to get to the end goal of Artificial Superintelligence at an even faster rate.”

Subscribe to HedgeBrev, HedgeNordic’s weekly newsletter, and never miss the latest news!

Our newsletter is sent once a week, every Friday.

Eugeniu Guzun
Eugeniu Guzun
Eugeniu Guzun serves as a data analyst responsible for maintaining and gatekeeping the Nordic Hedge Index, and as a journalist covering the Nordic hedge fund industry for HedgeNordic. Eugeniu completed his Master’s degree at the Stockholm School of Economics in 2018. Write to Eugeniu Guzun at eugene@hedgenordic.com

Latest Articles

Rising Adoption of Quantitative Investment Strategies Among Nordic Investors

From a high-level perspective, there is a clear trend of increasing adoption of quantitative investment strategies (QIS) among Nordic institutional investors, either through the...

EU Plans Stress Test for Hedge Funds and Non-Bank Firms

European regulators are planning a stress test to identify vulnerabilities beyond the traditional banking sector, focusing on less regulated entities such as hedge funds,...

ALCUR Fonder Continues Hiring Spree

Following two earlier additions this year, ALCUR Fonder continues to expand its portfolio management team at a notable pace. The Stockholm-based hedge fund boutique...

Nordic Private Markets Modernize with Data-Centric Trade Lifecycle Automation

By Anders Stengaard Jensen at Indus Valley Partner: In recent years, asset managers in Nordic countries have accelerated efforts to modernize trade operations, particularly...

Norwegian Hedge Fund Industry Sees Major Boost with New Launch

The Swedish and Danish hedge fund industries remain closely matched in size, with Denmark recently edging ahead of Sweden. While still less than half...

Atlant Funds Hold Up in May Despite Mistimed Market Call

Macroeconomic and market forecasts are notoriously difficult, even for experienced hedge fund managers. What matters more than being right, however, is ensuring that incorrect...

Allocator Interviews

In-Depth: High Yield

Voices

Request for Proposal

- Advertisement -
HedgeNordic
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.