Data Bottleneck Looms, AI Chaos

● AI Data Center Bottleneck Shifts to Data

Can the AI data center “fourth bottleneck” come as data?…Beyond GPUs·HBM, let’s sort out the money-making segment in one go

5 key points you need to know right now (covered in this article)

① AI infrastructure bottlenecks have moved from GPU/HBM (semiconductors) to power (including cooling) to the network (silicon photonics·fiber optic cables/on-device/edge), and it outlines the flow that the last ④th bottleneck shifts to “data.”

② From the token production perspective, “how fast” is the core, and the metrics to measure it are restructured into tokens produced per second, tokens produced per watt, and the cost to produce 1 million tokens.

③ It’s not just “buy GPUs”—it points out that efficiency-improving technologies (utilization rate·waiting time·memory/transfer bottlenecks) create differences in performance.

④ When the bottleneck changes, the investment sectors change too. That is, semiconductors → energy efficiency/thermal management → network (fiber/edge) → data collection.

⑤ The spread of on-device/edge increases small data centers near base stations (small H/W) and simultaneously reallocates demand for power, networks, and chipsets.

News summary: “AI data center bottlenecks are now data”

Based on remarks from an SK vice president, the bottleneck in AI infrastructure has been moving step by step, and in the future, data is likely to become the final big hurdle.

In the past, GPUs and HBM (high-bandwidth memory) acted like the first bottleneck, then power and cooling became the second bottleneck, and after that, network (copper→optical transmission/silicon photonics/edge·on-device) was expected to fully emerge as the third bottleneck.

And the key takeaway is that within 1 to 2 years, “fourth bottleneck = data” could arrive.

First bottleneck (past~present): GPU·HBM, ultimately “semiconductors”

1) Why GPU/HBM became the bottleneck first

AI computation (training/inference) is powered by the GPU, and even if the GPU processes quickly, bottlenecks that cause data to be fed or to make it wait often occur in HBM.

2) Rethinking “semiconductors” from the token production perspective

Token production ultimately boils down to how many “units the factory prints per second.”At this time, the core of factory performance is GPU + HBM + server/module performance,and it becomes important to design in a way that keeps the GPU from idling (utilization rate).

Second bottleneck (in progress): Power (including cooling) = the wall of energy costs

1) Why the second bottleneck is seen as power

After being built, data centers incur overwhelmingly large energy costs during operations.Because electricity is required “each time you run,” when the bottleneck moves from the GPU to power,performance (efficiency per watt) changes even with the same hardware.

2) Key metric: tokens produced per watt

This part is especially important from an investor’s perspective.It’s because how many tokens you can extract with 1W (tokens produced per watt) becomes the differentiating point.

3) Thermal management methods split efficiency

For example, a circulation cooling method (chipset/pipe-based cooling) may be more efficient than air cooling.When cooling efficiency improves relative to power input,the same GPU can operate faster and more stably,ultimately improving tokens produced per watt.

Third bottleneck (about to fully kick in): Network = optics (光) beyond copper limits + distributed processing

1) The mechanism of network becoming a bottleneck

When GPUs·HBM·and other chips exchange data within a server,if it’s based on electrical signals (copper transmission), space/heat/loss accumulate.

2) Why silicon photonics and fiber optic cables connect to “tokens per watt”

Switching from copper to optics can reduce transmission losses and also reduce heat generation.This leads to the same flow as silicon photonics (optical transmission infrastructure).

3) “On-device/edge” reshapes the network bottleneck

Network bottlenecks don’t spread only inside data centers—they also extend outside (base stations·devices·the edge).

As on-device AI grows, there were two choices (processing within the device vs processing in a data center),but as distributed computing like a small data center near a base station enters in the middle,the processing locations become complex with a three-layer/multi-layer structure.

4) In this flow, the cost of “operations” becomes important

When the network is the bottleneck, latency increases,and when latency increases, the waiting time for GPU/HBM also increases,which can reduce token production speed.

Fourth bottleneck (possible within 1 to 2 years): Data

1) Why “data” becomes the last bottleneck

At first, the model architecture and computation infrastructure seemed more important,but for model quality to improve further, you ultimately need data required for additional training/improvement.

2) How data acquisition changes

Data isn’t limited just by being “out there on the internet.”Therefore

① buying/collecting non-public data,② generating synthetic data,③ humans producing labeled/training data,④ creating an approach that does measurement/collection itself anew (turning everyday experiences that weren’t digitized into data)

These strategies become important.

3) The biggest opportunity: “Who can collect data best”

Especially for physical AI (AI that moves in the real world),you need a channel to capture data that has been hard to measure before.

For example, it mentions a structure where physical labor environments like a “hand palm (finger factory)” are captured,and then data flows into a specific company/research ecosystem.Also, when wearables like smart glasses appear,there are observations that the scope for collecting data could expand to indoor/daily life.

Three indicators to turn the token economy into an “investment language”

1) Tokens produced per second (speed)

How much the factory produces in one second.Not only GPU/HBM performance, but utilization rate (design that prevents the GPU from idling) is key.

2) Tokens produced per watt (energy efficiency)

When operating costs (power) become a bottleneck,the investment point shifts to whether you can “produce more tokens with the same energy.”

3) Cost to produce 1 million tokens (cost structure)

This metric reflects depreciation (capex), operating expenses (opex), and utilization rate together.In other words, even for the same tokens, the cost varies depending on “how” they’re produced.

Not only GPUs—MPUs/memory/chip ecosystems grow together

1) Agent era: you must produce tokens more cheaply

If token demand increases due to agents, inference, and service expansion,if costs can’t be controlled, the business can stop even if the model is better.So the trend is that competition to lower token production cost is growing more than clinging only to expensive GPUs.

2) MPU competition: Intel·AMD·Qualcomm·Rebellions, etc. + big tech’s own chips

It’s mentioned that big tech companies like Google·Microsoft·Amazon are also building or expanding their own MPUs to optimize workloads.

3) Memory ultimately centers on “Samsung Electronics·SK hynix·Micron”

While multiple companies can make MPUs,the logic is that high-performance memory has limited supply chains,so the benefits could concentrate.

4) Even in devices (on-device), small MPUs are needed

When AI runs in devices such as smartphones/PCs/some robots/cars,devices also need the “small compute chips (MPUs)” and memory required to produce tokens inside the device.That means demand expands not only in data centers but also into the semiconductor ecosystem within devices.

Investment perspective checklist: “Which bottleneck you solve” is the answer

1) Whether a company makes money with AI vs uses AI to run its core business well

There’s a different lens for AI providers (companies that directly attack token production/bottleneck solutions) and for AI adopters (companies trying to improve costs/productivity in their own products/services).

2) Translating the bottleneck framework into investment ideas

① Semiconductor bottleneck: GPU/HBM/server efficiency (tokens per second)

② Energy bottleneck: cooling/power efficiency/tokens produced per watt

③ Network bottleneck: silicon photonics/fiber optic transmission + edge distributed processing (lower latency, higher utilization)

④ Data bottleneck: collection channels (wearables·physical capture·synthetic/label ecosystems)

One “core reinterpretation” that doesn’t show up much elsewhere in this kind of content

Many people in AI investing focus only on “models (upscaling)” and “GPUs (computation),”but if you reframe the content of this video from the token perspective, the conclusion becomes clearer.

The money path for AI infrastructure moves from “a company that does computation well” to “a company that solves the bottleneck that makes token production the cheapest and fastest”.And the fact that the bottleneck moves in order—GPU/HBM → power → network → data—can be a criterion that determines future 1 to 2 year investment priorities, which is my key viewpoint.

Today’s SEO core keywords (naturally inserted)

This flow becomes much faster to understand when viewed from the perspectives of AI data centers, power efficiency, GPUs, tokens produced per watt, and silicon photonics.

Main content you want to convey (conclusion)

AI infrastructure bottlenecks start at “GPUs·HBM,” pass through “power (cooling)” and “network (optical·distributed),”and finally data could become the fourth bottleneck.

Investment/business decisions should ultimately approach it as “how to reduce which bottleneck to improve token productivity by tokens per second, per watt, and cost,”so you can find the money-making segment more accurately.

< Summary >

– AI data center bottlenecks go from GPU/HBM (semiconductors) → power/cooling → network, and within 1 to 2 years, data could become the fourth bottleneck. – Key indicators from the token production perspective are tokens per second, tokens per watt, and cost to produce 1 million tokens. – “Efficiency improvements” like utilization rate and waiting time, and memory/transfer bottlenecks determine performance. – The network is reshaped not only by silicon photonics/optical transmission but also by on-device and edge (small data centers near base stations). – The opportunity in the data bottleneck is “who can collect better” (smart glasses/physical capture/synthetic/label). – Competition expands to lower token cost, as not only GPUs but also the MPU and memory ecosystem grow together.

[Related articles…]

*Source: [ 티타임즈TV ]

– AI 데이터센터 병목과 돈이 될 길목, 한눈에 정리 (김지현 SK 부사장)

NextGenInsight.Net

Like this:

KOSPI Slumps, Samsung Boost Fails, Market Jitters

Tesla-300-Breakdown, Fed-Hawkish-Shock, AI-Cost-Surge

AI Panic Spreads, Nasdaq Slides, Nvidia CDS Surges

Feature is an online magazine made by culture lovers. We offer weekly reflections, reviews, and news on art, literature, and music.

NextGenInsight.Net

Data Bottleneck Looms, AI Chaos

Can the AI data center “fourth bottleneck” come as data?…Beyond GPUs·HBM, let’s sort out the money-making segment in one go

5 key points you need to know right now (covered in this article)

News summary: “AI data center bottlenecks are now data”

First bottleneck (past~present): GPU·HBM, ultimately “semiconductors”

1) Why GPU/HBM became the bottleneck first

2) Rethinking “semiconductors” from the token production perspective

Second bottleneck (in progress): Power (including cooling) = the wall of energy costs

1) Why the second bottleneck is seen as power

2) Key metric: tokens produced per watt

3) Thermal management methods split efficiency

Third bottleneck (about to fully kick in): Network = optics (光) beyond copper limits + distributed processing

1) The mechanism of network becoming a bottleneck

2) Why silicon photonics and fiber optic cables connect to “tokens per watt”

3) “On-device/edge” reshapes the network bottleneck

4) In this flow, the cost of “operations” becomes important

Fourth bottleneck (possible within 1 to 2 years): Data

1) Why “data” becomes the last bottleneck

2) How data acquisition changes

3) The biggest opportunity: “Who can collect data best”

Three indicators to turn the token economy into an “investment language”

1) Tokens produced per second (speed)

2) Tokens produced per watt (energy efficiency)

3) Cost to produce 1 million tokens (cost structure)

Not only GPUs—MPUs/memory/chip ecosystems grow together

1) Agent era: you must produce tokens more cheaply

2) MPU competition: Intel·AMD·Qualcomm·Rebellions, etc. + big tech’s own chips

3) Memory ultimately centers on “Samsung Electronics·SK hynix·Micron”

4) Even in devices (on-device), small MPUs are needed

Investment perspective checklist: “Which bottleneck you solve” is the answer

1) Whether a company makes money with AI vs uses AI to run its core business well

2) Translating the bottleneck framework into investment ideas

One “core reinterpretation” that doesn’t show up much elsewhere in this kind of content

Today’s SEO core keywords (naturally inserted)

Main content you want to convey (conclusion)

< Summary >

Share this:

Like this:

KOSPI Slumps, Samsung Boost Fails, Market Jitters

Tesla-300-Breakdown, Fed-Hawkish-Shock, AI-Cost-Surge

AI Panic Spreads, Nasdaq Slides, Nvidia CDS Surges

Feature is an online magazine made by culture lovers. We offer weekly reflections, reviews, and news on art, literature, and music.

NextGenInsight.Net