● AI Data Center Bottleneck Shifts to Data
Can the AI data center “fourth bottleneck” come as data?…Beyond GPUs·HBM, let’s sort out the money-making segment in one go
5 key points you need to know right now (covered in this article)
① AI infrastructure bottlenecks have moved from GPU/HBM (semiconductors) to power (including cooling) to the network (silicon photonics·fiber optic cables/on-device/edge), and it outlines the flow that the last ④th bottleneck shifts to “data.”
② From the token production perspective, “how fast” is the core, and the metrics to measure it are restructured into tokens produced per second, tokens produced per watt, and the cost to produce 1 million tokens.
③ It’s not just “buy GPUs”—it points out that efficiency-improving technologies (utilization rate·waiting time·memory/transfer bottlenecks) create differences in performance.
④ When the bottleneck changes, the investment sectors change too. That is, semiconductors → energy efficiency/thermal management → network (fiber/edge) → data collection.
⑤ The spread of on-device/edge increases small data centers near base stations (small H/W) and simultaneously reallocates demand for power, networks, and chipsets.
News summary: “AI data center bottlenecks are now data”
Based on remarks from an SK vice president, the bottleneck in AI infrastructure has been moving step by step, and in the future, data is likely to become the final big hurdle.
In the past, GPUs and HBM (high-bandwidth memory) acted like the first bottleneck, then power and cooling became the second bottleneck, and after that, network (copper→optical transmission/silicon photonics/edge·on-device) was expected to fully emerge as the third bottleneck.
And the key takeaway is that within 1 to 2 years, “fourth bottleneck = data” could arrive.
First bottleneck (past~present): GPU·HBM, ultimately “semiconductors”
1) Why GPU/HBM became the bottleneck first
AI computation (training/inference) is powered by the GPU, and even if the GPU processes quickly, bottlenecks that cause data to be fed or to make it wait often occur in HBM.
2) Rethinking “semiconductors” from the token production perspective
Token production ultimately boils down to how many “units the factory prints per second.”At this time, the core of factory performance is GPU + HBM + server/module performance,and it becomes important to design in a way that keeps the GPU from idling (utilization rate).
Second bottleneck (in progress): Power (including cooling) = the wall of energy costs
1) Why the second bottleneck is seen as power
After being built, data centers incur overwhelmingly large energy costs during operations.Because electricity is required “each time you run,” when the bottleneck moves from the GPU to power,performance (efficiency per watt) changes even with the same hardware.
2) Key metric: tokens produced per watt
This part is especially important from an investor’s perspective.It’s because how many tokens you can extract with 1W (tokens produced per watt) becomes the differentiating point.
3) Thermal management methods split efficiency
For example, a circulation cooling method (chipset/pipe-based cooling) may be more efficient than air cooling.When cooling efficiency improves relative to power input,the same GPU can operate faster and more stably,ultimately improving tokens produced per watt.
Third bottleneck (about to fully kick in): Network = optics (光) beyond copper limits + distributed processing
1) The mechanism of network becoming a bottleneck
When GPUs·HBM·and other chips exchange data within a server,if it’s based on electrical signals (copper transmission), space/heat/loss accumulate.
2) Why silicon photonics and fiber optic cables connect to “tokens per watt”
Switching from copper to optics can reduce transmission losses and also reduce heat generation.This leads to the same flow as silicon photonics (optical transmission infrastructure).
3) “On-device/edge” reshapes the network bottleneck
Network bottlenecks don’t spread only inside data centers—they also extend outside (base stations·devices·the edge).
As on-device AI grows, there were two choices (processing within the device vs processing in a data center),but as distributed computing like a small data center near a base station enters in the middle,the processing locations become complex with a three-layer/multi-layer structure.
4) In this flow, the cost of “operations” becomes important
When the network is the bottleneck, latency increases,and when latency increases, the waiting time for GPU/HBM also increases,which can reduce token production speed.
Fourth bottleneck (possible within 1 to 2 years): Data
1) Why “data” becomes the last bottleneck
At first, the model architecture and computation infrastructure seemed more important,but for model quality to improve further, you ultimately need data required for additional training/improvement.
2) How data acquisition changes
Data isn’t limited just by being “out there on the internet.”Therefore
① buying/collecting non-public data,② generating synthetic data,③ humans producing labeled/training data,④ creating an approach that does measurement/collection itself anew (turning everyday experiences that weren’t digitized into data)
These strategies become important.
3) The biggest opportunity: “Who can collect data best”
Especially for physical AI (AI that moves in the real world),you need a channel to capture data that has been hard to measure before.
For example, it mentions a structure where physical labor environments like a “hand palm (finger factory)” are captured,and then data flows into a specific company/research ecosystem.Also, when wearables like smart glasses appear,there are observations that the scope for collecting data could expand to indoor/daily life.
Three indicators to turn the token economy into an “investment language”
1) Tokens produced per second (speed)
How much the factory produces in one second.Not only GPU/HBM performance, but utilization rate (design that prevents the GPU from idling) is key.
2) Tokens produced per watt (energy efficiency)
When operating costs (power) become a bottleneck,the investment point shifts to whether you can “produce more tokens with the same energy.”
3) Cost to produce 1 million tokens (cost structure)
This metric reflects depreciation (capex), operating expenses (opex), and utilization rate together.In other words, even for the same tokens, the cost varies depending on “how” they’re produced.
Not only GPUs—MPUs/memory/chip ecosystems grow together
1) Agent era: you must produce tokens more cheaply
If token demand increases due to agents, inference, and service expansion,if costs can’t be controlled, the business can stop even if the model is better.So the trend is that competition to lower token production cost is growing more than clinging only to expensive GPUs.
2) MPU competition: Intel·AMD·Qualcomm·Rebellions, etc. + big tech’s own chips
It’s mentioned that big tech companies like Google·Microsoft·Amazon are also building or expanding their own MPUs to optimize workloads.
3) Memory ultimately centers on “Samsung Electronics·SK hynix·Micron”
While multiple companies can make MPUs,the logic is that high-performance memory has limited supply chains,so the benefits could concentrate.
4) Even in devices (on-device), small MPUs are needed
When AI runs in devices such as smartphones/PCs/some robots/cars,devices also need the “small compute chips (MPUs)” and memory required to produce tokens inside the device.That means demand expands not only in data centers but also into the semiconductor ecosystem within devices.
Investment perspective checklist: “Which bottleneck you solve” is the answer
1) Whether a company makes money with AI vs uses AI to run its core business well
There’s a different lens for AI providers (companies that directly attack token production/bottleneck solutions) and for AI adopters (companies trying to improve costs/productivity in their own products/services).
2) Translating the bottleneck framework into investment ideas
① Semiconductor bottleneck: GPU/HBM/server efficiency (tokens per second)
② Energy bottleneck: cooling/power efficiency/tokens produced per watt
③ Network bottleneck: silicon photonics/fiber optic transmission + edge distributed processing (lower latency, higher utilization)
④ Data bottleneck: collection channels (wearables·physical capture·synthetic/label ecosystems)
One “core reinterpretation” that doesn’t show up much elsewhere in this kind of content
Many people in AI investing focus only on “models (upscaling)” and “GPUs (computation),”but if you reframe the content of this video from the token perspective, the conclusion becomes clearer.
The money path for AI infrastructure moves from “a company that does computation well” to “a company that solves the bottleneck that makes token production the cheapest and fastest”.And the fact that the bottleneck moves in order—GPU/HBM → power → network → data—can be a criterion that determines future 1 to 2 year investment priorities, which is my key viewpoint.
Today’s SEO core keywords (naturally inserted)
This flow becomes much faster to understand when viewed from the perspectives of AI data centers, power efficiency, GPUs, tokens produced per watt, and silicon photonics.
Main content you want to convey (conclusion)
AI infrastructure bottlenecks start at “GPUs·HBM,” pass through “power (cooling)” and “network (optical·distributed),”and finally data could become the fourth bottleneck.
Investment/business decisions should ultimately approach it as “how to reduce which bottleneck to improve token productivity by tokens per second, per watt, and cost,”so you can find the money-making segment more accurately.
< Summary >
– AI data center bottlenecks go from GPU/HBM (semiconductors) → power/cooling → network, and within 1 to 2 years, data could become the fourth bottleneck. – Key indicators from the token production perspective are tokens per second, tokens per watt, and cost to produce 1 million tokens. – “Efficiency improvements” like utilization rate and waiting time, and memory/transfer bottlenecks determine performance. – The network is reshaped not only by silicon photonics/optical transmission but also by on-device and edge (small data centers near base stations). – The opportunity in the data bottleneck is “who can collect better” (smart glasses/physical capture/synthetic/label). – Competition expands to lower token cost, as not only GPUs but also the MPU and memory ecosystem grow together.
[Related articles…]
- Silicon photonics and the AI network bottleneck: why optical transmission can be linked to “tokens per watt”
- The era of the AI fourth bottleneck “data”: why the data collection channel (wearables·physical AI capture) is the next stage for investment
*Source: [ 티타임즈TV ]
– AI 데이터센터 병목과 돈이 될 길목, 한눈에 정리 (김지현 SK 부사장)


