Inference Shockwave, Nvidia Grip Slips, Google Floods Gemini, OpenAI Device IPO Chase

·

·

● Training-to-Inference Power Shift Nvidias Grip Slips Google Floods Gemini OpenAI Chases Device IPO

The Real Changes Emerging as the AI Battleground Shifts from “Training” to “Inference”: Key Takeaways from a Three-Way Analysis of NVIDIA, Google, and OpenAI

Today’s post includes the following.

1) Starting in 2025, as the flow of AI money moves from “training” to “inference,” how the GPU-centered order gets shaken up

2) The structural reasons NVIDIA has become not just a semiconductor company but a “platform expansion machine”

3) The network effects created as Google injects Gemini not as an “app” but across its entire existing suite of services

4) OpenAI’s real objective in trying to handle IPO/investment/devices all at once

5) Realistic positioning for Korean companies and individuals to survive in this landscape (ride along vs go head-to-head)

1) Today’s core news in one line: “The AI market is now shifting its center of gravity from training to inference.”

At the four-year mark of ChatGPT, the shared diagnosis from all three is that a major competitive realignment and paradigm shift has begun.

So far, the parties making money were mostly on the infrastructure side—GPUs (NVIDIA), memory (HBM), and data centers/cloud—

while AI model and service companies were the ones spending money.

But now the logic is starting to change to: “We can’t keep going like this; services have to make money,” and the landscape is beginning to wobble.

2) (Semiconductors/Infrastructure) Why GPU dominance weakens as inference grows

The point is cost efficiency.

Training requires ultra-high-performance GPUs, but

inference is about running “a lot, frequently, and cheaply,” so the incentive to use dedicated chips (ASICs/inference accelerators) rises sharply.

If you summarize the flow mentioned in the video like news, it looks like this.

2-1) Change in AI workload mix: training↓, inference↑

Up to 2024, training accounted for more than 60%, but

by 2025, a structure where inference becomes larger (roughly 55:45 was mentioned) is being observed—that’s the gist.

This means “the infrastructure mix changes.”

2-2) Why hyperscalers build their own inference chips

The fact that Google, Amazon, Microsoft, and Meta are all moving toward proprietary chips is because

inference cost becomes a “fixed cost” that directly eats into margins.

Key keywords at this stage are data center investment, AI semiconductors, supply chains, and interest rates.

2-3) The signal that “even OpenAI can’t rely only on GPUs anymore”

The very fact that talk is emerging about OpenAI exploring its own chips (including inference-focused ones) with Broadcom

is a symbol that the “NVIDIA alliance” is not permanently locked in.

2-4) NVIDIA’s response: “Don’t miss inference either” + “All the way to the data center?”

The reason NVIDIA strengthens inference-dedicated chips/technology through acquisitions (mentioned in the video) is simple.

If cost-efficient chips expand in inference, GPUs can be partially substituted—so NVIDIA needs to absorb that demand.

On top of that, if next-gen roadmaps (Vera/Rubin, and then Feynman was mentioned) drive bigger requirements for power, cooling, and facilities,

over the long run, the company could expand from “a company that sells chips” to “a company that effectively designs/leads super data centers.”

3) (Models/Apps) What happens the moment “ChatGPT’s solo dominance” starts to crack

The second fissure is change at the front end—user touchpoints.

Now, “people don’t use just one; they mix and match” becomes the default,

and as traffic disperses, the diagnosis is that a Warring States–style multi-polar era arrives.

3-1) Multipolarization with Gemini, Claude, Grok, and others

This is not merely a model performance contest.

As each company’s distribution channels (service embedding), data, and pricing policies combine, users can switch more easily.

3-2) How “agents” shake up SaaS valuations

As “agents that actually do work on a computer,” such as OpenAI Operator-type offerings, Anthropic Computer Use-type offerings, and OpenAI Codex, gain attention,

traditional SaaS that charged based on headcount gets pressured.

So even companies with growing revenue may see their stock/valuation shaken as the market questions their future potential—that’s the argument.

4) NVIDIA: Not a “GPU company,” but a “developer-platform-industry expansion” company

The scary part of NVIDIA that all three highlight is not performance but “how it expands.”

4-1) Phase 1: Lock-in developers with CUDA

As GPUs became used for AI training, NVIDIA captured the developer community through the CUDA ecosystem,

and as a result, model companies ended up in a structure where they had no choice but to pick NVIDIA—that’s the explanation.

4-2) Phase 2: Another round of lock-in with the agent-era toolchain (NIM/Nemo, etc.)

When agents become mainstream, developers need “tools to build agents,”

and if NVIDIA lays down free tools/frameworks here,

the same structure repeats where NVIDIA ultimately takes the infrastructure behind it (DGX, etc.).

4-3) Phase 3: Platformization across industries—autos, robots, and digital twins

By moving into autonomous driving (automotive engines/platforms), robots (Omniverse/Isaac/Cosmos, etc.), and digital twins,

NVIDIA is expanding toward “everything that AI goes into.”

The core point here was the observation that “the company that breaks down the boundary between inside and outside controls the platform.”

5) Google: Winning by putting Gemini not into an “app,” but into “every existing product”

Google’s strength is less about the model itself and more about the fact that

it already has services people use every day (Calendar, YouTube, Search, Android, Workspace, etc.) and the data.

5-1) Gemini embedding strategy = Make users use AI “where they already were using things”

This eliminates user learning cost.

Instead of installing a new app and changing habits,

AI gets “injected” into existing workflows, so adoption spreads faster.

5-2) Penetration into outside industries: cars, home appliances, and robots

As in CES cases, when vision-language-action models attach to robots/home appliances/cars,

Google targets a structure where “my model runs inside your product.”

5-3) A Google-style expansion trait: “Spread wide first, then clean up later”

It runs startup-like experiments that big companies struggle to do,

and then the capability to later prune into focus is interpreted as a competitive advantage.

6) OpenAI: Why it looks at IPO/investment/devices simultaneously

OpenAI has a meaningful B2C share, so it can be strongly impacted by competitive dynamics,

so it can be read as a flow of simultaneously seeking more stable revenue sources (B2B) and a new arena (devices).

6-1) “We need more money” means the model competition is a long game

Continuous capital is needed for model advancement, agent expansion, and partnership scaling.

This phase is especially influenced by global supply chains, data center investment cycles, and the interest-rate environment.

6-2) The essence of device obsession: To flip the platform, you need a “new interface”

If a post-smartphone interface (pins, pens, glasses, etc.) takes hold,

you can design a new ecosystem on top of it.

It is the same logic as Meta’s obsession with Ray-Ban glasses/Quest.

7) The real reason AI devices (smart glasses) matter: “Even if it’s inconvenient, people use it if the benefit is big enough.”

Glasses will inevitably be heavy and uncomfortable (processor/memory/camera/microphone/speaker, etc.),

but if combined with AI to create utility that needs no persuasion, adoption happens.

The core point is not the hardware, but

the model/infrastructure/connected experience behind it determines usability.

8) Korean company strategy: “Ride along (HBM/infrastructure) + Go head-to-head (customer touchpoints/apps)”

8-1) A representative ride-along case: HBM

In the AI semiconductor era, HBM is an area where Korea has already built strength,

and it remains an important lever throughout the infrastructure expansion phase.

8-2) The domain you must challenge: customer touchpoints (applications/services)

Even if smartphone OSes were captured by Apple and Google,

regional players still survived in services/apps.

AI, too, is likely not “the model eats everything,” but rather

“who captures the screen customers open every day (work/finance/manufacturing/education, etc.)” becomes the decisive battleground.

8-3) Where opportunity emerges for Korea: applied, on-the-ground AX

Areas like manufacturing, finance, and education—where AI reshapes “on-the-ground processes”—are mentioned as fields

where Korean companies can demonstrate strong execution.

9) Individual survival strategy: Become someone who doesn’t just “use AI,” but “collaborates with AI”

If you summarize why the junior hiring market is tough these days in one sentence, it’s this.

“When someone who already works well also uses AI well, more work concentrates on that person.”

9-1) So what you need is “AI collaboration experience”

Now people say that “have you worked with AI” is evaluated as much as “have you done team projects.”

9-2) A realistic prescription: Artificially create “collaboration situations” through communities/projects

If it’s hard to learn inside a company,

even job seekers/juniors should form goal-based projects

and build routines of AI + human collaboration in advance—that becomes competitiveness.

9-3) One step further: “Curiosity + skepticism (verification)” is differentiation

Don’t accept AI’s answers as-is;

the habit of asking back “Why did this result come out?” ultimately builds skill.

10) The “most important core point” that other news/YouTube relatively under-discuss (blog-ready key takeaway)

Point A: Inference-dedicated chip competition is decided more by the “software layer (orchestration)” than by “chip performance.”

In the latter part of the video, talk comes up about IBM/Oracle-style orchestration (Harmony/Orchestra), and

this is actually a truly important signal.

As inference chips (MPU/ASIC) increase, the management/optimization layer that enables you to run “on any chip, any model” becomes the money.

In other words, the semiconductor war can revert from “hardware” back to a “software platform” war.

Point B: NVIDIA’s next threat is not GPUs but “data center design authority.”

Next-gen chips change power/cooling/rack design.

This expands beyond simple component supply into a fight over “who sets data center standards,” and

if NVIDIA holds the standard, the entire supply chain can be reshaped.

Point C: The smart battle for AI devices is not comfort but “repeat utility strong enough to change habits.”

The condition for smart glasses to spread even if they are uncomfortable is “repeat value that makes you use them every day.”

If that holds, the next customer touchpoint after smartphones changes,

and at that moment, app/advertising/commerce revenue models shift in one sweep.

< Summary >

The AI market is shifting from training-centric to inference-centric, and as a result inference-dedicated chips (ASIC/MPU) and orchestration layers are becoming important.

NVIDIA strengthens its dominance through “cross-industry expansion” from CUDA → agent tools → automotive/robot platforms.

Google counterattacks by injecting Gemini not as a standalone app but across its entire existing services, leveraging distribution and data.

OpenAI, amid a long-run model competition, simultaneously pursues B2B stabilization and devices to change the game.

For Korea, a two-track approach is realistic: ride along with HBM/infrastructure, and compete head-to-head in customer touchpoints (services/apps/AX).

Individuals survive by becoming people who don’t just “use AI,” but “collaborate with AI” to produce outcomes.

[Related posts…]

How changes in NVIDIA’s AI semiconductor roadmap impact Korea’s supply chain

The Inference Era: The ASIC war and data center investment points

*Source: [ 티타임즈TV ]

– 테크현인 3인의 엔비디아, 구글, 오픈AI 분석 (김지현 부사장, 최재홍 교수, 윤종영 교수)


● Training-to-Inference Power Shift Nvidias Grip Slips Google Floods Gemini OpenAI Chases Device IPO The Real Changes Emerging as the AI Battleground Shifts from “Training” to “Inference”: Key Takeaways from a Three-Way Analysis of NVIDIA, Google, and OpenAI Today’s post includes the following. 1) Starting in 2025, as the flow of AI money moves from…

Feature is an online magazine made by culture lovers. We offer weekly reflections, reviews, and news on art, literature, and music.

Please subscribe to our newsletter to let us know whenever we publish new content. We send no spam, and you can unsubscribe at any time.

Korean