AI Automation Frenzy

● AI Agents Go Full Automation

3 decisive signals that AI is shifting from a “answering model” to an “agent workflow that gets it done to the end”

Today’s news is exactly three pillars.

1) The fact that Anthropic is testing Claude in a form that is “always on, responsive to triggers, and runs in its own environment”

2) The fact that Z.ai公開ed a screen-aware (vision) coding model designed to take recognition of the screen and carry it through to real coding/work

3) The fact that Alibaba is pushing “repository-level engineering + agent execution” with Qwen 3.6 Plus, which puts a 1M (1 million) token context front and center

These three are converging in one direction.
It’s moving from a setup where the model stops at the “answering” stage to one where it keeps working after looking at the screen (observing) → reasoning (thinking) → connecting with tools/systems (acting).

And if you pin down this flow with SEO keywords, it looks like:
AI agents, vision-based coding, context windows, multimodal workflows, software automation are threaded throughout the whole news.

1) Anthropic: Testing “Claude CONWAY” as a separate ‘standalone environment’ for agents

[Key takeaway] This reads like a signal that Claude is evolving beyond the chat window into a persistent agent environment (instances) that appears as an option in the sidebar.

1-1. Conway is closer to “persistent (always-on)” than “sessions”

Instead of normal chat, users select Conway as a separate option (sidebar)
When clicked, a “Conway instance” runs
Internally, it’s also expressed in a way that’s closer to a resident/maintained agent workspace rather than a simple session
In other words, it’s not structured around “the model answers and stops,” but around work units that carry state

1-2. Agent workspace: chat/search/system split apart

Chat: the conversation function you’d generally expect
Search: appears to be connected to an experimental hotkey
System: the real differentiator
Manage the agent environment
Install/Connect extensions
Add UI tabs
Configure context handlers

1-3. The “CNW ZIP” extension ecosystem: operating like a ‘platform,’ not just a model

Preparing Conway extensions in a CNW ZIP file packaging format
A direction where developers package tools and attach them “like an app” inside the agent environment
Why this is important:
In the future, Claude is likely to become more than a single model—more like an execution platform where tools are plugged in and wired together
Ultimately, the competitive point shifts from “performance only” to “extensions/integration/operating structure”

1-4. Connectors/tools + Chrome toggle: the browser enters the agent loop

Display the tools exposed to the connected client
There’s a toggle for Claude (browser) to connect directly to Conway
The browser itself can become the agent’s input/workspace
This reads less like a simple demo and more like a signal to create a real-world loop (observe-act)

1-5. Webhook trigger: a structure that works even if you don’t leave it “open”

A webhook system is embedded inside Conway
External services call it via a public URL → the agent “wakes up”
So it’s not about keeping the user waiting, but moving toward an always-on agent that runs event-based
This also aligns with Anthropic’s clawed code/agent workflow direction

1-6. Improved developer experience: Claude Code “NO_FLICKER mode” + mouse support

[Key takeaway] The point isn’t only that agents are getting stronger—it’s also that they improved “terminal UX” so developers can actually use it more comfortably.

NO_FLICKER mode
Fixes flicker/jumps/long-session performance degradation commonly seen in terminals
Updates via screen buffer method (updates only the observable area) instead of full re-rendering
Stabilizes CPU/memory usage (considering long conversations and even multimodal multi-agent workflows)
Full mouse support
Cursor position via clicking
Clickable tool output expansion
Click a URL to open immediately
Click a file path to open in the editor
Drag to select → automatic clipboard copy
Smooth scrolling-wheel navigation
More precise selection units with double/triple click (word/line)
One-line trade-off
Some native search shortcut keys may behave differently (experimental)

2) Z.ai: Aiming for “screen awareness + coding/work” in one go with GLM-5V-Turbo

[Key takeaway] It’s not just about having the ability to see the screen—it’s about connecting directly to a screen-to-coding/workflow flow.
As the name suggests, “5V Turbo” focuses on vision(Vision)-based coding and the vision agent workflow.

2-1. A direct hit on the existing problem: “can see, but can’t get the work done”

Many multimodal models
describe images well, but
have weaknesses in connecting them to actually useful code/actions
GLM-5V-Turbo claims it’s designed to handle both sides “at the same time”

2-2. Inputs include screens/documents/videos: accepting real work formats as-is

Supported scope (main points)
Images, video, UI layouts, design mockups
Dense documents
Given real-world workflows, the core is this:
In practice, it’s not “clean text”
It includes “messy evidence” like broken screens, PDFs, bug screenshots, and problem videos
So the model needs to properly capture visual grounding to make real work possible.

2-3. Technical keywords (company claims): Cogvit Vision Encoder + MTP + speed/long-output optimization

Preserve fine visual details/layout with Cogvit Vision Encoder
Strengthen handling of speed and long outputs with MTP (multi-token prediction)
Translated into one line:
“Observe clearly (preserve) → think fast (predict) → produce long work results (long output)”

2-4. A 200,000 context window + simultaneous multi-task training

200K context
A strategy to handle long documents, large codebases, and long flows based on vision/video in one go
Train 30+ tasks at the same time (claim)
Including STEM reasoning, vision grounding, video analysis, and tool use
It’s not a model that only does one capability well—
You can summarize it as aiming for the entire chain that goes from observing → understanding → continuing to the next action.

2-5. Optimizing the agent workflow: oriented toward OpenClaw/Clawed Code

Optimized so the agent moves in a screen-based work environment
Example:
Look at the screen and help with setup
Decide the next action by reading the screen
Proceed step-by-step like real computer work
It also mentions linkage with cloud code
Show screenshots/bug situations and it suggests code
“Pointing instead of explaining” becomes natural

2-6. Benchmark mentions: evaluating multimodal coding/agent execution

CCbench, V2, Zclaw Bench, Claw Eval, etc.
The key is that the test isn’t only about “understanding visually”—
multimodal coding
multimodal multi-step agent execution
producing useful results
That’s what they’re set up to measure.

3) Alibaba: Accelerating “project-level agent coding” with Qwen 3.6 Plus + 1M context

[Key takeaway] The strongest number here is 1M tokens (1 million) context.
And the goal isn’t “a chatbot demo,” but repo-level engineering + real execution.

3-1. “Capability loop”: repeat perception-reasoning-action in one workflow

What Alibaba emphasizes is the “full capability loop”
In other words, it’s not just answering once and stopping;
decomposing tasks
doing step-by-step work
running tests/corrections
moving forward continuously
Especially in coding, as agent reliability and repeat execution become more important, this direction aligns well with the market trend (agent workflows).

3-2. Repository engineering: handling the whole project, not snippets

Not a single code fragment;
perform work across the entire codebase
Meaning:
Keeping long-form context
Tracking dependencies across files
Needing multiple edit/validation loops
That’s why it matches the nature of 1M context.

3-3. The 1M context window: the foundation for an agent to “keep memory”

1M tokens means the agent can contain more information at once (long documents/large code/long instructions)
What an agent needs isn’t really “short Q&A,” but
what it did previously
which files/tools were important
what work still remains
Maintaining that context.
As context grows, the maintenance cost and omissions go down.

3-4. Preview on OpenRouter + (for now) free access: expanding experimental accessibility

Offered in a preview form on OpenRouter
Currently, free access based on 1M context is mentioned
This can also be viewed as a mechanism to help “developers try it quickly and attach it to workflows.”

3-5. Efficiency/reliability: hybrid architecture + stronger agent execution stability (claim)

Improving the hybrid architecture →
efficiency
reduced energy consumption
improved scaling
It explains that it strengthened inference/agent execution reliability compared to the 3.5 series.

3-6. Deployment-oriented: Wukong (enterprise automation) + agent tool integrations

Wukong: a multi-agent platform for automating enterprise work
Mentions connections with OpenClaw, Claude Code, Klein, etc.
Also for multimodal:
parsing dense documents
real-world visual analysis
long video inference
screenshots/hand-drawn wireframes/mockups → generating frontend code
It describes everything centered on “real work inputs.”

The most important turning point to take away from this news

The core is one thing.

The market’s competitive point is moving
from “how convincingly it answers”
to “whether it can keep repeating observe-reason-act to the very end inside the agent workflow”.

So even though the three companies’ directions differ (Claude CONWAY vs GLM-5V-Turbo vs Qwen 3.6 Plus), they share the same denominator.

Persistent (always-on) agents: triggers/webhooks/independent instances
Screen-based multimodality: treat screenshots/video/layouts as “work inputs”
Context expansion: keep project/long instructions with 200K~1M scale
Software automation: finish results through repo-level coding, tool integration, and repeated execution

In short:
AI is moving beyond being a conversation partner toward becoming the “agent-like entity that runs work,” like an operating system.

5 questions to check from an investment/work perspective in this updated field

1) Does the workflow I use connect not just “chat,” but also “event triggers/tool execution”?
2) Does it take screens (screenshots/videos/layouts) as real work inputs and carry that through to the results?
3) Is the context window large enough so project-level work doesn’t break apart?
4) Is the structure one where you can support extensions or tool connections?
5) In multi-step execution, does it reliably work “all the way to the end”?

Main content to convey (one-line summary)

As these three come together—Claude CONWAY’s persistent agent structure, Z.ai’s screen-aware coding, and Alibaba’s 1M-context-based project execution—AI agents are entering the “business workflow automation” stage in earnest.

< Summary >

Anthropic tests Claude with a Conway persistent agent environment, showing an always-on execution direction with an extension ecosystem (CNW ZIP), connectors, and a webhook trigger
Claude Code improves long-session developer UX with NO_FLICKER mode and full mouse support
Z.ai has GLM-5V-Turbo designed to more directly understand screen/layout/document/video inputs and carry it through to agent coding (claims include a 200K context and simultaneous multi-task training)
Alibaba ships Qwen 3.6 Plus with 1M context by default and puts the perception-reasoning-action loop and repository engineering front and center
Conclusion: Competition is moving from “answers” to “the ability to get it done to the end inside the agent workflow”

[Related articles]

*Source: [ AI Revolution ]

– Anthropic’s New Claude CONWAY Is Unlike Any AI Before

NextGenInsight.Net

Like this:

Gold Shock, AI Inflation, Semiconductor Surge

AI, power, scramble

KOSPI Hit Hard by Rate Hike Fears, ETF Curbs, Break-Even Selling

Feature is an online magazine made by culture lovers. We offer weekly reflections, reviews, and news on art, literature, and music.

NextGenInsight.Net

AI Automation Frenzy

3 decisive signals that AI is shifting from a “answering model” to an “agent workflow that gets it done to the end”

1) Anthropic: Testing “Claude CONWAY” as a separate ‘standalone environment’ for agents

1-1. Conway is closer to “persistent (always-on)” than “sessions”

1-2. Agent workspace: chat/search/system split apart

1-3. The “CNW ZIP” extension ecosystem: operating like a ‘platform,’ not just a model

1-4. Connectors/tools + Chrome toggle: the browser enters the agent loop

1-5. Webhook trigger: a structure that works even if you don’t leave it “open”

1-6. Improved developer experience: Claude Code “NO_FLICKER mode” + mouse support

2) Z.ai: Aiming for “screen awareness + coding/work” in one go with GLM-5V-Turbo

2-1. A direct hit on the existing problem: “can see, but can’t get the work done”

2-2. Inputs include screens/documents/videos: accepting real work formats as-is

2-3. Technical keywords (company claims): Cogvit Vision Encoder + MTP + speed/long-output optimization

2-4. A 200,000 context window + simultaneous multi-task training

2-5. Optimizing the agent workflow: oriented toward OpenClaw/Clawed Code

2-6. Benchmark mentions: evaluating multimodal coding/agent execution

3) Alibaba: Accelerating “project-level agent coding” with Qwen 3.6 Plus + 1M context

3-1. “Capability loop”: repeat perception-reasoning-action in one workflow

3-2. Repository engineering: handling the whole project, not snippets

3-3. The 1M context window: the foundation for an agent to “keep memory”

3-4. Preview on OpenRouter + (for now) free access: expanding experimental accessibility

3-5. Efficiency/reliability: hybrid architecture + stronger agent execution stability (claim)

3-6. Deployment-oriented: Wukong (enterprise automation) + agent tool integrations

The most important turning point to take away from this news

5 questions to check from an investment/work perspective in this updated field

Main content to convey (one-line summary)

< Summary >

Share this:

Like this:

Gold Shock, AI Inflation, Semiconductor Surge

AI, power, scramble

KOSPI Hit Hard by Rate Hike Fears, ETF Curbs, Break-Even Selling

Feature is an online magazine made by culture lovers. We offer weekly reflections, reviews, and news on art, literature, and music.

NextGenInsight.Net