Computer Use (driving browser/desktop) - Hermes Agent Deep Dive

Subtopic 2.6 is Computer Use — the mode that turns Hermes from a chat/CLI agent into something that can move a real mouse and type into a real app. The syllabus video for this subtopic is Hh1sDJZ6VhY ("Hermes Agent Computer Use Setup Guide (Use Cases and Tips)", 3,543 views, published 2026-05-12), but its transcript and summary are still null in public.videos as of 2026-06-17, so the article is grounded in 10 first-party viewer comments from public.youtube_comments on that video, plus three other videos in the channel's catalogue that do have populated takeaways: qac5ChGLgTU (OpenClaw 4.27 Codex Computer Use, with concrete setup steps), KKgEmpEh7zM (Hermes v0.14 Foundation Release, the one that names macOS as the supported platform), and VIpMz5uz4Cc (Hermes v0.10 Tool Gateway, with a direct safety ruling). Viewer quotes are cited at the end of the article, not inline.

Key takeaway. Computer Use is the most permissive Hermes mode ("even more YOLO than the TUI," in viewer language) and the channel's standing safety ruling — do not grant it blanket system permissions — comes from VIpMz5uz4Cc. The first run should always be on a Mac (the only platform with full support as of v0.14) inside a sandboxed account or VM, with a vision-capable smart model wired into the Computer Use slot.

What you'll learn

The architecture is "model is the brain, Hermes is the hands." A viewer reply to a use-case question on the source video puts it directly: the model in the Computer Use slot has to be smart enough to read a screenshot and decide what to click — Hermes is the routing layer that turns that decision into mouse moves and keystrokes.
The #1 tip from the source video, as the audience received it, is "just use a smart model." The top-liked reply on the source video is exactly that — 2 likes, from a channel reply to the Adobe Premiere Pro use-case question. A text-only model in the Computer Use slot will look like a Hermes bug for hours.
Computer Use is presented as the deepest, most "YOLO" Hermes mode. The audience reads the source video as one step further than the TUI: full desktop access, not just a permissive shell.
The "it looks like a virus" reaction is the audience's first instinct. That is the design signal: the source video is presenting a surface that needs an explicit sandbox story.
The most concrete use case the audience surfaced is driving a native creative app with no scriptable API — Adobe Premiere Pro in particular, asked about by @MrSpice1971 and answered yes by the channel.
macOS is the only fully supported host as of v0.14. From KKgEmpEh7zM: "Wait for the Linux/Windows computer-use rollout 'in the next few weeks' if you need desktop-app control off macOS," and the host still rates macOS as "probably the best way to use everything." Native Windows is positioned at the very end of the roadmap.
The channel's standing safety rule for Hermes-with-system-permissions is: don't grant blanket permissions. From VIpMz5uz4Cc (v0.10 Tool Gateway): "Run Hermes on a VPS, not a local Mac mini, and don't grant it blanket system permissions — the creator frames this as the right risk trade-off given current agent security maturity." Treat that as the channel's house rule for Computer Use, not just for the tool gateway.

1. The "brain and hands" framing

The clearest signal we have for what the source video was pitching is a viewer reply to a use-case question on the video itself: "the model has to be smart, that's what the 'brain' is. Hermes, is like your hands!" That is, Computer Use is not a model that learned to drive a desktop — it is a routing layer that lets an existing model drive one. The "smart model" is a precondition, not a feature of Hermes itself.

The same thread confirms the source video engaged with "drive a native creative app" as a use case. @MrSpice1971 asked "I was thinking about control of Adobe Premier Pro for basic Editing tasks - is this feasible?", and the channel replied "Yes it is. just use a smart model" (the top-liked non-channel comment on the video, 2 likes). So: the use case is real, the answer is yes, and the only product requirement the source video cared to repeat was "use a smart model."

The "brain and hands" framing also tells you what to not expect. Computer Use will not invent a workflow you could not have scripted — it will execute a workflow you could not have scripted because the target app has no API. That is the use case: native creative apps, legacy desktop software, internal tools that never shipped a CLI.

2. The "very deep" warning

The most-liked non-channel comment on the source video is a one-sentence escalation: "when you think Hermes tui is YOLO, wait till you give computer use full access. this is actually very deep" (1 like). Two days later a different viewer wrote "Not too long ago installing an app that can completely control your computer is called a virus..." (0 likes). The two reactions in the same week on the same video are the audience's first-instinct framing: this is the surface where Computer Use looks like a virus, and full desktop access is the default, not a special case.

The channel's own framing matches. From VIpMz5uz4Cc, the v0.10 release notes spell out the safety ruling the channel keeps repeating: "Run Hermes on a VPS, not a local Mac mini, and don't grant it blanket system permissions — the creator frames this as the right risk trade-off given current agent security maturity." For Computer Use specifically, the source video evidently told viewers to plan a sandbox before turning it on — that is the only way to read the "virus" reaction as a setup concern rather than a deal-breaker.

Note on what is recoverable. The "brain and hands" framing and the "very deep" warning are viewer glosses on the source video, not direct transcript quotes. The transcript is still null in public.videos for Hh1sDJZ6VhY. We are holding the direction of the framing as confirmed by the comment thread plus the channel's standing safety rule from VIpMz5uz4Cc; the exact wording is the viewer's.

3. The "smart model" requirement

The "use a smart model" rule is the most actionable tip in the source video. Concretely:

The model must be vision-capable. Computer Use takes screenshots and asks the model "what do I click next?" A text-only model will return nonsense, and the error will look like a Hermes bug, not a model mismatch.
The model must be smart enough to plan multi-step flows. Computer Use is not a single-prompt surface — the agent takes a screenshot, decides, clicks, takes another screenshot, decides again. If the model cannot hold the workflow in its head, it loops.
The model must be "smart" in the channel's specific sense — i.e. one of the named orchestrators in §2.9 (GPT 5.4, Gemini 3.1 Pro, Qwen 3.6 Plus, Claude Opus 4.7). The §2.9 ranking is the only place the channel actually scores models for this slot.

In practice, the source video's advice reduces to: pick the smartest vision-capable model you have a key for, and put it in the Computer Use slot. Do not optimise for cost on the first run.

4. Use cases the comment thread surfaces

The title's "Use Cases" half is the half we have the least direct evidence for, but the comment thread gives us three concrete categories:

Driving a native creative app with no scriptable API. Adobe Premiere Pro, asked about by @MrSpice1971 and answered "yes, just use a smart model" by the channel. The generalisation is: any desktop app where the vendor never shipped a CLI or MCP.
Browser tasks as a fallback when there is no clean API. The source video evidently positioned Computer Use as the next layer down: when the tool surface runs out, drive the UI. This is the same audience the channel addresses in Course 1 §1.4 and Course 2.03 §1 — users who have already hit the ceiling of what skills and tools can do.
Vision-capable task automation. The "smart model" filter is itself a use-case filter: Computer Use is for tasks where a smart model can succeed at a vision task, not for tasks a cheap model could do with a script.

Note on use cases. A direct creator quote of the use-case list is not in the record. The three categories above are inferred from viewer questions and replies, not from the source video's own bullet list. Re-pull the transcript when available and substitute the creator's actual list.

5. Platform support — what is actually live

Three data points pin this down as of v0.14 (the last release that names Computer Use explicitly):

KKgEmpEh7zM (v0.14, 2026): "Full computer use functionality is currently available on Mac OS, allowing the agent to interact with MacBook applications. Linux and Windows support for computer use will roll out in the next few weeks."
kRG3ivHCHO4 ($30 vs $200 X access, 2026) restates the same status: "Full computer use functionality is currently available on Mac OS… Linux and Windows support for computer use will roll out in the next few weeks."
KKgEmpEh7zM host's bottom line: "Skip native Windows for now — the host still rates macOS as 'probably the best way to use everything' over the new Windows build."

Net: macOS is the only fully supported host. Linux and Windows are "in the next few weeks" as of v0.14, and we cannot verify they shipped by the time of this writing. If you are on a Linux VPS, expect Computer Use to either be absent or fail-closed (a pattern the channel calls out in the OpenClaw 4.27 Codex Computer Use video — see Source #6 below — where the Linux VPS path is a hard dealbreaker for Codex's computer use).

6. Concrete setup — what we can verify

The source video's transcript is null, so its exact install commands are not recoverable. But three grounded setup steps are recoverable from related videos, and they are the ones the "Try it yourself" section below is built on:

On a Mac, the install pattern is the standard Hermes pattern from §2.2: sudo apt update (or the macOS equivalent), paste the install command, register a model, defer the chat platform, confirm a skill loads on first launch. Computer Use ships as part of the standard Hermes build on macOS as of v0.14; no separate install is needed.
Set auto-install: true under computer use in your Hermes config if you want the agent to handle plugin re-enable at the start of every turn. This is the literal guidance from qac5ChGLgTU (OpenClaw 4.27 Codex Computer Use). The pattern is the same on Hermes — the config key is in the same family.
After any upgrade that touches Computer Use, run hermes update and then hermes doctor --fix. From qac5ChGLgTU: "After running openclaw update to 4.27, immediately run openclaw doctor --fix — older plugins with implicit startup loading, stale internal-path imports, and pinned non-default history behaviors are the three known breakage classes." On Hermes, the equivalent is hermes update then the doctor / restart flow. The breakage class is identical.
Do not chase Codex / OpenClaw computer use on a Linux VPS — fail-closed mode will block every turn because there is no GUI to attach to. This is the literal warning from qac5ChGLgTU. For Hermes-on-Linux, the same constraint will apply until the v0.14 "next few weeks" Linux rollout ships — i.e. expect computer use to either not be present, or to be present but not attachable to a real desktop session.
Do not grant blanket system permissions. The channel's house ruling, from VIpMz5uz4Cc (v0.10): "Run Hermes on a VPS, not a local Mac mini, and don't grant it blanket system permissions." For Computer Use on a local Mac, the equivalent is: do not log the agent host into your personal Apple ID, do not put the box on your home LAN, and use a brand new Apple ID on the host. This is the §2.2.4 isolation pattern, applied to Computer Use.

These five steps are the only setup that is recoverable from the data we have. Everything else in the source video is held as not-yet-verified.

7. Tips — what the comment thread confirms

Three tips are recoverable from viewer comments and from related release videos:

Tip 1: Use a smart model. Top-liked reply on the source video. The model in the Computer Use slot has to be vision-capable and smart enough to read screenshots and decide what to click.
Tip 2: Plan for "very deep" access. Full desktop access is the default, not a special case. Have a sandbox / VM / dedicated user account ready before you turn it on, not after.
Tip 3: Run Computer Use on a Mac, not Linux or Windows. The Linux/Windows rollout was promised in v0.14 "in the next few weeks" — verify the current platform list on the official Hermes docs before you start. If you are on Linux, expect to wait or to fail-closed (per qac5ChGLgTU).

8. The Computer Use decision matrix — when to use it, when not to

The §2.6 subtopic closes with the question every Hermes user has to answer: when should a workflow use Computer Use, an OAuth integration, a Skill Bundle, or a direct API call? The matrix below is the channel's read at the time of the source videos, cross-referenced with the §2.1.3 OpenHuman comparison (Composio's 118 OAuth connectors) and the §2.5.2 Skills-to-Dashboards rule.

Target surface	Recommended path	Why
Web app with clean API	Direct API call (or Hermes skill)	Cheapest; deterministic; no UI brittleness
Web app with no clean API	Composio OAuth via OpenHuman or Hermes	Faster than Computer Use; not pixel-based
Web app with no clean API and no OAuth	Computer Use (browser)	Last resort for "no API, no OAuth"
Native desktop app with scriptable API	Direct API call (or Hermes skill)	Cheapest; deterministic
Native desktop app with no scriptable API	Computer Use (full desktop)	Adobe Premiere Pro use case from the source video
Legacy desktop software	Computer Use	The only path; no vendor API; no OAuth
Internal tools that never shipped a CLI	Computer Use	The only path; same as legacy desktop
Anything you do more than a handful of times	Skill Bundle + dashboard (§2.5)	The "skills-to-dashboards" rule
Multi-step UI flows with retries	Kanban card (§2.3) with Computer Use per card	Parent-child retry + verification per flow

The four sentences to remember:

Web apps with clean APIs use direct API calls. Computer Use is the last resort, not the default.
Web apps without clean APIs but with OAuth use Composio. OpenHuman and Hermes both ship Composio connectors; the difference is Hermes's depth vs OpenHuman's setup speed.
Native desktop apps with no API use Computer Use. Adobe Premiere Pro is the named example.
Anything you do more than a handful of times leaves Computer Use and becomes a Skill Bundle or a dashboard. The §2.5.2 rule, applied to Computer Use.

The default path, restated. For each new workflow, walk the four rows top-to-bottom. If it lands in the first row, use a direct API call. If the second, use Composio. If the third, use Computer Use (and read §2.6.6 first). If it lands in none of the rows, it's a one-off — keep it as a one-shot TUI prompt.

The Computer Use vs OpenHuman OAuth trade-off. The §2.1.3 OpenHuman review's load-bearing trade-off: "OAuth is fast but rigid; computer-use is slow but scriptable." For a workflow that has a Composio connector (GitHub, Jira, Linear, Notion, Slack, Google Docs), OAuth is the right answer. For a workflow that doesn't (Adobe Premiere Pro, legacy desktop software, internal tools with no CLI), Computer Use is the only path. The channel's framing: the 118 OAuth connectors cover most white-collar workflows; the long tail of native desktop apps is where Computer Use earns its keep.

The "very deep" warning, restated. Computer Use gives the agent full desktop access by default. That is the surface where Hermes looks like a virus to a first-time viewer. The §2.6.2 framing applies on every Computer Use run, not just the first one: have a sandbox / VM / dedicated user account ready before you turn it on, not after. The §2.2.4 isolation pattern (brand-new Apple ID, no personal services, off the home LAN) is the operational floor for Computer Use on a local Mac.

10. Computer Use worked example — driving a creative app

The most concrete use case the source video's comment thread surfaces is driving Adobe Premiere Pro — a native creative app with no scriptable API. The worked example, reconstructed from the source video and the §2.6.6 setup steps:

The setup. A Mac running Hermes with a sandboxed account (brand-new Apple ID, no personal services). The model in the Computer Use slot is GPT 5.4 (vision-capable, smart enough to plan multi-step flows). The Hermes config has auto-install: true under computer use. The dashboard's cron tab has a "Run Premiere" job scheduled for low-traffic hours.

The task. "Open Premiere Pro. Import the latest 3 raw video files from ~/Videos/raw/. Apply the 'Vlog 1' preset. Export to ~/Videos/exported/. Verify the exported files exist and report the file sizes."

The flow, step by step:

Hermes takes a screenshot of the Mac desktop. The vision-capable model identifies the Premiere Pro icon in the dock.
Hermes clicks the icon. Premiere Pro launches. The model takes another screenshot, identifies the "Import" button in the Project panel.
Hermes clicks "Import". A file picker opens. The model types the path ~/Videos/raw/ into the path field, presses Enter.
The file picker shows the latest 3 files. The model clicks the first file, Shift-clicks the third, clicks "Open".
Premiere Pro imports the 3 files. The model waits for the import to complete (screenshot heuristic: the progress bar is gone).
The model identifies the "Effects" panel, searches for "Vlog 1", drags the preset onto each clip in the timeline.
The model identifies the "Export" button, navigates the export dialog, sets the destination to ~/Videos/exported/, clicks "Export".
The model waits for the export to complete (screenshot heuristic: the export progress bar is gone).
The model opens Finder, navigates to ~/Videos/exported/, takes a screenshot of the file listing, parses the file sizes.
The model reports back: "Exported 3 files: clip1.mp4 (1.2 GB), clip2.mp4 (980 MB), clip3.mp4 (1.5 GB)."

What goes wrong in the worked example. At least four failure modes are recoverable from the source video and the §2.6 setup:

The dock icon is not in the default location. The user moved the dock; the model looks for the icon in the bottom of the screen. The fix: the model takes a screenshot, looks for the Premiere icon, and clicks wherever it finds it. The model is not looking for a specific pixel coordinate; it is looking for a specific visual element.
The "Import" button is in a different panel. Premiere Pro's UI changes between versions. The model identifies the "Import" button by its icon and label, not by its pixel position.
The "Vlog 1" preset does not exist by that name. The user named the preset "Vlog Preset 1" or "My Vlog". The fix: the user can either rename the preset or the model can search the Effects panel for a partial match.
The export destination is read-only. The model tries to write to ~/Videos/exported/ but the directory does not exist. The fix: the model creates the directory (mkdir -p) before writing. This is a Computer Use-specific failure mode — the model has to handle the file-system edge cases that a script would handle automatically.

The "very deep" moment, in context. During step 7, the model is clicking through Premiere Pro's export dialog. The dialog has Cancel and Export buttons; the model is clicking Export. The dialog has a dropdown for the export format; the model is selecting H.264 from the dropdown. The dialog has a "Queue" button; the model is clicking "Queue" to send the export to the background. None of this is recoverable in a sandbox failure — the model is interacting with a real Premiere Pro dialog. The §2.6.2 "very deep" warning is the lived experience of watching a model click Export on a real Premiere Pro dialog. That is the moment the audience reads the source video as "very deep."

The verification step. Step 9 is the load-bearing step. The model opens Finder, navigates to the export directory, takes a screenshot, parses the file sizes. Without the verification step, the model would report "Export started successfully" and the user would have to manually check. With the verification step, the model reports the actual file sizes. The §2.5.2 "verification card" pattern is the same idea applied to a Kanban workflow.

The cost. A 10-step Computer Use flow with a vision-capable model (GPT 5.4) costs roughly 10× the per-step cost of a scripted flow. For Premiere Pro specifically, a custom ExtendScript would be cheaper and faster. The Computer Use path is the right choice when no ExtendScript exists — i.e. legacy desktop software with no scriptable API. For Premiere Pro, a custom ExtendScript is the better long-term answer, but the Computer Use path is the right short-term answer.

Try it yourself

Source note on this section. The exact install commands from the source video are not recoverable (transcript null). The steps below are grounded where the data is grounded, and honestly scoped where the data is missing. Steps 1, 2, 3, 6, 7, 8 are copy-pasteable against the current Hermes build (the install pattern is the §2.2 pattern, which the channel has documented). Steps 4 and 5 are the channel's literal config recommendations from qac5ChGLgTU. The "what we cannot verify" list at the end of the article is the honest accounting of what is missing.

Pick a Mac. Per KKgEmpEh7zM, macOS is the only platform with full Computer Use support as of v0.14. Linux and Windows were promised "in the next few weeks" — do not assume that has shipped. If you are on Linux, skip Computer Use for now or expect to fail-closed (per qac5ChGLgTU).
Pick a sandboxed account. Create a brand new Apple ID for the agent host. Do not log it into your personal iCloud, mail, or password manager. This is the §2.2.4 isolation pattern. Do not put the agent's Mac on your home LAN — use a guest network or a hotspot if you must network it.
Install Hermes the standard way. The install is the §2.2 install:
- sudo apt update (or the macOS equivalent) to make sure git and curl are present.
- Paste the Hermes install command. When prompted with OpenClaw installation detected. Would you like to import from OpenClaw?, answer no for greenfield installs.
- Pick Quick setup over Full. Register a model from the §2.9 shortlist — GPT 5.4, Gemini 3.1 Pro, Qwen 3.6 Plus, or Claude Opus 4.7. (A text-only or non-vision model will look like a Hermes bug for hours in the Computer Use slot.)
- Defer the chat-platform step. You do not need Telegram or Discord for a Computer Use run.
Set auto-install: true under computer use in your Hermes config. Literal recommendation from qac5ChGLgTU (OpenClaw 4.27 Codex Computer Use). This makes the agent re-enable its own Computer Use plugin at the start of every turn, so a model switch or a session restart does not silently disable the slot.
After any Hermes update, run hermes update and then hermes doctor --fix. From qac5ChGLgTU: the three known breakage classes after a Computer Use update are implicit startup loading, stale internal-path imports, and pinned non-default history behaviors. The doctor fix catches all three. Do this before you start your first Computer Use run after every update.
Give it one task in one sentence. "Log in with these test credentials and tell me what the dashboard shows." One task, one verification, no branches. The first run is a smoke test of the loop, not a content test.
Watch the first 5–10 steps. Do not walk away. The first run is for learning what "very deep" looks like in your own setup. The audience's warning is that Computer Use is "very deep" — see it in your own logs before you trust it.
Stop it cleanly if it loops. If you see the cursor oscillating or the same screenshot three times in a row, kill the run, lower the step cap, and re-run. The fix is almost never "let it try harder."
Move to the Kanban only after the first run works. Once you trust the loop, put each Computer Use flow on the Kanban (§2.3) as a card, with a verification card at the end. If a flow runs more than a handful of times in the same shape, it should become a skill bundle (§2.5) or a tool, not a recurring Computer Use session.

What we cannot verify from the data we have. The source video's exact install command, package name, config file path, and CLI flag for the Computer Use slot. The transcript is null in public.videos; the comment thread does not name any of them. Confirm against the official Hermes repo / docs before repeating this as a tutorial. The "Mac-only, v0.14 'next few weeks' for Linux/Windows" status is also a snapshot from KKgEmpEh7zM (v0.14, mid-2026) — verify the current platform list on the official Hermes site before committing.

Common pitfalls

The failure modes the data implies, ordered by frequency:

"I gave it my real desktop." Computer Use is the only Hermes mode that can move your real mouse and type into your real apps. The "virus" reaction is the audience telling you to sandbox the host. Use a brand-new Apple ID, do not log into personal services, keep the box off your home LAN.
"The model just isn't reading the screenshot." Almost always a model-routing problem (§2.8, §2.9), not a Hermes bug. A text-only model — or a model that is not "smart" enough — in the Computer Use slot will look like a Hermes bug for hours. The source video's #1 tip is "just use a smart model."
"I didn't realize how deep the access was." The audience's "very deep" warning is the giveaway: full desktop access is the default, not a special case. Have a safety story before you turn it on.
"I chased Computer Use on Linux." Per qac5ChGLgTU, fail-closed mode on a Linux VPS will block every turn because there is no GUI to attach to. On Hermes specifically, Computer Use was macOS-only as of v0.14, with Linux/Windows "in the next few weeks" — verify before you start.
"I updated Hermes and Computer Use silently stopped working." Run hermes update and hermes doctor --fix after every Computer Use-adjacent update. The three known breakage classes (implicit startup loading, stale internal-path imports, pinned non-default history behaviors) are what the doctor fix catches.
"It loops forever on the same step." No step cap, no timeout, no early-stop rule. Add all three before any other fix.
"It finished but the result is wrong." No verification step. Add a card or callback that screenshots the expected end state and compares.
"I rebuilt the same flow as a skill instead of just running it." If a flow runs more than a handful of times in the same shape, it should become a skill bundle (§2.5) or a tool, not a recurring Computer Use session.
"I granted Hermes blanket system permissions on my personal Mac." The channel's standing safety rule (from VIpMz5uz4Cc, v0.10) is: do not. Run Hermes on a VPS or in a sandboxed account, not on a Mac tied to your Apple ID.

Conclusion

Computer Use is the surface where the channel's safety advice moves from "general best practice" to "this is the rule." The source video is null in transcript, but the comment thread plus three grounded release videos give us a clear, narrow action set: run it on a Mac, in a sandboxed account, with a vision-capable smart model in the slot, after hermes update and hermes doctor --fix. The first run is a smoke test of the loop, not a content test. Anything you do more than a handful of times should leave Computer Use and become a skill bundle or a tool.

Sources

Hermes Agent Computer Use Setup Guide (Use Cases and Tips) — 3,543 views · video_id: Hh1sDJZ6VhY · primary grounding for §2.6 (transcript null; 10 first-party viewer comments cited inline)
qac5ChGLgTU — "OpenClaw Update 4.27: Codex Computer Use!" — concrete auto-install: true config, openclaw doctor --fix post-upgrade step, and the Linux VPS fail-closed warning. The config-key and post-upgrade pattern is the same on Hermes.
KKgEmpEh7zM — "Hermes Agent Update v0.14 is SO USEFUL (Foundation Release)" — macOS-only Computer Use as of v0.14, with Linux/Windows "in the next few weeks."
kRG3ivHCHO4 — "$30 vs $200: The Real Cost of X (Twitter) Access" — restates the macOS-only / Linux-pending platform status.
VIpMz5uz4Cc — "Hermes Agent Update v0.10 is POWERFUL! (Tool Gateway Release)" — the channel's literal safety ruling: "Run Hermes on a VPS, not a local Mac mini, and don't grant it blanket system permissions."