The Kanban (durable, retryable, inspectable tasks) - Hermes Agent Deep Dive

Subtopic 2.3 is the Kanban board — the single feature that the channel comes back to over and over as the reason to use Hermes over a vanilla orchestrator. Three videos are anchored in the syllabus: the canonical setup guide, the cron pairing, and the multi-board update. Across the three, a clear shape emerges: the Kanban turns multi-agent work into durable, retryable, inspectable tasks on a board, with named profile workers, isolation per board, and a worker-logs trail that an ordinary sub-agent flow does not give you.

The load-bearing detail — the one thing to walk away with — is the parent-child retry loop. A normal orchestrator notifies you when a run fails; the Kanban retries. The creator logged a build that needed 6 runs and a separate test that needed 81. A vanilla orchestrator would have notified him and stopped at run 1. The full history lives under Worker Logs in the Kanban UI — check it before assuming a pipeline is broken. That single fact is the §2.3 anchor.

What you'll learn

The Kanban is not just a UI — every assignee is a named profile under ~/.hermes/profiles/<name>/ with its own SOUL.md and config, and the parent agent's API key is not inherited. You set inference_provider and the key on every profile by hand.
Setup is a four-step sequence: hermes update, init the Kanban DB, hermes gateway start, then hermes profile create <role> for every assignee. On older builds, these commands do not exist.
Finished reports live in ~/hermes/kanban/, not ~/hermes/profiles/ — the worker logs and the final markdown file land in different trees, and the dashboard only flips to done on a successful write.
The Kanban's killer feature is the parent-child retry loop. A normal orchestrator notifies you when a run fails; the Kanban retries. The creator logged a Space Shooter game build that took 6 runs before he terminated it, and a separate test that hit 81 runs before succeeding.
Pairing the Kanban with cron turns a 14-step sequential pipeline into a 9-worker parallel one — but you hit four real bugs: gateway exits early, duplicate parent tasks on test runs, no cron-to-Kanban dedup, and task accumulation (the creator hit 63 orphan children in a week).
The multi-board update adds a hermes kanban boards subcommand, a Profiles UI section with a one-click "Copy CLI command", and a soul.md editor per profile — but profile name collisions across boards break isolation.

1. The canonical Kanban setup

Hermes Agent Kanban Setup Guide (Multi-Agent Task Board) (16,341 views) is the most-viewed Kanban video on the channel and the canonical setup walkthrough. The release was ~8 hours old at recording, which is why every command below is gated on first running hermes update.

The Kanban's role in the multi-agent pattern. The Kanban is the operational surface underneath the §2.1.1 orchestrator + adversarial verifier pattern. Where Mavis has a built-in verifier that audits outputs without shared context, Hermes's Kanban generalises that idea: workers produce, the parent task dispatches, the worker logs show the audit trail. The Kanban's three roles — parent task, worker profiles, verifier lane (in v0.15) — are the same orchestrator + worker + verifier pattern, implemented at the Kanban level.

The Kanban's relationship to the Skill Bundle primitive. A Skill Bundle pins the skill set the agent uses (cross-reference §2.5). A Kanban profile is a persistent role with its own SOUL.md and memory (cross-reference §2.3.1). The two are complementary: a Skill Bundle runs the workflow; a Kanban profile runs the worker. The same /research bundle can be invoked from a Kanban worker profile or from a one-shot TUI prompt. The Kanban is the persistence layer; the bundle is the deterministic layer.

The Kanban's relationship to the cron tab. A cron fires the parent task; a Kanban dispatches the child workers (cross-reference §2.4 and §2.3.2). The dashboard's cron tab is the schedule; the Kanban is the dispatch. If you want a daily AI-news briefing delivered to Discord at 9:00 a.m., set the cron in the dashboard. If you want that briefing to fan out into 9 parallel search workers, set the cron in the agent and let the Kanban dispatch — that is the §2.3.2 / §2.5 pattern, not the §2.4 one.

The Kanban's relationship to the Curator. The Curator (§2.7) is the operational layer underneath all of the above. Every skill the Kanban workers use is a candidate for the Curator's archive. Every 15-turn self-evolution loop (§2.1.2) creates new skills that the Kanban workers may pick up. The Curator's weekly cron is the housekeeping that keeps the skill library clean.

Why the Kanban is the channel's "killer feature." The §2.3.2 source video names the parent-child retry loop as the Kanban's killer feature. The creator logged a Space Shooter game build that took 6 runs before he terminated it, and a separate test that hit 81 runs before succeeding. A vanilla orchestrator would have notified the creator and stopped at run 1. The Kanban retries, and the Worker Logs surface tells the user whether the retries are making forward progress. That is the "killer" part: the Kanban is the only multi-agent surface that retries without asking.

The four stories — when to use each. The source video enumerates four usage patterns, numbered Story 1–4:

Story 1 — solo dev shipping a feature (one assignee). Beginner default. The Kanban is overkill for a single-assignee task; the win is the dashboard, the cron, the Worker Logs. Use Story 1 for the first Kanban install.
Story 2 — fleet farming (multiple schemas × multiple assignees). Useful for parallel data ingestion. Less common than Story 1 or 3; mostly used for ingest pipelines.
Story 3 — multi-role pipeline with retries. "Sort of N8N territory" with inspectable nodes. The daily AI-news briefing (5 search workers, 2 editors, 2 publishers) is the canonical Story 3 example.
Story 4 — circuit breaker / crash recovery. A Kanban task that monitors other Kanban tasks and intervenes on failures. The most advanced pattern; not for beginners.

Recommendation: stay on Story 1 until it works twice in a row, then graduate. The "spin up the Kanban and watch" hands-off flow belongs to Story 3, not Story 1.

The Kanban's four lanes — what each one means. The default board has three lanes (inbox, doing, done); the multi-board update adds a fourth (blocked). Each lane has a specific role in the parent-child flow:

inbox — parent tasks that have not yet been dispatched. The cron in the Dashboard's cron jobs tab drops parent tasks into the inbox lane. The dispatcher picks up tasks from the inbox lane, looks at the assignee list, and dispatches to the matching Kanban profile.
doing — parent tasks that have been dispatched but not yet completed. Worker logs are populated for each child task. The dashboard's log session tab shows the parent's status; the Kanban's Worker Logs shows each child's status.
done — parent tasks that completed successfully. The dashboard flips to done on a successful write to ~/hermes/kanban/. The dashboard only flips to done on a successful write — an empty/missing file in the profiles tree does not mean the run failed (cross-reference §2.3.1).
blocked — parent tasks that hit a bug, a missing model key, or a downstream failure. The dashboard lets you block tasks but not delete them (cross-reference §2.3.2). The blocked lane is the safety net; tasks land there when the dispatcher's retry logic gives up.

The parent-child retry loop — the §2.3 anchor. The Kanban's killer feature, in detail. When a worker task fails (e.g. the API returned a 5xx, the model's response was malformed, the network timed out), the Kanban retries. The retry happens at the child-task level, not the parent-task level; the parent task stays in doing until all child tasks complete (or hit the retry cap). The retry count, the failure reason, and the worker logs are all visible in the Kanban UI.

The two real-world examples:

The Space Shooter build that took 6 runs. A coding task to build a Space Shooter game. The first 5 runs produced builds that crashed on launch. The 6th run produced a build that ran. The creator terminated the run after 6 attempts because the 6th run was successful — the Kanban's retry loop kept the task alive long enough for the model to converge.
The separate test that took 81 runs. A different coding task where the model kept producing builds that failed a specific test. The Kanban's retry loop kept the task alive for 81 runs. The 81st run was successful.

The takeaway: don't assume retries mean failure. Check the Worker Logs. The Kanban's retry loop is the safety net; the Worker Logs is the audit trail.

The Skill Bundle integration. The Kanban does not replace the Skill Bundle primitive; the two are complementary. A Kanban worker profile invokes a Skill Bundle via the bundle's slash command (e.g. /research). The bundle's YAML pins the skill set; the Kanban worker's SOUL.md pins the persona; the cron in the dashboard pins the schedule. The three together are the daily AI-news pipeline.

What the Kanban actually is. It's a live dashboard inside the Hermes runtime where multiple named agents — each with its own role — collaborate on a project. The contrast the creator draws matters: a standard orchestrator's sub-agents are disposable workers that hand in their homework and disappear. A Kanban profile is a persistent, configurable agent living under ~/.hermes/profiles/<name>/ with its own SOUL.md and a config file. They are not the same thing.

The setup sequence. After hermes update:

Initialise the Kanban DB.
Start the gateway with hermes gateway start.
Create a task.
Run hermes profile create <role> (e.g. researcher, backend-dev) for every assignee.

The friction point the creator flagged on stream: the researcher profile has no inference provider configured out of the box and does not inherit the parent agent's API key. You set it manually for each profile. The gotcha: profile .env files start empty even when a key exists in the main agent's dotenv. Remove empty api_key fields from config.yaml and copy the real key into each profile. Reuse the same key across roles if you are on a Kimi coding plan and want to fan out.

Resuming the right session. The TUI resume command is hermes --profile <name> --resume <session_id>. The default hermes sessions list only shows the main agent's sessions — a subtle but common confusion.

The demo result. Using the bundled example, a single researcher assignee mapped the AI funding landscape. It pulled live TechCrunch articles, ran 14 attempts (the first 7 crashed), and produced a structured markdown report dated May 4, 2026. The report saved to ~/hermes/kanban/, not ~/hermes/profiles/ — the creator initially looked in the wrong folder and accused the agent of gaslighting him. The dashboard only flips to done on a successful write, so an empty/missing file in the profiles tree does not mean the run failed.

Four use cases, numbered Story 1–4. The creator enumerates the patterns:

Story 1 — solo dev shipping a feature (one assignee; what was demoed). Beginner default.
Story 2 — fleet farming (multiple schemas × multiple assignees).
Story 3 — multi-role pipeline with retries — described as "sort of N8N territory" with inspectable nodes.
Story 4 — circuit breaker / crash recovery.

Recommendation: stay on Story 1 until it works twice in a row, then graduate. The "spin up the Kanban and watch" hands-off flow belongs to Story 3, not Story 1.

Verdict on sub-agents. The channel's framing: orchestrator QC is "still not enough," and Kanban assignees "go really in depth to complete the task" — a real fix for the quality problem in vibe-coded multi-agent flows.

NOTE: the example report is dated May 4, 2026 in the source video. If you are reading this later, treat that as a snapshot of the demo run, not a feature guarantee.

2. Kanban + cron — a 14-step pipeline becomes 9 parallel workers

Hermes Agent Kanban + Cron Job is POWERFUL (Setup Guide) (5,119 views) is the cron-pairing story. The headline: a 14-step sequential cron becomes a 9-worker parallel pipeline — but only after four custom workarounds.

The 14-step sequential pipeline (the "old setup"). A single sub-agent runs 14 web searches sequentially: model releases, tool releases, agent frameworks, trending workflows, news, research papers, GitHub trending, Hacker News, Reddit, X trending, podcasts, YouTube, academic, plus one meta-search. After all 14 complete, the agent writes a report, updates an HTML landing page, and posts to Discord. The whole thing takes ~45 minutes. One failed search stalls everything; the shell date syntax was being passed literally into search queries; reports had been getting shorter since May 2nd (because the agent was running out of context window).

The 9-worker parallel pipeline (the "new setup"). The same cron, but the parent task dispatches 9 child workers:

5 parallel search workers. Each search worker is responsible for 2–3 of the original 14 searches. The 5 workers run in parallel; the parent task waits for all 5 to complete.
2 parallel editor workers. Each editor takes the raw search results from 2–3 search workers, filters duplicates, ranks by importance, and produces a structured markdown section.
2 parallel publisher workers. One publisher updates the HTML landing page; the other fires the Discord notification.

The total time: ~12 minutes (the slowest single chain is a search worker + an editor + a publisher; the parent task does not need to wait for all 9 workers in series). Output is now structured with tables, categorisation, and 48-hour verification.

The token budget for the 9 workers. The source video reports the agent itself proposed these token budgets:

Search workers: 90–100 tokens per response (model picks the most relevant snippet, the editor handles the rest).
Editor workers: mid-range (50–70 tokens per response; the editor does the most synthesis work).
Publisher workers: 20–30 tokens per response (the publisher just formats and posts).

Total per run: ~500 tokens across the 9 workers, vs ~1,800 tokens for the 14-step sequential pipeline. The parallel pipeline is not just faster; it is cheaper, because each worker has a narrower scope and a smaller context window.

The "study the documentation" prompt. The source video's first prompt to the agent: "Can you study and understand the official documentation of the Kanban features?" That prompt is the workaround for the "don't assume the agent knows its own features" failure mode. The agent has access to the official Kanban docs; the user pastes the doc links and asks how they apply to the pipeline. The agent in the source video produced a Kanban setup plan based on the docs; the user approved the plan, and the agent created the specialist profiles and tool sets via TUI.

The four things that broke with cron + Kanban — the source's load-bearing caveats. These are the real ones, named in the source video:

Gateway exits early. Ready child tasks never get dispatched after the parent completes. The Kanban dispatches child tasks via the gateway; if the gateway dies after the parent completes, the children are orphaned. Fix: a systemd service keeps the gateway alive 24/7. The creator warns this will burn electricity on a local machine — VPS is the right place to run this. The §2.2 dashboard tmux wrapper survives logout only ~50% of the time; systemd is the right answer.
Duplicate parent tasks on test runs. May 6, 2026 at 9:10 a.m. produced two completed task sets and two Discord pings. The cron fired the parent task twice within a 10-minute window, and the Kanban did not dedup. There is no native dedup yet. Fix: the agent now does a dedup check before spawning; the user can also add a guard to the cron that suppresses duplicate firings within a configurable window.
No cron-to-Kanban dedup. Every scheduled run creates a fresh task set regardless of what's already on the board. If a parent task is still doing when the next cron fires, the Kanban spawns a second parent task. The two run in parallel, and the second one races the first to update the HTML landing page. Fix: the agent now does a dedup check before spawning (same as #2).
Task accumulation. After one week, the board had 7 parent tasks and 63 children. The dashboard currently lets you block tasks but not delete them. The blocked lane fills up; the done lane fills up; the doing lane has stale tasks. The fix is a delete button (not yet shipped at the time of the source video) or a manual block sweep.

The cron syntax — the "shell date was being passed literally" gotcha. The source video names a subtle failure mode: the shell date syntax ($(date +%Y-%m-%d), for example) was being passed literally into the search queries, not expanded. The fix: the cron job must use a wrapper script that expands date before passing the value to the agent. The wrapper script is a one-line bash file that does echo "Today is $(date +%Y-%m-%d)" and pipes the output to the agent's prompt.

The cron timing — 9:00 a.m. Hong Kong time. The source video's pipeline fires daily at 9:00 a.m. Hong Kong time. The 9:00 a.m. timing is not arbitrary: the creator wants the daily briefing to be ready before the US morning. The cron timezone is the user's choice; the agent's prompt is timezone-agnostic. The 9:00 a.m. is the creator's preference, not a recommendation.

The "dedup check before spawning" workaround, in detail. The source video's workaround for the duplicate parent task problem: the agent now does a dedup check before spawning. The check is: "is there a parent task on the board with the same prompt and the same scheduled time, in the doing or inbox lane?" If yes, skip spawning. The check is hand-coded in the agent's prompt; it is not a native Kanban feature. The fix is a temporary workaround until the native dedup ships.

The recommendation, restated. The creator is explicit: start with a non-cron project in a dedicated workspace folder, not a scheduled pipeline. On a rebuild, he would skip the cron pairing entirely until the delete button and native dedup ship. The four real bugs (gateway exits early, duplicate parent tasks, no cron-to-Kanban dedup, task accumulation) are not worth the risk.

Old setup vs. new setup. The old cron fired daily at 9:00 a.m. Hong Kong time and spawned a single sub-agent that ran 14 web searches sequentially, wrote the report, updated the HTML landing page, and posted to Discord. One failed search stalled everything, the shell date syntax was passed literally into search queries, and reports had been getting shorter since May 2nd.

The new pipeline runs the same cron, but the parent task dispatches:

5 parallel search workers (model releases, tool releases, agent frameworks, trending workflows, plus one for active inputs).
2 editors that filter duplicates and rank by importance.
2 publishers that update the HTML and fire the Discord notification.

Output is now structured with tables, categorisation, and 48-hour verification.

How to actually get it running. Don't assume the agent knows its own features. The creator's first prompt was literally: "Can you study and understand the official documentation of the Kanban features?" — paste the doc links and ask how they apply to your pipeline. Let the agent create specialist profiles and tool sets via TUI. Token budget suggestions (proposed by the agent itself in the source): 90–100 for researchers, mid-range for editors, 20–30 for publishers.

Four things that broke with cron + Kanban — the source's load-bearing caveats. These are the real ones, named in the source video:

Gateway exits early. Ready child tasks never get dispatched after the parent completes. Fix: a systemd service keeps the gateway alive 24/7. The creator warns this will burn electricity on a local machine — VPS is the right place to run this.
Duplicate parent tasks on test runs. May 6, 2026 at 9:10 a.m. produced two completed task sets and two Discord pings. There is no native dedup yet.
No cron-to-Kanban dedup. Every scheduled run creates a fresh task set regardless of what's already on the board. Fix: the agent now does a dedup check before spawning.
Task accumulation. After one week, the board had 7 parent tasks and 63 children; the dashboard currently lets you block but not delete tasks.

Recommendation. The creator is explicit: start with a non-cron project in a dedicated workspace folder, not a scheduled pipeline. On a rebuild, he would skip the cron pairing entirely until the delete button and native dedup ship.

3. The multi-board update — profiles, parent-child, and naming discipline

Hermes Agent Kanban UPDATE: Multiple Boards Setup (3,350 views) is the most recent Kanban piece in the syllabus. The release adds multi-board support, a first-class Profiles section, and a clearer separation between persistent profile agents and disposable sub-agents.

The update you can't skip. The Kanban now ships with multiple boards, accessible from the dashboard via New Board or from the terminal with the hermes kanban boards subcommand. Before the update, only a single default board existed. As the creator shows on screen, trying to spin up a second board on the old build throws an error. If you see that error, your Hermes is on the old Kanban — run hermes update.

Profiles are now first-class. The same release adds a Profiles section for multi-agent setups. Each profile gets a one-click Copy CLI command that drops you into nano of the right .env file, so you no longer have to hunt for which API key belongs to which agent. You can also edit each profile's soul.md directly from the dashboard to customise the system prompt. The creator's point: these are persistent agents with memory and system prompts, not the disposable sub-agents you get from an orchestrator — sub-agents "get spawned, they do their job, and then they're gone."

Why the parent-child link matters — the §2.3 anchor. The Kanban's core advantage over a normal orchestrator/sub-agent flow is the parent-child dependency. If a run fails midway, the Kanban retries. The creator logged a Space Shooter game build that took 6 runs before he terminated it, and remembers another test that hit 81 runs before succeeding. A vanilla orchestrator would have notified him and stopped at run 1. The full history lives under Worker Logs in the Kanban UI — check it before assuming a pipeline is broken.

Isolation requires naming discipline. Each board runs against its own SQLite database, workspace directory, logs, and HERMES_KANBAN_BOARD env var — workers only see their own board. That isolation breaks the moment two boards share a profile name like researcher or editor. The creator's rule: name profiles per project (e.g. ai_researcher, crypto_researcher, sports_researcher) or context bleeds and the agent loses track of which project it's serving.

Audit every day. The creator runs a daily audit of his AI-news pipeline that verifies the gateway is up — critical on VPS, because if the gateway dies, the child task "will forever wait for the parent task." Lazy users can audit weekly, but on a VPS the audit is the difference between a 24/7 pipeline and a 24/7 zombie.

4. Kanban swarm — the v0.15 Velocity extension

The v0.15 "Velocity Release" (cross-reference §2.8 Part D) shipped a Kanban Swarm primitive that generalises the parent-child retry loop. The standalone Kanban (§2.3.1–§2.3.3) handles a single board with a parent task and N child workers; Kanban Swarm is the v0.15 generalisation that ships with a graph with a root task, parallel workers, a verifier, and a synthesizer.

What the swarm ships with (per summary_content, GL2FhteoPBA):

Work-tree-per-task — coding jobs don't collide in one directory. Each worker operates in its own git worktree.
TTL claims — workers claim tasks with a time-to-live; if the worker dies mid-run, the TTL expires and another worker can claim the task.
Retry fingerprinting — the swarm identifies repeated failure modes across retries and re-routes around them.
Stale-task respawn guards — orphaned tasks (e.g. the 63 orphan children the §2.3.2 cron+Kanban video flagged) are detected and respawned cleanly.

The v0.15 release also shipped with the velocity improvements that made the swarm possible: hermes --version from 700ms to 258ms, per-turn framework overhead from 399,000 to 213,000 function calls, and session search from ~90s to 20ms (~4,500× faster). The swarm's per-task verifier and synthesizer loops would be impractically slow without the velocity work.

The verifier lane — what it does. The §2.1 Mavis video named the orchestrator + adversarial verifier pattern: workers produce, a separate agent audits from first principles. The v0.15 Kanban Swarm has the verifier lane built in. The synthesizer lane then merges the verifier-approved worker outputs into a final report. That is the same Mavis pattern, implemented at the Kanban level.

The audience's take. Top-liked viewer @luckyjc3 (2 likes) on the v0.15 video: "If I were on that team, I wouldn't be advertising that I wrote 16k lines in 1 file" — the audience's read on the v0.15 refactor is "this was overdue." Viewer @enzopaupau2302 (1 like) is asking for the kanban swarm deep-dive explicitly, cross-confirming the source video's "standout feature" framing. The swarm is what the channel will cover in the next major Kanban release.

The 6-run Space Shooter, 81-run test — what they actually look like. The source video's "Space Shooter build that took 6 runs" is a real first-party signal: the Kanban's parent-child retry loop kept retrying a failing build until either the build succeeded or the creator terminated it. The 81-run test is the upper bound: a separate experiment where the creator let the retry loop run to completion. Two takeaways:

Don't assume retries mean failure. The Kanban's killer feature is the parent-child retry loop. A build that takes 6 or 81 runs to succeed is still a successful build, as long as the worker logs show forward progress each retry.
Set a retry cap. A naive "retry forever" loop is a fire-hazard. The v0.15 swarm's TTL claims + retry fingerprinting are the fix; before the swarm, the cap was a hand-coded max_retries per profile.

The dashboard's auditability surface for retries. All of this — retries, TTL expirations, fingerprinting decisions — lands in the Worker Logs view in the Kanban UI. Cross-reference §2.4: the Dashboard's log session tab is the per-task log surface; the Kanban's Worker Logs is the per-run audit trail for multi-task flows. If a pipeline is "stuck," Worker Logs is the first place to look; log session is the second.

5. The Kanban decision matrix — when to use it, when not to

The §2.3 subtopic closes with the question every new Hermes user has to answer: should this task be a Kanban task, a Skill Bundle, a /goal-anchored session, or just a one-shot prompt? The matrix below is the channel's read at the time of the source videos.

Workflow type	Recommended surface	Why
One-off exploration	One-shot TUI prompt	Cheapest path; no setup overhead
Daily, quota-bound, deterministic	Skill Bundle + cron in Dashboard tab	Deterministic on the what; cron decides the when
Daily, quota-bound, but variable	Kanban board (single-assignee Story 1)	Persistent worker + retry loop + Worker Logs
Multi-role pipeline, scheduled	Kanban board (multi-assignee Story 3)	Parent-child dependency + verifier lane; cron fires parent
Multi-role pipeline, ad-hoc	Kanban board (multi-assignee) + `/goal`	Goal anchors intent; Kanban dispatches workers
Multi-board workflows (research + trading + sports)	Multiple Kanban boards with project-prefixed profile names	Isolation; one board per project
Computer Use flows	Kanban card per flow + verification card	§2.6 anchor; one card per Computer Use session
Long-running autonomous skill maintenance	The Curator (§2.7) on a weekly cron	Curator is the operational layer underneath all of the above

The four sentences to remember:

One-shots stay one-shots. Don't Kanban-ify a single prompt.
Daily, deterministic work is a Skill Bundle + Dashboard cron. The §2.5 → §2.4 pattern.
Daily, variable work is a Kanban board. The §2.3 anchor.
Multi-board workflows need naming discipline. Prefix every profile name with the project slug.

The default path, restated. Start with Story 1 (one-assignee, no cron, dedicated workspace). When that works twice in a row, add a second assignee. When that works, add a verifier lane. When that works, add a cron in the Dashboard. When all of that works, run a Curator pass (§2.7) and review the report. Skip the cron pairing on a Story 1 board — the four real bugs (gateway exits early, duplicate parent tasks, no cron-to-Kanban dedup, task accumulation) are not worth the risk.

Try it yourself

This is a hands-on subtopic — pick a project, not a sandbox.

Update first. Run hermes update. Confirm hermes kanban boards and hermes gateway start are present — if not, you are on the old build.
Pick one project, not a cron. Create a dedicated workspace folder (e.g. AI News/). Do not start with a scheduled pipeline — the cron pairing is Story 3.
Initialise the board and gateway. Run the Kanban DB init, then hermes gateway start. On VPS, wrap the gateway in a systemd service so it survives logouts. A bare hermes dashboard tmux wrapper is not enough — the §2.2 dashboard tmux session survives logout only ~50% of the time.
Create profiles by hand. For a Story 1 setup, run hermes profile create researcher (or your role name). Then edit ~/.hermes/profiles/researcher/ to set inference_provider and paste in the API key — the parent agent's key is not inherited. Remove empty api_key fields from config.yaml and copy the real key into each profile.
Smoke-test the worker. Drop one task with a single web_search-only assignee that writes a single markdown file. The first run is a schema test, not a content test. If the first 7 attempts crash (the source video saw exactly that), let the retry loop do its job. The parent-child retry is the Kanban's killer feature.
Find the artefact. Look in ~/hermes/kanban/, not ~/hermes/profiles/. The dashboard only flips to done on a successful write, so an empty profiles tree is normal.
Resume correctly. Use hermes --profile <name> --resume <session_id>. Do not use hermes sessions list — that command shows the main agent only.
Graduate slowly. Only when Story 1 works twice in a row, add a second assignee. Only when that works, move to Story 3 (cron + multi-role pipeline) — and only after the delete button and native dedup ship.
Audit daily on VPS. Confirm the gateway is up. A dead gateway on a VPS leaves child tasks waiting on the parent forever. The creator's Space Shooter build that took 6 runs is the reason — without a live gateway, those retries never finish.
Name profiles per project. If you add a second board, prefix every profile name with the project slug (ai_researcher, crypto_researcher) or context bleeds across boards.

Common pitfalls

These are the gotchas the source videos flag explicitly, plus a few general ones for the multi-agent pattern.

Don't skip hermes update. If hermes kanban boards, hermes gateway start, or hermes profile create is missing, you are on the old Kanban — the multi-board UI will throw on board #2.
Don't assume the parent agent's API key is inherited. Profile .env files start empty. Remove empty api_key fields from config.yaml and copy the real key into each profile .env.
Don't look for the report in ~/hermes/profiles/. Artefacts land in ~/hermes/kanban/. The worker logs and the final markdown file live in different trees.
Don't use hermes sessions list to resume a Kanban profile. Use hermes --profile <name> --resume <session_id> — the default command only shows the main agent's sessions.
Don't start with a scheduled pipeline. Cron + Kanban ships four real bugs: gateway exits early, duplicate parent tasks, no cron-to-Kanban dedup, and task accumulation (the creator hit 63 orphan children in a week). Start with a one-shot project.
Don't trust block to clean up. The dashboard lets you block tasks but not delete them — task accumulation is a real concern until the delete button ships.
Don't share profile names across boards. Two boards with a researcher profile will see each other's context. Prefix every profile name with the project slug (ai_researcher, crypto_researcher).
Don't run cron + Kanban without a persistent gateway on VPS. A tmux-wrapped gateway survives logout ~50% of the time. Use systemd.
Don't assume retries mean failure — this is the §2.3 anchor. The Kanban's killer feature is the parent-child retry loop. The creator logged a build that took 6 runs to succeed and a separate test that took 81. Check Worker Logs before declaring a pipeline dead.
Don't skip the daily gateway audit on VPS. A dead gateway leaves child tasks waiting on the parent forever.
Don't treat profiles as disposable sub-agents. Profiles are persistent roles with their own soul.md and memory. Sub-agents spawn, run, and die; profiles stick around. Different model.
Don't bolt on paid API endpoints before the smoke test runs. Get the free web_search-only researcher working first (TechCrunch etc. is enough for the default profile), then expand.
Don't expect a one-click "spin up the Kanban and watch" flow on Story 1. That level of automation is Story 3. Story 1 requires manual profile creation per assignee.

Sources

Hermes Agent Kanban Setup Guide (Multi-Agent Task Board) — 16,341 views · video_id: R_aLVXYzDac
Hermes Agent Kanban + Cron Job is POWERFUL (Setup Guide) — 5,119 views · video_id: iN2fD36Sgdg
Hermes Agent Kanban UPDATE: Multiple Boards Setup — 3,350 views · video_id: fKoPRL0dhyk