1.5 days. 6 developers. Build with AI, stay in control.
The core mental model for agentic development. AI handles the deliverables—code, tests, docs. Humans handle the decisions—scope, quality, risk.
3 focus areas: Foundations — how agentic coding works. Practice — real workflows on real code. Proof — ship a working component.
3 arcs: Foundations (Day 1 AM) → Practice (Day 1 PM) → Proof (Day 2).
The bottleneck is no longer writing code—it's defining the right work, checking outputs, and managing risk.
Developers become architects of outcomes, not typists of code.
The shift: 70% of your time moves to planning & defining. AI writes the code. You orchestrate the outcome.
An AI system that autonomously takes actions to achieve a goal, using tools, making decisions, and iterating based on feedback.
Decides what to do next without explicit instruction.
Invokes external tools and APIs to accomplish tasks.
Works toward a defined objective, not just responding.
Evaluates results and tries again if needed.
Reads the environment and adapts behavior.
AUTOCOMPLETE
Tab completion
CHAT
Q&A only
COPILOT
Inline suggestions
AGENT MODE
Multi-step tasks
AUTONOMOUS
Background work
↑ Most devs start here
↑ Workshop goal
Less agentic → More agentic. The jump from Copilot to Agent Mode is where real productivity gains happen.
1. OBSERVE
Read current state
2. THINK
Reason about needs
3. PLAN
Decide on approach
4. ACT
Execute using tools
5. EVALUATE
Check results
The agent repeats this cycle until the goal is met. Each iteration uses feedback from the previous one.
THE CYCLE
Observe → Think → Plan → Act → Evaluate → repeat
Most projects fit in a single context window. Use /compact for long sessions.
What: A standard protocol for connecting AI models to external tools.
Why: Standardized connections = portable workflows across tools.
Benefit: Same integration works across Claude Code, Cursor, and more.
Two complementary frameworks: how we work with AI and what we build with AI.
Each unit of work should make subsequent work easier, not harder.
Always give AI a way to verify its work—tests, types, linting.
Testing harness + CI/CD = exponential dividends over time.
Plan → Work → Review → Compound → Repeat. Each cycle builds on the last.
No AI. All code hand-written.
Copy-paste snippets from chat.
Inline suggestions, you accept/reject.
Multi-file editing, iterative loops.
Compound engineering. Workshop goal.
Most teams are at level 2–3. This workshop takes you to level 5: AI as a full agent with compound engineering principles.
State-of-the-art AI results come from systems with multiple specialized components, not monolithic models.
Documentation Q&A, context-aware search.
Claude Code: reads, edits, runs, iterates.
Parallel research, writing, and review agents.
Multiple specialized components > monolithic models. MCP enables building these systems easily.
1. Define the Goal, Not the Steps
2. Provide Context, Not Instructions
3. Verify, Don't Trust
4. Iterate in Loops
5. Invest in Infrastructure
6. Use the Best of N Pattern
7. Branch First, Experiment Freely
DO:
"Fix the bug where users can't log in with special characters."
DON'T:
"Open file X, find function Y, change line Z."
Share the project's stack, conventions, and constraints—not step-by-step instructions.
CLAUDE.md captures this context so every session starts informed.
Always validate agent output: tests, type checking, linting, human review.
This is the compound engineering principle: trust but verify.
Agent tries → you review → provide feedback → agent tries again. Faster than specifying everything upfront.
Good CLAUDE.md = better performance. Good tests = safer autonomy. Good CI/CD = faster feedback.
Generate multiple versions, cherry-pick the best. AI creates diverse solutions. You bring judgment to select.
Always create a git branch before major AI changes. Try, fail, retry without risk. Merge only what works.
Front-load all context (stack, constraints, patterns), then state the task and requirements.
Have Claude review its own work for security, edge cases, performance, and accessibility.
Get multiple perspectives: security engineer, performance specialist, UX developer.
Complex features in stages: data model → API → UI → integration → tests.
Limit scope to get focused output. "Implement ONLY X. Do not add Y or Z. I'll handle those separately."
The agentic coding environment we'll use for every lab. Read, edit, run, iterate—all through natural language.
What: A markdown file in your project root giving Claude persistent context.
Where: /project-root/CLAUDE.md
Why: Claude reads it automatically at session start. Reduces repeated context, ensures consistent style, documents tribal knowledge.
Include: Overview, tech stack, architecture, conventions, common commands.
What: A file where Claude stores learnings that persist across sessions.
Where: .claude/memory.md
Contains:
Access with /memory command.
Spawn separate Claude instances for parallel work. Tests while refactoring, research while coding. Give clear, scoped instructions.
Shell commands that trigger at specific events: PreToolUse, PostToolUse, Notification, Stop. Auto-run tests after edits.
/help, /clear, /compact, /config, /cost, /doctor, /init, /mcp, /memory, /review, /revert. Plus your own custom commands.
Plan: Claude proposes, you approve. Safer for learning.
Auto: Claude executes freely. Faster for routine tasks.
We build a team agreement on Day 2.
Open labs/lab1-setup.md to begin.
Foundations are set. Now we build. Core workflows, power features, and real code.
Every agentic task follows the same three-step loop.
Set the goal, provide context, specify acceptance criteria. The better the definition, the better the output.
Run the code, check the tests, review the diff. Verify AI output before trusting it.
Iterate based on what you see. Add improvements, fix edge cases, polish. Repeat until done.
This is faster than trying to specify everything upfront. Define broadly, validate quickly, refine iteratively.
Ask for N different approaches with tradeoffs explained.
Review all versions. Compare against team patterns and maintenance cost.
Pick the best, or combine elements from multiple. Refine the chosen approach.
"Generate 3 different approaches to this authentication flow. For each, explain the tradeoffs. Then I'll pick one."
AI generates diverse solutions. You bring judgment to select.
Building a UserProfileCard component. Open labs/lab2-workflows.md.
Spawn separate Claude instances for independent tasks. The main agent coordinates, subagents execute in parallel.
"Work on the login page while a subagent writes tests for the API."
When to use: tests + docs simultaneously, research while coding, refactoring while testing.
PostToolUse hook matching Write|Edit → runs npm test after every file change. Tests run automatically.
Create .claude/commands/ with markdown files that define reusable prompts.
/review-security — check for vulnerabilities/analyze-failures — parse failure log patternsUse $ARGUMENTS for dynamic input.
PostToolUse hook logs every failed command to .claude/failures.log.
Then /analyze-failures reads the log, groups by type, identifies patterns, and suggests CLAUDE.md updates.
This is compound engineering: each failure makes future work easier.
Open labs/lab3-power.md to begin.
Agentic mindset, core concepts, frameworks, 7 principles, Claude Code setup. Lab 1 complete.
Define → Validate → Refine loop, Best of N, UserProfileCard built. Lab 2 complete.
Subagents, hooks, custom commands, compound loop. Lab 3 complete.
Tomorrow: MCP integrations (Figma, Playwright), Ship It challenge, and Team Playbook.
Day 2 starts at 9:00 AM. Bring your Figma token ready.
Connect AI to external tools. Go from Figma to deployed code.
MCP lets Claude Code talk to Figma, browsers, databases, and more. A standard protocol for unlimited tool access.
Install Figma MCP server. Configure with your Personal Access Token. Verify with /mcp.
Read Figma files. List components. Extract design tokens: colors, typography, spacing, variants.
Claude creates React components matching the design exactly. TypeScript, Tailwind, accessibility built in.
The complete design-to-code pipeline: Figma spec → AI reads design → AI writes component → human validates.
Open labs/lab4-compound.md to begin.
Compound AI: Figma MCP + Playwright MCP = visual regression testing.
Open labs/lab5-playwright.md to begin.
Start with Figma. End with a deployed component. The final challenge.
EXTRACT
Figma → Spec
IMPLEMENT
AI writes code
VALIDATE
Tests + review
REFINE
Iterate + polish
DEPLOY
Ship to preview
Open labs/lab6-figma.md to begin.
Share your deployed component URL with the group.
URL: _________________
URL: _________________
URL: _________________
URL: _________________
URL: _________________
URL: _________________
Turn what we learned into team norms. What AI does, what humans own, and how we ship safely.
Sync on what landed and what's blocked. Troubleshooting session with the team. Fix early friction.
Working session on a topic the team picks. Flag what's working and what needs adjusting.
Team owns the playbook. Final review, close out gaps, full handoff. Agentic workflows are routine.
The workshop gives the team the tools. The follow-up sessions make them stick.
You shipped it. Now make it stick.