ChanlChanl
Learning AI

Part 4: All 7 Extension Points in One Production Codebase

50+ skills, multiple MCP servers, scoped rules, safety hooks — here's how all 7 Claude extension points compose in a real NestJS monorepo with 17 projects. What works, what fights, and what we'd do differently.

DGDean GroverCo-founderFollow
March 19, 2026
20 min read
Watercolor illustration of developers at a cafe terrace with rocket deployment diagram on screen — Dusty Blue style

We've been running Claude Code as our primary development tool for about a year. Not as a side experiment. Not for generating boilerplate. As the main way code gets written, reviewed, tested, deployed, and documented across a monorepo with 8 NestJS microservices, 3 frontend apps, a Python voice bot, an SDK, and a CLI.

Over that time, we've built 50+ skills, configured multiple MCP servers, written hooks for safety, and rewritten our CLAUDE.md more times than I can count. Some of those extension points compose beautifully. Others fight each other in ways the documentation doesn't warn you about.

This is Part 4 of our series on the Claude extension stack. Part 1 covered the mental model and decision framework. Part 2 went deep on CLAUDE.md, hooks, and skills. Part 3 covered MCP servers, connectors, and Claude Apps. This article is different: no theory, no decision frameworks. Just a walkthrough of what all 7 extension points look like in one production codebase, what composes well, and what we'd do differently.

The codebase: what we're working with

Our platform is a monorepo with 17 projects spanning two languages, three deployment targets, and wildly different development workflows. Here's the landscape:

Backend: 8 NestJS microservices (agent management, interactions, knowledge base, channels, integrations, scenarios, assessments, and a platform gateway). They share a common package for auth, health checks, and inter-service communication. MongoDB for persistence, Redis for transport.

Frontend: 3 Next.js apps — an admin dashboard, a voice analytics UI, and an assessment platform. All share a common SDK for data hooks and API access.

SDK and CLI: A TypeScript SDK (@chanl-ai/platform-sdk) that provides React hooks, API modules, and a CLI. Both frontend apps and the CLI consume it. The SDK is the single source of truth for all data access -- from agent tools to memory to analytics.

Voice bot: A Python service using Pipecat for real-time voice AI. Different language, different deployment target (Pipecat Cloud instead of Fly.io), different everything.

MCP server: A Vercel-hosted MCP server that gives external AI agents access to our platform's tools and data.

This matters because the extension stack needs to handle all of it. Backend TypeScript patterns are different from frontend React patterns. Python has different conventions than TypeScript. Deploying to Fly.io is different from deploying to Vercel or Pipecat Cloud. A single monolithic configuration would either be too generic to be useful or so long that Claude ignores most of it.

That tension — breadth vs. depth, universal rules vs. context-specific guidance — is the core problem the extension stack solves.

CLAUDE.md in practice: the 3-tier strategy

We structure project instructions across three tiers. Root CLAUDE.md provides navigation and universal rules. Service-level CLAUDE.md files provide project-specific context. Scoped rules auto-load based on which files you're touching.

Tier 1: root CLAUDE.md (~500 lines)

The root CLAUDE.md is the entry point. It establishes hard rules that apply everywhere, provides navigation to find anything in the monorepo, and documents the orchestrator pattern for multi-project work.

Here's the structure we landed on after months of iteration:

markdown
# CLAUDE.md
 
## Session Start — Context Before Code
Every session begins here. Load context for the topic
before writing any code.
 
## Work Routing — Orchestrator + Subagent Architecture
The main thread plans, dispatches, reviews. Subagents implement.
 
## Hard Rules
10 non-negotiable rules. Link to full doc.
 
## Quick Commands
make backend, make api-get, make deploy-staging...
 
## Architecture
Scoped rules mapping, reference docs, related repos.
 
## Services Overview
Port/route table for all 8 services.
 
## Deploy
Fly.io, Vercel, Pipecat Cloud commands.
 
## Skills (Slash Commands)
Quick reference for /commit, /deploy, /plan, /board...

The key insight: the root CLAUDE.md is a navigation hub, not an encyclopedia. It tells Claude what exists and where to find it. The actual rules live in scoped files that load on demand.

We learned this the hard way. Our CLAUDE.md used to be over 600 lines — it contained all backend patterns, all frontend rules, all deploy procedures, everything. Claude would frequently ignore rules buried deep in the file. Splitting to a navigation hub with scoped rules was a turning point.

Tier 2: service-level CLAUDE.md

Each of the 17 projects has its own CLAUDE.md with project-specific details:

markdown
# CLAUDE.md - agent-service
 
> **Parent docs**: See root CLAUDE.md for hard rules and architecture.
 
Quick reference for agent-service development.
 
**Port**: 8002
**Owner**: Agent configuration, tools, prompts, memory
**Consumers**: MCP server, interactions-service, platform-sdk
 
**Deploy**: Part of mono deploy (make deploy-staging)
**Health**: curl -sf http://localhost:8002/health
**Logs**: make dev-logs SVC=agent
 
## Endpoints
| Method | Path           | Description     |
|--------|----------------|-----------------|
| POST   | /agents        | Create agent    |
| GET    | /agents        | List agents     |
| GET    | /agents/stats  | Workspace stats |
...

These files are concise — port, endpoints, owners, gotchas. When a subagent gets dispatched to work on agent-service, it reads this file and immediately knows the lay of the land.

Tier 3: scoped rules (the game-changer)

Scoped rules in .claude/rules/ are the most underappreciated feature in the entire extension stack. Each rules file has a globs: frontmatter that tells Claude when to load it:

yaml
# .claude/rules/backend-services.md
---
globs:
  - "services/**/*.ts"
  - "packages/nestjs-common/**/*.ts"
  - "packages/platform-server/**/*.ts"
---
 
# Backend Services — chanl-platform
Rules and patterns for NestJS microservice development...
yaml
# .claude/rules/frontend-apps.md
---
globs:
  - "apps/**/*.tsx"
  - "apps/**/*.ts"
  - "apps/**/*.css"
  - "packages/platform-sdk/src/react/**/*.ts"
---
 
# Frontend Apps — chanl-platform
Rules for React apps, shadcn components, design tokens...

When you edit a backend service file, only backend rules load. When you edit a React component, only frontend rules load. This means Claude gets deep, relevant context without wasting tokens on irrelevant rules.

Here's our full rules directory:

text
.claude/rules/
  api-contracts.md         # Response format, pagination, error codes
  backend-services.md      # NestJS modules, schemas, DTOs
  deploy-infra.md          # Fly.io, Vercel, Doppler, Docker
  figma-design-system.md   # Design tokens, component architecture
  frontend-apps.md         # React, shadcn, responsive, state
  inter-service.md         # ServiceProxy, auth flow, Redis transport
  lessons-backend.md       # Hard-won debugging lessons (backend)
  lessons-deploy.md        # Hard-won debugging lessons (deploy)
  lessons-frontend.md      # Hard-won debugging lessons (frontend)
  lessons-integration.md   # Hard-won debugging lessons (integration)
  sdk-cli.md               # SDK modules, hooks, query keys, CLI
  wizard-dialog-ux.md      # Multi-step dialog patterns

The lessons-* files deserve special mention. Every time we spend 30+ minutes debugging something that Claude should have known, we write it up:

markdown
# lessons-backend.md
 
### mongoose-doc-save-race-condition
**Trigger**: any code using doc.save() for updates
**What happened**: concurrent updates overwrite each other
**Why wrong**: save() loads full document, modifies in memory,
  writes back — no atomicity
**Fix**: use findByIdAndUpdate(id, { $set: {...} },
  { new: true, runValidators: true }). Always.

These are filed with globs that match the relevant file types, so the next time Claude touches a Mongoose model, it automatically sees the lesson. We have about 30 of these across four files. They've prevented hundreds of repeated mistakes.

Skills in the wild: 50+ and counting

We have 38 skills in .claude/commands/ and 15 more in .claude/skills/. They fall into four categories: workflow, domain, operational, and cross-project.

Workflow Skills

These manage the development lifecycle:

SkillPurpose
/planTDD session plan with stories, tasks, use cases
/reviewCheck session work against the active plan
/commit7-phase commit: review, reflect, board sync, propose, push
/deployMulti-target deploy (Fly.io, Vercel, Pipecat Cloud)
/boardGitHub issue + project board management
/git-pushPush with pre-push verification

The /commit skill is the most complex. Here's its header:

yaml
---
description: Review + reflect + commit in one flow
argument-hint: [type] [scope] [message] OR [--wip] OR [--push]
allowed-tools: Bash(git:*), Bash(make:*), Bash(pnpm:*),
  Read, Edit, Write, Glob, Grep, AskUserQuestion
---

It runs seven phases: check the active plan, review changes against the plan, analyze the conversation transcript for anything missed, sync with the GitHub project board, verify use case tests pass, propose a commit message, and optionally push. The allowed-tools field constrains what Claude can do inside the skill — it can run git and make commands, read and edit files, but nothing else.

Domain Skills

These encode expertise about specific parts of the codebase:

SkillPurpose
/nestjs-coderTDD-driven NestJS implementation with chanl patterns
/voice-botVoice bot development (Python/Pipecat)
/test-writerGenerate use case test scripts
/crud-testCRUD test generation for any service
/migrate-ui-compMove UI components from app to SDK

The /nestjs-coder skill is essentially a packaged expert. It includes TDD workflow documentation, pattern templates, a gotchas file, and scripts. When dispatched to implement a feature, it follows RED-GREEN-REFACTOR methodology using our specific patterns:

text
.claude/skills/nestjs-coder/
  SKILL.md           # Main skill definition
  TDD-WORKFLOW.md    # RED → GREEN → REFACTOR steps
  PATTERNS.md        # Entity relations, response formats
  GOTCHAS.md         # Common pitfalls
  templates/         # Schema, DTO, controller templates
  scripts/           # Test runner helpers

Operational Skills

These provide observability and maintenance:

SkillPurpose
/statusMulti-environment health dashboard
/healthQuick service health check
/dbDatabase queries and inspection
/logsTail service logs
/usersUser management operations

Cross-project Skills

Some skills work across multiple repositories:

SkillPurpose
/contextLoad deep context about any topic/project
/blogBlog writing workflow (works from any repo)
/dispatchRoute work to specialized subagents
/helpmeNavigate the monorepo

The /context skill is one of our favorites. Before writing any code, you run /context mcp or /context voice and it loads pre-built knowledge files about that area — architecture docs, recent changes, known gotchas. This replaces expensive exploration where Claude would spend 10 minutes reading files to build context that we've already documented.

What makes a good Skill

After building 50+, we've learned what separates useful skills from noise:

The description field is more important than the body. Claude decides whether to invoke a skill based on the description — a one-line summary in the frontmatter. If it's vague ("Helpful deployment utility"), Claude won't know when to use it. If it's specific ("Deploy chanl-platform to staging or production on Fly.io"), Claude matches it reliably.

Skills should have clear argument patterns. The argument-hint field tells Claude what inputs the skill expects. /deploy staging is clear. A skill that takes unnamed positional arguments is confusing.

Scope the allowed tools. A deploy skill shouldn't be able to edit source files. A code review skill shouldn't be able to run deploys. The allowed-tools field is your permission boundary.

Kill skills that overlap. We retired several skills that overlapped with others. When Claude sees two skills that could match a request, it sometimes picks the wrong one or asks the user to choose — both are friction. Consolidate aggressively.

Hooks: the safety net

Hooks intercept tool calls before they execute. They're the mechanical enforcement layer — CLAUDE.md suggests behavior, hooks enforce it.

Project-level hook: production deploy gate

Our .claude/settings.json configures a PreToolUse hook on all Bash commands:

json
{
  "permissions": {
    "deny": [
      "Bash(gh pr merge*--squash*)",
      "Bash(gh pr merge*--rebase*)"
    ]
  },
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/block-prod-deploy.sh"
          }
        ]
      }
    ]
  }
}

The deny list hard-blocks squash and rebase merges (we always use merge commits). The hook script checks every Bash command against a list of production-destructive patterns:

bash
#!/bin/bash
# Exit 0 with no output = allow
# Exit 0 with hookSpecificOutput JSON = force user prompt
# Exit 2 = hard block
 
INPUT=$(cat)
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
[ -z "$COMMAND" ] && exit 0
 
REASON=""
 
PROD_MAKE_TARGETS=(
  "deploy-prod"
  "deploy-prod-full"
  "deploy-agents-prod"
  "release-promote"
  "mcp-deploy-prod"
  "voice-bot-deploy-prd"
  "doppler-sync-prod"
)
 
for target in "${PROD_MAKE_TARGETS[@]}"; do
  if echo "$COMMAND" | grep -qE "make\s+.*\b${target}\b"; then
    REASON="Production deploy: make $target"
    break
  fi
done
 
# If production command detected, force user confirmation
if [ -n "$REASON" ]; then
  cat <<EOF
{
  "hookSpecificOutput": {
    "hookEventName": "PreToolUse",
    "permissionDecision": "ask",
    "permissionDecisionReason": "$REASON — confirm to proceed"
  }
}
EOF
fi

Staging deploys pass through automatically. Production deploys require the user to approve in the CLI prompt. This is defense in depth — CLAUDE.md says "never deploy to production without confirmation," but the hook enforces it mechanically even when Claude is deep in a complex task and might forget.

Global hook: directory permissions

We also run a global hook at ~/.claude/hooks/directory-permissions.sh that applies across all projects. It auto-approves safe operations (Read, Grep, Glob) and blocks truly destructive commands regardless of project context.

The hook receives JSON input with the tool name and parameters, then makes a decision:

bash
# Explicit safe tools (always approve)
case "$TOOL_NAME" in
  Read|Grep|Glob|WebFetch|WebSearch|Write|Edit)
    approve "Safe tool auto-approved" ;;
esac
 
# Check Bash commands for dangerous patterns
is_blocked_bash() {
  local full="$CMD"
  # Block rm -rf on critical paths
  # Block git push --force to main
  # Block destructive database operations
  ...
}

The hook pattern is simple: parse the JSON input, check against your rules, output a decision. The three possible outcomes — allow silently, ask the user, or hard block — give you granular control.

Hooks we wish we'd written sooner

We started with all safety rules in CLAUDE.md. For months, we relied on "Never push to main" and "Never deploy to production without asking" as text instructions. They worked 95% of the time. The 5% failures were catastrophic enough that we should have written hooks from day one.

The lesson: anything that would be a serious problem if violated once deserves a hook, not just a CLAUDE.md instruction.

MCP servers: where Claude meets the world

We connect multiple MCP servers to Claude Code, each serving a different purpose. MCP servers provide data access and actions that Claude can't get from the filesystem alone.

Our MCP server stack

ServerWhat It Provides
SerenaSemantic code analysis — find symbols, navigate references, understand structure
PlaywrightBrowser automation for testing UI changes
FigmaDesign context, screenshots, component metadata
SlackRead channels, send messages, team communication
shadcnComponent registry search, examples, audit checklists
Our own MCP serverDogfooding — agents, tools, memory, analytics from our platform

How MCP complements Skills

The relationship between MCP servers and skills is one of the most natural compositions in the stack. Skills define the workflow. MCP servers provide the data. They rarely conflict because they operate at different levels.

Example: our /deploy skill knows the deployment sequence — build, push, verify health, run smoke tests, post to Slack. The Slack MCP server handles the "post to Slack" step. The skill doesn't need to know Slack's API. The MCP server doesn't need to know the deployment workflow. Clean separation.

Another example: our /commit skill includes a phase that syncs with the GitHub project board. If an issue was being worked on, the commit skill updates its status. The board integration happens through gh CLI commands (not MCP), but the pattern is the same — the skill orchestrates, the external tool provides access.

The Figma-to-code pipeline

One of our most productive MCP integrations is the Figma connection. When implementing UI from designs, the workflow is:

  1. get_design_context from Figma MCP — structured code for the target component
  2. get_screenshot — visual reference
  3. The frontend rules file (auto-loaded via scoped rules) tells Claude how to adapt — replace raw colors with semantic tokens, replace arbitrary spacing with 8pt grid values, use existing shadcn components
  4. The shadcn MCP searches the component registry for existing patterns to copy

This pipeline turns a Figma frame into a production component in minutes. The scoped rules are critical — without them, Claude would use the raw Tailwind classes from Figma instead of our semantic token system.

Dogfooding our own MCP server

We use our own product's MCP server in development. This is simultaneously useful and humbling. When the MCP server breaks in dev, we experience it as developers before our users do. We've caught multiple issues this way — tool resolution bugs, authentication edge cases, context window overflow from chatty tool responses.

The chatty response problem is worth calling out. One of our MCP tools returns detailed analytics data. In early versions, a single tool call would dump 2,000+ tokens of JSON into Claude's context. Across a few tool calls in a session, we'd burn through context window budget fast. We learned to design MCP tool responses with context window cost in mind -- return summaries by default, offer detail endpoints for drill-down. If you're building your own MCP server, our guide to building your first MCP server covers the protocol fundamentals, and advanced MCP patterns covers the response design lessons we learned the hard way.

The orchestrator pattern: subagents at scale

For any work that touches more than one project in the monorepo, the main Claude thread acts as an orchestrator. It never writes code directly for multi-project tasks. Instead, it plans, dispatches, and reviews.

User Request Phase 0: Clarify Phase 1: Task Plan Phase 2: Dispatch Backend Subagent SDK Subagent UI Subagent Phase 3: Verify Cross-Project Review
Orchestrator pattern: main thread plans and dispatches, subagents implement in parallel

Phase 0: clarify

Before any code, the orchestrator loads context for the relevant topic and asks clarifying questions. It reads config/projects.yaml to identify which projects are affected and determines the layer stack — backend services, SDK modules, UI components.

Phase 1: task plan

Tasks are ordered inside-out following what we call the DRY onion:

  1. Backend tasks first — schema changes, DTOs, controllers, services, tests
  2. SDK tasks next — types, module methods, React hooks, tests
  3. UI tasks last — components and pages that consume SDK hooks

This order matters. UI depends on SDK. SDK depends on API. If you build UI first and discover the API shape is wrong, you're reworking three layers instead of one.

Each task gets explicit acceptance criteria and test requirements. The task description includes: what to implement, what test to write, and what constitutes done.

Phase 2: dispatch

Each subagent gets a context packet:

text
Project: agent-service
Path: services/agent-service/
CLAUDE.md: services/agent-service/CLAUDE.md
Rules: .claude/rules/backend-services.md
Task: Add lastActive timestamp to agent schema.
  Update on every agent interaction.
  Test: verify timestamp updates on PATCH /agents/:id
Commands: build, test, health from projects.yaml

Independent tasks dispatch in parallel. Dependent tasks dispatch sequentially — the SDK subagent waits for the backend subagent to finish so it knows the final API shape.

Phase 3: verify

The orchestrator reviews all subagent results for cross-project consistency. Do SDK types match the actual API response? Do UI hooks use the correct SDK methods? Do the test assertions match the implementation?

This phase catches integration gaps that no individual subagent can see. A backend subagent might return { items: [...] } while the SDK subagent expected { data: [...] }. The orchestrator spots the mismatch.

Scope guardrails

We've encoded blast-radius checks into the CLAUDE.md:

If a change touches core schemas or shared contracts, enumerate the full blast radius before writing any code.

Specific triggers:

  • Schema field change on a core entity (Agent, Interaction, Scenario) — map ALL downstream consumers first
  • More than 5 tasks in one session — split into multiple PRs
  • Shared module change (the common NestJS package) — every service depends on it, what breaks?
  • API contract change (response shape, status codes) — SDK and all consuming apps may break

These guardrails have prevented several "I renamed a field and broke 12 things" disasters.

What composes well (and what fights)

Skills plus MCP servers compose naturally because they operate at different abstraction levels. Scoped rules plus root CLAUDE.md compose well because they split breadth and depth. The friction comes from too many skills (decision paralysis), chatty MCP responses (context burn), and monolithic CLAUDE.md files that Claude ignores past the first 200 lines.

Natural compositions

Skills + MCP = workflow with data. Skills define the steps. MCP servers provide external data. The deploy skill knows the sequence, the Slack MCP posts the results. The UI skill knows the component patterns, the Figma MCP provides the design. They never step on each other because they operate at different levels of abstraction.

Scoped rules + root CLAUDE.md = context without bloat. The root CLAUDE.md stays navigable at ~500 lines. Scoped rules provide deep context for whatever you're touching — 2,600+ lines of rules that load selectively. When editing a backend service, Claude sees backend patterns, Mongoose gotchas, and API contract rules. When editing a React component, it sees design tokens, spacing rules, and shadcn patterns. Same session can switch between both.

Hooks + CLAUDE.md = defense in depth. CLAUDE.md says "don't deploy to production without asking." The hook mechanically blocks it. CLAUDE.md reduces attempts. Hooks catch the ones that slip through. Together they create reliable safety without either being perfect alone.

Lessons files + scoped globs = institutional memory. When Claude touches a Mongoose model, it automatically sees the lesson about doc.save() race conditions. When it writes a service-to-service call, it sees the lesson about missing workspace headers. These prevent recurring mistakes without cluttering the main CLAUDE.md.

Orchestrator + subagents + service CLAUDE.md = parallel multi-project work. Each subagent gets exactly the context it needs. The backend subagent reads the service's CLAUDE.md and backend rules. The SDK subagent reads the SDK rules. The orchestrator tracks dependencies and reviews consistency. Work that would take a single agent hours of context-switching happens in parallel.

Friction points

Too many skills = decision paralysis. When Claude sees 50+ skills, it sometimes picks the wrong one or asks "did you mean /deploy or /deploy-prod?" We've consolidated where possible and deprecated skills that overlap (prefixed with dontuse- so they're still in version control but clearly retired).

MCP servers that are too chatty eat context. Our own MCP server's analytics tool initially returned full dataset objects. A few tool calls in a conversation would burn thousands of context tokens. We redesigned responses to return summaries first, with explicit drill-down options. General rule: MCP tool responses should be under 500 tokens unless the user asked for detail.

Everything-in-CLAUDE.md fails at scale. We proved this empirically. Our 600-line CLAUDE.md had rules about Mongoose patterns, React component conventions, deploy procedures, and testing strategies all in one file. Claude reliably followed rules in the first 200 lines and intermittently ignored rules deeper in the file. Splitting to scoped rules fixed this — each rules file is 100-300 lines, well within Claude's reliable attention.

Skills without clear descriptions never fire. We have a few skills with generic descriptions like "Code quality utilities" that Claude never invokes unprompted. The skills with specific descriptions — "Deploy chanl-platform to staging or production on Fly.io" — get invoked reliably. The description is a matching function, not documentation.

Subagent context hand-off is imperfect. Subagents can't see the orchestrator's conversation. They get a task description and file paths, but miss nuance from the discussion. We've mitigated this by making task descriptions very explicit — include the "why" and the constraints, not just the "what."

What we'd do differently

Hooks first, scoped rules from day one, and half as many skills with twice the description quality. Those three changes would have saved us months of rework and dozens of incidents where CLAUDE.md soft-guardrails failed under pressure.

Start with Hooks for safety

We spent our first three months with all safety rules in CLAUDE.md. "Never push to main." "Never deploy without asking." "Never delete production data." These rules worked most of the time. The failures were rare but severe.

Hooks should be the first thing you configure, not the last. Any rule where a single violation causes real damage belongs in a hook. CLAUDE.md is for guidance and patterns. Hooks are for hard constraints.

Use scoped rules from day 1

We wrote a monolithic CLAUDE.md for months before discovering scoped rules. By the time we split it, the file was over 600 lines and Claude was ignoring entire sections.

If you're starting a new project: create .claude/rules/ immediately. Put your domain-specific rules in scoped files with glob patterns. Keep root CLAUDE.md as a navigation hub that's genuinely quick to scan.

Fewer, better Skills

Our skill count grew organically. Someone needed a deploy shortcut, so we wrote /deploy. Then /deploy-prod for a slightly different workflow. Then /deploy-staging as a wrapper. Now we have three skills where one with argument routing would suffice.

Start with the question: "Is this a distinct workflow, or a variant of an existing one?" If it's a variant, add argument handling to the existing skill. Only create a new skill when the workflow is fundamentally different.

Better Skill descriptions

The description field in skill frontmatter is the most important line in the file. We initially treated it as documentation. It's actually a matching function — Claude reads it to decide whether to invoke the skill.

Bad: "Helpful utility for code operations" Good: "Review + reflect + commit in one flow" Better: "Deploy chanl-platform to staging or production on Fly.io"

Specific verbs, specific nouns, specific targets. Claude matches on these.

Invest in lessons files early

Our lessons files (lessons-backend.md, lessons-integration.md, etc.) are the highest-ROI artifacts in the entire configuration. Each entry takes 5 minutes to write and saves 30+ minutes every time the same situation comes up.

The format matters: Trigger (when does this apply?), What happened (what went wrong?), Why wrong (root cause), Fix (what to do instead). Scoped with globs so they load automatically when touching relevant files.

Start logging lessons from your first debugging session. You'll thank yourself within a week.

Design MCP tool responses for context budget

Every token in an MCP tool response is a token Claude can't use for reasoning. We learned this when our analytics MCP tool started returning 2KB JSON blobs. Three tool calls in a conversation and Claude was running out of room to think.

Design your MCP responses like API responses for mobile: return the minimum useful payload. Offer pagination. Let the caller request detail when needed. Your future self — burning context window budget in a complex refactoring session — will appreciate the restraint.

The extension stack in action: a real session

A single feature touches all seven extension points in sequence: context loading, task planning, scoped rule activation, subagent dispatch, orchestrator verification, hook-gated commit, and MCP-powered team notification. Here's that flow for a real task: adding a lastActive timestamp to agents that updates whenever the agent handles an interaction.

1. Context loads (/context agents). Pre-built knowledge about agent-service architecture, schema, consumers. Saves 10 minutes of exploration.

2. Plan creates tasks (/plan "add lastActive to agents"). Orchestrator identifies three layers: backend (schema + service), SDK (types + hooks), UI (agent list column). Creates tasks in dependency order.

3. Scoped rules activate. When the backend subagent opens services/agent-service/src/agents/agent.schema.ts, backend rules and lessons files auto-load. Claude sees the Mongoose findByIdAndUpdate pattern, not doc.save(). It sees the lesson about virtual .id lost in serialization.

4. Subagent implements. The backend subagent adds the schema field, updates the service to set lastActive on interaction, writes a test. The SDK subagent adds the type and updates the hook. The UI subagent adds the column.

5. Orchestrator verifies. Checks that the SDK type matches the API response. Checks that the UI column references the SDK hook, not a direct API call. Runs the test suite.

6. Commit goes through hooks (/commit --push). The commit skill reviews changes, the hook lets staging-related commands pass but would block any accidental production deploy. Post-push, a message goes to the team Slack channel via MCP.

Seven extension points, one task: CLAUDE.md for navigation, scoped rules for patterns, skill for workflow, hook for safety, MCP for Slack, subagents for parallel work, lessons files for avoiding known pitfalls.

None of them are complex individually. The power is in composition.

Getting started: a practical sequence

Start with hooks and a minimal CLAUDE.md in week one, add scoped rules in week two, write your first skills in week three, and begin logging lessons in week four. MCP servers and orchestrator patterns come later -- they compound on top of the foundation.

Week 1: Hooks + minimal CLAUDE.md. Write hooks for your hard safety constraints. Write a CLAUDE.md under 200 lines that covers project structure, key commands, and the 5 rules you care most about.

Week 2: Scoped rules. Create .claude/rules/ with 2-3 files for your main domains (backend, frontend, deploy). Add glob patterns. Move domain-specific content out of root CLAUDE.md.

Week 3: First skills. Write 3-5 skills for your most common workflows. Commit, deploy, and test are good starting points. Focus on descriptions.

Week 4: Lessons files. Start logging debugging lessons. One file per domain. Scope with globs. This compounds — every week the lessons file gets more valuable.

Month 2: MCP servers. Connect external MCP servers that match your workflow (Figma for design-to-code, Slack for communication, database for queries). Build your own MCP server if you're dogfooding a platform product.

Month 3: Orchestrator pattern. If you're in a monorepo or multi-project setup, document the subagent dispatch pattern. Write service-level CLAUDE.md files. Build the /dispatch skill.

The key is to start simple and iterate. Every configuration artifact should exist because it solved a real problem, not because it seemed theoretically useful.


This article is the most specific in the series because it draws from real production usage, not theory. The patterns here evolved over a year of daily use, hundreds of debugging sessions, and more CLAUDE.md rewrites than I care to count.

The extension stack isn't a destination — it's infrastructure that grows with your codebase. The 50+ skills, scoped rules, safety hooks, and MCP integrations we run today started as a 100-line CLAUDE.md and a dream. They'll look different in another year.

If you're earlier in the journey, the rest of this series covers the fundamentals. Part 1 has the mental model for deciding which extension point to use. Part 2 goes deep on CLAUDE.md, hooks, and skills. Part 3 covers MCP servers and external integrations. And if you want to see how MCP tool management works at scale or what happens when your agent has 30 tools and no idea when to use them, those articles have the technical detail.

Build the safety net first. Add context second. Automate workflows third. That's the sequence that works.

Build AI agents with production-grade tools

Chanl gives your agents tools, knowledge, memory, and testing — so you can focus on the customer experience, not the infrastructure.

Start building
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Learn Agentic AI

One lesson a week — practical techniques for building, testing, and shipping AI agents. From prompt engineering to production monitoring. Learn by doing.

500+ engineers subscribed

Frequently Asked Questions