Prompt Engineering from First Principles: 12 Techniques Every AI Developer Needs

Most prompt engineering guides read like recipe books. Do this, get that. They work until they don't — and when they don't, you're stuck because you never understood why the technique worked in the first place.

This guide is different. We're going to build up from fundamentals, showing you twelve techniques that form the backbone of every production AI system. Each one comes with real TypeScript code you can run today, using the Anthropic or OpenAI SDK. No pseudocode. No hand-waving.

By the end, you'll have a mental model for when to reach for each technique and why it works — not just a bag of tricks.

1. Zero-Shot Prompting

Zero-shot prompting is the simplest technique: you give the model an instruction with no examples. Just tell it what you want.

You'd use this when the task is straightforward enough that the model's training data covers it. Classification, summarization, extraction — anything where the expected behavior is well-understood from the instruction alone.

The naive version:

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 256,
  messages: [
    { role: "user", content: "Is this review positive or negative? 'The food was amazing but the service was painfully slow.'" }
  ],
});
 
// Output: A long paragraph discussing nuances of the review

The model gives you a thoughtful essay when you wanted a label. Here's the fix — be explicit about format:

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 64,
  messages: [
    {
      role: "user",
      content: `Classify this customer review as exactly one of: POSITIVE, NEGATIVE, or MIXED.
Respond with only the classification label, nothing else.
 
Review: "The food was amazing but the service was painfully slow."
 
Classification:`
    }
  ],
});
 
// Output: "MIXED"

The key principle: zero-shot works when you make the task unambiguous. Specify the output format. Constrain the response space. Don't leave room for interpretation.

2. Few-Shot Prompting

When zero-shot isn't cutting it — maybe the model keeps misinterpreting your intent, or the output format needs to be very specific — give it examples. Few-shot prompting is providing 2-5 input/output pairs that demonstrate exactly what you want.

This shines when you need consistent formatting, domain-specific classification labels, or when the task has subtle rules that are easier to show than explain.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 128,
  messages: [
    {
      role: "user",
      content: `Extract the action items from meeting notes. Format each as a JSON object.
 
Meeting notes: "John will send the Q3 report by Friday. Sarah needs to review the API docs."
Action items:
[{"owner": "John", "task": "Send Q3 report", "deadline": "Friday"},
 {"owner": "Sarah", "task": "Review API docs", "deadline": null}]
 
Meeting notes: "We agreed to postpone the launch. Mike will update the roadmap and notify stakeholders by EOD."
Action items:
[{"owner": "Mike", "task": "Update roadmap", "deadline": "EOD"},
 {"owner": "Mike", "task": "Notify stakeholders", "deadline": "EOD"}]
 
Meeting notes: "Lisa will schedule a follow-up for next week. The team should review the new pricing tiers before then."
Action items:`
    }
  ],
});
 
// Output: [{"owner": "Lisa", "task": "Schedule follow-up", "deadline": "next week"},
//          {"owner": "Team", "task": "Review new pricing tiers", "deadline": "next week"}]

A few things to notice. The examples establish the JSON schema implicitly — the model picks up field names, null handling, and how to normalize deadlines without you writing a single line of specification. Choose examples that cover edge cases (like the null deadline) and your few-shot prompt becomes a spec by demonstration.

3. Chain-of-Thought (CoT)

Chain-of-thought prompting asks the model to show its reasoning before giving a final answer. It's the difference between a student writing down just the answer versus showing their work.

This technique is essential for multi-step reasoning: math, logic, code debugging, any task where jumping straight to an answer leads to errors.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Without CoT — the model often gets this wrong
const naive = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 256,
  messages: [
    {
      role: "user",
      content: "A customer bought 3 items at $12.99 each, has a 15% discount coupon, and shipping is $5.99 for orders under $40 but free for orders $40+. What's the total?"
    }
  ],
});
 
// With CoT — dramatically more accurate
const withCoT = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 512,
  messages: [
    {
      role: "user",
      content: `A customer bought 3 items at $12.99 each, has a 15% discount coupon, and shipping is $5.99 for orders under $40 but free for orders $40+. What's the total?
 
Think through this step-by-step:
1. Calculate the subtotal
2. Apply the discount
3. Determine shipping cost
4. Calculate the final total
 
Show your reasoning, then give the final answer.`
    }
  ],
});
 
// Output:
// 1. Subtotal: 3 × $12.99 = $38.97
// 2. Discount: $38.97 × 0.15 = $5.85, so after discount: $38.97 - $5.85 = $33.12
// 3. Shipping: $33.12 < $40, so shipping is $5.99
// 4. Final total: $33.12 + $5.99 = $39.11

The magic of CoT isn't just accuracy — it's debuggability. When the model gets it wrong, you can see where the reasoning broke down. That's invaluable in production systems where you need to understand failures, not just detect them.

4. System Prompts and Role Prompting

System prompts set the stage before any user interaction. They define who the model is, what it knows, and how it should behave. Role prompting takes this further by assigning a specific persona with domain expertise.

This is the backbone of every production AI agent. Without a well-crafted system prompt, your agent is a generalist trying to be everything to everyone.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 1024,
  system: `You are a senior billing support specialist at a SaaS company.
 
Your knowledge:
- Subscription tiers: Free, Pro ($29/mo), Enterprise ($99/mo)
- Billing cycles are on the 1st of each month
- Prorated refunds are available within 14 days of a charge
- You can issue credits but cannot process direct refunds — those go to the billing team
 
Your behavior:
- Always verify the customer's email before discussing account details
- Be empathetic but concise — customers contacting billing are often frustrated
- If a request is outside your authority, explain what you CAN do and offer to escalate
- Never guess at account balances or charge amounts — say you'll look it up`,
  messages: [
    {
      role: "user",
      content: "I was charged twice this month and I want my money back."
    }
  ],
});
 
// The model responds in character: asks for email verification,
// acknowledges the frustration, explains the refund escalation process

The system prompt above does four things well: it defines the domain (billing), sets boundaries (what the agent can and can't do), establishes tone (empathetic but concise), and provides guardrails (never guess at numbers). When you're building agents for customer experience, that combination — knowledge, authority limits, tone, and safety rails — is what separates a useful agent from a liability.

If you're managing prompts across multiple agents, version control becomes critical. Platforms with dedicated prompt management let you iterate on system prompts without redeploying code.

5. Structured Output

Getting consistent, parseable output from an LLM is one of the most practically important skills in production AI. Structured output techniques force the model to respond in a specific format — JSON, XML, or a predefined schema.

typescript

import OpenAI from "openai";
 
const client = new OpenAI();
 
// Using OpenAI's response_format for guaranteed JSON
const response = await client.chat.completions.create({
  model: "gpt-4o",
  response_format: { type: "json_object" },
  messages: [
    {
      role: "system",
      content: "You extract structured data from customer support messages. Always respond in JSON."
    },
    {
      role: "user",
      content: `Extract the following from this message:
- intent (one of: billing_question, technical_issue, feature_request, cancellation, general_inquiry)
- urgency (low, medium, high)
- entities (any products, features, or account details mentioned)
 
Message: "My enterprise dashboard has been showing wrong analytics data since Tuesday.
This is blocking our quarterly review tomorrow — we need this fixed ASAP."
 
Respond as JSON with keys: intent, urgency, entities, summary`
    }
  ],
});
 
const parsed = JSON.parse(response.choices[0].message.content!);
// {
//   "intent": "technical_issue",
//   "urgency": "high",
//   "entities": ["enterprise dashboard", "analytics", "quarterly review"],
//   "summary": "Analytics data showing incorrect information since Tuesday, blocking quarterly review"
// }

For Anthropic's API, you can achieve reliable structured output with XML tags — Claude handles them particularly well:

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 512,
  messages: [
    {
      role: "user",
      content: `Analyze this support ticket and provide your analysis in the following XML format:
 
<analysis>
  <intent>the primary intent</intent>
  <urgency>low|medium|high</urgency>
  <sentiment>positive|neutral|negative</sentiment>
  <recommended_action>what should happen next</recommended_action>
</analysis>
 
Ticket: "I've been a customer for 3 years and I love your product, but this new update
completely broke the reporting feature. I need it fixed or I'll have to look at alternatives."
 
Provide only the XML, no other text.`
    }
  ],
});
 
// Output:
// <analysis>
//   <intent>technical_issue</intent>
//   <urgency>high</urgency>
//   <sentiment>negative</sentiment>
//   <recommended_action>Escalate to engineering for reporting feature regression fix;
//   acknowledge loyalty and provide timeline</recommended_action>
// </analysis>

The choice between JSON and XML depends on your downstream pipeline. JSON is easier to parse programmatically. XML is easier for the model to produce correctly because the closing tags act as natural delimiters. In practice, use JSON with response_format when available, and XML tags for Claude or when you need nested structures the model can reason about.

6. Template Variables

Real-world prompts aren't static strings — they're templates with dynamic values injected at runtime. Template variables let you separate prompt logic from prompt data, making prompts reusable across different contexts.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Define your template
const supportTemplate = (vars: {
  agentName: string;
  companyName: string;
  customerName: string;
  productTier: string;
  knowledgeBase: string;
}) => `You are ${vars.agentName}, a support agent for ${vars.companyName}.
 
The customer you're speaking with:
- Name: ${vars.customerName}
- Plan: ${vars.productTier}
- ${vars.productTier === "Enterprise" ? "This is a high-priority account. Prioritize their request." : "Standard priority."}
 
Reference knowledge:
${vars.knowledgeBase}
 
Respond helpfully and reference specific documentation when possible.`;
 
// Use with different contexts
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 1024,
  system: supportTemplate({
    agentName: "Alex",
    companyName: "Acme Cloud",
    customerName: "Sarah Chen",
    productTier: "Enterprise",
    knowledgeBase: "- SSO is configured via Settings > Security > SAML\n- Enterprise accounts have dedicated support SLAs",
  }),
  messages: [
    { role: "user", content: "How do I set up SSO for my team?" }
  ],
});

Template variables become especially powerful when combined with a prompt management system. Instead of hardcoding templates in your application code, you store them externally and inject variables at runtime. This means product managers can tweak the agent's personality, update knowledge references, or adjust per-tier behavior — all without touching code. This is exactly the kind of workflow that prompt management tools are built for.

7. Instruction Hierarchy

When a prompt has multiple instructions, order matters. Models tend to follow instructions that appear later in the prompt more reliably, and they can lose track of constraints mentioned early on. Instruction hierarchy is about structuring your prompt so the most important rules are in the strongest positions.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Bad: Important constraint buried in the middle
const weakPrompt = `Help the customer with their billing question.
Never reveal internal pricing formulas or discount approval thresholds.
Be friendly and conversational.
If they ask about competitor pricing, redirect to our value proposition.
Always confirm the customer's identity before sharing account details.
Use their first name when possible.`;
 
// Good: Structured with clear priority levels
const strongPrompt = `## CRITICAL RULES (never violate)
1. Verify customer identity (email + last 4 of card) before sharing ANY account details
2. Never reveal internal pricing formulas or discount approval thresholds
3. Never share other customers' information
 
## RESPONSE GUIDELINES
- Be friendly and conversational; use the customer's first name
- If asked about competitor pricing, redirect to our value proposition
- Keep responses under 3 sentences unless the customer asks for detail
 
## KNOWLEDGE
- Current plans: Starter ($19/mo), Growth ($49/mo), Scale ($149/mo)
- Billing cycle: 1st of each month
- Refund window: 30 days`;
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 1024,
  system: strongPrompt,
  messages: [
    { role: "user", content: "What discount can you give me?" }
  ],
});

The structured version uses three layers: critical rules (must never be broken), response guidelines (should be followed), and reference knowledge (context). This mirrors how Anthropic recommends structuring prompts — clarity and constraints first, style second, context third. When your agent has dozens of instructions, this hierarchy prevents the model from "forgetting" the important ones.

8. Negative Prompting

Sometimes telling the model what NOT to do is more effective than trying to describe everything it should do. Negative prompting defines boundaries and prevents common failure modes.

This is particularly useful when you've observed specific failure patterns in testing. Rather than trying to engineer the perfect positive instruction, you add guardrails for the known failure cases.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 1024,
  system: `You are a medical information assistant providing general health education.
 
DO:
- Provide general health information from established medical sources
- Suggest the user consult a healthcare provider for personal medical questions
- Use clear, accessible language
 
DO NOT:
- Diagnose conditions or interpret symptoms for a specific person
- Recommend specific medications, dosages, or treatment plans
- Say "you probably have..." or "it sounds like you might have..."
- Provide information about self-harm methods
- Contradict established medical consensus (e.g., vaccine safety)
- Use phrases like "I'm not a doctor, but..." — instead, directly state the general information and recommend professional consultation`,
  messages: [
    {
      role: "user",
      content: "I've been having chest pain for two days. What do I have?"
    }
  ],
});
 
// Output explains that chest pain has many possible causes (general info),
// strongly recommends seeking immediate medical attention,
// does NOT attempt to diagnose

The "DO NOT" list in the example above came from real failure observations — models tend to hedge with "I'm not a doctor, but..." and then effectively diagnose anyway. Negative prompts catch that specific pattern. When you're running scenario-based testing against your agents, you'll discover these failure patterns quickly, and negative prompts are the fastest way to patch them.

9. Self-Consistency

Self-consistency runs the same prompt multiple times, collects several reasoning paths, and picks the answer that appears most often. It's a brute-force reliability technique: if three out of five runs agree on an answer, you can be more confident than trusting a single run.

This is worth the extra API cost when accuracy matters more than latency — classification tasks, data extraction, content moderation decisions.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
async function classifyWithConsistency(
  text: string,
  labels: string[],
  runs: number = 5
): Promise<{ label: string; confidence: number }> {
  const prompt = `Classify the following text into exactly one category: ${labels.join(", ")}.
Respond with only the category label.
 
Text: "${text}"
 
Category:`;
 
  // Run multiple classifications in parallel
  const results = await Promise.all(
    Array.from({ length: runs }, () =>
      client.messages.create({
        model: "claude-sonnet-4-5-20250514",
        max_tokens: 32,
        temperature: 0.7, // Some variation to get diverse reasoning paths
        messages: [{ role: "user", content: prompt }],
      })
    )
  );
 
  // Count votes
  const votes: Record<string, number> = {};
  for (const result of results) {
    const label = (result.content[0] as { text: string }).text.trim();
    votes[label] = (votes[label] || 0) + 1;
  }
 
  // Find majority
  const sorted = Object.entries(votes).sort((a, b) => b[1] - a[1]);
  const [topLabel, topCount] = sorted[0];
 
  return {
    label: topLabel,
    confidence: topCount / runs,
  };
}
 
// Usage
const result = await classifyWithConsistency(
  "Your product is okay I guess, but I expected more for the price",
  ["positive", "negative", "neutral", "mixed"]
);
 
console.log(result);
// { label: "mixed", confidence: 0.8 }  — 4 out of 5 runs agreed

The temperature: 0.7 is deliberate. If you run at temperature 0, you'll get the same answer every time — which defeats the purpose. You want enough variation to surface alternative interpretations, then let the majority vote resolve ambiguity. In practice, 3-5 runs is the sweet spot between reliability and cost.

10. Prompt Chaining

Prompt chaining breaks a complex task into a sequence of simpler prompts, where each step's output feeds into the next. Instead of asking the model to do everything at once, you build a pipeline.

This is how production AI systems actually work. A single monolithic prompt that handles research, analysis, formatting, and quality checks is fragile. A chain of focused prompts is testable, debuggable, and individually improvable.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
async function ask(system: string, user: string): Promise<string> {
  const res = await client.messages.create({
    model: "claude-sonnet-4-5-20250514",
    max_tokens: 1024,
    system,
    messages: [{ role: "user", content: user }],
  });
  return (res.content[0] as { text: string }).text;
}
 
async function analyzeCustomerFeedback(feedback: string) {
  // Step 1: Extract structured data
  const extracted = await ask(
    "You extract structured data from customer feedback. Respond in JSON only.",
    `Extract from this feedback:
- main_issue: the primary complaint or praise
- product_area: which part of the product (billing, UI, performance, support, other)
- emotion: the customer's emotional state (frustrated, satisfied, confused, angry, neutral)
- has_churn_risk: boolean
 
Feedback: "${feedback}"`
  );
 
  // Step 2: Generate response draft using extracted context
  const draft = await ask(
    `You are a customer success manager drafting responses to feedback.
Use the structured analysis provided to craft an appropriate response.`,
    `Based on this analysis:
${extracted}
 
Original feedback: "${feedback}"
 
Draft a response that:
1. Acknowledges their specific concern
2. Addresses the emotional tone appropriately
3. Provides a concrete next step`
  );
 
  // Step 3: Quality check the draft
  const qualityCheck = await ask(
    "You are a QA reviewer for customer communications. Be critical.",
    `Review this draft response for a customer:
 
Draft: "${draft}"
 
Check for:
1. Does it sound genuine (not robotic)?
2. Does it make promises we might not keep?
3. Is the tone appropriate for someone who is ${extracted}?
 
Respond with: APPROVED or NEEDS_REVISION with specific feedback.`
  );
 
  return { extracted: JSON.parse(extracted), draft, qualityCheck };
}
 
const result = await analyzeCustomerFeedback(
  "I've been waiting 3 weeks for a response to my support ticket. This is unacceptable for an enterprise customer paying $10k/month."
);

Each step in the chain has a single responsibility. Step 1 extracts data. Step 2 generates a draft. Step 3 reviews it. You can test each step independently, swap out models per step (maybe use a cheaper model for extraction, a more capable one for drafting), and add steps without rewriting the whole pipeline. If you're measuring quality with scorecards, you can score each step individually to find exactly where your chain breaks down.

11. ReAct Pattern

ReAct (Reasoning + Acting) combines chain-of-thought reasoning with the ability to take actions — calling tools, looking up data, executing code. The model reasons about what it needs to do, takes an action, observes the result, and then reasons again. It's the foundation of modern AI agents.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Define available tools
const tools: Anthropic.Messages.Tool[] = [
  {
    name: "lookup_order",
    description: "Look up a customer order by order ID. Returns order status, items, and shipping info.",
    input_schema: {
      type: "object" as const,
      properties: {
        order_id: { type: "string", description: "The order ID (e.g., ORD-12345)" },
      },
      required: ["order_id"],
    },
  },
  {
    name: "check_inventory",
    description: "Check current inventory level for a product SKU.",
    input_schema: {
      type: "object" as const,
      properties: {
        sku: { type: "string", description: "Product SKU" },
      },
      required: ["sku"],
    },
  },
];
 
// Simulate tool execution
function executeTool(name: string, input: Record<string, string>): string {
  if (name === "lookup_order") {
    return JSON.stringify({
      order_id: input.order_id,
      status: "shipped",
      tracking: "1Z999AA10123456784",
      items: [{ sku: "WIDGET-100", name: "Premium Widget", qty: 2 }],
      estimated_delivery: "2026-03-10",
    });
  }
  if (name === "check_inventory") {
    return JSON.stringify({ sku: input.sku, in_stock: 47, warehouse: "US-West" });
  }
  return "Unknown tool";
}
 
// ReAct loop
async function handleCustomerQuery(query: string) {
  const messages: Anthropic.Messages.MessageParam[] = [
    { role: "user", content: query },
  ];
 
  let response = await client.messages.create({
    model: "claude-sonnet-4-5-20250514",
    max_tokens: 1024,
    system: "You are a customer support agent. Use the available tools to look up real data before answering. Never guess at order details.",
    tools,
    messages,
  });
 
  // Loop while the model wants to use tools
  while (response.stop_reason === "tool_use") {
    const toolUseBlocks = response.content.filter(
      (block): block is Anthropic.Messages.ToolUseBlock => block.type === "tool_use"
    );
 
    // Execute each tool call
    const toolResults: Anthropic.Messages.ToolResultBlockParam[] = toolUseBlocks.map((block) => ({
      type: "tool_result" as const,
      tool_use_id: block.id,
      content: executeTool(block.name, block.input as Record<string, string>),
    }));
 
    // Feed results back
    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
 
    response = await client.messages.create({
      model: "claude-sonnet-4-5-20250514",
      max_tokens: 1024,
      system: "You are a customer support agent. Use the available tools to look up real data before answering. Never guess at order details.",
      tools,
      messages,
    });
  }
 
  return response.content;
}
 
await handleCustomerQuery("Where is my order ORD-12345? Can I add another widget to it?");
 
// The model:
// 1. Reasons: "I need to look up this order first"
// 2. Acts: calls lookup_order("ORD-12345")
// 3. Observes: order is shipped, contains 2 Premium Widgets
// 4. Reasons: "Order is already shipped, can't modify. Let me check widget inventory for a new order"
// 5. Acts: calls check_inventory("WIDGET-100")
// 6. Observes: 47 in stock
// 7. Responds: explains order status, offers to place new order for additional widget

The ReAct loop is what makes an AI agent an agent rather than a chatbot. It can gather information, make decisions based on real data, and take multiple actions to fulfill a request. The tools your agent has access to define what it can actually do — and the quality of your tool descriptions directly impacts how reliably the model chooses the right tool at the right time.

12. Meta-Prompting

Meta-prompting is using an LLM to generate, evaluate, or improve prompts. Instead of manually iterating on prompt text, you ask the model to write better prompts for you — or to critique the ones you have.

This is underrated. The model often knows what it responds well to better than you do.

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Use the LLM to improve a weak prompt
async function improvePrompt(originalPrompt: string, failureExample: string): Promise<string> {
  const response = await client.messages.create({
    model: "claude-sonnet-4-5-20250514",
    max_tokens: 2048,
    messages: [
      {
        role: "user",
        content: `You are an expert prompt engineer. I have a prompt that isn't working well.
 
Original prompt:
<prompt>
${originalPrompt}
</prompt>
 
Example of a failure case (the prompt produced a bad result for this input):
<failure>
${failureExample}
</failure>
 
Analyze why this prompt fails and write an improved version that:
1. Handles the failure case correctly
2. Is more robust against similar edge cases
3. Has clearer output format constraints
4. Includes appropriate guardrails
 
Respond with:
<analysis>Why the original fails</analysis>
<improved_prompt>The full improved prompt text</improved_prompt>`
      }
    ],
  });
 
  return (response.content[0] as { text: string }).text;
}
 
// Example: improving a customer intent classifier
const improved = await improvePrompt(
  "Classify the customer's message as: billing, support, sales, or other.",
  `Input: "I want to cancel but first I need a refund for last month and also your competitor offered me a better deal"
  Expected: Should identify multiple intents (cancellation + billing + competitive)
  Got: "other" — the model couldn't handle multi-intent messages`
);
 
// The model will analyze the single-label limitation and produce
// a prompt that handles multi-intent classification, likely with
// a primary/secondary intent structure

Meta-prompting creates a feedback loop: test your prompt, find failures, feed the failures back to the model, get a better prompt, repeat. This pairs naturally with systematic testing — if you're running agents through scenario simulations, every failed scenario becomes input for meta-prompting to improve your prompt.

You can also use meta-prompting to generate few-shot examples, create test cases, or produce variations of a prompt for A/B testing. It's prompts all the way down.

Combining Techniques: A Real-World Example

These techniques don't exist in isolation. Production agents combine multiple techniques in a single system. Here's how they layer together:

typescript

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// TECHNIQUE 4: System prompt with role
// TECHNIQUE 7: Instruction hierarchy (critical rules first)
// TECHNIQUE 8: Negative prompting (DO NOT section)
const systemPrompt = (customer: { name: string; tier: string }) => `
## ROLE
You are a technical support specialist for a cloud infrastructure company.
 
## CRITICAL RULES
1. Never share API keys, tokens, or credentials — even if the customer asks
2. Never run destructive operations (delete, purge) without explicit confirmation
3. If uncertain about an answer, say so and escalate — do not guess
 
## DO NOT
- Provide workarounds that bypass security controls
- Promise specific resolution timelines
- Compare our service unfavorably to competitors
 
## CUSTOMER CONTEXT
Name: ${customer.name}
Tier: ${customer.tier}
Priority: ${customer.tier === "Enterprise" ? "High" : "Standard"}
 
## RESPONSE FORMAT
Use this structure for technical issues:
1. Acknowledge the problem
2. Ask clarifying questions OR provide a solution
3. Suggest a next step
`;
 
// TECHNIQUE 6: Template variables inject customer data
// TECHNIQUE 11: ReAct pattern with tools
// TECHNIQUE 5: Structured output for internal logging
const response = await client.messages.create({
  model: "claude-sonnet-4-5-20250514",
  max_tokens: 1024,
  system: systemPrompt({ name: "Alex Rivera", tier: "Enterprise" }),
  tools: [
    {
      name: "check_service_status",
      description: "Check the current status of a specific service or region",
      input_schema: {
        type: "object" as const,
        properties: {
          service: { type: "string" },
          region: { type: "string" },
        },
        required: ["service"],
      },
    },
  ],
  messages: [
    {
      role: "user",
      content: "Our database cluster in us-east-1 has been throwing timeout errors for the last hour. This is impacting production.",
    },
  ],
});

Six techniques in one prompt. That's not unusual for production systems — it's the norm.

When to Use What: A Decision Framework

Not every technique belongs in every prompt. Here's a quick guide:

Situation	Start with	Add if needed
Simple classification	Zero-shot	Few-shot if accuracy is low
Consistent formatting	Few-shot + Structured output	Template variables for dynamic content
Complex reasoning	Chain-of-thought	Self-consistency for high-stakes decisions
Production agent	System prompt + Role	ReAct for tools, Negative prompting for guardrails
Improving existing prompts	Meta-prompting	Prompt chaining for multi-step evaluation
Multi-step workflows	Prompt chaining	ReAct if steps require external data

The general principle: start simple. Use zero-shot first. Add techniques only when you observe specific failures. Every technique adds complexity, and complexity is a cost — in token usage, latency, maintainability, and debuggability.

What's Next: Part 2

This guide covered the foundational twelve. In Part 2, we'll go deeper into advanced techniques: retrieval-augmented generation (RAG), constitutional AI prompting, dynamic tool selection, and prompt optimization at scale. We'll also look at how to build automated evaluation pipelines so you can measure whether your prompt changes actually improve performance — not just feel better.

For now, pick one technique you haven't tried and apply it to a real problem this week. The best way to internalize these patterns is to see them fail, understand why, and iterate. That's not just how prompt engineering works — it's how all engineering works.

If you're building agents and want to systematically test how your prompts perform across different scenarios, analytics dashboards can show you where your prompts succeed and where they fall short in production.

Progress0/12

Sources & References

Key Takeaway

Testing edge cases before production deployment can reduce customer complaints by 80% and prevent costly emergency fixes post-launch.

prompt-engineering llm typescript learning-ai

Chanl Team

AI Agent Testing Platform

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Get AI Agent Insights

Subscribe to our newsletter for weekly tips and best practices.

1. Zero-Shot Prompting

2. Few-Shot Prompting

3. Chain-of-Thought (CoT)

4. System Prompts and Role Prompting

5. Structured Output

6. Template Variables

7. Instruction Hierarchy

8. Negative Prompting

9. Self-Consistency

10. Prompt Chaining

11. ReAct Pattern

12. Meta-Prompting

Combining Techniques: A Real-World Example

When to Use What: A Decision Framework

What's Next: Part 2

Chanl Team

Get AI Agent Insights

Related Articles

How to Evaluate AI Agents: Build an Eval Framework from Scratch

MCP Explained: Build Your First MCP Server in TypeScript and Python

RAG from Scratch: Build a Retrieval-Augmented Generation Pipeline