The two-layer prompt architecture: System and user prompts
In this week’s Expert Insight, Imran Ahmad explores the foundational split between an agent’s persistent identity and its real-time tasks
As AI systems evolve from passive chatbots into autonomous agents capable of planning, reasoning, and coordinating actions, prompt design is becoming less about clever instructions and more about architecture. In this week’s Expert Insight, Imran Ahmad, author of 30 Agents Every AI Engineer Must Build, explores one of the foundational ideas behind modern agent engineering: the separation between an agent’s persistent identity and its real-time tasks.
One of the most foundational innovations in agent design is the two-layer prompt architecture, which distinctly separates an agent’s core identity from its real-time instructions. This layered design, consisting of the system prompt and the user prompt, establishes a clear division of responsibilities, drawing inspiration from classical software principles such as separation of concerns and abstraction layers.
A helpful analogy is that of an agent functioning as a diplomat: the system prompt defines the diplomat’s country, values, and code of conduct; the user prompt is the current negotiation or message they are handling. The diplomat must respond fluidly, but always in alignment with national policy.
In multi-agent scenarios, this diplomat analogy extends across agent boundaries. When one agent passes a task or data payload to another, it is effectively handing off a “diplomatic brief”: the receiving agent’s system prompt must re-establish persona, authority scope, and operational constraints for the new context. Without explicit role-passing in the handoff protocol, the receiving agent may inherit ambiguous instructions or combine roles across agents. Well-designed multi-agent architectures, therefore, encode the PTCF components not just in each agent’s internal system prompt but also in the inter-agent message schema, ensuring that every communication boundary preserves the constitutional clarity that the framework provides.
Together, these two layers form what we might call the agent’s prompt contract:
System prompt: How the agent behaves
User prompt: What the agent should do
This architectural separation allows developers to decouple personality from the task, enabling robust systems where agents can maintain consistent identity and ethics even across changing user demands and complex workflows.
For example, a customer support agent might be instructed via its system prompt to be empathetic, solution-oriented, and policy-compliant, while user prompts might range from Can I return my product? to Why was I charged twice? The agent’s behavior across all of these tasks remains consistent, coherent, and aligned with the organization’s voice and values.
As mentioned, in agentic systems, this architecture provides more than just structure; it defines how agents think, reason, and act over time. It enables modularity, context-awareness, and scalability, while preserving consistency of behavior across varied tasks and environments.
This modularity has a direct bearing on prompt budget management. System prompts consume a fixed portion of the model’s context window on every call. For a 4,096-token model (illustrating the constraint at its most acute; the same discipline applies at any context size, as system prompt overhead scales with model capability and workflow complexity), allocating 3,000 tokens to the system prompt leaves only 1,096 tokens for the user turn and response. Prompt engineers must therefore treat system prompt length as a first-class design constraint, compressing verbose persona or context sections, using pointers to external knowledge stores where possible, and reserving token budget for the dynamic content that drives task resolution.
The system prompt: The agent’s constitution
The system prompt serves as the agent’s cognitive and ethical constitution—a persistent, invisible scaffold that defines its identity, boundaries, and reasoning style. It is loaded once at the beginning of a session and remains active throughout, shaping how the agent interprets every user instruction and environmental cue.
Think of the system prompt as the immutable layer of thought: it encodes who the agent is, what it knows, and how it must behave, regardless of the situation. Unlike traditional systems, where behavioral changes require code updates and redeployment, the system prompt allows for natural language reprogramming, editable in real time and understandable by both technical and non-technical stakeholders.
A well-designed system prompt should address the following considerations:
Defining identity: Establishes the agent’s persona, communication tone, and reasoning style, whether it acts as a legal analyst, technical assistant, or friendly fitness coach. This goes beyond tone; it embeds cognitive patterns and decision logic.
Enforcing rules: Embeds ethical, procedural, and legal constraints (e.g., “Do not provide financial advice” or “Cite all factual claims”). These function as internal behavioral guardrails and not post hoc filters, shaping how the agent thinks.
Outlining capabilities: Specifies what the agent is allowed to do: what tools it can access, what domains it can operate within, and how it should disclose limitations. This fosters self-awareness and responsible boundary management.
Specifying output format: Defines structural requirements such as JSON schemas, Markdown formatting, or bullet-point summaries. Consistent formatting is essential for multi-agent communication and API-driven workflows.
Establishing context hierarchy: Dictates how the agent resolves conflicting instructions, manages multiple sources of information, and prioritizes different goals, an essential trait in complex, multi-turn scenarios.
A good system prompt helps ground agents in a durable, readable, and enforceable operational charter, hence becoming the semantic source code for their long-term behavior.
The user prompt: The dynamic stimulus
In contrast to the enduring nature of the system prompt, the user prompt serves as the transient signal, a single unit of interaction that captures immediate intent. It could be a command, a question, a request for action, or a contextual update. The user prompt is interpreted within the behavioral frame set by the system prompt and may vary with every input.
For example, if a system prompt defines an agent as an empathetic senior customer support specialist, a user prompt such as My internet is not working. Can you help? would be interpreted and responded to within that empathetic, customer support-oriented framework. The agent wouldn’t, for instance, respond with a joke or a complex technical dissertation unrelated to support, even if it had the underlying capability, because its system prompt has defined its operational boundaries and persona.
In technical and multi-agent contexts, the user prompt is often machine-generated by an orchestrator rather than typed by a human. The following example shows a structured task payload that an orchestrator agent might inject as the user prompt for a downstream specialist agent:
{
“task_id”: “task_20240315_002”,
“assigned_agent”: “data_analyst_agent”,
“task_type”: “analysis”,
“priority”: “high”,
“payload”: {
“objective”: “Identify the top three revenue drivers from Q4 sales data.”,
“data_source”: “s3://company-data/sales/Q4_2024.csv”,
“output_format”: “JSON”,
“constraints”: [
“No PII in output”,
“Confidence score required per finding”
]
},
“context_references”: [”task_20240315_001”],
“deadline_utc”: “2024-03-15T18:00:00Z”
}
This co-design principle, where the system prompt and user prompt are engineered together as a coherent pair, is essential for building reliable orchestration pipelines. The system prompt defines what the agent is; the user prompt specifies what it must do next. Both must be crafted with the same deliberate attention to PTCF principles for the agent to perform consistently at scale.
Well-designed agents must balance the tension between these two layers: being responsive to user input while remaining anchored in their constitutional commitments. This dynamic tension between fluidity and constraint is where much of the art and subtlety of prompt engineering emerges.
Now that we’ve established the structural foundations of prompt-driven agents, we turn to the question of design methodology. How can we reliably construct prompts that yield predictable, adaptive, and transparent agent behavior? The answer lies in a principled framework called the PTCF blueprint, which we’ll explore next.
The shift toward agentic AI is forcing developers to think beyond prompts as isolated inputs and toward systems that can operate consistently, safely, and at scale. Ahmad’s framework offers a practical look at how structured prompt architectures are becoming the operational backbone of production-ready AI agents, and the book expands that foundation into 30 real-world agent patterns designed for modern enterprise systems.


