AegisAgent — Deep Agent Workflow Design (June 2026 reset)¶

Product: AegisAgent Category: Agent Action Integrity → Integrity-anchored Agent SOC Version: v0.3 (re-anchored on the integrity-anchored Agent SOC) Date: 2026-06-05 Read first: AegisAgent_Gap_Reassessment_2026-06.md · SOC architecture: AegisAgent_Agent_SOC_Design.md

⚠️ Reset note (two layers). v0.1 designed the generic "every action passes a runtime decision point" workflow — now commodity. v0.2 rebuilt the Approval workflow (§7) around frozen-action hashing + fail-closed SDK, the Authorization workflow (§6) around deterministic provenance, and added verifiable receipts to audit (§9). v0.3 adds Workflow 7 (§10): the async SOC pipeline — every decision emits an event that detection/correlation/response consume out-of-band. The action path is never slowed; the SOC rides the receipt + provenance spine.

1. Design thesis¶

Every meaningful agent action passes a runtime decision point — the decision is provable (executed action cryptographically bound to the human approval), provenance-gated (untrusted source cannot drive a privileged action), and observable (the decision streams asynchronously into a SOC that detects, correlates, contains, and proves).

Baseline interception is necessary but no longer differentiating. AegisAgent's workflow engine adds three guarantees: integrity (approved == executed), deterministic provenance (untrusted source can't drive a privileged action), and operability (a SOC on the resulting tamper-evident evidence).

2. Core question (per risky action)¶

Should THIS agent perform THIS action on THIS resource, via THIS tool, under THIS source-trust, right now —
if a human approves, can we prove the executed bytes are exactly the approved bytes —
and can the SOC later correlate and PROVE what happened across the whole run?

3. The seven workflows¶

Agent registration
Tool / MCP registration (+ manifest pinning)
Runtime authorization (provenance-gated) — §6
Human approval (frozen-action integrity) — §7
Memory / RAG trust (provenance + receipts; later) — §8
Audit & investigation (verifiable receipts) — §9
SOC detection / correlation / response (async, on the receipt stream) — §10

4. End-to-end flow¶

INLINE (sync)                                          ASYNC SOC (out-of-band)
User / App
   v
AI Agent Runtime (LangGraph / OpenAI Agents SDK / CrewAI / AutoGen / custom)
   v
AegisAgent SDK  ── canonicalize action -> action_hash; FAIL CLOSED on mismatch before execute
   v
AegisAgent Gateway
   ├─ Identity Resolver
   ├─ Trust-Provenance Gate     (deterministic 6-level label -> Cedar context)
   ├─ Policy Engine (Cedar)     (action_hash + source_trust native inputs)
   ├─ Risk Engine               (enrich/route for display; never overrides forbid, never gates)
   ├─ Approval Integrity Engine (freeze -> hash -> bind -> single-use -> verify)
   ├─ Token Broker              (agents never hold raw tool creds)
   ├─ Receipt + Audit Writer    (hash-chained verifiable receipts)
   └─ Event emitter ───mpsc───► Normalize → Detect → Correlate → Alert
   v                                  → { Respond (freeze/revoke/quarantine), Notify, Index, RCA(LLM,box) }
External Tool / MCP Server               → SOC Console (provable incident timelines)

The SDK is part of the trust boundary: the final fail-closed action_hash check happens there. The event emitter is fire-and-forget: SOC work never blocks the action path (Design Law 3).

5. Workflows 1–2: registration & manifest pinning (table stakes)¶

Agent registration: identity, owner, environment, framework, model, risk tier, status (active/disabled/frozen/quarantined); issue tenant-scoped token.
Tool/MCP registration: register tools + per-action risk/mutation flags; register MCP servers, discover + pin + hash manifests; deny unknown tools by default. Manifest drift downgrades provenance to unknown/malicious_suspected, feeds §6, and raises a SOC drift detection (AEG-4002).

6. Workflow 3 — Runtime authorization (provenance-gated)¶

1. Agent proposes tool call.
2. SDK canonicalizes {tool, action, resource, parameters} -> action_hash.
3. SDK -> POST /v1/authorize (with run's source_trust label).
4. Gateway resolves tenant/agent/user; Trust-Provenance Gate sets context.trust_level
   = lowest trust level of any content consumed in the run.
5. Cedar evaluates with trust_level + mutates_state + resource + environment:
      - mutating + untrusted_external/malicious_suspected/unknown -> DENY (deterministic forbid)
      - mutating + semi_trusted_customer                          -> REQUIRE_APPROVAL
      - read-only / trusted                                       -> ALLOW
6. Risk Engine enriches/routes for display (cannot override a forbid; never gates).
7. Decision + action_hash + source_trust returned; receipt written; ASE EMITTED (async).

Determinism rule: a classifier or SOC anomaly score may lower the trust label (tighten) but never raise it, and never re-open a deterministic gate (Design Law 1). The deny for "mutating + untrusted" is not overridable by a "looks benign" score.

forbid (principal, action == Action::"tool_call", resource)
when {
  context.mutates_state == true &&
  (context.trust_level == "untrusted_external" || context.trust_level == "malicious_suspected" || context.trust_level == "unknown")
};

7. Workflow 4 — Human approval (frozen-action integrity) — centerpiece¶

1. Decision = require_approval.
2. Approval Integrity Engine FREEZES the exact canonical action and stores:
      { action_hash, canonical_action, approver_group, expires_at, consumed_at=NULL }.
3. Agent execution pauses (LangGraph interrupt / OpenAI HITL / SDK block).
4. Slack/Teams/dashboard card renders the CANONICAL action + source_trust label,
   so the human approves exactly what will run. (Callback signature verified; approver role checked.)
5. Human decision:
      APPROVE -> approval bound to action_hash + approver identity + timestamp
      EDIT    -> edited params -> NEW canonical action -> NEW action_hash -> RE-EVALUATE (fresh decision)
      REJECT / ESCALATE / EXPIRE -> action never runs
6. SDK resumes: re-fetch approval -> CONSUME (single-use, atomic) -> recompute hash(about_to_run).
      hash == approved action_hash AND status == approved AND not expired AND consume succeeded
          -> execute via Token Broker proxy
      else -> FAIL CLOSED + emit tamper-attempt receipt.
7. Verifiable receipt written (hash-chained); ASE EMITTED on every transition (created/approved/
   edited/rejected/consumed) so the SOC sees the full approval lifecycle.

Approval state machine: CREATED → NOTIFIED → {APPROVED→CONSUMED | EDITED→re-evaluate | REJECTED | ESCALATED | EXPIRED | CANCELLED}. Single-use (consumed_at), time-boxed (expires_at), replay-checked.

Threats this workflow closes (and the SOC detections they raise): approve-then-swap → hash mismatch (T-A1, high-severity detection); post-approval param tampering (T-A2); replay/reuse → consume 409 (T-A3); render-vs-bytes (T-A4).

ATTACK: agent gets "comment_on_pr" approved (hash Ha), then tries "merge_pull_request" (hash Hb).
RESULT: SDK recomputes Hb != Ha -> FAIL CLOSED. Nothing runs. Receipt records the tamper attempt;
        the SOC raises a T-A1 detection and (if repeated) freezes the agent.

8. Workflow 5 — Memory / RAG trust (provenance + receipts; later)¶

Apply the same primitives to memory writes and RAG ingestion (AgentPoison/PoisonedRAG class): label each write/retrieval with source trust; block memory writes from untrusted sources unless approved; require provenance + receipts for knowledge-base updates. Reuses §6 provenance and §9 receipts — and emits ASEs the SOC correlates (e.g., poisoned-memory-write → later privileged action).

9. Workflow 6 — Audit & investigation (verifiable receipts)¶

Every protected action emits a hash-chained verifiable receipt (receipt_hash = SHA-256(canonicalize(body incl. prev_receipt_hash))), not just a log line. Investigation timeline reconstructs: run start → content consumed + source-trust labels → proposed actions + action_hash → policy/provenance decisions → approval (approver, bound hash) → executed result → receipt chain.

GET /v1/receipts/:id/verify recomputes the chain and returns verified | tampered. Receipts export via OTel/webhook and serve as SOC 2 / Article 14 evidence. Crucially, this receipt chain is the SOC's evidence spine (Workflow 7): every alert and incident references the receipt_hash links covering its events, which is what makes SOC incident timelines provable rather than merely logged (see action-receipt-spec.md §7).

10. Workflow 7 — SOC detection / correlation / response (async) — new¶

The decision (§6/§7) emits an Agent Security Event; the SOC consumes it out-of-band. All detection is deterministic; the only LLM narrates closed incidents (Design Laws 1–2).

1. ASE on the bus -> Normalizer reshapes + enriches (data_access, destination, manifest_hash).
2. Atomic rules match a single event (e.g. AEG-1002 confused-deputy-mutation, level 12, ATLAS AML.T0051).
3. Correlation updates per-(agent_id, run_id) windows:
      - frequency: AEG-2010 deny-storm (5 denies / 60s) -> throttle + alert
      - sequence : AEG-3007 read-sensitive -> external-write (within 300s) -> exfil incident
4. On match: build alert (level, ATLAS/OWASP tags) + open/append an incident with evidence_receipts[]
   (the receipt_hash chain covering the events -> PROVABLE timeline).
5. Response Engine maps verdict -> deterministic action via the gateway control API:
      freeze_agent | revoke_token | quarantine_mcp_server | notify_slack | open_incident
   (tenant-scoped, fail-closed, reversible, audited; agents.status honored on the next action.)
6. On incident close: the sandboxed RCA narrator (LLM) drafts a human-readable summary from inert evidence.
7. SOC Console renders the live feed + the provable incident timeline (one-click receipt-chain verify).

Automation levels (graduated): L0 observe → L1 auto-enrich → L2 auto-triage/incident → L3 auto-contain safe/reversible actions (high-risk → approval, critical → deny) → L4 autonomous with human supervision. Reversible, low-blast-radius containment (deny, require-approval, throttle, freeze) may auto-fire early; destructive actions stay human-gated.

11. Reliability / fail-closed behavior¶

Component down	Behavior
Gateway/policy unreachable	SDK fails closed for mutating/high-risk; read-only may fail open only if explicitly configured
Approval channel down	Approval stays pending; fallback to dashboard; auto-deny on timeout
Receipt/audit pipeline down	Critical actions block until a receipt can be written; low-risk buffer + retry
`action_hash` mismatch (any cause)	FAIL CLOSED — never execute; record tamper attempt; SOC detection
SOC plane down	Action path UNAFFECTED (async by construction); events buffer/drop with metric; monitoring degrades, never the action

12. Framework integration¶

LangGraph: HITL middleware interrupt on require_approval; resume only after SDK hash verification + consume.
OpenAI Agents SDK: tool guardrail computes action_hash, pauses for human review, verifies before side-effecting execution.
CrewAI / AutoGen: before-tool-call hooks / tool wrappers calling the SDK; same fail-closed contract.
Layer-on: when fronting an existing gateway, AegisAgent consumes that gateway's allow decision and adds freeze→hash→bind→verify + receipt + the SOC event stream.
Agentless: where the SDK can't be installed, ingest existing logs/traces/webhooks → normalize to ASE → same SOC pipeline (no inline enforcement, but full detection + provable evidence).

13. Workflow design principles¶

Decide close to the action; enforce integrity at the last step (SDK).
Provenance is deterministic; classifiers and scores only tighten, never gate.
Humans approve risk, not routine (risk-based gating).
Every approval binds to exactly one frozen action and is single-use.
Every protected action yields verifiable evidence — the SOC's spine.
Detection is asynchronous and deterministic; the only LLM narrates closed incidents.
Fail closed on any ambiguity for mutating/high-risk actions; SOC failure never fails the action path open.

14. Workflow recommendation¶

Build the seven workflows, but obsess over Workflow 4 (approval integrity), the provenance gate in Workflow 3, and the async emission that begins Workflow 7:

A pause-and-ask-a-human approval is table stakes. A pause-and-ask-a-human approval that is cryptographically bound to the exact executed action, deterministically gated on source provenance, recorded as a verifiable receipt, and streamed into a deterministic SOC that detects, correlates, contains, and proves — that is AegisAgent.