CTGT — Runtime AI Governance for Multi-Agent Systems
Technical Brief & POC Framework

Runtime AI Governance for Multi-Agent Systems

Active deployment scenarios, technical documentation on CTGT's graph-based policy enforcement and activation steering approaches, and a proposed framework for a joint proof-of-concept engagement.

Prepared for
Cognizant AI Trust & Safety
From
CTGT — ctgt.ai
Classification
Confidential
Date
March 2026
Active Deployments

Where this is already working

CTGT's Policy Engine is in production and active evaluation at some of the most demanding enterprise environments in regulated industries. Each engagement below represents a distinct governance architecture and use case.

Fortune 15 Global Bank — Second Line of Defense

Optimizing the marginal cost of customization

The bank's senior technical fellow framed CTGT's core value as optimizing the marginal cost of customization for revenue-generating workflows. In personalized client communications and portfolio commentary, deploying dedicated ML engineering teams for each compliance variation is not economically viable. CTGT's policy graph serves as the central control plane: policies are ingested once as natural language documents and enforced consistently across all workflows, regardless of the underlying model.

The firm is actively building internal AI solutions and integrating CTGT as their governance layer. They position the model's own capabilities as their first line of defense, infrastructure as their second, and human oversight as their third. CTGT operates at the second line, ensuring that governance persists even as models and infrastructure change.

Alpha deployment; positioned as second line of defense
Fortune 15 Global Bank — Communications Surveillance

From post-hoc detection to proactive compliance

The wealth management division of this institution is scoped for an alpha evaluation using CTGT to monitor electronic communications in real time. The current system relies on post-talk surveillance: hundreds of thousands of rules-based triggers that fire after the fact, generating investigations for compliance teams to resolve.

The proposed architecture shifts this to proactive governance. CTGT evaluates communications against SEC, FINRA, and OCC requirements before they are sent, catching violations like unauthorized forward-looking statements, undisclosed conflicts, and prohibited product recommendations at the point of generation rather than after distribution.

Alpha scoped with wealth management division
Global Financial Data & Media Company — Editorial AI Governance

Executive-sponsored evaluation for editorial and AI safety

The CTO and Chief Data Officer of a global financial data and media company have given executive sponsorship to evaluate CTGT for AI safety and editorial standards enforcement across the media division. The firm generates hundreds of articles daily, operates a proprietary terminal platform, and requires on-premises deployment with zero data egress.

The evaluation scope includes multi-article summarization accuracy, attribution and source verification, prevention of speculative language, and PII detection across AI-assisted editorial workflows. Engineering leadership has been designated to drive the technical evaluation.

Executive sponsorship secured; POC scoped with engineering team
Top-3 Global Asset Manager — Enterprise AI Guardrails

Replacing non-deterministic cloud provider defaults

The CDAO and enterprise architecture leadership of a top-3 global asset manager conducted a technical deep-dive into CTGT's platform after their architects identified a fundamental limitation: the default guardrails provided by their primary cloud AI service are non-deterministic, producing inconsistent enforcement results. Their GenAI footprint is expanding rapidly across multiple cloud providers and model families, and they need a governance layer that works consistently regardless of the underlying model or platform.

The firm's architecture team validated CTGT's deterministic approach, its model-agnostic deployment model, and its ability to operate within their existing cloud infrastructure. Their internal Data Defense Office, which maintains a dedicated test bed for guardrail evaluation, has been identified as the validation path.

Architecture validated by CDAO; evaluation path defined through Data Defense Office
Technical Architecture

How the Policy Engine works

The Policy Engine combines two complementary governance mechanisms: graph-based policy enforcement for surgical, rule-level compliance, and activation steering for broad behavioral alignment. The two operate independently or in combination depending on the model type and governance objectives.

The Policy Graph

The core of the system is an immutable knowledge graph that represents an organization's policies, SOPs, regulations, and business logic. When a policy document is uploaded, the engine chunks it into overlapping segments, extracts structured entities and relationships using an LLM via the DSPy framework, and stores the results in two databases: a vector store for semantic search (finding relevant policies by meaning) and a graph database for relationship traversal (understanding how policies connect, conflict, and depend on each other).

At runtime, every piece of AI-generated content is evaluated against this dual-database system. The vector store identifies which policies are semantically relevant. The graph database traverses the relationships between those policies to detect violations that depend on context: a statement that is compliant in isolation but violates a policy when combined with something said three turns earlier, or an action that is permitted under one policy but excluded by another.

When a violation is detected, the system does not simply block the output. It identifies the specific non-compliant segment, isolates it from the rest of the response, and remediates it while preserving the original intent and structure. The end user sees a corrected response. The compliance team sees a complete audit trail with clause-level linkage to the violated policies.

Multi-turn trajectory analysis

The graph structure is particularly well-suited to multi-turn, trajectory-based risks. Because the graph is immutable and deterministic, each interaction is logged as a traversal. Patterns that look benign in isolation but form a problematic sequence (crescendo attacks, gradual scope expansion, slow policy drift) naturally emerge as repeated or escalating traversals of specific nodes. This is a structural property of the graph, not something that needs to be engineered case by case.

Activation Steering

For open-weight models, CTGT applies activation steering at inference time. This technique identifies the internal features of a model most relevant to a given governance objective, then applies targeted adjustments to those activation layers during generation, without modifying the underlying model weights.

The approach starts with a probing step: a set of targeted prompts that reveal which internal activations correspond to the behaviors being governed (e.g., avoiding speculative language, staying within a particular domain, adhering to a specific regulatory tone). At runtime, those activations are adjusted to steer the model's output.

Steering is effective for broad behavioral alignment: maintaining a consistent tone, staying within domain boundaries, and performing well in specific languages. It is less effective for surgical, rule-level enforcement, which is where the Policy Graph takes over. In practice, the two methods are complementary. The graph handles precise policy governance. Steering handles the broader behavioral envelope.

Open-weight vs. proprietary models

When working with proprietary models where direct activation access is not available, the insights from steering research are translated into a graph-based context engineering approach. The Policy Graph breaks the response into sub-problems, validates each independently, and surgically corrects any violations. This is how CTGT maintains model-agnostic governance: the same policy graph works regardless of whether the underlying model is open-weight or closed-source.

Deployment model

The Policy Engine is API middleware. It sits between the output of the model and the end user or downstream system. Integration is a single API endpoint change. No infrastructure modifications are required on the client side.

Capability Traditional approach CTGT Policy Engine
Policy encoding Prompt engineering, manual rules Natural language document upload
Violation handling Block output, alert human Surgical remediation preserving intent
Multi-turn awareness Stateless per-request evaluation Graph traversal tracks interaction history
Audit trail Probabilistic confidence scores Deterministic, clause-level policy linkage
Model dependency Vendor-specific tuning Model-agnostic; any API endpoint
Deployment SaaS or multi-month integration API middleware; on-prem, VPC, or SaaS
Defense in Depth

Threat detection at the runtime layer

Beyond policy enforcement, the engine runs seven parallel analyzers on every incoming request. Their scores are weighted and combined into a single threat level. The system auto-adjusts weights based on available context: when policies are loaded, policy-based analyzers receive more weight; without policies, behavioral analyzers compensate.

Analyzer Function
Pattern Matching Detects known attack signatures and injection patterns
Semantic Similarity Embedding-based anomaly detection; flags inputs that are semantically unusual relative to the session
Policy Violation Boundary checking against the loaded policy graph
Behavioral Drift Session history analysis; detects gradual shifts in user behavior or request patterns over time
Response Behavior LLM output monitoring; flags when model responses deviate from expected patterns
Topic Shift Conversation manipulation detection; catches attempts to gradually steer a conversation off-policy
LLM Judge AI-powered content safety evaluation for nuanced cases that deterministic methods may miss

The output is a weighted composite threat level (None, Low, Medium, High, or Critical) for every request. One-time violations are caught by pattern matching and policy checking. Trajectory-based risks like crescendo attacks and collusion patterns surface through behavioral drift, topic shift, and session history analysis.

Proposed POC

A framework for joint evaluation

The following is a proposed framework for a joint proof-of-concept. It is intended as a starting point for collaborative scoping, not a fixed specification.

Scenario

A multi-agent financial services use case using synthetic data that mimics real-world complexity. Electronic communications surveillance for wealth management is the most natural fit: it maps directly to an active CTGT deployment and to regulatory requirements that banking clients face daily. A secondary scenario around claims underwriting or agent coordination could run in parallel depending on team bandwidth.

Methods

Demonstration of both graph-based policy enforcement and activation steering, benchmarked against conventional approaches: deterministic threshold checks and LLM-as-judge for drift detection. The value of the evaluation is in the direct comparison, running the same scenarios through both approaches and measuring the results.

The POC should include a combination of open-weight and proprietary models to demonstrate how governance persists across model types.

Synthetic data

All evaluation data will be synthetic. CTGT provides the base data generation framework and violation injection methodology. Cognizant's team would refine the scenarios to ensure they reflect the complexity and edge cases observed in production client environments. No real client data is required at any stage.

Success criteria

Success criteria should be defined collaboratively. The natural framing: can the Policy Engine catch the subtle, trajectory-based risks that conventional approaches miss? Multi-turn drifts, agent collusion patterns, emergent behaviors, and scope violations represent the highest-value governance challenges in multi-agent systems. A measurable lift on those classes of risk would constitute a compelling result for Cognizant's financial services client base.

Timeline

Three weeks of execution, with results presentable within four weeks. The output should be a packaged, repeatable scenario suitable for client-facing demonstrations across Cognizant's service lines.

Environment

Designed to fit existing infrastructure

CTGT's architecture is API middleware that deploys into any standard cloud or on-premises environment. The POC environment should be determined jointly based on what creates the least friction. All evaluation data is synthetic, so there are no data sensitivity constraints on where the engagement runs.

Integration model
API middleware
Sits between model output and the end user. Single API endpoint change. No infrastructure modifications required.
Data
100% synthetic
Generated collaboratively. No real client data required at any stage of the evaluation.
Cloud compatibility
Azure / AWS / GCP
Natively compatible with all three. Supports existing demo environments or fresh provisioning.
Production deployment
On-prem, VPC, or SaaS
Supports hybrid and fully air-gapped configurations. One production customer operates with zero telemetry.
Next Steps

Scoping the engagement

Four items to align on to move from framework to execution.

Item 1
Finalize the POC scenario and scope
Confirm the e-comms surveillance scenario or adjust focus to the use case most compelling for Cognizant's target client base. Define scope boundaries.
Item 2
Align on the synthetic data approach
Review CTGT's data generation framework. Determine what additional complexity or edge cases should be injected to ensure the evaluation reflects production-grade scenarios.
Item 3
Determine the evaluation environment
Identify the preferred infrastructure (Azure, AWS, or GCP). CTGT's requirements are lightweight: standard API access and GPU allocation for open-weight model inference.
Item 4
Define success criteria
Collaboratively establish the evaluation criteria. What constitutes a compelling result for internal stakeholders? What does the Cognizant go-to-market team need to take this to clients?
Accompanying documentation

This brief is accompanied by a separate technical document covering the full Policy Engine architecture: system topology, data flow from policy ingestion to real-time enforcement, the dual-database design, threat detection pipeline, multi-tenancy model, deployment configuration, and codebase design patterns.