Architecture Reference

Policy Engine Technical Deep-Dive

System architecture, benchmarks, and integration specifications for CTGT's graph-based policy engine and feature-level intervention system.

System Architecture

CTGT operates as a middleware layer between client applications and LLM providers, combining feature-level intervention (for open-weight models) with graph-based verification (for all models including closed-weight) to enforce policy compliance.

📥

Client Request

Query + Context

→

🔷
Policy Graph
Neo4j Backend

→

🤖

LLM Inference

Any Provider

→

🔧
Verification
Graph + Feature

→

📤

Compliant Output

+ Audit Trail

Feature-Level Intervention (Open-Weight Models)

For open-weight models, we intervene at the activation level during the forward pass. We identify latent feature vectors associated with specific behaviors (bias, misconception, confabulation) and mathematically modify the hidden state:

                        feature_intervention.py
                        Python
                    

# Feature-level intervention formula
# h = original hidden state
# v = feature vector (e.g., "confabulation", "misconception")
# alpha = intervention strength coefficient

def intervene(h: Tensor, v: Tensor, alpha: float) -> Tensor:
    """
    Modify hidden state to suppress or amplify specific features.
    Arithmetic operation with negligible overhead (<10ms on R1).
    """
    h_prime = h - alpha * (h @ v) * v
    return h_prime

# Example: Suppress "confabulation" feature to reduce hallucination
# This approach improved TruthfulQA from 21% to 70% on GPT-OSS-120b
confabulation_vector = load_feature_vector("confabulation")
modified_state = intervene(hidden_state, confabulation_vector, alpha=0.8)
                    

Closed-Weight Model Support

For closed-weight models (GPT-4, Claude, Gemini), we use graph-based verification combined with a representation of the model's features based on comparable open-source models and internal data. This avoids extensive prompt engineering iterations while achieving similar policy enforcement.

Policy Node Schema

Each policy is represented as a node in a Neo4j graph, with subgraphs representing facets such as positive examples, key phrases, and violation patterns.

                        policy_schema.ts
                        TypeScript
                    

interface PolicyNode {
  id: string;                    // Unique policy identifier
  name: string;                  // Human-readable name
  description: string;           // Policy description
  source_document: string;       // Original document reference
  criticality: 1 | 2 | 3 | 4 | 5; // Priority weight
  domain: string;                // e.g., "FINRA", "HIPAA", "Brand"
  
  // Subgraph facets
  positive_examples: string[];   // Examples of compliant output
  negative_examples: string[];   // Examples of violations
  key_phrases: string[];         // Triggering keywords
  forbidden_patterns: RegExp[]; // Prohibited patterns
  
  // Graph relationships
  parent_policies: string[];     // Hierarchy links
  conflicts_with: string[];      // Known collision nodes
  supersedes: string[];          // Override relationships
}
                    

                        example_policy.json
                        JSON
                    

{
  "id": "FINRA-2210-d-1-A",
  "name": "Fair and Balanced Communications",
  "description": "Communications must be fair, balanced, and not misleading",
  "source_document": "FINRA Rule 2210(d)(1)(A)",
  "criticality": 5,
  "domain": "FINRA",
  "positive_examples": [
    "Past performance does not guarantee future results.",
    "Investments involve risk, including possible loss of principal."
  ],
  "forbidden_patterns": [
    "guaranteed.*return",
    "risk[- ]?free",
    "can't lose"
  ]
}
                    

Collision Resolution Engine

When policies conflict, the collision engine applies deterministic resolution using criticality weighting, semantic similarity, and hierarchy traversal.

                        collision_resolver.py
                        Python
                    

def resolve_collision(
    policies: List[PolicyNode],
    query_embedding: Tensor,
    context: QueryContext
) -> PolicyNode:
    """
    Deterministic policy selection when multiple policies apply.
    Returns the winning policy with full audit trail.
    """
    scored = []
    for policy in policies:
        # Compute weighted score
        criticality_weight = policy.criticality / 5.0
        semantic_score = cosine_similarity(
            query_embedding, 
            policy.embedding
        )
        hierarchy_bonus = get_hierarchy_depth(policy) * 0.1
        
        final_score = (
            criticality_weight * 0.5 +
            semantic_score * 0.35 +
            hierarchy_bonus * 0.15
        )
        scored.append((policy, final_score))
    
    # Deterministic selection: highest score wins
    winner = max(scored, key=lambda x: x[1])
    log_audit_trail(policies, winner, context)
    return winner[0]
                    

Benchmark Results

Performance validated against HaluEval, TruthfulQA, and internal FINRA benchmarks. Results demonstrate significant improvements over RAG and prompt engineering baselines.

96.5%

HaluEval-QA Accuracy

89.2%

Remediation Accuracy

20ms

P90 Policy Retrieval

<10ms

Feature Intervention Overhead

HaluEval-QA Benchmark Results

Model	Base	RAG	CTGT	Improvement
GPT-OSS-120B	61%	44%	74%	+30 pts vs RAG
GPT 5.2 (Frontier)	74%	52%	91%	+39 pts vs RAG
Claude Sonnet	74%	48%	96%	+48 pts vs RAG
GPT-OSS-120B (Indexical)	9%	64%	91%	+27 pts vs RAG

Key Finding

RAG often degrades accuracy by adding noise. On Claude Sonnet, RAG dropped accuracy from 74% to 48%. CTGT achieved 96% by fixing reasoning rather than just adding context. Open-source models with CTGT exceed frontier model baselines.

Latency Benchmarks (Production Environment)

Operation	P50	P90	P99
Policy Retrieval (25K policies)	12ms	20ms	35ms
Feature Intervention (R1 model)	6ms	9ms	14ms
Graph Verification	8ms	15ms	28ms
Full Pipeline (excl. LLM inference)	45ms	72ms	95ms

Oracle RAI Capability Matrix

Mapping Oracle's stated requirements to CTGT's production capabilities.

Capability	Oracle Requirement	CTGT Status
Knowledge Graph for Policy Selection	Exploring	Production
Hybrid Traditional ML + LLM	Required	Production
Nested/Contradicting Policy Handling	Critical	Production
Feature-Level Control	Desired for Gov	Production
On-Premise Deployment	Required	Production
Model Agnostic	Required	Production
Defensible Audit Trail	Required	Production
Active Remediation (not just blocking)	Desired	89.2% Accuracy
HaluEval Benchmark Performance	Competitive	96.5% Score
SOC-2 Compliance	Required	Certified

Deployment Architecture

CTGT supports multiple deployment models to meet Oracle's security and data sovereignty requirements.

Single-Tenant VPC (Recommended)

Runs entirely within Oracle Cloud Infrastructure
Dedicated compute instances
Private networking with VCN isolation
Customer-managed encryption keys
Data never leaves customer VPC
CTGT manages software updates only

Full On-Premise

Docker container delivery
Air-gapped installation supported
Zero internet connectivity required
Helm charts for Kubernetes deployment
SDK integration for custom stacks
Government/ITAR compatible

                        kubernetes-deployment.yaml
                        YAML
                    

# Example Kubernetes deployment for on-premise installation
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ctgt-policy-engine
  namespace: ai-governance
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ctgt-policy-engine
  template:
    spec:
      containers:
      - name: policy-engine
        image: ctgt/policy-engine:2.4.1
        resources:
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "16Gi"
            cpu: "8"
        env:
        - name: NEO4J_URI
          valueFrom:
            secretKeyRef:
              name: ctgt-secrets
              key: neo4j-uri
        ports:
        - containerPort: 8080