CTGT Technical Appendix | Oracle RAI Partnership
Technical Appendix
Oracle RAI Partnership | January 2026
Architecture Reference

Policy Engine Technical Deep-Dive

System architecture, benchmarks, and integration specifications for CTGT's graph-based policy engine and feature-level intervention system.

System Architecture

CTGT operates as a middleware layer between client applications and LLM providers, combining feature-level intervention (for open-weight models) with graph-based verification (for all models including closed-weight) to enforce policy compliance.

📥
Client Request
Query + Context
🔷
Policy Graph
Neo4j Backend
🤖
LLM Inference
Any Provider
🔧
Verification
Graph + Feature
📤
Compliant Output
+ Audit Trail

Feature-Level Intervention (Open-Weight Models)

For open-weight models, we intervene at the activation level during the forward pass. We identify latent feature vectors associated with specific behaviors (bias, misconception, confabulation) and mathematically modify the hidden state:

feature_intervention.py Python
# Feature-level intervention formula
# h = original hidden state
# v = feature vector (e.g., "confabulation", "misconception")
# alpha = intervention strength coefficient

def intervene(h: Tensor, v: Tensor, alpha: float) -> Tensor:
    """
    Modify hidden state to suppress or amplify specific features.
    Arithmetic operation with negligible overhead (<10ms on R1).
    """
    h_prime = h - alpha * (h @ v) * v
    return h_prime

# Example: Suppress "confabulation" feature to reduce hallucination
# This approach improved TruthfulQA from 21% to 70% on GPT-OSS-120b
confabulation_vector = load_feature_vector("confabulation")
modified_state = intervene(hidden_state, confabulation_vector, alpha=0.8)
Closed-Weight Model Support
For closed-weight models (GPT-4, Claude, Gemini), we use graph-based verification combined with a representation of the model's features based on comparable open-source models and internal data. This avoids extensive prompt engineering iterations while achieving similar policy enforcement.

Policy Node Schema

Each policy is represented as a node in a Neo4j graph, with subgraphs representing facets such as positive examples, key phrases, and violation patterns.

policy_schema.ts TypeScript
interface PolicyNode {
  id: string;                    // Unique policy identifier
  name: string;                  // Human-readable name
  description: string;           // Policy description
  source_document: string;       // Original document reference
  criticality: 1 | 2 | 3 | 4 | 5; // Priority weight
  domain: string;                // e.g., "FINRA", "HIPAA", "Brand"
  
  // Subgraph facets
  positive_examples: string[];   // Examples of compliant output
  negative_examples: string[];   // Examples of violations
  key_phrases: string[];         // Triggering keywords
  forbidden_patterns: RegExp[]; // Prohibited patterns
  
  // Graph relationships
  parent_policies: string[];     // Hierarchy links
  conflicts_with: string[];      // Known collision nodes
  supersedes: string[];          // Override relationships
}
example_policy.json JSON
{
  "id": "FINRA-2210-d-1-A",
  "name": "Fair and Balanced Communications",
  "description": "Communications must be fair, balanced, and not misleading",
  "source_document": "FINRA Rule 2210(d)(1)(A)",
  "criticality": 5,
  "domain": "FINRA",
  "positive_examples": [
    "Past performance does not guarantee future results.",
    "Investments involve risk, including possible loss of principal."
  ],
  "forbidden_patterns": [
    "guaranteed.*return",
    "risk[- ]?free",
    "can't lose"
  ]
}

Collision Resolution Engine

When policies conflict, the collision engine applies deterministic resolution using criticality weighting, semantic similarity, and hierarchy traversal.

collision_resolver.py Python
def resolve_collision(
    policies: List[PolicyNode],
    query_embedding: Tensor,
    context: QueryContext
) -> PolicyNode:
    """
    Deterministic policy selection when multiple policies apply.
    Returns the winning policy with full audit trail.
    """
    scored = []
    for policy in policies:
        # Compute weighted score
        criticality_weight = policy.criticality / 5.0
        semantic_score = cosine_similarity(
            query_embedding, 
            policy.embedding
        )
        hierarchy_bonus = get_hierarchy_depth(policy) * 0.1
        
        final_score = (
            criticality_weight * 0.5 +
            semantic_score * 0.35 +
            hierarchy_bonus * 0.15
        )
        scored.append((policy, final_score))
    
    # Deterministic selection: highest score wins
    winner = max(scored, key=lambda x: x[1])
    log_audit_trail(policies, winner, context)
    return winner[0]

Benchmark Results

Performance validated against HaluEval, TruthfulQA, and internal FINRA benchmarks. Results demonstrate significant improvements over RAG and prompt engineering baselines.

96.5%
HaluEval-QA Accuracy
89.2%
Remediation Accuracy
20ms
P90 Policy Retrieval
<10ms
Feature Intervention Overhead
HaluEval-QA Benchmark Results
Model Base RAG CTGT Improvement
GPT-OSS-120B 61% 44% 74% +30 pts vs RAG
GPT 5.2 (Frontier) 74% 52% 91% +39 pts vs RAG
Claude Sonnet 74% 48% 96% +48 pts vs RAG
GPT-OSS-120B (Indexical) 9% 64% 91% +27 pts vs RAG
Key Finding
RAG often degrades accuracy by adding noise. On Claude Sonnet, RAG dropped accuracy from 74% to 48%. CTGT achieved 96% by fixing reasoning rather than just adding context. Open-source models with CTGT exceed frontier model baselines.
Latency Benchmarks (Production Environment)
Operation P50 P90 P99
Policy Retrieval (25K policies) 12ms 20ms 35ms
Feature Intervention (R1 model) 6ms 9ms 14ms
Graph Verification 8ms 15ms 28ms
Full Pipeline (excl. LLM inference) 45ms 72ms 95ms

Oracle RAI Capability Matrix

Mapping Oracle's stated requirements to CTGT's production capabilities.

Capability Oracle Requirement CTGT Status
Knowledge Graph for Policy Selection Exploring Production
Hybrid Traditional ML + LLM Required Production
Nested/Contradicting Policy Handling Critical Production
Feature-Level Control Desired for Gov Production
On-Premise Deployment Required Production
Model Agnostic Required Production
Defensible Audit Trail Required Production
Active Remediation (not just blocking) Desired 89.2% Accuracy
HaluEval Benchmark Performance Competitive 96.5% Score
SOC-2 Compliance Required Certified

Deployment Architecture

CTGT supports multiple deployment models to meet Oracle's security and data sovereignty requirements.

Single-Tenant VPC (Recommended)

  • Runs entirely within Oracle Cloud Infrastructure
  • Dedicated compute instances
  • Private networking with VCN isolation
  • Customer-managed encryption keys
  • Data never leaves customer VPC
  • CTGT manages software updates only

Full On-Premise

  • Docker container delivery
  • Air-gapped installation supported
  • Zero internet connectivity required
  • Helm charts for Kubernetes deployment
  • SDK integration for custom stacks
  • Government/ITAR compatible
kubernetes-deployment.yaml YAML
# Example Kubernetes deployment for on-premise installation
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ctgt-policy-engine
  namespace: ai-governance
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ctgt-policy-engine
  template:
    spec:
      containers:
      - name: policy-engine
        image: ctgt/policy-engine:2.4.1
        resources:
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "16Gi"
            cpu: "8"
        env:
        - name: NEO4J_URI
          valueFrom:
            secretKeyRef:
              name: ctgt-secrets
              key: neo4j-uri
        ports:
        - containerPort: 8080