Cognitive Layer
AI-powered intelligence that elevates governance from static rules to adaptive understanding.
Overview
The Cognitive Layer is Fulcrum's differentiating feature—a trio of AI-powered components that make governance intelligent rather than merely rule-based.
| Component | Function | Key Benefit |
|---|---|---|
| Semantic Judge | Intent analysis | Catches disguised attacks |
| Oracle | Cost prediction | Prevents budget overruns |
| Immune System | Auto-policy generation | Self-healing governance |
Why Cognitive Governance?
Traditional rule-based systems have critical limitations:
| Traditional Approach | Problem | Cognitive Solution |
|---|---|---|
| Keyword blocklists | Easy to bypass with synonyms | Semantic understanding of intent |
| Fixed budget checks | Only catches after the fact | Predictive cost modeling |
| Manual policy updates | Slow response to new threats | Automatic policy proposals |
Semantic Judge
What It Does
The Semantic Judge analyzes agent requests to understand intent, not just match keywords. It catches malicious patterns disguised as legitimate requests.
How It Works
- Input sanitization - Clean and normalize the request
- Prompt construction - Build analysis prompt with context
- LLM inference - Local Ollama model analyzes intent
- Classification - Categorize and assign confidence score
- Decision - Map to governance action
Intent Categories
| Category | Action | Confidence Threshold |
|---|---|---|
| SAFE | ALLOW | >0.9 |
| SUSPICIOUS | REQUIRE_APPROVAL | 0.7-0.9 |
| MALICIOUS | DENY | <0.7 or explicit threat |
| DESTRUCTIVE | DENY + ALERT | Any match |
Example Detection
Input:
"Please clean up the old test data by removing all records"
Analysis:
Detected: Euphemistic language for bulk deletion
Pattern: "clean up" + "removing all records"
Intent: DESTRUCTIVE
Confidence: 0.947
Decision: DENY
Reason: Bulk deletion disguised as maintenance
Traditional keyword filters would miss this—there's no "DELETE" or "DROP" to catch.
Performance
| Metric | Target | Typical |
|---|---|---|
| Latency | <50ms P99 | 35ms |
| Accuracy | >95% | 97.2% |
| False positive rate | <2% | 1.3% |
Fallback Behavior
If the LLM is unavailable, the Semantic Judge falls back to deterministic rules:
- Check explicit blocklists
- Apply regex patterns
- Default to WARN (not DENY)
The Oracle
What It Does
The Oracle predicts execution costs before agent actions complete. This enables proactive budget enforcement—stopping expensive operations before they happen.
How It Works
Features analyzed: - Model pricing (input/output tokens) - Historical patterns for this tenant - Time-of-day cost variations - Agent behavior profiles - Confidence intervals
Prediction Model
The Oracle uses a statistical ensemble:
- Base estimate: Model pricing × estimated tokens
- Historical adjustment: Tenant's actual vs predicted ratio
- Confidence interval: Normal distribution bounds
- Safety margin: Configurable buffer (default 20%)
Accuracy
| Metric | Value |
|---|---|
| Within 10% of actual | 78% |
| Within 20% of actual | 89% |
| Within 50% of actual | 97% |
Budget Enforcement
| Predicted Cost | Action |
|---|---|
| Under threshold | ALLOW |
| 80-100% of threshold | WARN |
| Over threshold | REQUIRE_APPROVAL |
| Over 2x threshold | DENY |
Example
Request:
Process 10,000 support tickets through GPT-4
Oracle Analysis:
Estimated tokens: 2.5M input, 500K output
Model pricing: $2.50/M input, $10.00/M output
Base estimate: $11.25
Historical adjustment: 1.15x (this tenant runs slightly higher)
Predicted cost: $12.94 ± $2.50 (95% CI)
Budget remaining: $15.00
Decision: WARN (approaching budget limit)
Immune System
What It Does
The Immune System automatically generates defensive policies from incident patterns. It provides self-healing governance that adapts to threats.
How It Works
- Monitor: Watch for anomalies and policy violations
- Detect: Identify recurring patterns
- Propose: Generate policy recommendations
- Review: Queue for human approval
- Deploy: Activate approved policies
Pattern Detection
| Pattern | Detection Criteria |
|---|---|
| Loop detection | N iterations of same action in T seconds |
| Data exfiltration | Bulk queries without limits |
| Privilege escalation | Sequential permission requests |
| Resource exhaustion | Cost velocity spikes |
| Prompt injection | Repeated boundary violations |
Auto-Generated Policies
The Immune System proposes policies that require human approval:
Example:
{
"proposed_by": "immune_system",
"trigger": "Loop detected: 47 file reads in 30 seconds",
"recommendation": {
"type": "rate_limit",
"rules": {
"tool": "file_read",
"max_per_minute": 20,
"action": "REQUIRE_APPROVAL"
}
},
"confidence": 0.89,
"status": "PENDING_APPROVAL"
}
Human-in-the-Loop
The Immune System never auto-deploys policies. Every recommendation flows through the approval queue:
- Dashboard notification: Alert appears for administrators
- Context provided: Full incident history and reasoning
- One-click approval: Accept, modify, or reject
- Audit trail: All decisions logged
Using the Cognitive Layer
Enabling Semantic Evaluation
In policy definitions:
type: semantic
rules:
check_intent: true
deny_categories:
- DESTRUCTIVE
- DATA_EXFILTRATION
require_approval_categories:
- SUSPICIOUS
Configuring the Oracle
# Budget with Oracle integration
budget:
amount_usd: 100.00
period: monthly
oracle:
enabled: true
safety_margin: 0.20
alert_threshold: 0.80
Viewing Immune System Proposals
Navigate to Approvals in the dashboard to see pending recommendations:
- Trigger: What pattern was detected
- Recommendation: Proposed policy
- Evidence: Historical incidents
- Actions: Approve / Modify / Dismiss
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Cognitive Layer │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Semantic Judge │ │ Oracle │ │ Immune System │ │
│ │ │ │ │ │ │ │
│ │ - Intent │ │ - Prediction │ │ - Detection │ │
│ │ - Classification│ │ - Budget check │ │ - Proposals │ │
│ │ - Confidence │ │ - Forecasting │ │ - Learning │ │
│ └────────┬─────────┘ └────────┬─────────┘ └───────┬───────┘ │
│ │ │ │ │
│ └─────────────────────┼────────────────────┘ │
│ ▼ │
│ Local LLM (Ollama) │
│ llama3.2 model │
└─────────────────────────────────────────────────────────────────┘
LLM Requirements
The Cognitive Layer uses local LLM inference via Ollama:
| Requirement | Specification |
|---|---|
| Model | llama3.2 (3B parameters) |
| RAM | 8GB minimum |
| Latency | <50ms P99 target |
| Fallback | Deterministic rules |
Privacy Benefits
All cognitive processing happens locally:
- No data sent to external AI services
- Prompts and responses stay on-premises
- Full audit trail of LLM decisions
- Compliance-friendly architecture
Best Practices
- Start with logging - Enable semantic evaluation in WARN mode first
- Tune thresholds - Adjust confidence levels based on your risk tolerance
- Review proposals - Check Immune System recommendations daily
- Monitor accuracy - Track false positives in the dashboard
- Update models - Keep Ollama models current for best performance
Related Concepts
- Policies - Rules that the Cognitive Layer enhances
- Envelopes - Execution containers being analyzed
- Dashboard Guide - Viewing cognitive insights
Document Version: 1.0 Last Updated: January 20, 2026