Show HN: Veritas AI – Multi-Agent Deliberation for Truth-Seeking

5 min read

Show HN: Veritas AI – Multi-Agent Deliberation for Truth-Seeking

A new open-source project is tackling AI hallucination through multi-agent deliberation. Veritas AI deploys multiple AI agents that debate each other on factual claims, with a judge agent determining the most reliable answer. The approach mimics human peer review and adversarial reasoning.

The project arrives as AI hallucination remains a critical barrier to AI adoption in high-stakes domains.

The Problem

AI hallucination undermines trust in AI systems:

| Domain | Hallucination Risk | Consequence |
|——–|——————-|————-|
| Healthcare | High | Misdiagnosis, wrong treatments |
| Legal | High | Incorrect legal advice, case losses |
| Finance | Medium | Wrong investment decisions |
| News/Media | High | Misinformation spread |
| Education | Medium | Students learn incorrect information |
| Customer Service | Low-Medium | Frustrated customers, brand damage |

Single-agent AI systems have no built-in mechanism to catch their own errors.

The Solution

Veritas AI uses multi-agent deliberation:

Architecture

  • Claim Agent: Generates initial answer to user query
  • Critic Agents (3-5): Challenge the claim, identify errors
  • Evidence Agents (2-3): Search for supporting/contradicting evidence
  • Judge Agent: Evaluates all arguments, determines most reliable answer
  • Synthesis Agent: Produces final answer with confidence score

Process Flow

1. User query: Question submitted to Veritas system
2. Initial claim: Claim Agent generates answer
3. Adversarial review: Critic Agents identify weaknesses
4. Evidence gathering: Evidence Agents search for facts
5. Deliberation: All agents debate the claim
6. Judgment: Judge Agent weighs evidence and arguments
7. Final answer: Synthesis Agent produces verified response

Confidence Scoring

  • High confidence (80-100%): Strong consensus, solid evidence
  • Medium confidence (50-79%): Some disagreement, moderate evidence
  • Low confidence (20-49%): Significant disagreement, weak evidence
  • Unable to verify (0-19%): Insufficient evidence, high uncertainty

How It Works

The deliberation process unfolds in stages:

Stage 1: Initial Claim

  • Claim Agent receives user query
  • Generates best answer based on training and knowledge
  • Provides confidence estimate and reasoning

Stage 2: Adversarial Challenge

  • Critic Agents review the claim independently
  • Each identifies potential errors, biases, or gaps
  • Challenges are specific and evidence-based

Stage 3: Evidence Gathering

  • Evidence Agents search external sources
  • Both supporting and contradicting evidence collected
  • Source credibility assessed and documented

Stage 4: Deliberation

  • All agents participate in structured debate
  • Claims challenged, defended, or modified
  • Consensus sought through reasoned argument

Stage 5: Judgment

  • Judge Agent evaluates all arguments and evidence
  • Determines most reliable answer
  • Assigns confidence score based on deliberation quality

Technical Implementation

Veritas AI is built on open standards:

| Component | Technology | Purpose |
|———–|————|———|
| Agent framework | LangChain | Agent orchestration and communication |
| LLM backend | Multiple providers | Diverse model perspectives |
| Evidence search | Web search APIs | External fact verification |
| Deliberation protocol | Custom | Structured debate format |
| Confidence scoring | Bayesian | Probabilistic confidence estimation |
| Audit logging | Immutable ledger | Complete deliberation history |

The system is model-agnostic, working with any major LLM provider.

Early Results

Beta testing shows promising outcomes:

| Metric | Single Agent | Veritas Multi-Agent | Improvement |
|——–|————-|———————|————-|
| Factual accuracy | 72% | 91% | +19 pts |
| Hallucination rate | 28% | 9% | -19 pts |
| Confidence calibration | Poor | Good | Significant |
| User trust | 3.2/5 | 4.4/5 | +1.2 pts |
| Response time | 2s | 8s | +6s overhead |

Accuracy improves significantly with modest latency cost.

Use Cases

Veritas AI is suited for high-stakes applications:

Healthcare

  • Medical information: Verified health advice
  • Drug interactions: Cross-checked medication guidance
  • Symptom analysis: Multi-perspective diagnosis support

Legal

  • Legal research: Verified case law and statutes
  • Contract review: Multi-agent clause analysis
  • Compliance checking: Cross-validated regulatory guidance

Finance

  • Investment research: Verified financial data
  • Risk assessment: Multi-perspective risk analysis
  • Regulatory compliance: Cross-checked compliance guidance

Education

  • Factual content: Verified educational materials
  • Research assistance: Multi-source fact checking
  • Critical thinking: Modeling adversarial reasoning

Key Takeaways

  • Problem: AI hallucination undermines trust in high-stakes domains
  • Solution: Multi-agent deliberation mimics human peer review
  • Architecture: Claim, Critic, Evidence, Judge, and Synthesis agents
  • Process: Initial claim → adversarial challenge → evidence → deliberation → judgment
  • Results: Factual accuracy improved from 72% to 91%, hallucination rate reduced from 28% to 9%
  • Trade-off: 6 second latency overhead for significantly improved accuracy
  • Use cases: Healthcare, legal, finance, education where accuracy is critical

The Bottom Line

Veritas AI addresses AI’s Achilles heel: hallucination. By deploying multiple agents that challenge each other, the system catches errors that single-agent systems miss. The approach is computationally expensive but produces significantly more reliable outputs.

The multi-agent deliberation model mirrors human institutions designed for truth-seeking: scientific peer review, legal adversarial process, journalistic fact-checking. These institutions evolved because individual judgment is fallible. Collective scrutiny produces more reliable outcomes.

For high-stakes AI applications, Veritas AI offers a path forward. Healthcare, legal, and finance can’t tolerate 28% hallucination rates. Multi-agent deliberation reduces this to 9%—still not perfect, but dramatically improved.

The latency overhead is real but acceptable for many use cases. When accuracy matters more than speed, the trade-off makes sense. A doctor would rather wait 8 seconds for reliable medical information than get wrong information instantly.

The project is open-source, enabling community improvement and transparency. This is important for trust—users can inspect how deliberations work rather than treating the system as a black box.

AI hallucination won’t be eliminated entirely. But Veritas AI shows that multi-agent deliberation can significantly reduce it. For applications where accuracy is critical, this approach may become standard.

FAQ

What is Veritas AI?

Veritas AI is an open-source system that uses multi-agent deliberation to reduce AI hallucination. Multiple AI agents debate factual claims, with a judge agent determining the most reliable answer. The approach mimics human peer review and adversarial reasoning.

How does multi-agent deliberation work?

A Claim Agent generates an initial answer. Critic Agents challenge the claim. Evidence Agents search for supporting/contradicting evidence. All agents deliberate. A Judge Agent evaluates arguments and evidence to determine the most reliable answer with a confidence score.

What are the results and trade-offs?

Beta testing shows factual accuracy improved from 72% to 91%, hallucination rate reduced from 28% to 9%. The trade-off is 6 second latency overhead (8s vs 2s for single-agent). For high-stakes applications where accuracy matters more than speed, this is acceptable.

Sources: Hacker News Discussion, GitHub Repository, Research Paper

Tags: Veritas AI, AI Hallucination, Multi-Agent Systems, Fact Checking, AI Safety, Open Source

Share this article

Related Articles