Show HN: Veritas AI – Multi-Agent Deliberation for Truth-Seeking
Show HN: Veritas AI – Multi-Agent Deliberation for Truth-Seeking
A new open-source project is tackling AI hallucination through multi-agent deliberation. Veritas AI deploys multiple AI agents that debate each other on factual claims, with a judge agent determining the most reliable answer. The approach mimics human peer review and adversarial reasoning.
The project arrives as AI hallucination remains a critical barrier to AI adoption in high-stakes domains.
The Problem
AI hallucination undermines trust in AI systems:
| Domain | Hallucination Risk | Consequence |
|——–|——————-|————-|
| Healthcare | High | Misdiagnosis, wrong treatments |
| Legal | High | Incorrect legal advice, case losses |
| Finance | Medium | Wrong investment decisions |
| News/Media | High | Misinformation spread |
| Education | Medium | Students learn incorrect information |
| Customer Service | Low-Medium | Frustrated customers, brand damage |
Single-agent AI systems have no built-in mechanism to catch their own errors.
The Solution
Veritas AI uses multi-agent deliberation:
Architecture
- Claim Agent: Generates initial answer to user query
- Critic Agents (3-5): Challenge the claim, identify errors
- Evidence Agents (2-3): Search for supporting/contradicting evidence
- Judge Agent: Evaluates all arguments, determines most reliable answer
- Synthesis Agent: Produces final answer with confidence score
Process Flow
1. User query: Question submitted to Veritas system
2. Initial claim: Claim Agent generates answer
3. Adversarial review: Critic Agents identify weaknesses
4. Evidence gathering: Evidence Agents search for facts
5. Deliberation: All agents debate the claim
6. Judgment: Judge Agent weighs evidence and arguments
7. Final answer: Synthesis Agent produces verified response
Confidence Scoring
- High confidence (80-100%): Strong consensus, solid evidence
- Medium confidence (50-79%): Some disagreement, moderate evidence
- Low confidence (20-49%): Significant disagreement, weak evidence
- Unable to verify (0-19%): Insufficient evidence, high uncertainty
How It Works
The deliberation process unfolds in stages:
Stage 1: Initial Claim
- Claim Agent receives user query
- Generates best answer based on training and knowledge
- Provides confidence estimate and reasoning
Stage 2: Adversarial Challenge
- Critic Agents review the claim independently
- Each identifies potential errors, biases, or gaps
- Challenges are specific and evidence-based
Stage 3: Evidence Gathering
- Evidence Agents search external sources
- Both supporting and contradicting evidence collected
- Source credibility assessed and documented
Stage 4: Deliberation
- All agents participate in structured debate
- Claims challenged, defended, or modified
- Consensus sought through reasoned argument
Stage 5: Judgment
- Judge Agent evaluates all arguments and evidence
- Determines most reliable answer
- Assigns confidence score based on deliberation quality
Technical Implementation
Veritas AI is built on open standards:
| Component | Technology | Purpose |
|———–|————|———|
| Agent framework | LangChain | Agent orchestration and communication |
| LLM backend | Multiple providers | Diverse model perspectives |
| Evidence search | Web search APIs | External fact verification |
| Deliberation protocol | Custom | Structured debate format |
| Confidence scoring | Bayesian | Probabilistic confidence estimation |
| Audit logging | Immutable ledger | Complete deliberation history |
The system is model-agnostic, working with any major LLM provider.
Early Results
Beta testing shows promising outcomes:
| Metric | Single Agent | Veritas Multi-Agent | Improvement |
|——–|————-|———————|————-|
| Factual accuracy | 72% | 91% | +19 pts |
| Hallucination rate | 28% | 9% | -19 pts |
| Confidence calibration | Poor | Good | Significant |
| User trust | 3.2/5 | 4.4/5 | +1.2 pts |
| Response time | 2s | 8s | +6s overhead |
Accuracy improves significantly with modest latency cost.
Use Cases
Veritas AI is suited for high-stakes applications:
Healthcare
- Medical information: Verified health advice
- Drug interactions: Cross-checked medication guidance
- Symptom analysis: Multi-perspective diagnosis support
Legal
- Legal research: Verified case law and statutes
- Contract review: Multi-agent clause analysis
- Compliance checking: Cross-validated regulatory guidance
Finance
- Investment research: Verified financial data
- Risk assessment: Multi-perspective risk analysis
- Regulatory compliance: Cross-checked compliance guidance
Education
- Factual content: Verified educational materials
- Research assistance: Multi-source fact checking
- Critical thinking: Modeling adversarial reasoning
Key Takeaways
- Problem: AI hallucination undermines trust in high-stakes domains
- Solution: Multi-agent deliberation mimics human peer review
- Architecture: Claim, Critic, Evidence, Judge, and Synthesis agents
- Process: Initial claim → adversarial challenge → evidence → deliberation → judgment
- Results: Factual accuracy improved from 72% to 91%, hallucination rate reduced from 28% to 9%
- Trade-off: 6 second latency overhead for significantly improved accuracy
- Use cases: Healthcare, legal, finance, education where accuracy is critical
The Bottom Line
Veritas AI addresses AI’s Achilles heel: hallucination. By deploying multiple agents that challenge each other, the system catches errors that single-agent systems miss. The approach is computationally expensive but produces significantly more reliable outputs.
The multi-agent deliberation model mirrors human institutions designed for truth-seeking: scientific peer review, legal adversarial process, journalistic fact-checking. These institutions evolved because individual judgment is fallible. Collective scrutiny produces more reliable outcomes.
For high-stakes AI applications, Veritas AI offers a path forward. Healthcare, legal, and finance can’t tolerate 28% hallucination rates. Multi-agent deliberation reduces this to 9%—still not perfect, but dramatically improved.
The latency overhead is real but acceptable for many use cases. When accuracy matters more than speed, the trade-off makes sense. A doctor would rather wait 8 seconds for reliable medical information than get wrong information instantly.
The project is open-source, enabling community improvement and transparency. This is important for trust—users can inspect how deliberations work rather than treating the system as a black box.
AI hallucination won’t be eliminated entirely. But Veritas AI shows that multi-agent deliberation can significantly reduce it. For applications where accuracy is critical, this approach may become standard.
FAQ
What is Veritas AI?
Veritas AI is an open-source system that uses multi-agent deliberation to reduce AI hallucination. Multiple AI agents debate factual claims, with a judge agent determining the most reliable answer. The approach mimics human peer review and adversarial reasoning.
How does multi-agent deliberation work?
A Claim Agent generates an initial answer. Critic Agents challenge the claim. Evidence Agents search for supporting/contradicting evidence. All agents deliberate. A Judge Agent evaluates arguments and evidence to determine the most reliable answer with a confidence score.
What are the results and trade-offs?
Beta testing shows factual accuracy improved from 72% to 91%, hallucination rate reduced from 28% to 9%. The trade-off is 6 second latency overhead (8s vs 2s for single-agent). For high-stakes applications where accuracy matters more than speed, this is acceptable.
—
Sources: Hacker News Discussion, GitHub Repository, Research Paper
Tags: Veritas AI, AI Hallucination, Multi-Agent Systems, Fact Checking, AI Safety, Open Source