Provable
Cited answers for questions. Ranked chunks for agents. Both signed, both verifiable, both on your machine.
from provable import Pipeline
p = Pipeline.from_documents("./pdfs/")
ans = p.query("data retention policy?")
ans.answer # cited text
ans.citations # source spans
ans.proof # SHA-256 Merkle
Every stage is deterministic, inspectable, and reproducible. No black-box embedding API. No vendor upload. No surprises.
No vendor upload. No external embedding API. Pure Python, runs on your machine.
from provable import Pipeline
pipeline = Pipeline.from_documents("./my_corpus/")
result = pipeline.query("What does our compliance policy say about data retention?")
# result.answer → cited natural-language answer
# result.citations → exact source spans with doc IDs
# result.proof → SHA-256 Merkle commitment (verifiable in <40ms)
# result.verdict → "ANSWERED" or "ABSTAINED"
# Same query against a running Provable server
curl "https://your-host/api/query?q=What+does+our+compliance+policy+say"
# Independently verify any returned proof:
curl -X POST "https://your-host/api/verify" \
-H "content-type: application/json" \
-d '{"query": "...", "retrieved_doc_ids": [...], "proof_signature": "PROOF-..."}'
# → { "valid": true, "verify_ms": 1.8, "reason": "all hashes match" }
{
"verdict": "ANSWERED",
"answer": "Data retention policy requires 7-year storage...",
"primary_doc_id": "policies/retention-2024.pdf#p3",
"citations": [
{ "doc_id": "policies/retention-2024.pdf#p3", "score": 9.41 },
{ "doc_id": "compliance/gdpr-summary.md", "score": 7.22 }
],
"proof_signature": "PROOF-A8B3C1D2E4F5G6H7I8J9K0L1",
"latency_ms": 113.4
}
There is no "best guess." There is no fabricated citation. The set is partitioned.
Modern agents waste their context window on irrelevant text. Provable filters your corpus down to the chunks that actually matter for the decision. Cited, signed, ranked. Drop them straight into the system prompt. The agent decides on verified ground truth.
retention-policy-2024.pdf#p3.
Redact PII after year 5 per gdpr-summary.pdf#p12.
PROOF-A8B3C1D2E4F5G6H7
Clinical AI that cites. Audit trail verifiable in milliseconds.
Real cases, real paragraphs. Opposing counsel verifies the citation from the signed proof in under 40 ms.
Cited to the actual disclosure document. Supervision review becomes minutes, not hours.
Cited to the actual regulation. The administrative record writes itself.
Pre-loaded with 518 documents across medical, legal, finance, science, and software. Drop your own folder anytime. Same engine.
One folder, any format. Cited answers, signed proofs. Honest abstentions.