Security & Data Handling

erabot.ai scans code that is, by definition, proprietary. This page describes exactly what we see, what we keep, and what we never do. It is deliberately specific — if you are buying for a team that handles customer data, this is the page your security lead will read.

How we handle your code

Code you submit (paste, upload, or GitHub connect) is parsed with tree-sitter locally in our scanner. Before any code is sent to our audit LLM (Gemini 2.0 Flash), it passes through a redactor that strips obvious secrets (API keys, JWTs, private keys, AWS credentials) and common PII patterns (emails, addresses, SSN-shaped strings). Redacted segments are replaced with sentinel tokens before prompt assembly.

The redactor also wraps untrusted code in <untrusted_input> XML tags so that LLM prompt-injection attempts in your code cannot manipulate our audit prompt. Our system prompt explicitly instructs the model to ignore instructions appearing inside those tags.

Retention

  • Code snippets and pasted files: retained only for the duration of the scan job. Purged within 1 hour of scan completion.
  • Scan findings and agent-instructions.md: retained on your account until you delete them. 30-day rolling backups. Fully encrypted at rest.
  • LLM audit logs: we log prompt metadata (model, token counts, latency) but not the redacted prompt content itself. Logs retained 90 days.
  • Training: we do not train models on your code, your scans, or your findings. Ever. Gemini is called with standard safety settings and no sharing.

Encryption

  • TLS 1.3 in flight for every request.
  • API keys and provider credentials stored with Fernet symmetric encryption at rest. The master key is managed by Fly.io secrets, never committed to the repo.
  • JWT tokens live in HTTP-only, Secure, SameSite=Strict cookies. No tokens in localStorage.

Access control

  • Every resource (scans, API keys, projects) is scoped by user_id in the ORM and enforced in the route layer. Cross-tenant access is rejected with HTTP 404 (not 403) so resource existence isn’t leaked.
  • Budget and quota operations use SELECT FOR UPDATE on Postgres so concurrent calls can’t race around the limit.
  • Team tier supports role-based access control (admin, member, viewer). SSO / SAML is available on Enterprise.

Eval methodology & accuracy

F1 = 1.00call-site detection, 105 OSS repos + fixtures, bootstrap 95% CI [1.00, 1.00]

This measures whether we correctly find every LLM API call in source — not whether the findings written on top are high-quality recommendations. Full methodology, per-language / per-provider / per-complexity breakdown, and raw numbers: /eval.

We publish our eval results because a scanner that can’t tell you its precision is a liability, not a tool. The corpus is 105 real-world open-source repositories across Python, TypeScript, JavaScript, Go, and Rust, stratified by LLM provider usage, plus ~80 synthetic fixtures. Results are regenerated on every main branch build — the number you see above is what the latest commit produced, not a snapshot.

Finding-quality accuracy (does this recommendation actually save money when applied?) is a separate dimension we label per-finding with confidence bands (high / medium / low / directional only) rather than a single headline number. This is the honest framing.

SOC 2 roadmap

We are not SOC 2 Type II certified today. We are actively onboarding to a compliance-automation platform (Vanta) and the target window for Type II is Q3 2026. Type I attestation is targeted earlier, Q2 2026.

In the meantime, Enterprise contracts include a custom DPA and specific data-residency commitments on request. Book a 30-minute call at /for-teams to discuss before the certification lands.

Infrastructure

  • Backend: Fly.io (Frankfurt region primary, US-East replica).
  • Frontend: Vercel edge network.
  • Database: Postgres with encrypted-at-rest volumes, daily snapshots.
  • Cache: Redis, TLS-only.
  • Status page: status.erabot.ai

Responsible disclosure

Found a vulnerability? Email security@erabot.ai. We respond within 24 hours and prioritize ethical researchers. We do not pursue legal action against good-faith researchers who follow standard disclosure practice.

This page is not a legal contract. For the binding version, see the Terms and Privacy Policy. Last updated 2026-04-14.