What sits between a forwarded scam and the reply you read
This page is for curious readers who want to know what kinds of attacks our service is built against, and what categories of defense are in place. We name the techniques; we don't publish the recipe.
Who this page is for
The rest of our site is written for a general audience — plain English, no jargon, big buttons. This page is the one place where we go a layer deeper, for readers who want to evaluate our service the way they would any other piece of security software.
What follows is honest about the categories of detection and defense in place. It is deliberately not a how-to: we leave out specific thresholds, regex patterns, rate limits, and other knobs that would help an attacker tune around our checks. If you have a question this page doesn't answer, write to help@tro-net.com and a person will reply.
The threats we built against
We optimize for the categories of attack that disproportionately target older email users — and the technical evasions modern scam infrastructure uses to land them.
- Brand impersonation. Mail that claims to be from a bank, a tax authority, a shipping carrier, a utility, or a major retailer, but is sent from a domain the real organization does not own.
- Look-alike domains. Sender or link domains that visually resemble a real brand — homograph (Cyrillic/Greek confusables), substring ("brand-secure-login.com"), or near-miss typo squats.
- Authentication forgery. Mail whose
From:header is set to a brand's domain but which fails the cryptographic checks that real brand mail passes. - Weaponized attachments. Files with renamed extensions (
invoice.pdf.exe), payload formats disguised as documents, and archive types commonly used to evade naive content scanners. - Prompt injection inside the email body. Hostile instructions, invisible Unicode, and manipulation payloads embedded in the forwarded message — designed to re-target an automated reader.
- Reply-loop and amplification abuse. Mail crafted to provoke an automated reply (to ourselves, to a third party, or in a self-sustaining loop), and high-volume bursts intended to exhaust a service budget.
- Cross-user content collisions. One user's reported scam being re-served to another user from a stale cache, or a compromised sender appearing "safe" because of a confidently-wrong prior verdict.
Sender authentication
The single most reliable phishing signal is whether the sender is who they say they are. Before we read a single word of the message, we verify it.
Every inbound message is parsed for SPF, DKIM, and DMARC results — the three standards that let a domain owner say "this server is allowed to send for me" and "this message has not been altered in flight." We look at alignment (whether the authenticated domain matches the visible From: domain), not just whether each check independently passed.
For mail that has been forwarded — which is most of what we receive, because that's the product — we walk the message's ARC (Authenticated Received Chain) seal so we can recover the sender's original authentication state from before the forwarding hop. A forwarded scam doesn't get a free pass just because the forwarder relayed it cleanly.
Look-alike domain detection
A scam often points at a domain that isn't a brand's real domain — but is close enough that a human reader won't notice. We compare every URL host (and the sender's domain) against a curated set of canonical domains for known-impersonated organizations.
The comparison runs four passes, in order of decreasing confidence:
- Exact and suffix match. The domain (or one of its subdomains) is a real canonical domain for the brand.
- Homograph match. The domain uses Unicode confusables (Cyrillic / Greek / digit lookalikes) to mimic a brand's spelling.
- Substring match. A brand's name appears as a label inside a longer host ("brand-account-verify.example") that the brand does not own.
- Bounded edit-distance match. A short, controlled tolerance for typo squats, scoped to brand stems long enough to make false positives unlikely.
Three independent gates — body claims a brand, sender domain doesn't authenticate as that brand, link domain is a look-alike for that brand — produce the strongest phishing signal we have. We weight signals so a single-pass match alone never short-circuits to a final verdict.
Attachment screening
Filenames lie. We treat the declared file type as a hint, not a fact.
Inbound attachments go through two layers of inspection:
- Filename heuristics. Suspicious extensions, double-extension patterns (
invoice.pdf.exe), and archive containers commonly used to evade content scanners are flagged at the filename level. - Magic-byte sniffing. We read the leading bytes of the actual attachment payload and identify the real file format. A file declared
.pdfbut whose first bytes identify it as a Windows executable produces a "declared-vs-actual mismatch" signal that travels with the message into the rest of the pipeline.
Attachment signals are deliberately allowed to override our verdict cache: two messages with identical bodies but different attachments never share a cached classification.
Content analysis
Once the structural checks above are done, the remaining work is reading. We use a machine-learning classifier — a large language model — operated under tight constraints.
The classifier sees the email body, the user's optional framing note, and a structured bundle of pre-extracted signals (authentication results, brand-mismatch tier, attachment heuristics, link metadata). It returns a structured JSON verdict — verdict tier, confidence, reasons, recommended actions, and an optional impersonated organization slug — which we validate against a strict schema before using.
A few constraints that matter:
- Two-track routing. When the deterministic gates above produce a clear-cut picture (strong brand-mismatch with failing authentication, dangerous attachment type, obvious newsletter structure), we route to a smaller, faster model. Ambiguous cases get the larger one. The router uses signal strength, never the email's own text.
- Hedged fallback verdicts. If the model output fails schema validation, fails our content checks, or the call itself errors, we fall back to a hedged "we couldn't analyze this — here's what to check yourself" reply. We never silently substitute a confident verdict.
- Strict separation of trusted and untrusted text. Our pre-extracted signals (which are produced by our own code) and the email body (which is hostile input) live in clearly-delimited regions of the prompt. The system prompt warns the model not to follow instructions found inside the email body.
Defense against prompt injection
A forwarded scam is by definition a piece of hostile text reaching an automated reader. We treat every email body as adversarial.
Concretely, we apply several layered defenses before the model ever sees the message:
- Invisible Unicode stripping. Zero-width joiners, bidirectional overrides, and Plane 14 tag characters (the basis of several published prompt-injection exploits in 2024) are removed before classification.
- Bounded input length. Email bodies and user notes are truncated. The bound is not advertised here, but it serves both as a token-bill ceiling and as a way to limit the surface a manipulated payload can attack.
- Untrusted-content tagging. Email content is wrapped in dedicated tags inside the prompt; the system prompt instructs the model to treat text inside those tags as data only — never as instructions.
- Server-side rendering of all contact details. The model is structurally unable to emit a phone number or website to a user. It emits an organization slug, and our server resolves that slug against a curated, hand-verified allowlist before any contact information appears in a reply. If the slug is unknown, the reply hedges instead of guessing.
The verified-contact allowlist
When a reply tells you to call your bank, that phone number was verified by hand against the bank's own website.
We maintain an internal allowlist of major frequently-impersonated organizations — banks, tax authorities, shipping carriers, payment providers, large consumer-tech accounts. Each entry has a display name, official phone number, and official website, every one of which was verified against the organization's real web presence on the date noted in the entry, and re-verified on a regular cadence.
The classifier identifies which organization a scam is impersonating; our server picks what to say about that organization. Any slug not on the allowlist falls through to a hedged "look the company up yourself" message — we would rather decline to give a number than risk pointing a reader at one we haven't verified.
Result caching, with safety floors
If two users forward the same scam from the same campaign within a short window, we don't pay for two independent classifications. But the cache is not a passthrough.
The cache is keyed on a normalized signature of the forwarded message — sender domain, subject, and body — chosen so that the same campaign collides into the same cache row across users while still discriminating attachments and authentication state.
A weakly-confident "safe" verdict in the cache is treated as a miss and re-classified, by design: a borderline-safe call shouldn't get blindly re-served to the next reader. Hedged fallback verdicts are never cached. And when an attachment-based signal fires on a particular message, the cache is bypassed entirely so the message is reasoned about in isolation.
Outbound discipline
A service that auto-replies to incoming mail is one bug away from being a mail-loop generator or a spam amplifier. Every reply we send passes one gate.
That gate enforces, in layered order:
- RFC 3834 loop suppression. We honor
Auto-Submittedheaders, and we set our own outbound toAuto-Submitted: auto-repliedso other systems honor us in turn. - No-reply / bounce / list detection. Senders shaped like
noreply@,postmaster@, mailing lists (List-Id,List-Unsubscribe), and nullReturn-Pathbounces never receive an auto-reply. - Suppression list. Hard bounces and abuse complaints accumulate; once an address is on the list, we never email it again.
- Per-thread reply caps. A bounded number of auto-replies per thread per rolling window, enforced atomically at the database level so a burst of concurrent inbound can't race the limit.
- Loop-counter header. We stamp our own outbound with a depth counter so that a reply we wrote, somehow returning to us, is recognized and dropped.
Abuse and resource limits
Anyone who can forward us a message can, in principle, try to forward us a million. Every step on the hot path is bounded.
- Inbound webhook hardening. Postmark's basic-auth credentials are verified with a constant-time comparison; message IDs are deduplicated so retries from the email provider can't double-process a forward.
- Bounded payloads. Request bodies, attachment byte counts, and the URL list extracted from a single email are all bounded before any expensive work runs.
- Per-sender throttling. A single sending address can only land a bounded number of inbound forwards per minute before later traffic is paused.
- Per-user classifier budgets. Classifier calls are rate-limited per user, so one noisy account can't exhaust the analysis budget for everyone else.
- Quota gating before spend. Plan-level quota is checked before we call the classifier, so a forward that would exceed quota costs nothing in compute.
What we deliberately don't do
A few common patterns in this category of product are absent on purpose.
- No user-facing report-this-as-phishing button. Our audience came to us so they wouldn't have to decide; asking them to also classify is at odds with that promise. A reporting endpoint is also an attack surface — an adversary with throwaway accounts could use it to blocklist a legitimate sender. Any feedback flows through human-read support email instead.
- The classifier is structurally unable to emit a phone number or website. We enforce this through three layers — the system prompt, a constrained JSON schema, and a server-side stripping step that drops anything phone- or URL-shaped from model output. See "verified-contact allowlist" above. This is the single highest-stakes property in the product.
- No "looks safe, skip the check" heuristic. We don't decide a message is safe without running the full pipeline. The only short-circuits to "safe" are the ones explicitly enumerated above, each of which is structurally narrow.
- No outbound calls or text messages. We communicate with you exclusively by email. Anyone calling and claiming to be from our service is not.
- No password requests, ever. We never ask for any password — for your email, your bank, or anything else. A reply from us claiming otherwise is not from us.
More questions
If you're a security researcher, a journalist, or a technically-curious user with a question this page didn't cover, we're happy to talk.
For non-technical questions about the service — privacy, pricing, how to forward — see the FAQ or the step-by-step guide. For everything else, including responsible disclosure of a vulnerability you may have found, write to help@tro-net.com and a person will reply.