Real-time ingestion. New datasets every hour.

See your exposure before adversaries do.

Continuous indexing of every public breach, infostealer log, and credential dump. Built for security, fraud, and threat-intel teams that need to know which of their domains, employees, and customers are already in the wild.

SearchIndex lookup — at least one required
Email
is
Records indexed
22,066,497,922
Datasets in catalog
2,410
Distinct sources
1,912
Ingestion cadence
Real-time

Threat intelligence sources

Three feeds. One unified index.

Every record we ingest is normalized and joined into a single search index. One query reaches the full corpus. Built for teams that need to act on exposure, not assemble it.

Breach data

users.csv
10.6M rows
emailssnpwd_hash
jane.d…@gmail.com***-**-2847$2$10$f9p…
alex.k…@yahoo.com***-**-9134$2$10$xa3…
sam.li…@proton.me***-**-5061$2$10$kl1…

Database dumps from compromised companies. Email, name, postal address, SSN, password hashes — plaintext when the breach was that bad.

What gets exposed

emailnameaddressssndobpassword

Infostealer logs

session_1812.log
redline
user:jane.doe@gmail.com
pass:●●●●●●●●●●●●
card:4532 ●●●● ●●●● 482712/27
wallet:seed: ridge alley orbit shy…
cookie:__Secure-1PAPISID=pUq…

Output from infostealer malware on infected endpoints. Browser-saved credentials, autofill, session cookies, authenticated tokens, saved cards, wallet seeds.

What gets exposed

urlusernamepasswordcookiescredit_cardwallet_seed

Drop sites

onion-mirror · thread/429
3h ago
[DUMP] 50K • ssn + cc fullz
jane.d…|***-**-2847|4532●●●●4827
alex.k…|***-**-9134|5412●●●●1923
sam.li…|***-**-5061|4111●●●●3872
+49,997 more

Files dropped on paste sites, exposed cloud buckets, and adversary forums. Combolists, scraped profiles, exfiltrated SSN dumps, leaked cards.

What gets exposed

emailusernamepasswordssncredit_cardip

Latest intelligence

Recent high-impact disclosures.

The largest collections currently in the catalog, ranked by record count. Click into any to inspect the schema, the source provenance, and run a scoped query.

Browse the full catalog

USDoD

1,238,395,718
records
Disclosed Apr 1, 2024database

In April 2024, a large trove of data made headlines as having exposed "3 billion people" due to a breach of the National Public Data background check service. The initial corpus of data released in the breach contained billions of rows of personal information, including US social security numbers. Further partial data sets were later released including extensive personal information and 134M unique email addresses, although the origin and accuracy of the data remains in question.

peopledatalabs.com

416,634,583
records
Disclosed Oct 16, 2019scraping

A massive 1.2 billion record data exposure discovered in October 2019 involving an unsecured Elasticsearch server. This file represents a 416 million record subset in JSON format. Data includes names, LinkedIn profile IDs and URLs, email addresses, phone numbers, and geographic locations aggregated by People Data Labs (PDL), a data enrichment company. The data was not a direct hack of PDL but rather an exposed server belonging to an unknown third party that had licensed PDL data.

comelec.gov.ph

100,479,164
records
Disclosed Mar 27, 2016other

A breach of the Commission on Elections (COMELEC) of the Philippines, exposing the entire Philippine voter registration database. The archive contains voter registration records (new_id_released.txt, web_id_onhand.txt, web_id_disapproved.txt), overseas absentee voter data (overseas_absentee_all.txt, overseas_absentee_scratch.txt), geographic reference codes, embassy and country codes, web application user accounts with hashed passwords (dbadmin_usersinformation.txt), and internal system user accounts (fum_users.txt). The data includes full names, dates of birth, addresses, fingerprint data, voter identification numbers (VINs), passport numbers, and biometric information for millions of Filipino voters including overseas absentee voters.

linkedin.com

96,378,643
records
Disclosed Apr 1, 2021scraping

A large-scale LinkedIn data scrape/breach containing approximately 700 million records. The dataset includes full names, email addresses, phone numbers, LinkedIn profile URLs, LinkedIn IDs, Facebook URLs, job titles, company information, location data, inferred salaries, skills, education, work history, gender, birth year, Twitter/GitHub handles, and more. Data appears to have been harvested via scraping or API abuse and compiled into structured CSV files partitioned into ~700 parts, plus separate email-only and phone-only files.

Unknown ID-Hash Database

80,898,386
records
Disclosed Jun 4, 2026other

A dataset containing numeric user IDs paired with what appear to be SHA-1 password hashes (40 hex characters), with some entries redacted as 'xxx'. The IDs range from low values (1410) to ~20 million, suggesting a moderately large platform. No filename, domain, or company name is available to identify the target. The format is consistent with a leaked user credentials table.

aitype.com

75,014,733
records
Disclosed Jun 1, 2014other

Breach of AIType, an Android AI keyboard application. The dataset contains user records including IP addresses, city, country, device brand and model, device ID, Android version, app package name, username (email addresses), geographic coordinates, GCM push notification IDs, and user language preferences. The data appears to be a MongoDB export from AIType's backend infrastructure.

Threat briefings

What our analysts are tracking.

Original reporting on emerging breaches, leak campaigns, and the operators behind them. No press releases. No reposts.

Read all briefings

Who uses it

Built for the teams defending exposed identities.

Security operations

Surface credential exposure across your domain. Pivot from a single leaked email to every dataset that record appears in, then push remediation to your IDP in seconds.

Fraud and trust

Detect compromised customer accounts before they're exploited. Score session and signup risk against known-stolen credentials joined to identity attributes.

Engineering

Hit the API from your own infrastructure. Boolean queries, cursor pagination, sparse fields. Same query language as the UI, free for the first 500 calls a day.

For engineers

Wire exposure data into your stack.

Same engine that powers this site, exposed as a clean REST API. Boolean queries. Cursor pagination. Sparse fields. HATEOAS links. Free for the first 500 calls a day, with quota tiers for production traffic.

Read the API docs