← Back to Blogs
HN Story

Automating the Fight Against Data Brokers: A Deep Dive into auto-identity-remove

May 18, 2026

Automating the Fight Against Data Brokers: A Deep Dive into auto-identity-remove

The modern internet has turned personal identity into a commodity. Data brokers—companies that scrape, aggregate, and sell personal information—operate in the shadows, often making it nearly impossible for an individual to manually request the removal of their data. For most, the process is a tedious cycle of searching for their own profile across hundreds of sites and filling out repetitive, often intentionally obstructive, opt-out forms.

Recently, a new open-source project, auto-identity-remove, has emerged to tackle this problem. Developed by stephenlthorn, this tool aims to automate the removal of personal information from over 500 people-search sites and data broker databases on a recurring schedule. By leveraging browser automation and AI-powered CAPTCHA solving, it attempts to shift the burden of privacy maintenance from the user to the machine.

How the Automation Works

The tool is built on Node.js and Playwright, allowing it to simulate human interaction with web pages. Rather than a simple script, it implements a sophisticated workflow to handle the varied nature of data broker opt-out processes:

  1. Discovery and Identification: The script searches for the user's name and state to find specific profile URLs, which many brokers require for a precise opt-out request.
  2. Automated Submission: It fills and submits opt-out forms using personal data stored locally in a config.json file.
  3. CAPTCHA Bypass: To overcome the intentional friction added by brokers, the tool integrates with CapSolver, an AI-powered service that solves reCAPTCHAs for a nominal fee.
  4. State Tracking: A state.json file tracks successful removals. Because brokers often re-add data, the tool uses a 90-day re-check window to ensure persistence.
  5. Hybrid Execution: For sites that are too complex for automation (like Google's "Results About You"), the tool opens the page in the user's browser for manual intervention.

Broker Coverage Strategies

The project categorizes brokers into three tiers of automation:

  • Explicitly Defined (30+ sites): High-value brokers like Spokeo, WhitePages, and B2B giants like ZoomInfo and Clearbit have dedicated logic to handle their specific search-and-remove flows.
  • Generic Runners (470+ sites): Using datasets from The Markup and the "Big Ass Data Broker Opt Out List," the tool employs four heuristic strategies (e.g., looking for "Do Not Sell My Personal Information" buttons or OneTrust privacy managers) to attempt automated opt-outs.
  • Manual Fallbacks: Sites requiring account authentication or complex verification are flagged for the user to handle manually.

Technical Considerations and Limitations

While the tool provides a powerful framework for privacy, it is not without its challenges. The community discussion on Hacker News highlighted several critical technical and philosophical hurdles.

Platform Dependency

Currently, the tool is heavily tied to macOS, utilizing launchd for scheduling and the Messages app for notifications. While the core logic is Node.js, users on Linux or Windows would need to implement their own scheduling (such as cron) to achieve the same monthly automation.

The "Active User" Paradox

A significant point of contention among users is whether automating opt-outs actually helps or inadvertently harms the user. Some argue that submitting a form—which requires providing current contact information—might signal to a broker that the data they hold is accurate and that the user is "active."

"I unironically suspect the purpose of many opt-out forms is merely to record the up-to-date information."

International Utility

The tool is primarily designed for the US market, relying on "State" and "ZIP code" as primary identifiers. Users outside the US have reported difficulties, noting that non-numeric postal codes or different address formats often break the automation logic.

Comparison: Open Source vs. Paid Services

There are established paid services like Incogni and Optery that offer similar functionality. The developer of auto-identity-remove suggests a complementary approach rather than a competitive one. Paid services often have professionally maintained flows that cover a wider array of brokers, but they are closed-source and require subscriptions.

This open-source tool offers transparency and control, specifically targeting gaps that paid services might miss, such as Acxiom or LexisNexis. For the privacy-conscious user, the strongest strategy may be combining a paid service for bulk removal with this script for targeted, high-value gaps.

Future Outlook

The project is currently in beta, with the developer seeking help to improve the heuristic approach for generic sites and to implement email verification flows. As AI models for "computer use" (like Claude's latest capabilities) evolve, there is a growing curiosity about whether these models could replace static selectors with dynamic, visual understanding of opt-out forms, potentially increasing the success rate of automated identity removal.

References

HN Stories