How a Missing Security Control Turned WhatsApp’s API into a Data Goldmine

How a Missing Security Control Turned WhatsApp’s API into a Data Goldmine

Alex Cipher's Profile Pictire Alex Cipher 8 min read

A single overlooked security measure turned WhatsApp’s API into a treasure trove for data harvesters. When researchers discovered that WhatsApp’s contact-discovery API lacked robust rate limiting, they didn’t just find a minor oversight—they uncovered a vulnerability that allowed the enumeration of 3.5 billion active accounts, dwarfing previous data leaks in scale and sensitivity. By automating requests through just a handful of authenticated sessions, the researchers were able to query over 100 million phone numbers per hour, collecting not only account existence but also profile photos, status messages, and even public encryption keys (BleepingComputer).

This incident isn’t just about WhatsApp. It’s a wake-up call for the entire tech industry, echoing past breaches at Facebook, Twitter, and Dell, where APIs without proper throttling became open doors for mass data scraping. The WhatsApp case stands out for its global reach, revealing active users in countries where the app is banned and exposing regulatory gaps that could lead to hefty fines under laws like the GDPR. As APIs become the backbone of digital services, this story highlights the urgent need for smarter, adaptive security—especially as attackers grow more sophisticated and data becomes ever more valuable.

How Rate Limiting (or the Lack Thereof) Turned WhatsApp’s API into a Data Goldmine

The Role of Rate Limiting in API Security

Rate limiting is a foundational security mechanism in API design that restricts the number of requests a user or system can make to a service within a defined timeframe. Its primary purpose is to prevent abuse, such as brute-force attacks, credential stuffing, or mass enumeration of data. When properly implemented, rate limiting can detect and throttle suspicious activity, ensuring that APIs are not exploited for large-scale data harvesting. In the context of WhatsApp’s contact-discovery API, the absence of robust rate limiting created an environment where automated systems could query the platform at unprecedented speed and scale, enabling the enumeration of billions of accounts without detection or intervention (BleepingComputer).

Exploiting the Contact-Discovery API: Technical Mechanics

The WhatsApp contact-discovery API, specifically the GetDeviceList endpoint, was designed to allow users to check if a phone number was registered on the platform and to retrieve information about associated devices. However, the API lacked effective rate limiting, allowing researchers—and potentially malicious actors—to automate and scale requests massively. By leveraging just five authenticated sessions on a single university server, researchers were able to send over 100 million queries per hour. This high throughput was possible because the API did not enforce per-user, per-session, or per-IP restrictions, nor did it monitor for behavioral anomalies such as repeated sequential queries across vast phone number ranges (BleepingComputer).

The researchers generated a global set of 63 billion possible mobile numbers and systematically checked each one against the API. This brute-force enumeration yielded 3.5 billion active WhatsApp accounts, demonstrating the catastrophic scale of exposure possible when rate limiting is absent.

Data Harvested: Scope and Sensitivity

The lack of rate limiting did not merely expose the existence of WhatsApp accounts. By chaining multiple API endpoints—including GetUserInfo, GetPrekeys, and FetchPicture—the researchers were able to collect a wide array of sensitive user data at scale. The harvested dataset included:

  • Phone numbers: The primary identifier for WhatsApp accounts, often linked to personal identity.
  • Timestamps: Metadata indicating when accounts were active or created.
  • Profile photos: Downloaded in bulk, with 77 million US-based images retrieved, many displaying identifiable faces.
  • “About” text: User-generated status messages, sometimes containing personal details or links to other social profiles.
  • Public keys for E2EE encryption: While not directly compromising message content, these keys are part of WhatsApp’s end-to-end encryption infrastructure and could be leveraged in future attacks or for metadata analysis (BleepingComputer).

The scale of the operation was unprecedented. The researchers noted that, had the dataset been released publicly, it would have constituted the largest data leak in history, surpassing even the 2021 Facebook phone-number scrape.

Global Impact: Geographic and Regulatory Implications

The enumeration provided a unique, previously unavailable snapshot of WhatsApp’s global user base. Notably, the data revealed:

  • India: 749 million active accounts
  • Indonesia: 235 million
  • Brazil: 206 million
  • United States: 138 million
  • Russia: 133 million
  • Mexico: 128 million

Additionally, millions of active accounts were identified in countries where WhatsApp was officially banned, such as China, Iran, North Korea, and Myanmar. In Iran, for example, usage continued to grow even as government restrictions were in place, with a notable uptick after the ban was lifted in December 2024 (BleepingComputer).

The exposure of such granular, country-level data has significant implications for privacy, regulatory compliance, and national security. In the European Union, for instance, similar incidents have previously resulted in substantial fines under the General Data Protection Regulation (GDPR), as seen when Meta was fined €265 million after a Facebook scraping incident caused by a comparable lack of API safeguards (BleepingComputer).

Comparative Analysis: WhatsApp and Other API Scraping Incidents

The WhatsApp incident is not isolated; it is emblematic of a broader pattern of API vulnerabilities across major online platforms. In 2021, Facebook’s “Add Friend” feature was similarly abused to scrape data on 533 million users, including phone numbers, Facebook IDs, names, and genders. The root cause was again the absence of adequate rate limiting, which allowed attackers to upload vast contact lists and check them en masse against the platform (BleepingComputer).

Other notable examples include:

  • Twitter: Attackers exploited an API vulnerability to match phone numbers and email addresses to 54 million accounts, again due to insufficient request throttling.
  • Dell: 49 million customer records were scraped after attackers abused an unprotected API endpoint.

These incidents underscore a systemic issue: APIs that facilitate account or data lookups without robust rate limits become prime targets for large-scale enumeration and data exfiltration. The WhatsApp case stands out for its sheer scale—3.5 billion active accounts—and the breadth of personal data exposed, but it is part of a recurring pattern of API misconfigurations leading to mass privacy violations (BleepingComputer).

Long-Term Risks and the Persistence of Leaked Data

The consequences of large-scale data scraping extend far beyond the initial exposure. Phone numbers and associated metadata remain valuable for years, enabling a range of malicious activities, including targeted phishing, SIM swapping, identity theft, and cross-platform account correlation. The WhatsApp researchers found that 58% of the phone numbers leaked in the 2021 Facebook incident were still active on WhatsApp in 2025, highlighting the enduring utility of such datasets for threat actors (BleepingComputer).

Moreover, the aggregation of profile photos, “about” texts, and device information allows for the construction of detailed user profiles, which can be leveraged for social engineering or further attacks. The persistence of these risks is compounded by the slow pace at which users change their phone numbers or update their privacy settings, making the fallout from such leaks both widespread and long-lasting.

Organizational Response and Remediation Efforts

Following disclosure of the vulnerability, WhatsApp implemented rate-limiting protections to prevent similar abuse. This involved introducing restrictions on the number of requests that could be made from a single session, user, or IP address within a given timeframe. Such measures are now considered industry best practice and are essential for any API that exposes user data or facilitates account lookups.

The incident has prompted broader scrutiny of API security practices across the technology sector. Organizations are increasingly recognizing the need for:

  • Comprehensive monitoring: Detecting unusual patterns of API usage indicative of automated scraping.
  • Adaptive rate limiting: Adjusting thresholds dynamically based on user behavior and risk signals.
  • Authentication and authorization: Ensuring that only legitimate, authenticated users can access sensitive endpoints.
  • Data minimization: Limiting the amount and sensitivity of data returned by APIs, especially for public or easily automatable endpoints.

These lessons are being incorporated into regulatory frameworks and industry guidelines, as reflected in the growing number of fines and enforcement actions against companies that fail to protect user data adequately.

Broader Industry Implications and Future Directions

The WhatsApp scraping incident has catalyzed a shift in how API security is approached, both technically and organizationally. Security leaders are now prioritizing API protection in their budgets and strategic planning, as evidenced by recent CISO benchmark reports (BleepingComputer). The focus is on:

  • Proactive threat modeling: Identifying potential abuse cases before APIs are deployed.
  • Continuous testing: Employing automated tools to simulate attacks and uncover vulnerabilities.
  • Cross-platform vigilance: Recognizing that attackers often correlate data across multiple services, amplifying the impact of individual leaks.

As APIs continue to proliferate and underpin critical digital services, the lessons from WhatsApp’s experience are shaping the future of privacy, security, and trust online. The imperative for robust rate limiting and vigilant API management has never been clearer, with the protection of billions of users’ data hanging in the balance.

Final Thoughts

The WhatsApp API flaw is a stark reminder that even the most widely used platforms can fall prey to basic security oversights. The absence of rate limiting didn’t just expose billions of accounts—it set a new benchmark for what’s possible when API security is neglected. The ripple effects are profound: from the persistence of leaked data fueling future attacks, to the regulatory and reputational fallout for organizations that fail to protect user privacy (BleepingComputer).

Looking ahead, the lesson is clear: API security can’t be an afterthought. As emerging technologies like AI and IoT drive even more data through APIs, the stakes will only rise. Proactive threat modeling, adaptive rate limiting, and continuous monitoring aren’t just best practices—they’re essential defenses in a landscape where a single flaw can expose billions. The WhatsApp incident should serve as a catalyst for organizations everywhere to rethink their approach to API protection and to prioritize user trust above all.

References