In 2019, security researchers discovered that Dalil, a popular Saudi Arabian caller identification application with over 5 million users, had left its entire MongoDB database exposed to the internet without any authentication or access controls. The 585-gigabyte database contained detailed personal information including real names, phone numbers, precise GPS location data, device information, and mobile carrier details for millions of Saudi users.
Key Facts
- WhatDalil caller ID app left 585GB MongoDB database open without authentication.
- WhoOver 5 million Saudi users of the Dalil app.
- Data ExposedReal names, phone numbers, GPS locations, device IDs, and carrier data.
- OutcomePre-PDPL; zero security controls on a surveillance-grade dataset.
What Was Exposed
- Real names associated with phone numbers for approximately 5 million users, predominantly Saudi Arabian nationals
- Phone numbers including both the primary number and contacts from users' address books uploaded by the app
- Precise GPS location data with latitude and longitude coordinates, revealing users' physical movements and frequented locations
- Device information including phone model, operating system version, IMEI numbers, and device identifiers
- Mobile carrier information revealing which telecommunications provider each user subscribed to
- App usage data including installation dates, last active timestamps, and feature usage patterns
- Email addresses associated with user accounts
The Dalil breach is a case study in the risks posed by the proliferation of mobile applications that collect far more data than their core functionality requires. A caller identification app needs access to a phone number database and, arguably, a user's own phone number. It does not need continuous GPS location tracking, device IMEI numbers, or carrier information. The collection of this excessive data, combined with the absence of any database security whatsoever, created a surveillance-grade dataset on 5 million people that was freely accessible to anyone who knew where to look.
The GPS location data is the most concerning element of the exposure. With latitude and longitude coordinates associated with identified individuals, the dataset could be used to track users' movements, identify their home and workplace addresses, determine their daily routines, and establish patterns of association between individuals who frequent the same locations.
In a region where personal privacy is culturally valued and where certain gatherings or associations may carry social or legal sensitivity, the exposure of granular location data poses risks that extend far beyond conventional identity theft.
The data could be weaponized for stalking, blackmail, social engineering, or targeted surveillance. A domestic abuser could track a victim's movements. A business rival could monitor a competitor's meetings and associations. A criminal organization could identify high-value targets based on their home addresses and daily patterns. The potential for harm from location data is limited only by the imagination and intent of those who access it.
The 585GB volume of the database suggests that the exposure included not only current user data but potentially historical location data spanning the application's entire operational lifetime. This longitudinal location dataset would enable the construction of detailed movement profiles showing how individuals' patterns changed over time, where they traveled, and whom they met. For security services, intelligence agencies, or malicious actors, this type of historical location data is extraordinarily valuable and would normally require significant resources and legal authority to obtain.
Open MongoDB instances have been a persistent vulnerability pattern across the technology industry, with search engines like Shodan making it trivial to discover unprotected databases. The Dalil case demonstrates that this vulnerability is not limited to obscure startups in mature technology markets; it is a global problem that affects applications serving populations in every region.
The fact that an app serving 5 million users in a high-profile market like Saudi Arabia could operate with zero database authentication suggests a fundamental lack of security awareness and resources during the application's development and deployment.
Regulatory Analysis
The Dalil breach occurred in 2019, four years before the enactment of Saudi Arabia's PDPL in September 2023. At the time of the breach, Saudi Arabia lacked a comprehensive data protection law, which meant there was no dedicated regulatory body to investigate the breach, no mandatory notification requirements, and no specific penalties for the type of data exposure that occurred. This regulatory vacuum illustrates precisely why the PDPL was necessary, as the Dalil incident, and others like it, demonstrated that the Saudi digital ecosystem had outpaced the legal framework designed to protect it.
Analyzed under the PDPL as it now stands, the Dalil breach would trigger multiple violations. Article 5 establishes consent requirements for the collection of personal data, mandating that individuals be informed of the purpose for which their data is being collected and that consent be specific, informed, and freely given. The collection of GPS location data and device identifiers by a caller ID application raises serious questions about whether users were adequately informed about the scope of data collection and whether their consent extended to continuous location tracking.
If the app's privacy policy did not explicitly disclose the collection of GPS data, or if consent was bundled into a general terms-of-service acceptance, the processing would lack a valid legal basis under Article 6. The PDPL requires granular consent that is specific to each purpose of processing, not blanket authorization hidden in dense legal language that no reasonable user would read or understand.
Article 10 addresses the processing of personal data by third parties and the obligations of data controllers to ensure processor compliance. If Dalil used third-party cloud hosting services for its MongoDB instance, the responsibility for securing the database would remain with Dalil as the data controller. The fact that the database was deployed without authentication suggests either a catastrophic misconfiguration or a deliberate decision to forgo security controls, neither of which would constitute a valid defense under the PDPL.
Article 19's requirement for appropriate technical and organizational measures is where the PDPL would apply most directly to the Dalil case. An exposed MongoDB database with no authentication represents the most basic possible failure of technical security measures. There were no access controls, no encryption, no monitoring, and no alerts. This is not a case where sophisticated attackers defeated well-designed defenses; it is a case where no defenses existed.
Under the PDPL, SDAIA would be justified in imposing the maximum fine of SAR 5 million, as it would be difficult to identify a more complete failure of the duty to implement appropriate security measures.
What Should Have Been Done
The most fundamental control that should have been in place was database authentication and access control. MongoDB supports robust authentication mechanisms including SCRAM-SHA-256, X.509 certificates, and LDAP integration. Any of these mechanisms, even the simplest username-and-password authentication, would have prevented the database from being publicly accessible. The fact that the database was deployed without any authentication represents a failure of the most basic security hygiene.
Network-level access controls should have restricted database connections to only the application servers that needed access, using firewall rules, VPN requirements, or cloud security groups to ensure that the database was never directly reachable from the internet. Defense in depth means that even if database authentication were somehow bypassed, network-level controls would prevent unauthorized connections from reaching the database at all.
Data minimization should have been a core design principle from the application's inception. A caller identification app does not need to collect continuous GPS location data, device IMEI numbers, or carrier information to perform its primary function. The PDPL's data minimization principle, which requires that data collection be limited to what is necessary for the stated purpose, would have prevented the creation of this surveillance-grade dataset in the first place.
By collecting only the data necessary for caller identification, the blast radius of any potential breach would have been limited to phone numbers and associated names, rather than a comprehensive profile including physical location histories.
Encryption of data at rest should have been implemented to protect the database contents even in the event of unauthorized access. MongoDB Enterprise and even the community edition support encryption at rest using industry-standard algorithms. If the database had been encrypted, an attacker who gained access to the raw files or the database connection would have been unable to read the data without the encryption keys.
Combined with column-level encryption for the most sensitive fields such as GPS coordinates and national identification numbers, encryption would have provided a critical last line of defense.
Continuous security monitoring and vulnerability scanning should have been part of the application's operational procedures. Automated tools that scan for exposed databases, open ports, and misconfigured cloud services are widely available and are considered a baseline requirement for any internet-facing application. Regular penetration testing would have identified the open MongoDB instance as a critical vulnerability, and a vulnerability management program would have ensured that the issue was remediated promptly.
The extended period of exposure suggests that no such monitoring or testing was in place, leaving the 5 million users' data accessible to anyone with the technical skill to run a Shodan search.
An open MongoDB containing the location histories and personal details of 5 million users is not a sophisticated attack; it is a failure to implement the most elementary security controls. The Dalil breach predated the PDPL, but it exemplifies the data protection failures the law was designed to prevent. Any application collecting personal data in Saudi Arabia today must treat database security as a non-negotiable minimum, not an afterthought.