INTELLIGENCE
ZERO|TOLERANCE
Intelligence Advisory
zerotolerance.me

Egyptian Scholastic Test 72K+ Children's PII on Open AWS S3

2022-2023 · 72K children

Publication Date
2022-01-01
Category
Data Breaches
Author
K. Ellabban
Organization
Zero|Tolerance Security Research

An unprotected Amazon Web Services (AWS) S3 bucket belonging to an educational testing service exposed the personal data of more than 72,000 Egyptian children to the open internet. Discovered by cybersecurity researchers between 2022 and 2023, the bucket was publicly accessible with no authentication, encryption, or access controls of any kind.

Executive Summary

Key Facts

  • WhatOpen AWS S3 bucket exposed personal data of 72,000+ Egyptian children.
  • WhoEgyptian schoolchildren and their parents across multiple governorates.
  • Data ExposedChildren’s names, national IDs, test scores, and parent contact details.
  • OutcomeBucket secured after researcher disclosure; no regulatory penalty reported.
Impact Assessment

What Was Exposed

  • Full legal names of 72,000+ Egyptian children, in both Arabic and transliterated forms, linked to their educational records and testing profiles
  • Dates of birth providing exact age information for minor children, a critical data point for identity construction and a key element of identity verification across government and financial services
  • Egyptian national identification numbers (al-raqm al-qawmi), which serve as lifelong identifiers in Egypt’s civil registry system and are used for every significant government and financial interaction
  • School names and locations, enabling physical identification and tracking of specific children to specific schools and geographic areas
  • Test scores and academic performance data, which constitute educational records with privacy implications under multiple regulatory frameworks and can be used for profiling and discrimination
  • Parent and guardian contact information, including phone numbers, email addresses, and in some cases residential addresses linked to school registration
  • Registration metadata including dates of enrollment, testing session identifiers, payment records for testing fees, and administrative notes about special accommodations or test conditions
  • Photographs of children submitted as part of registration and identification verification processes

The combination of data elements in this exposure creates a uniquely dangerous profile for each affected child. A national ID number linked to a name, date of birth, school, and parent contact information provides every element needed for identity fraud that can follow a child for decades. In Egypt, the national ID number is used across government services, financial transactions, and civil registrations throughout a person’s entire life.

A child whose national ID is compromised at age eight will still be dealing with the consequences at age thirty, as the number cannot be easily changed and is linked to every significant interaction with the state and the financial system.

The long-term nature of child identity compromise is what makes it categorically more severe than adult identity exposure. Adults who discover identity fraud typically have existing financial accounts, credit histories, and established identities that provide a baseline against which fraud can be detected. Children have none of these.

Fraudulent accounts, loans, and government benefit claims opened using a child’s national ID may go undetected for years or even decades, until the child reaches adulthood and attempts to open their first bank account, apply for a government service, or establish their own financial identity.

By that point, the damage has compounded into a complex web of fraudulent records that takes months or years to untangle - if it can be fully resolved at all.

The school identification data adds a physical dimension to the digital exposure.

Knowing which school a specific child attends, combined with their name, age, and photograph (if included in registration records), creates a stalking and targeting risk that extends beyond the digital realm. While the probability of such targeting is statistically low, the severity if it occurs is extreme, and any responsible data protection framework assigns the highest safeguards to data that could facilitate physical harm to children.

The combination of school name, child name, and parent contact information also enables targeted social engineering attacks against parents, such as emergency scam calls claiming a child has been in an accident at their specific school.

The academic performance data introduces potential for discrimination and profiling that follows children through their educational careers. Test scores linked to identifiable individuals could be used by educational institutions, potential employers, or social contacts to make judgments about a child’s capabilities and potential. In competitive educational environments where test scores influence school placement and opportunity, the unauthorized disclosure of performance data can have tangible consequences for the affected children’s educational trajectories and future prospects.

The AWS S3 misconfiguration is a well-documented and entirely preventable class of vulnerability. Amazon has implemented multiple safeguards to prevent accidental public exposure of S3 buckets, including default-deny access policies, public access block settings at the account level, prominent visual warnings in the management console when a bucket is configured for public access, and automated security findings through AWS Trusted Advisor and AWS Config.

For this bucket to have been publicly accessible, someone had to either deliberately configure it that way or override multiple default protections. This suggests either a fundamental misunderstanding of cloud security by the development team, a conscious decision to prioritize convenience over security during development that was never remediated before production deployment, or the absence of any security review in the application deployment process.

The duration of the exposure - months of unrestricted public access - compounds the severity. During that window, the bucket contents could have been accessed, downloaded, cached, and redistributed by any number of parties. Automated scanning tools like GrayhatWarfare, Bucket Finder, and custom scripts regularly probe for open S3 buckets across the entire AWS namespace, meaning the probability that the data was accessed by unauthorized parties before researchers identified it is extremely high.

Security research has consistently demonstrated that newly created public S3 buckets are typically discovered by automated scanners within hours, not days. Even after the bucket was secured, any copies made during the exposure window remain in circulation with no mechanism for recall or deletion.

The educational testing context adds further concern. Parents who registered their children for scholastic testing did so with a reasonable expectation that the testing service would protect their children’s information. They were required to provide sensitive data - including national IDs and dates of birth - as a condition of participation. The service collected this data under an implicit duty of care that it comprehensively failed to honor.

For families in Egypt, where educational testing is often a high-stakes process linked to school placement and academic opportunity, opting out of data collection was not a realistic option.

This power asymmetry between the service and the families it served makes the negligent data handling particularly egregious.

The vendor landscape for educational technology in Egypt and the broader MENA region includes many smaller companies that may lack the security expertise and resources of larger technology firms. These EdTech providers often handle significant volumes of sensitive data - particularly children’s data

  • while operating with minimal security oversight. The Egypt Scholastic Test exposure is likely not an isolated incident but rather a visible example of a broader pattern of inadequate data security across the educational technology sector in the region. Without regulatory pressure or industry standards specific to EdTech data protection, similar exposures are likely occurring undetected.
Compliance Impact

Regulatory Analysis

This breach intersects with Egypt’s data protection framework at multiple critical points, and the involvement of children’s data elevates every regulatory dimension. Law No. 151 of 2020 on the Protection of Personal Data establishes children’s personal data as a special category requiring enhanced protections. While the law does not specify a distinct age threshold for children’s data (unlike the GDPR’s allowance for member states to set thresholds between 13 and 16), it recognizes the heightened vulnerability of minors and the need for additional safeguards when processing their information.

Article 2 of Law No. 151/2020 classifies data relating to children as “sensitive personal data” subject to enhanced processing restrictions. Under Article 3, the processing of sensitive personal data requires explicit consent - and for children, this consent must come from a parent or legal guardian. The educational testing service likely obtained some form of consent during the registration process, but consent for processing does not authorize negligent storage.

The obligation to protect data through its entire lifecycle, from collection through processing, storage, and eventual deletion, is a fundamental principle that the S3 misconfiguration violated at the storage phase. Consent to collect is not consent to expose.

Article 4 mandates that data controllers implement appropriate technical and organizational measures to ensure data security. An open S3 bucket with no authentication represents the most basic possible failure of this obligation.

The law does not specify particular technologies, but any reasonable interpretation of “appropriate measures” for children’s sensitive data would include, at minimum, access authentication, encryption at rest, access logging, and regular security reviews. The exposed bucket satisfied none of these requirements. The gap between the legal standard and the operational reality is not a matter of degree - it is a total absence of any security control whatsoever on a publicly accessible internet endpoint containing children’s personal data.

The data minimization principle embedded in Law No. 151/2020 also applies. The testing service should have evaluated whether it was necessary to collect and retain national ID numbers, parent contact details, dates of birth, and photographs in a single dataset. If testing could be administered using a pseudonymized identifier with the mapping to real identities stored separately under stronger controls, then the consolidation of all data elements in a single publicly accessible bucket violated the principle of collecting only what is necessary for the stated purpose.

The principle of storage limitation further requires that personal data be retained only for as long as necessary to fulfill the purpose for which it was collected. If test records from previous years were included in the bucket, the retention of historical children’s data beyond the period necessary for test administration raises additional compliance concerns.

Egypt’s Child Law (Law No. 12 of 1996, amended by Law No. 126 of 2008) provides additional protections for children’s rights, including the right to privacy and the obligation of institutions dealing with children to act in the child’s best interest. While this law was not designed with data protection specifically in mind, it establishes a broader legal framework within which the negligent exposure of 72,000 children’s personal data can be evaluated as a failure of institutional duty to the minors in their care.

The National Council for Childhood and Motherhood (NCCM), established under the Child Law, has authority to address violations of children’s rights and could potentially play a role in investigating data protection failures affecting minors even in the absence of full Data Protection Center capacity.

The international dimension of cloud infrastructure adds complexity to the regulatory analysis.

AWS S3 buckets are hosted in specific geographic regions, and if the bucket containing Egyptian children’s data was hosted outside Egypt (a common configuration for organizations that select regions based on cost or latency rather than data sovereignty), this raises cross-border data transfer issues under Law No. 151/2020. Article 14 restricts the transfer of personal data outside Egypt to countries that provide an adequate level of protection, with additional safeguards required in the absence of an adequacy determination.

The storage of sensitive children’s data on cloud infrastructure without consideration of data residency requirements adds a further dimension of non-compliance.

The enforcement challenge remains the central obstacle. The Data Protection Center established under Law No. 151/2020 has not achieved full operational capacity, meaning the specialized institution designed to investigate and penalize data protection violations cannot yet fulfill this function for a breach involving children’s data - the very category that most urgently demands regulatory attention.

The maximum penalty under the law is EGP 5 million (approximately $100,000 USD), which for a data breach affecting 72,000 children seems inadequate to achieve either punitive or deterrent objectives. The gap between the law on paper and its enforcement capacity in practice is perhaps nowhere more starkly illustrated than in a case involving the unprotected exposure of tens of thousands of children’s records to the open internet.

In the absence of domestic enforcement capacity, international data protection frameworks may provide additional accountability mechanisms. If any of the affected children hold EU citizenship or residency (as children of Egyptian expatriate families using the testing service from abroad), the GDPR could apply to the processing of their data, subjecting the testing service to EU regulatory enforcement.

Similarly, if the testing service processes data on behalf of schools or educational authorities in countries with active data protection enforcement, the contractual and regulatory obligations flowing from those relationships could create accountability pathways that do not depend on Egyptian enforcement capacity.

Assessment

What Should Have Been Done

The first and most fundamental requirement is proper cloud security configuration.

AWS provides multiple layers of protection against public S3 bucket exposure, and every single one of them should have been enabled. The S3 Block Public Access feature, available at both the account level and the bucket level, should have been activated as a blanket policy across the organization’s entire AWS account. This feature overrides any bucket-level configurations that might permit public access, acting as a failsafe against human error or deliberate misconfiguration.

AWS explicitly recommends this as a default security control for any account handling sensitive data, and it can be enabled with a single API call or console toggle.

Beyond the Block Public Access control, the bucket should have been configured with a restrictive bucket policy that explicitly denied access from any principal outside the organization’s AWS account. IAM roles and policies should have been used to grant access only to specific application service accounts and authorized administrative users, following the principle of least privilege. No human user should have had permanent access to the bucket - access should have been granted through temporary role assumption with session time limits.

The data should have been encrypted at rest using AWS KMS with customer-managed keys, ensuring that even if bucket permissions were misconfigured, the underlying data would remain cryptographically protected and inaccessible without the appropriate decryption key.

For data of this sensitivity involving minors, the testing service should have implemented data segregation and pseudonymization. Children’s national ID numbers, names, dates of birth, and photographs should never have been stored in the same bucket or database as test scores and school information. A pseudonymized architecture would store test records with randomly generated identifiers, with the mapping between those identifiers and real children’s identities stored in a separate, heavily restricted database with independent access controls and encryption.

This design ensures that even a complete compromise of the test results database exposes no personally identifiable information, and that de-pseudonymization requires access to a separate system that can be monitored and controlled independently.

Automated cloud security posture management (CSPM) should have been deployed to continuously monitor the AWS environment for misconfigurations. Tools like AWS Config (with conformance packs for security best practices), AWS Security Hub (which aggregates findings from multiple security services), or third-party CSPM solutions (Prisma Cloud, Wiz, Lacework, Orca) can detect publicly accessible S3 buckets within minutes of misconfiguration and trigger automated remediation or alert security teams.

AWS Config can be configured with a rule that automatically checks whether any S3 bucket allows public access and triggers an AWS Lambda function to remediate the configuration immediately. For an organization handling children’s data, continuous configuration monitoring is not an optional enhancement - it is a fundamental control that compensates for the reality that human operators will inevitably make configuration mistakes.

Access logging and monitoring should have been enabled on the S3 bucket using AWS CloudTrail and S3 Server Access Logging. These controls record every access request to the bucket, including the source IP address, timestamp, action performed, and response status. Had logging been enabled, the organization would have been able to detect unusual access patterns - such as bulk downloads from unknown IP addresses or access from geographic regions inconsistent with normal application usage - and respond before months of exposure accumulated.

The absence of logging meant that even after the bucket was secured, the organization had no way to determine the full scope of unauthorized access, leaving it unable to assess which children’s data was specifically accessed and by whom.

The organization should have conducted a Data Protection Impact Assessment (DPIA) before deploying the testing platform. Any system processing children’s sensitive data at scale warrants a formal assessment of privacy risks, security controls, and the necessity and proportionality of data collection. A DPIA would have identified the S3 storage architecture as a high-risk processing activity and required specific technical safeguards before launch.

The assessment should have questioned whether it was necessary to store all data elements in a single location, whether pseudonymization could reduce risk, and whether the development team had sufficient cloud security expertise to configure the storage securely.

The absence of any formal risk assessment suggests that data protection was not considered during the system’s design or deployment phases.

Secure development practices should have been embedded in the application development lifecycle. The organization should have adopted a DevSecOps approach that integrates security checks into every stage of development, from code review through testing and deployment. Infrastructure-as-code tools (Terraform, CloudFormation) should have been used to define S3 bucket configurations in version-controlled templates that are reviewed for security before deployment.

Automated security scanning of infrastructure configurations (using tools like Checkov, tfsec, or AWS CloudFormation Guard) would have flagged a publicly accessible bucket configuration before it was ever deployed to production.

The organization should have established a responsible disclosure program that made it easy for security researchers to report vulnerabilities without legal risk. The fact that the exposure was discovered by external researchers rather than internal monitoring demonstrates that the organization’s own security capabilities were insufficient.

A security contact published in a security.txt file at the organization’s domain root, a bug bounty program through platforms like HackerOne or Bugcrowd, or at minimum a published responsible disclosure policy with a dedicated security email address would have facilitated faster remediation and demonstrated a commitment to security that is especially important for organizations handling children’s data.

Data retention policies should have ensured that children’s data was deleted promptly after the testing purpose was fulfilled. Test scores and associated personal data should be retained only for the minimum period necessary to deliver test results and fulfill any legitimate administrative purpose. Automated data lifecycle management using S3 lifecycle policies can automatically transition data to restricted storage classes and eventually delete it according to predefined retention schedules.

The persistent storage of children’s data beyond its useful life increases exposure risk without any corresponding benefit, violating both the storage limitation principle and basic risk management logic.

Finally, regulatory and contractual safeguards should have been in place between the testing service and the educational institutions or government bodies that authorized the testing. These organizations have a duty to their students and their families to ensure that any third party handling children’s data maintains adequate security standards. Data processing agreements should mandate specific security controls, regular security assessments, and immediate breach notification.

The absence of these contractual safeguards suggests that the organizations commissioning the testing service did not conduct adequate due diligence on the service provider’s data protection capabilities before entrusting it with their students’ most sensitive information.

Exposing 72,000 children’s personal data on an unprotected cloud storage bucket is not a sophisticated attack or an unforeseeable event - it is a basic cloud security failure with lifelong consequences for the affected minors.

Egyptian children whose national IDs were exposed in 2022 will carry the risk of that exposure for decades. Organizations that collect children’s data accept an elevated duty of care that demands security measures proportionate to the vulnerability of their data subjects. When that duty is met with a publicly accessible S3 bucket and no monitoring, the failure is not technical - it is institutional.