Skip to main content

Our Methodology

How SchoolSeek processes and presents school data.

Official EMIS Data + NSC Matric Results + NEIMS/EFMS Infrastructure

Last processed: 2026-03-30. School data: DBE EMIS Q3 2025. Matric results: NSC School Performance Report 2025. Infrastructure: NEIMS/EFMS 2021 (historical).

Data Source: EMIS

All school data on SchoolSeek comes from the Education Management Information System (EMIS), maintained by the South African Department of Basic Education. The EMIS dataset includes institutional information for every registered school in South Africa — public and independent. SchoolSeek currently uses the Q3 2025 national EMIS extract, which covers 25,527 schools nationally. We filter this to the Western Cape province (1,935 schools) for our initial launch.

Data Processing Pipeline

Raw EMIS data is processed through an automated pipeline that: (1) parses the national Excel file, (2) filters to the target province, (3) normalises inconsistent fields (urban/rural casing, no-fee status values, phone numbers), (4) validates GIS coordinates against South Africa's geographic bounds, (5) replaces sentinel values (99, None) with explicit null markers, and (6) generates URL-safe slugs for every school and area. The pipeline runs comprehensive validation tests after every execution.

Class Size Context Methodology

Schools are grouped into peer groups based on three dimensions: quintile (or independent status), phase (primary, secondary, combined), and province. Within each peer group, we calculate the percentile rank for the learner-to-educator ratio. A school with "smaller classes than 80% of schools in the same quintile and phase" has a lower (more favourable) ratio than 80% of schools in its peer group. Schools with missing data receive no percentile. Peer groups with fewer than 5 schools are flagged as having limited comparison value.

Data Exclusions and Overrides

To ensure meaningful comparisons, we apply two data quality measures: (1) Schools with fewer than 5 enrolled learners are excluded from class size percentile calculations. These are typically newly registered, not yet operational, or special-purpose schools where a ratio like "0.5:1" would be statistically meaningless and would skew peer group distributions. These schools still have profiles on SchoolSeek — they simply show "comparison not available" with an explanation. (2) We maintain a manual overrides file to correct known EMIS data errors (such as address data in suburb fields). All overrides are documented with the original value, the correction, and the reason. Both the exclusion threshold and all overrides are visible in our open-source pipeline code.

NSC Matric Results

SchoolSeek integrates National Senior Certificate (NSC) examination results from the DBE School Performance Report 2025 (published January 2026). This report covers all schools that participated in the 2025 NSC Examination, including public, special, and independent schools. For Western Cape, 435 secondary and combined schools have matric data. The data includes: number of candidates who wrote the exam, number who achieved the NSC, and the pass rate percentage for each year (2023, 2024, and 2025). We do not have per-school bachelor pass rates or distinction counts — the School Performance Report only publishes overall pass/fail data at the school level.

Matric Pass Rate Context

We calculate pass rate percentiles within peer groups (schools in the same quintile and province). This contextual comparison is essential for impartiality — a Quintile 1 school's pass rate should be compared to other Quintile 1 schools, not to Quintile 5 schools with vastly different resources. The percentile shows what proportion of schools in the same quintile achieved a lower pass rate. For example, "higher pass rate than 75% of Q3 schools" means this school's pass rate exceeds three-quarters of Quintile 3 schools in the Western Cape. Schools are joined to matric data by EMIS number. Primary-only schools (which do not write matric) are correctly excluded.

Infrastructure Data (NEIMS/EFMS)

SchoolSeek displays province-level school infrastructure statistics from the National Education Infrastructure Management System (NEIMS), as reported on 12 April 2021 by the Department of Basic Education. NEIMS has since been superseded by the Education Facility Management System (EFMS), but the EFMS 2023 report uses the same 2021 data snapshot — meaning the 2021 figures remain the most recent publicly available infrastructure census data. This data covers 1,457 ordinary operational public schools in the Western Cape and includes: facilities (libraries, laboratories, computer centres, sports facilities), utilities (electricity, water supply, sanitation), connectivity (internet access, landline, cell network), and security features. Important: This is historical province-level aggregate data — it shows what percentage of schools had each facility in 2021, not whether a specific school has a library or lab today. Per-school infrastructure data is not publicly available from the DBE. When per-school data becomes available (via future DBE publications or PAIA requests), we will integrate it to show facility information on individual school profiles.

What We Measure

With EMIS data and NSC results, we can report on: school name, EMIS number, phase, sector, quintile, no-fee status, urban/rural classification, learner count, educator count, learner-to-educator ratio, physical address, GIS coordinates, contact details, and NSC matric pass rates (2023–2025). We calculate the learner-to-educator ratio and peer group percentiles for both class size and pass rate.

What We Cannot Measure

Our data does not include: bachelor pass rates or distinction counts per school (the DBE School Performance Report only publishes overall pass/fail at school level), learner progress or value-added measures, teaching quality assessments, school culture or safety, parent or learner satisfaction, extracurricular offerings, or per-school infrastructure details (whether a specific school has a library, lab, or computer centre — only province-level NEIMS aggregate data is publicly available). We present what we can verify from official sources and clearly label the limitations.

Impartiality Principles

SchoolSeek follows strict impartiality rules: (1) We never produce composite scores, letter grades, or star ratings — these would primarily reflect community wealth, not school quality. (2) We always compare like-with-like — a Quintile 1 school is only compared to other Quintile 1 schools. (3) We never conflate resources with quality — a school's quintile and funding level describe its context, not its effectiveness. (4) We present "Data not available" rather than hiding missing fields. (5) Before publishing any metric, we ask: "Will this cause schools serving disadvantaged communities to be perceived as inferior?"

Data Coverage and Completeness

We cross-referenced our EMIS Q3 2025 dataset against multiple external sources to verify coverage: (1) the WCED Sector Analysis 2022 (published by the World Bank), which reported 1,827 schools (1,451 public ordinary, 304 independent, 72 special needs); (2) Wikipedia's List of Secondary Schools in the Western Cape; and (3) ISASA independent school directories. Our dataset contains 1,935 schools — 108 more than the 2022 baseline, consistent with 3 years of new school registrations (particularly independent schools). Over 97% of Wikipedia-listed secondary schools were found in our data; most apparent gaps were due to Afrikaans name variants (e.g., "Bellville High" is listed as "Bellville Hoërskool" in EMIS). GIS coordinate coverage is 99.1% (1,918 of 1,935 schools). A small number of schools known by common names differ from their EMIS registration names — for example, Diocesan College appears as "Bishops" and SACS appears as "S.A. College". These mappings are documented in our overrides file. Full audit findings are published in our data repository.

Data Limitations and Known Issues

The EMIS dataset has known limitations: approximately 25% of Western Cape schools have missing suburb values (we use fallback logic: suburb, then town/city, then district). Some Eastern Cape schools have latitude and longitude coordinates swapped (corrected in our pipeline). The NoFeeSchool and Urban_Rural fields have inconsistent formatting across provinces (normalised in our pipeline). GIS coordinates for some schools are missing or inaccurate (17 schools lack coordinates). We validate all coordinates against South Africa's geographic bounds and flag outliers. The Eden and Central Karoo district name is truncated in some EMIS records — both variants are mapped to the same district in our pipeline.

Updates and Corrections

EMIS data is updated annually by the Department of Basic Education. SchoolSeek refreshes its data when new EMIS extracts are published. The "last updated" date on every page reflects when our data was last processed. If you notice an error in a school's data, please contact us — we will verify against the source EMIS data and correct our pipeline if needed.