Skip to main content

Our Methodology

How SchoolSeek processes and presents school data.

Official EMIS Data + NSC Matric Results + NEIMS/EFMS Infrastructure

Last processed: 2026-05-03. School data: DBE EMIS Q3 2025. Matric results: NSC School Performance Report 2025. Infrastructure: NEIMS/EFMS 2021 (historical).

Data Source: EMIS

All school data on SchoolSeek comes from the Education Management Information System (EMIS), maintained by the South African Department of Basic Education. The EMIS dataset includes institutional information for every registered school in South Africa — public and independent. SchoolSeek currently uses the Q3 2025 national EMIS extract, which covers 25,527 schools across all 9 provinces.

EMIS Classification Fields

Beyond core fields like name, location, and learner count, we surface EMIS classification metadata including phase, sector, quintile, and specialisation. The specialisation field distinguishes ordinary schools (the majority), schools focused on specific subject phases (PRIMARY SUBJECTS, SECONDARY SUBJECTS), and schools serving specific learner groups (SPECIAL NEEDS EDUCATION, SCHOOL OF SKILL). Some provinces include richer SEN sub-types in this field — e.g. AUTISM, DEAF, CEREBRAL PALSIED. These are administrative classifications: they describe what kind of school it is, not how well it performs. We never use specialisation as an input to peer comparison, ranking, or any quality signal. Like all EMIS data, classification values reflect the most recent annual snapshot we have ingested. Specialisation is among the most stable EMIS fields — a school's administrative classification rarely changes between annual refreshes (reclassification is a multi-year DBE process), so values are typically reliable. Contact details, by contrast, are far more likely to drift between refreshes; please report any incorrect contact information so we can verify and update.

Data Processing Pipeline

Raw EMIS data is processed through an automated pipeline that: (1) parses the national Excel file, (2) filters to the target province, (3) normalises inconsistent fields (urban/rural casing, no-fee status values, phone numbers), (4) validates GIS coordinates against South Africa's geographic bounds, (5) replaces sentinel values (99, None) with explicit null markers, and (6) generates URL-safe slugs for every school and area. The pipeline runs comprehensive validation tests after every execution.

Learner-Educator Ratio Methodology

Schools are grouped into peer groups based on three dimensions: quintile (or independent status), phase (primary, secondary, combined), and province. Within each peer group, we calculate the percentile rank for the learner-to-educator ratio. A school with a "lower L:E ratio than 80% of similar schools" has a more favourable ratio than 80% of schools in its peer group. Important: the EMIS "educators" count includes all staff classified as educators — principals, deputy principals, heads of department, and specialist staff — not just classroom teachers. The L:E ratio is therefore a staffing indicator, not a direct measure of class size. Actual class sizes are typically higher than the L:E ratio suggests. Schools with missing data receive no percentile. Peer groups with fewer than 5 schools are flagged as having limited comparison value.

Data Exclusions and Overrides

To ensure meaningful comparisons, we apply several data quality measures: (1) Schools with fewer than 5 enrolled learners are excluded from learner-educator ratio percentile calculations. These are typically newly registered, not yet operational, or special-purpose schools where a ratio like "0.5:1" would be statistically meaningless and would skew peer group distributions. These schools still have profiles on SchoolSeek — they simply show "comparison not available" with an explanation. (2) We maintain a manual overrides file to correct known EMIS data errors (such as incorrect suburb fields, changed phone numbers, or other contact details). All overrides are documented with the original value, the correction, and the reason. (3) Area-to-district assignment uses majority voting — when schools in the same suburb span multiple education districts (due to EMIS data errors or boundary edge cases), the area is assigned to the district containing the majority of its schools. This prevents a single misclassified school from placing an entire area in the wrong district. Both the exclusion threshold and all overrides are documented and available on request.

EMIS data freshness and user-reported corrections

The Department of Basic Education refreshes EMIS once per year. Between refreshes, school details — particularly contact information like phone numbers and email addresses — can become outdated. Schools rarely submit interim corrections to EMIS, so a number on EMIS may continue to point to a disconnected line for many months after a school has changed it. We address this in two ways. First, we cross-reference school contact details against secondary sources (school websites, Facebook pages, recent news coverage, and direct school communication) when discrepancies are flagged. Second, we accept and apply user-reported corrections — but only when corroborated by at least one credible secondary source (such as the school's own Facebook page, an official school communication, or direct confirmation from school leadership). Every applied correction is logged in our public overrides file with the reporting source, the corroboration, and the date applied. We never apply a correction based on a single anonymous report without independent verification, and we never alter peer-comparison or matric data through this channel — overrides are scoped to factual contact and location fields only. If you spot incorrect information on a school profile, please email us with the correction and any supporting evidence; we will verify and update.

NSC Matric Results

SchoolSeek integrates National Senior Certificate (NSC) examination results from the DBE 2025 School Performance Report (published January 2026, source: education.gov.za). This report covers all schools that participated in the 2025 NSC Examination across all 9 provinces, including public, special, and independent schools. Nationally, 6,578 schools have matric data. For each school we store: number of progressed learners, number of candidates who wrote the exam, number who achieved the NSC, and the pass rate percentage for each year (2023, 2024, and 2025). On school profiles we display the pass rate, the number of candidates who passed out of those who wrote, and a 3-year trend chart. We show candidate counts alongside pass rates because they add essential context — a 100% pass rate from 5 candidates is a very different result from 100% from 500 candidates. We do not have per-school bachelor pass rates or distinction counts — the School Performance Report only publishes overall pass/fail data at the school level. IEB and Cambridge schools do not have per-school results in the DBE report; these schools are clearly labelled as "IEB/Cambridge results not publicly available" rather than shown as having no data.

Matric Pass Rate Context

We calculate pass rate percentiles within peer groups (schools in the same quintile and province). This contextual comparison is essential for impartiality — a Quintile 1 school's pass rate should be compared to other Quintile 1 schools, not to Quintile 5 schools with vastly different resources. The percentile shows what proportion of schools in the same quintile achieved a lower pass rate. For example, "higher pass rate than 75% of Q3 schools" means this school's pass rate exceeds three-quarters of Quintile 3 schools in the Western Cape. Schools are joined to matric data by EMIS number. Primary-only schools (which do not write matric) are correctly excluded.

Infrastructure Data (NEIMS/EFMS)

SchoolSeek displays province-level school infrastructure statistics from the National Education Infrastructure Management System (NEIMS), as reported on 12 April 2021 by the Department of Basic Education. NEIMS has since been superseded by the Education Facility Management System (EFMS), but the EFMS 2023 report uses the same 2021 data snapshot — meaning the 2021 figures remain the most recent publicly available infrastructure census data. This data covers 1,457 ordinary operational public schools in the Western Cape and includes: facilities (libraries, laboratories, computer centres, sports facilities), utilities (electricity, water supply, sanitation), connectivity (internet access, landline, cell network), and security features. Important: This is historical province-level aggregate data — it shows what percentage of schools had each facility in 2021, not whether a specific school has a library or lab today. Per-school infrastructure data is not publicly available from the DBE. When per-school data becomes available (via future DBE publications or PAIA requests), we will integrate it to show facility information on individual school profiles.

What We Measure

With EMIS data and NSC results, we can report on: school name, EMIS number, phase, sector, specialisation (administrative classification — see "EMIS Classification Fields" above), quintile, no-fee status, urban/rural classification, learner count, educator count, learner-to-educator ratio, physical address, GIS coordinates, contact details, and NSC matric pass rates (2023–2025). We calculate the learner-to-educator ratio and peer group percentiles for both L:E ratio and pass rate. Note: the educator count includes all staff classified as educators by EMIS (teachers, HODs, deputy principals, principals) — the L:E ratio is a staffing indicator, not a direct measure of class size.

What We Cannot Measure

Our data does not include: bachelor pass rates or distinction counts per school (the DBE School Performance Report only publishes overall pass/fail at school level), learner progress or value-added measures, teaching quality assessments, school culture or safety, parent or learner satisfaction, extracurricular offerings, or per-school infrastructure details (whether a specific school has a library, lab, or computer centre — only province-level NEIMS aggregate data is publicly available). We present what we can verify from official sources and clearly label the limitations.

Impartiality Principles

SchoolSeek follows strict impartiality rules: (1) We never produce composite scores, letter grades, or star ratings — these would primarily reflect community wealth, not school quality. (2) We always compare like-with-like — a Quintile 1 school is only compared to other Quintile 1 schools. (3) We never conflate resources with quality — a school's quintile and funding level describe its context, not its effectiveness. (4) We present "Data not available" rather than hiding missing fields. (5) Before publishing any metric, we ask: "Will this cause schools serving disadvantaged communities to be perceived as inferior?"

Data Coverage and Completeness

We cross-referenced our EMIS Q3 2025 dataset against multiple external sources to verify coverage: (1) the WCED Sector Analysis 2022 (published by the World Bank), which reported 1,827 schools (1,451 public ordinary, 304 independent, 72 special needs); (2) Wikipedia's List of Secondary Schools in the Western Cape; and (3) ISASA independent school directories. Our dataset contains 1,935 schools — 108 more than the 2022 baseline, consistent with 3 years of new school registrations (particularly independent schools). Over 97% of Wikipedia-listed secondary schools were found in our data; most apparent gaps were due to Afrikaans name variants (e.g., "Bellville High" is listed as "Bellville Hoërskool" in EMIS). GIS coordinate coverage is 99.1% (1,918 of 1,935 schools). A small number of schools known by common names differ from their EMIS registration names — for example, Diocesan College appears as "Bishops" and SACS appears as "S.A. College". These mappings are documented in our overrides file. Full audit findings are published in our data repository.

Data Limitations and Known Issues

The EMIS dataset has known limitations: (1) Approximately 100 of 601 Northern Cape schools are missing GIS coordinates in the EMIS source data — these schools will not appear as pins in the Map Explorer, but their profiles and list view remain fully functional. The remaining Northern Cape schools have valid coordinates and show on the map normally. (2) Approximately 25% of schools nationally have missing suburb values (we use fallback logic: suburb, then town/city, then district). (3) Some Eastern Cape schools have latitude and longitude coordinates swapped (corrected in our pipeline). (4) The Urban/Rural classification field is sourced directly from EMIS and reflects historical registration data. Some schools — particularly in metropolitan areas — may be classified as RURAL despite being in urban suburbs. This is a known EMIS data quality issue and does not reflect the current operational setting of the school. (5) A small number of schools at Quintile 4 or 5 are classified as No-Fee in the EMIS data. This can occur when schools are reclassified from lower quintiles but the no-fee status field is not updated simultaneously in the source data. Parents should verify current fee arrangements directly with the school. (6) The NoFeeSchool field and various name fields have inconsistent formatting across provinces (normalised in our pipeline). We validate all coordinates against South Africa's geographic bounds and flag outliers. The Eden and Central Karoo district name is truncated in some EMIS records — both variants are mapped to the same district in our pipeline.

Updates and Corrections

EMIS data is updated annually by the Department of Basic Education. SchoolSeek refreshes its data when new EMIS extracts are published. The "last updated" date on every page reflects when our data was last processed. If you notice an error in a school's data, please contact us — we will verify against the source EMIS data and correct our pipeline if needed.