Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions products/facilities/TODO_csv_refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# TODO: CSV Refactor for FACTYPE/FACSUBGRP Assignment

## Goal

Replace hardcoded CASE statements for `factype` and `facsubgrp` assignment in pipeline SQL files with two lookup CSVs. This enforces the one-to-many relationship between `FACTYPE` and `FACSUBGRP` structurally rather than through a dbt test.

## Prerequisites

- The `many_to_one_facdb_export__FACTYPE____FACSUBGRP_` dbt test is **passing** (completed).
- All pipeline SQL files have been fixed so each `factype` maps to exactly one `facsubgrp`.

## Two Lookup CSVs to Create

### 1. `facdb/data/lookup_factype_source.csv`
Maps source values to `factype`:
- Columns: `source_name`, `source_column`, `source_value`, `factype`
- Replaces the per-pipeline CASE statements that assign `factype`

### 2. `facdb/data/lookup_factype.csv`
Maps `factype` to `facsubgrp`:
- Columns: `factype`, `facsubgrp`
- Enforces the one-to-many constraint structurally (one row per factype)
- Do **not** merge this into `lookup_classification.csv` — keep it separate

## Rules

- Add `ELSE NULL` to any remaining CASE statements not covered by the CSVs, so unmapped values fail the `not_null` test rather than silently passing through.
- Do **not** add `factype` to `lookup_classification.csv`.
- All 55 pipelines listed below are in scope — not just the four fixed in the prior phase.

## Pipelines in Scope

All 55 pipeline files that assign `factype` and/or `facsubgrp`:

```
bpl_libraries.sql
dca_operatingbusinesses.sql
dcla_culturalinstitutions.sql
dcp_colp.sql
dcp_pops.sql
dcp_sfpsd.sql
dep_wwtc.sql
dfta_contracts.sql
doe_busroutesgarages.sql
doe_lcgms.sql
doe_universalprek.sql
dohmh_daycare.sql
dot_bridgehouses.sql
dot_ferryterminals.sql
dot_mannedfacilities.sql
dot_pedplazas.sql
dot_publicparking.sql
dpr_parksproperties.sql
dsny_donatenycdirectory.sql
dsny_electronicsdrop.sql
dsny_fooddrop.sql
dsny_garages.sql
dsny_leafdrop.sql
dsny_specialwastedrop.sql
dycd_service_sites.sql
fbop_corrections.sql
fdny_firehouses.sql
foodbankny_foodbanks.sql
hhc_hospitals.sql
hra_jobcenters.sql
hra_medicaid.sql
hra_snapcenters.sql
moeo_socialservicesitelocations.sql
nycdoc_corrections.sql
nycha_communitycenters.sql
nycha_policeservice.sql
nycourts_courts.sql
nypl_libraries.sql
nysdec_lands.sql
nysdec_solidwaste.sql
nysdoccs_corrections.sql
nysdoh_healthfacilities.sql
nysdoh_nursinghomes.sql
nysed_activeinstitutions.sql
nysoasas_programs.sql
nysomh_mentalhealth.sql
nysopwdd_providers.sql
nysparks_historicplaces.sql
nysparks_parks.sql
qpl_libraries.sql
sbs_workforce1.sql
uscourts_courts.sql
usdot_airports.sql
usdot_ports.sql
usnps_parks.sql
```

## Reference

See `facdb/data/lookup_classification.csv` and `facdb/sql/_create_facdb_classification.sql` for the existing pattern to follow when wiring up new CSVs.
35 changes: 35 additions & 0 deletions products/facilities/facdb/data/lookup_factype.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
factype,facsubgrp
2 Year Independent,Colleges or Universities
2 Year Proprietary,Colleges or Universities
4-Year Independent,Colleges or Universities
4 Year Proprietary,Colleges or Universities
Academic Libraries,Academic and Special Libraries
Approved Private Schools For Swd,Public and Private Special Education Schools
Charter School,Charter K-12 Schools
CUNY - 4 Year College,Colleges or Universities
CUNY - Community College,Colleges or Universities
CUNY - Graduate Center,Colleges or Universities
Elementary School - Non-public,Non-Public K-12 Schools
Feeding Site,Child Nutrition
Ged-Alternative High School Equivalency Prep Programs,GED and Alternative High School Equivalency
Graduate Programs Only,Colleges or Universities
High School - Non-public,Non-Public K-12 Schools
Historical Societies,Historical Societies
Licensed Private Schools,Proprietary Schools
Middle School - Non-public,Non-Public K-12 Schools
Other School - Non-public,Non-Public K-12 Schools
Pre-School Evaluation Providers For Swd,Preschools for Students with Disabilities
Pre-School For Students With Disabilities,Preschools for Students with Disabilities
Private Museums And Sites,Museums
Proprietary Schools,Proprietary Schools
Public Museums And Sites,Museums
Regents Approved Foreign Colleges,Non-Public K-12 Schools
Registered Business Schools,Proprietary Schools
Registered Esl Schools,Proprietary Schools
Satellite Site For Students With Disabilities,Non-Public K-12 Schools
Special Libraries,Academic and Special Libraries
Summer Only Feeding Site,Child Nutrition
SUNY - Community Colleges,Colleges or Universities
SUNY - Health Science Centers,Colleges or Universities
SUNY - Specialized Colleges,Colleges or Universities
Usda Community Eligibility Option,Child Nutrition
36 changes: 36 additions & 0 deletions products/facilities/facdb/data/lookup_factype_source.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
source_name,source_column,source_value,factype
nysed_activeinstitutions,classification_key,APPROVED PRE-SCHOOL PROGRAMS|PRE-SCHOOL FOR STUDENTS WITH DISABILITIES,Pre-School For Students With Disabilities
nysed_activeinstitutions,classification_key,APPROVED PRE-SCHOOL PROGRAMS|SATELLITE SITE FOR STUDENTS WITH DISABILITIES,Satellite Site For Students With Disabilities
nysed_activeinstitutions,classification_key,CHILD NUTRITION|FEEDING SITE,Feeding Site
nysed_activeinstitutions,classification_key,CHILD NUTRITION|SUMMER ONLY FEEDING SITE,Summer Only Feeding Site
nysed_activeinstitutions,classification_key,CHILD NUTRITION|USDA COMMUNITY ELIGIBILITY OPTION,Usda Community Eligibility Option
nysed_activeinstitutions,classification_key,CUNY|CUNY 4 YEAR COLLEGE,CUNY - 4 Year College
nysed_activeinstitutions,classification_key,CUNY|CUNY COMMUNITY COLLEGE,CUNY - Community College
nysed_activeinstitutions,classification_key,CUNY|CUNY GRADUATE CENTER,CUNY - Graduate Center
nysed_activeinstitutions,classification_key,LIBRARIES|ACADEMIC LIBRARIES,Academic Libraries
nysed_activeinstitutions,classification_key,LIBRARIES|SPECIAL LIBRARIES,Special Libraries
nysed_activeinstitutions,classification_key,MUSEUMS AND HISTORICAL SOCIETIES|HISTORICAL SOCIETIES,Historical Societies
nysed_activeinstitutions,classification_key,MUSEUMS AND HISTORICAL SOCIETIES|PRIVATE MUSEUMS AND SITES,Private Museums And Sites
nysed_activeinstitutions,classification_key,MUSEUMS AND HISTORICAL SOCIETIES|PUBLIC MUSEUMS AND SITES,Public Museums And Sites
nysed_activeinstitutions,classification_key,NON-PUBLIC SCHOOLS|ELEM,Elementary School - Non-public
nysed_activeinstitutions,classification_key,NON-PUBLIC SCHOOLS|MIDDLE,Middle School - Non-public
nysed_activeinstitutions,classification_key,NON-PUBLIC SCHOOLS|HIGH,High School - Non-public
nysed_activeinstitutions,classification_key,NON-PUBLIC SCHOOLS|OTHER,Other School - Non-public
nysed_activeinstitutions,classification_key,OTHER HE|REGENTS APPROVED FOREIGN COLLEGES,Regents Approved Foreign Colleges
nysed_activeinstitutions,classification_key,OTHER SCHOOLS SERVING STUDENTS WITH DISABILITIES|APPROVED PRIVATE SCHOOLS FOR SWD,Approved Private Schools For Swd
nysed_activeinstitutions,classification_key,OTHER SCHOOLS SERVING STUDENTS WITH DISABILITIES|PRE-SCHOOL EVALUATION PROVIDERS FOR SWD,Pre-School Evaluation Providers For Swd
nysed_activeinstitutions,classification_key,PROPRIETARY SCHOOLS|LICENSED PRIVATE SCHOOLS,Licensed Private Schools
nysed_activeinstitutions,classification_key,PROPRIETARY SCHOOLS|REGISTERED BUSINESS SCHOOLS,Registered Business Schools
nysed_activeinstitutions,classification_key,PROPRIETARY SCHOOLS|REGISTERED ESL SCHOOLS,Registered Esl Schools
nysed_activeinstitutions,classification_key,PUBLIC SCHOOLS|CHARTER SCHOOL,Charter School
nysed_activeinstitutions,classification_key,PUBLIC SCHOOLS|GED-ALTERNATIVE HIGH SCHOOL EQUIVALENCY PREP PROGRAMS(AHSEP),Ged-Alternative High School Equivalency Prep Programs
nysed_activeinstitutions,classification_key,PUBLIC SCHOOLS|SATELLITE SITE FOR CHARTER SCHOOLS,Charter School
nysed_activeinstitutions,classification_key,REGENTS APPROVED INDEPENDENT COLLEGES|2 YEAR INDEPENDENT,2 Year Independent
nysed_activeinstitutions,classification_key,REGENTS APPROVED INDEPENDENT COLLEGES|4-YEAR INDEPENDENT,4-Year Independent
nysed_activeinstitutions,classification_key,REGENTS APPROVED INDEPENDENT COLLEGES|GRADUATE PROGRAMS ONLY,Graduate Programs Only
nysed_activeinstitutions,classification_key,REGENTS APPROVED PROPRIETARY COLLEGES|2 YEAR PROPRIETARY,2 Year Proprietary
nysed_activeinstitutions,classification_key,REGENTS APPROVED PROPRIETARY COLLEGES|4 YEAR PROPRIETARY,4 Year Proprietary
nysed_activeinstitutions,classification_key,REGENTS APPROVED PROPRIETARY COLLEGES|GRADUATE PROGRAMS ONLY,Graduate Programs Only
nysed_activeinstitutions,classification_key,SUNY|SUNY COMMUNITY COLLEGES,SUNY - Community Colleges
nysed_activeinstitutions,classification_key,SUNY|SUNY HEALTH SCIENCE CENTERS,SUNY - Health Science Centers
nysed_activeinstitutions,classification_key,SUNY|SUNY SPECIALIZED COLLEGES,SUNY - Specialized Colleges
18 changes: 18 additions & 0 deletions products/facilities/facdb/sql/_create_reference_tables.sql
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,21 @@ CREATE TABLE manual_corrections (
new_value TEXT
);
\COPY manual_corrections FROM 'facdb/data/manual_corrections.csv' DELIMITER ',' CSV HEADER;


DROP TABLE IF EXISTS lookup_factype_source;
CREATE TABLE lookup_factype_source (
source_name TEXT,
source_column TEXT,
source_value TEXT,
factype TEXT
);
\COPY lookup_factype_source FROM 'facdb/data/lookup_factype_source.csv' DELIMITER ',' CSV HEADER;


DROP TABLE IF EXISTS lookup_factype;
CREATE TABLE lookup_factype (
factype TEXT,
facsubgrp TEXT
);
\COPY lookup_factype FROM 'facdb/data/lookup_factype.csv' DELIMITER ',' CSV HEADER;
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,28 @@ WITH merged AS (
+ g11::numeric
+ g12::numeric
+ ugs::numeric
) AS enrollment
) AS enrollment,
-- Pre-compute grade-band key for NON-PUBLIC SCHOOLS (factype depends on
-- enrollment counts, not just source category values).
-- All other pipelines use inst_type_description|inst_sub_type_description.
CASE
WHEN inst_type_description = 'NON-PUBLIC SCHOOLS'
AND (
prek::numeric + halfk::numeric + fullk::numeric
+ g01::numeric + g02::numeric + g03::numeric
+ g04::numeric + g05::numeric + uge::numeric
) > 0
THEN 'NON-PUBLIC SCHOOLS|ELEM'
WHEN inst_type_description = 'NON-PUBLIC SCHOOLS'
AND (g06::numeric + g07::numeric + g08::numeric) > 0
THEN 'NON-PUBLIC SCHOOLS|MIDDLE'
WHEN inst_type_description = 'NON-PUBLIC SCHOOLS'
AND (g09::numeric + g10::numeric + g11::numeric + g12::numeric + ugs::numeric) > 0
THEN 'NON-PUBLIC SCHOOLS|HIGH'
WHEN inst_type_description = 'NON-PUBLIC SCHOOLS'
THEN 'NON-PUBLIC SCHOOLS|OTHER'
ELSE inst_type_description || '|' || inst_sub_type_description
END AS classification_key
FROM nysed_activeinstitutions
LEFT JOIN nysed_nonpublicenrollment
ON nysed_activeinstitutions.sed_code::bigint = nysed_nonpublicenrollment.beds_code
Expand Down Expand Up @@ -85,66 +106,8 @@ SELECT
NULL AS borocode,
NULL AS bin,
NULL AS bbl,
(
CASE
WHEN inst_sub_type_description LIKE '%CHARTER SCHOOL%' THEN 'Charter School'
WHEN inst_type_description = 'CUNY'
THEN concat(
'CUNY - ',
initcap(right(inst_sub_type_description, -5))
)
WHEN inst_type_description = 'SUNY'
THEN concat(
'SUNY - ',
initcap(right(inst_sub_type_description, -5))
)
WHEN
inst_type_description = 'NON-PUBLIC SCHOOLS'
AND prek + halfk + fullk + g01 + g02 + g03 + g04 + g05 + uge > 0
THEN 'Elementary School - Non-public'
WHEN
inst_type_description = 'NON-PUBLIC SCHOOLS'
AND g06 + g07 + g08 > 0 THEN 'Middle School - Non-public'
WHEN
inst_type_description = 'NON-PUBLIC SCHOOLS'
AND g09 + g10 + g11 + g12 + ugs > 0 THEN 'High School - Non-public'
WHEN
inst_type_description = 'NON-PUBLIC SCHOOLS'
AND inst_sub_type_description NOT LIKE 'ESL'
AND inst_sub_type_description NOT LIKE 'BUILDING' THEN 'Other School - Non-public'
WHEN inst_sub_type_description LIKE '%AHSEP%' THEN initcap(split_part(inst_sub_type_description, '(', 1))
ELSE initcap(inst_sub_type_description)
END
) AS factype,
(
CASE
WHEN inst_sub_type_description LIKE '%GED-ALTERNATIVE%' THEN 'GED and Alternative High School Equivalency'
WHEN inst_sub_type_description LIKE '%CHARTER SCHOOL%' THEN 'Charter K-12 Schools'
WHEN inst_sub_type_description LIKE '%MUSEUM%' THEN 'Museums'
WHEN inst_sub_type_description LIKE '%HISTORICAL%' THEN 'Historical Societies'
WHEN inst_type_description LIKE '%LIBRARIES%' THEN 'Academic and Special Libraries'
WHEN inst_type_description LIKE '%CHILD NUTRITION%' THEN 'Child Nutrition'
WHEN
inst_sub_type_description LIKE '%PRE-SCHOOL%'
AND (
inst_sub_type_description LIKE '%DISABILITIES%'
OR inst_sub_type_description LIKE '%SWD%'
) THEN 'Preschools for Students with Disabilities'
WHEN (inst_type_description LIKE '%DISABILITIES%') THEN 'Public and Private Special Education Schools'
WHEN inst_sub_type_description LIKE '%PRE-K%' THEN 'City Government Offices'
WHEN
(inst_type_description LIKE 'PUBLIC%')
OR (inst_sub_type_description LIKE 'PUBLIC%') THEN 'Public K-12 Schools'
WHEN
(inst_type_description LIKE '%COLLEGE%')
OR (inst_type_description LIKE '%CUNY%')
OR (inst_type_description LIKE '%SUNY%')
OR (inst_type_description LIKE '%SUNY%') THEN 'Colleges or Universities'
WHEN inst_type_description LIKE '%PROPRIETARY%' THEN 'Proprietary Schools'
WHEN inst_type_description LIKE '%NON-IMF%' THEN 'Public K-12 Schools'
ELSE 'Non-Public K-12 Schools'
END
) AS facsubgrp,
lfs.factype,
lft.facsubgrp,
(
CASE
WHEN inst_type_description = 'PUBLIC SCHOOLS' THEN 'NYC Department of Education'
Expand Down Expand Up @@ -173,6 +136,11 @@ SELECT
NULL AS geo_bl,
NULL AS geo_bn
INTO _nysed_activeinstitutions
FROM merged;
FROM merged
LEFT JOIN lookup_factype_source AS lfs
ON lfs.source_name = 'nysed_activeinstitutions'
AND lfs.source_value = merged.classification_key
LEFT JOIN lookup_factype AS lft
ON lft.factype = lfs.factype;

CALL append_to_facdb_base('_nysed_activeinstitutions');
Loading