Nationalizing the Houston New-Home Dashboard

Public Data Sources, Automation Paths, and Modeling Recipes

← Back to research index

Table of Contents

  1. Executive Summary
  2. Houston Presentation Metric Inventory
  3. Public and Government-Adjacent Source Catalog
  4. Metric-to-Source Mapping and Per-Metric Replication Difficulty
  5. Automated Ingestion Playbooks by Connector Family
  6. Houston-Specific Source Inference and Gap Analysis
  7. Modeling Methods
  8. National Rollout Plan Across Top-25 CBSAs
  9. Implementation Artifacts

Executive Summary

The Houston presentation (Land Tejas Private Consult dated Feb 12, 2026) is essentially a multi-layer housing "operating system": it mixes (a) macro demand drivers (population, jobs), (b) resale pressure/relief (MLS inventory & closings), and (c) new-home supply pipeline metrics that require construction-lifecycle visibility (starts → under construction → completed → sold/closed, plus lot pipeline: VDL and "future lots"). In Houston, the presenter's "CBAS" metrics strongly indicate a hybrid system: public sources for macro/permits + a field-research / community-level operational database for starts/closings/UC/FV/VDL/future lots (that level of granularity is not available nationally from federal statistics). This is consistent with CBAS's positioning around "primary and secondary granular housing data" and the presence of a "Director of Field Research Operations."

For a national rollout across "top-25 CBSAs" (unspecified—placeholders maintained), the most automatable national spine is:

To reproduce Houston-style "starts / UC / completed / finished vacant / VDL / future lots" at CBSA and submarket level, you have three scalable options (in increasing fidelity):

  1. Federal-only approximation (fast, lower fidelity): Use CBSA permits (BPS) + national/region SOC lags + HUD-USPS vacancy/no-stat (where available) to estimate UC stock and "finished vacant." This will be directionally useful but not "CBAS-equivalent" at community/submarket resolution.
  2. Permit-system lifecycle extraction (best public path): Build connectors to local permitting/inspection systems (Accela, Tyler, OpenGov, ArcGIS/Socrata open-data portals) and compute starts/UC/completions directly from inspection events and permit statuses. Accela documents offset/limit pagination and rate-limit headers; ArcGIS query pagination uses resultOffset/resultRecordCount and exceededTransferLimit.
  3. Recorder + assessor fusion (true closings; hardest): Pull deed transfers from county clerk/recorder systems (often account-gated and fee-based, as in Harris County) and combine with assessor parcels and HMDA to classify purchase type (cash vs financed) and new-construction flags.

A pragmatic national rollout is therefore a connector program: federal spine first, then add (a) the "local permitting lifecycle" connector family and (b) a smaller set of "recorder/assessor" patterns for top markets where automation is feasible.

Houston Presentation Metric Inventory

The table below itemizes every market metric / chart / table-type element present in the Houston deck (excluding purely administrative "member events/training" slides). Where a slide is clearly visual but text extraction does not reveal the data series (e.g., some image-only charts), the element is marked ambiguous and treated as a placeholder.

Full Metric Inventory Table (30+ rows)
Deck Location (Page) Metric / Visual Element Definition (As Shown or Inferred) Units Time Window / Cadence Viz Type
7, 12 Metro population by year; annual population change bars "Metro Houston Population by Year" and annual total change People (level); people/year (change) Annual (2015–2024 shown) Line + bar
8 Population growth rank table (top MSAs) MSA population as of 7/1/24; change 2023→2024 and % People; % Annual Table
9 County growth rank table County 2024 population and 2024 population growth People Annual Table
10–11 Components of population change Births−deaths, domestic migration, international migration; annual and cumulative since 2020 People Annual + cumulative Stacked bars + callouts
14–15 Employment growth trend (Houston vs peers) Rank table: total employment and annual job growth; plus Houston YoY job growth time series Jobs Monthly/YoY Table + line
16 Unemployment rate & payroll employment growth Unemployment rate "constant since early 2022" + nonfarm payroll employment level (2003–2025) %; jobs Monthly Line
18 Annual employment growth by sector Sector-level job growth counts (positive/negative) Jobs Annual Bar
19 Employment growth forecast Greater Houston Partnership annual job growth forecast Jobs Annual forecast Callout
21 MLS resale "stats" table Annual sales, monthly sales, avg price, median price, active listings, days on market, sale/list ratio Counts; $; %; days Dec 2025 snapshot + YoY Table
22 Resale closings & inventory time series Monthly closed listings vs active listings; commentary "peaked" Counts Monthly (Sep 2024–Nov 2025 shown) Dual-axis line/bar
25 New-home starts & closings history "Estimated Annual Starts" and "Estimated Annual Closings" (Houston MSA) + YoY deltas Homes Annual (through 3Q 2025 shown) Line/bar
26 New-home market by the numbers Quarterly starts; quarterly closings; homes under construction; finished vacant inventory; months supply Homes; months Quarterly (3Q 2025 shown) KPI tiles
27 Lot pipeline by the numbers Vacant developed lots (VDL); VDL months supply; future lots; future lots with active site work Lots; months Quarterly (3Q 2025 shown) KPI tiles
29 Starts & closings by market area Annual starts/closings split by market area Homes Annual (3Q 2025 roll) Bar
30 Starts & closings by lot size program Annual starts/closings by lot width bucket (<34, 35–39, …, 90+) Homes Annual Bar
31–32 High-volume subdivision counts Count of subdivisions above a start/closing threshold Subdivision count Annual KPI statement
34–36 Finished vacant inventory across the market Finished vacant homes in inventory and months supply by market area; table of "submarkets with high FV"; FV by lot size Homes; months Quarterly / annualized Bar + table
39 Median new home price over time Median new home price and median price per SF $; $/SF Monthly/quarterly through Dec 2025 Line
40 Floorplan quarter-over-quarter price direction Counts/percent of plans with decreases / no-change / increases; median change sizes Floorplans; %; $ Quarterly KPI + distribution summary
41–42 Starts/closings and FV by base price band Annual starts/closings by base price buckets; finished vacant by base price buckets Homes Annual Bar
43–46 Extremes of base-priced floorplans Cheapest and most expensive base-priced floorplans (plan, community, beds/baths, SF, base price) $; SF Point-in-time Tables
48–50 Top communities & neighborhoods Ranked lists of communities by annual starts/closings Homes Annual Tables
53–55 High VDL submarkets/subdivisions; VDL by lot size Submarkets with VDL > threshold and high MOS; count of subdivisions with ≥100 VDL; VDL inventory & MOS by lot size Lots; months Quarterly Table + KPI + bar
56–57 Future lots by market area and "status" Future lots split by market area; split by condition (vacant, clearing, WS&D, paving) Lots Quarterly Bar
58 Future planned subdivision locations Count of planned subdivisions across market Subdivision count Point-in-time Map + KPI
61–67 288 South Corridor package "By the numbers" KPIs; activity table by quarter (closings/models/completed vacant/UC/etc); top communities/builders; starts/closings by lot size and base price; VDL & future planned lots Mix Quarterly & annual KPI + table + bars
69–74 Other Land Tejas submarket packages Similar "submarket snapshot" pages (Grand Magnolia, River Ranch, Lago Mar East) Mix Recent quarter KPI + map/table
76 Mortgage rate forecast commentary "Big mortgage rate declines not expected soon" % Forecast (source-cited) Text + chart (ambiguous)
78 Consumer confidence commentary "Consumer Confidence remains a challenge" Index Current Text + chart (ambiguous)
80 New-home starts forecast range Houston annual starts forecast ranges for 2025 and 2026; change vs prior year Homes; % Annual forecast KPI
81 Permits vs starts; long-term average Annual single-family permits vs estimated annual starts; long-term average Homes Annual Line/bar

Public and Government-Adjacent Source Catalog with Endpoints and Automation Assessment

This section is an automatable catalog oriented around what can be pulled nationally and how it maps to Houston-style KPIs.

Federal "Backbone" Sources (High Coverage, High Stability)

Federal Backbone Sources Table
Source What It Covers Direct Access Method Key Endpoints Formats Cadence Geography Automation Difficulty (1–5)
U.S. Census Bureau Building Permits Survey (BPS) Permits issued (units, buildings, valuation) Bulk downloads + ASCII files Docs: census.gov/permits; CBSA ASCII doc: cbsaasc.pdf; CBSA Excel: msamonthly.html XLSX; TXT (CSV-delimited) Monthly + annual CBSA, county, place (permit office) 2 (bulk is easy; semantics take work)
U.S. Census Bureau Survey of Construction (SOC): New Residential Construction (NRC) Starts, authorized-not-started, under construction, completions (national/region) Bulk tables + press releases Landing: census.gov/construction/nrc/; tables: starts_cust.xlsx and comps_cust.xlsx XLSX + HTML Monthly + quarterly supplements National + Census region 1 (very easy)
U.S. Census Bureau SOC: New Residential Sales (NRS) New-home sales, for-sale inventory, months supply; stage-of-construction inventory Bulk tables + press releases Landing: census.gov/construction/nrs/; series listing: series.html XLSX + HTML + PDFs Monthly + quarterly supplements National + Census region 1
Federal Reserve Bank of St. Louis FRED API Programmatic access to many housing series (including SOC-derived series) REST API Docs: series_observations.html JSON/XML/CSV/XLSX Varies (often monthly) National; some regional 1–2
Federal Housing Finance Agency House Price Index HPI by metro/state/national; repeat-sales index Direct CSV download hpi_master.csv CSV Monthly + quarterly releases National, state, MSA/city 1
FFIEC / CFPB HMDA Public Data Mortgage originations (purchase/refi), lender, some geography (MSA/MD) Data browser downloads + "dataset filtering" URLs; bulk national datasets HMDA overview: ffiec.gov/data/hmda CSV downloads; bulk files (yearly) Annual release cycles National; MSA/MD; tract (in some products) 3 (large files + privacy-modified fields)
HUD-USPS Vacancy & No-Stat (restricted) Vacancy and "no-stat" counts (includes under-construction/no-mail cases) for occupancy/vacancy proxies Login-gated download (gov/nonprofit only) Dataset page: huduser.gov USPS datasets; FAQ: FAQ.pdf CSV (varies), ZIP+4 aggregates (derived) Quarterly Tract/ZIP+4 aggregates (varies) 5 (access restriction + licensing)
U.S. Bureau of Labor Statistics employment/unemployment APIs Job growth & unemployment used in the deck Public API v2 API features/limits via BLS documentation JSON Monthly National/metro (dataset-dependent) 2

Critical constraint: SOC (starts/UC/completions) and NRS (new-home inventory stages) are not published at CBSA in the standard public tables—only national and region are guaranteed. So for "top-25 CBSAs," permits are the only guaranteed federal CBSA-level production indicator from Census.

Federal "Distribution Layers" (ArcGIS/Open Data Shells)

HUD republishes some Census BPS aggregations through ArcGIS Hub, which can be an easier ingestion surface when you want GIS-ready joins. Example: "Residential Construction Permits by County" on HUD's open data hub. This is particularly useful when your internal geography is county-based and you need quick joins to CBSA via a county→CBSA delineation file.

CBSA Mapping Primitives You Will Need Everywhere

To nationalize cleanly, you need a single authoritative CBSA crosswalk. The U.S. Census Bureau delineation files provide county membership for each CBSA; the "List1" Excel is a canonical artifact. For HMDA, the Federal Financial Institutions Examination Council also publishes tract lists and crosswalks keyed to MSA/MD for compliance geocoding, which can help reconcile annual boundary changes.

Metric-to-Source Mapping and Per-Metric Replication Difficulty

The Houston deck's metrics fall into three replicability classes:

Core Houston KPIs and Best National Source Candidates

Houston KPI (Deck Phrasing) Best Public Source(s) Local/Vendor Source(s) Often Needed Key Transform Steps National Replication Difficulty (1–5)
Permits issued (implied in permits vs starts) BPS CBSA/county/place files CBSA code standardization; single-family vs total units; revision handling 2
Starts (by quarter / annual) No direct CBSA federal series; use SOC national/regional to calibrate models Permitting/inspection systems (Accela/Tyler/OpenGov etc) to detect "first inspection/foundation" Define "start event"; permit→start lag model by jurisdiction; aggregate to CBSA 4–5
Closings / sales (new homes) HMDA purchase loans (financed closings proxy) County recorder deeds to capture cash and true transfers; MLS new construction (proprietary) Filter purchase loans; dedupe; geography to CBSA; fuse recorder for cash 4–5
Homes under construction (stock) SOC national/region defines UC concept Local inspection status "in progress" is best; or USPS no-stat + permits model Build stock-flow: started-not-finaled; reconcile to permit counts 4
Completions (by quarter) SOC national/region only Local CO/final inspection events; assessor "year built"; utility connects Choose completion definition; handle partial finals; aggregate 4
Finished vacant new-home inventory NRS stage-of-construction inventory is national/region only Local: CO but not sold (recorder/HMDA), plus listing status; USPS vacancy/no-stat aids Define "finished" (CO/final); define "vacant" (unsold/unoccupied) 5
VDL (vacant developed lots) & months supply No federal equivalent Planning/plat, developer filings; GIS parcel+subdivision; proprietary "lot status" databases Normalize lot lifecycle steps; compute supply vs starts/absorption 5
Future lots / active site work / future planned subdivisions No direct; partial via plats or subdivision approvals some cities publish City planning reports; ePortal "plat" approvals; ArcGIS layers; proprietary land pipeline Create "future lot" entity; detect site work via permits/inspections/engineering permits 5
Resale MLS metrics Not federal; "government-adjacent" only if Realtor association publishes Houston Association of Realtors publishes monthly market updates; other markets vary Standardize MLS metric definitions; handle membership/API access constraints 4–5

Automated Ingestion Playbooks by Connector Family

You asked for per-metric step-by-step; the most scalable way to implement that is connector-first. Once each connector is in place, metrics become SQL transforms.

BPS (Permits) Connector

Why It Matters

BPS is your only guaranteed CBSA-level federal production indicator; CBSA ASCII files explicitly include buildings/units/valuation and provide file naming conventions.

Endpoints & Layouts

BPS Ingestion Steps
  1. Discover new files by monthly survey month (YYYYMM). Use a "manifest" table keyed by dataset + survey month.
  2. Download CBSA monthly current-month file and YTD file; the CBSA ASCII doc defines names like CBSA<YYMM>C.TXT and CBSA<YYMM>Y.TXT.
  3. Parse CSV-delimited ASCII; store both "with imputation" and "reported only" columns (CBSA ASCII includes both sets).
  4. Handle revisions: the SOC/BPS release process includes revised permits releases and periodic methodological changes; treat each month as mutable and re-load an N-month "revision window."
  5. Normalize geography: store 5-digit CBSA and optional CSA; join to a fixed CBSA dim built from Census delineation files.
  6. Quality checks: negative deltas should be explainable via revisions; track vintage.

Rate limits / failure modes: Bulk downloads are stable; primary risks are schema updates and file format shifts (Census has changed SOC tables from .xls to .xlsx historically).

SOC/NRC/NRS Connector (National Calibration + Stage-of-Construction Inventory)

Why It Matters

SOC describes starts/UC/completions and notes estimates are adjusted using authorized units (BPS) and account for late reports and starts/sales before permits.

SOC Ingestion Steps
  1. Download NRC/NRS XLSX tables each release cycle from the series pages.
  2. Track quarterly "supplement tables" (SOC schedule lists quarterly and revision items).
  3. Use these series strictly for:
    • National/region benchmarking,
    • Lag calibration (permit→start; completion timing),
    • Definitional alignment (what counts as "under construction," "completed," "for sale," etc.).

HMDA Connector (Financed Closings Proxy)

Why It Matters

HMDA is the most scalable public source for purchase-loan activity and lender detail, administered through FFIEC/CFPB infrastructure and released as national datasets and data browser tools.

HMDA Ingestion Steps
  1. Decide your product: "snapshot" national loan-level dataset vs filtered downloads (tradeoff between size and convenience).
  2. Prefer annual bulk ingestion: load the whole year, then create a curated analytic table with only:
    • purchase originations (exclude refi),
    • owner-occupied vs investor (where available),
    • loan amount, action taken, property type, lien status,
    • MSA/MD codes (for CBSA mapping).
  3. Cash proxy / gap: HMDA misses cash purchases; you need recorder data or modeling (see gap section).
  4. Governance: HMDA public data are modified for privacy; don't expect perfect address-level matching.

HUD-USPS Vacancy/No-Stat Connector (Restricted, High Value for "Finished Vacant")

Why It Matters

HUD's USPS data represent the universe of addresses updated quarterly, but are only accessible to government entities and registered nonprofits under a sublicense. The USPS "no-stat" bucket explicitly includes "under construction and not yet occupied" in addition to abandoned/inactive cases, making it a powerful—but tricky—signal for new construction pipeline and "finished vacant" inference.

HUD-USPS Ingestion Steps (If Eligible)
  1. Complete HUD user registration and obtain access (organizational prerequisite).
  2. Pull quarterly tract-level and/or ZIP+4 aggregates for:
    • total addresses,
    • vacant counts by duration buckets,
    • no-stat counts by duration buckets. (Field definitions and caveats are in HUD FAQs.)
  3. Use HUD-USPS ZIP crosswalk files to allocate to CBSA when you ingest at ZIP level; HUD documents the ratio fields and posting timelines.
  4. Model separation of "growth no-stat" vs "distress no-stat" using address base changes; HUD notes interpretive issues and recommends local knowledge.

Automation risk: Access restrictions and legal terms (difficulty 5) dominate, not engineering.

ArcGIS Hub / ArcGIS REST Connector (Open-Data Permitting and Parcel Layers)

Many cities/counties publish permitting, inspections, parcels, subdivision plats, etc. through ArcGIS Hub or ArcGIS Server feature services.

Core REST Patterns

Query endpoint: .../FeatureServer/<layerId>/query with where, outFields, and pagination via resultOffset / resultRecordCount.

Always check exceededTransferLimit to know whether to continue paging.

ArcGIS Ingestion Steps
  1. Discover the dataset and identify layers/tables.
  2. Pull schema (f=pjson) and detect system field names (OBJECTID, create/edit timestamps).
  3. Incremental loads: query on edit_date/last_edited_date if present; otherwise do daily diff by OBJECTID ranges.
  4. Pagination: page until exceededTransferLimit=false (do not rely on count alone).
  5. Normalize to your permit schema (see schema section).

Socrata (SODA) Connector (Common for City Open Data)

Socrata APIs support app tokens and throttling improvements; default result size is commonly 1,000 rows unless you use $limit/$offset.

Socrata Ingestion Steps
  1. Identify a dataset's domain and resource ID: https://{domain}/resource/{id}.json.
  2. Use $select to limit fields, $where for incremental filters, $order for deterministic paging, $limit/$offset for paging.
  3. Provide X-App-Token to increase practical rate limits.
  4. Schema drift: Socrata can omit type headers; don't depend on X-SODA2-* headers for critical typing.

Permitting Platforms: Accela, Tyler Enterprise Permitting & Licensing, OpenGov

These systems are ubiquitous in large metros, but public access varies dramatically.

Automation reality: These are "government-adjacent" but often not open. For top-25 CBSAs you will likely need a strategy of:

Houston-Specific Source Inference and Gap Analysis

What Houston Elements Are Clearly Public vs. Likely CBAS Proprietary

Clear Public Signals Used in the Deck

Strong Evidence of Proprietary / Field Research Components

Gap Analysis Table

Metric Family Reproducible from Federal Public Sources Alone? Requires Local Open Data / Permitting Portals? Likely Needs Proprietary Inputs? Practical Substitute
Permits (CBSA) Yes (BPS) No No Use BPS as baseline + place/county splits
Starts / UC / completions (CBSA) No (SOC is national/region) Yes (inspections/CO events) Sometimes (if portals closed) Permit-lag model calibrated to SOC + partial local lifecycle
New-home closings (CBSA) No Sometimes (CO + deed/recording) Often (MLS/new-home sales feeds; builder data) HMDA purchase loans + recorder for cash + builder roster modeling
Finished vacant new-home inventory No Yes (CO + unsold; occupancy signals) Often USPS vacancy/no-stat (if eligible) + "CO minus sale" model
VDL / future lots pipeline No Sometimes (plats/engineering permits, GIS) Often Parcel subdivision + plat approval layers + heuristic site-work detection

Modeling Methods to Estimate Metro-Level Starts/UC/Completions and Finished-Vacant Inventory

These are practical approaches when you cannot get direct lifecycle events for every jurisdiction.

Permit-to-Start and Permit-to-Completion Stock-Flow Model (CBSA Level)

Inputs

Model Sketch

St = Σk wk · Pt−k

where wk is a lag distribution (k months from permit to start).

Ct = Σj vj · St−j

where vj is a start-to-completion distribution.

UCt = UCt−1 + StCt

Calibration

Calibrate wk and vj to match regional SOC starts/completions and UC stock dynamics, but allow CBSA-specific adjustment using:

Limitations

Cannot reproduce community/submarket counts like CBAS; it's a CBSA-level approximation.

Finished-Vacant "CO Minus Sale" Model

Goal: estimate completed but unsold/unoccupied new homes.

Best-Available Public Approximation

If you can get completion events (CO/final inspection) from permitting portals (local), and sale/closing events from recorder or HMDA, then:

FinishedVacantStock ≈ CompletedNotSoldNotOccupied

Practical Proxy Layers

Cash Closings Estimation (HMDA + Recorder Fusion)

Approach

  1. Use recorder deeds as "ground truth closings count" where automatable.
  2. Match or compare to HMDA purchase-loan counts by MSA/MD.
  3. Estimate cash share as:
Cash Share = 1 − (HMDAfinanced / Totaldeeds)

by time and price bucket.

National Rollout Plan Across Top-25 CBSAs with Connectors, Effort, and "Data Availability Cards"

Placeholder Top-25 CBSA List

Top-25 is unspecified. A reproducible way to define "top 25 new-home markets" is to rank CBSAs by single-family permits (BPS CBSA annual or rolling 12-month). Use placeholders until you choose the definition (population, permits, starts estimates, revenue).

Placeholder Rank CBSA Code CBSA Name Notes
1–25 (TBD) (TBD) Define top-25 criterion (population vs permits vs strategic markets)

CBSA Data-Availability Card Template (Prefilled with Houston Examples)

CBSA: Houston-area (example)

Key questions: permits, starts, UC, completions, finished vacant, closings, VDL/future lots.

Federal Baseline

Local Permitting / Lifecycle

Recorder / Assessor

Difficulty Score Per Metric

Prioritized Connector Backlog with Weeks-to-Build

Connector Scope Typical Blockers Estimated Engineering Effort
BPS (CBSA/county/place) Parse CBSA ASCII + Excel; revision windows; CBSA joins Minimal 1–2 weeks
SOC NRC/NRS ingestion Pull XLSX tables; maintain release calendar Minimal <1 week
FRED API Standard series pulls; caching; retries API key management <1 week
FHFA HPI Ingest hpi_master.csv; pick flavors/levels Minimal <1 week
HMDA bulk Annual bulk load; curated analytics; MSA/MD joins Scale + privacy-modified fields 2–4 weeks
ArcGIS REST generic FeatureServer/MapServer query + paging; schema inference maxRecordCount/pagination complexities 2–4 weeks
Socrata SODA generic $where/$limit/$offset; tokens Dataset-specific schema drift 1–3 weeks
Accela Construct API Records/permits/inspections extraction Agency auth + permissions; rate limiting 4–8 weeks
Tyler Enterprise Permitting & Licensing Permits/inspections via API toolkit Contractual access; per-version differences 6–10 weeks
OpenGov Permitting & Licensing API Permits/licensing workflows API key approval + scope definitions 3–6 weeks
Recorder ingestion patterns Deeds/transfers, document metadata Paywalls, CAPTCHA, fees, legal constraints 6–12+ weeks per market

Rollout Prioritization Heuristic for CBSAs

Because lifecycle and recorder access dominate difficulty, prioritize CBSAs in this order:

  1. CBSAs where major jurisdictions publish permits/inspections as ArcGIS or Socrata open data (fastest path to starts/UC/completions using lifecycle events).
  2. CBSAs where permitting platform APIs are available with manageable access (OpenGov / Accela / Tyler) and jurisdictions are willing to issue credentials.
  3. CBSAs where recorder data is programmatically accessible or sold in bulk (otherwise you'll be stuck with HMDA-only closings proxies).

Implementation Artifacts: Normalization Schema, ETL Diagrams, and SQL DDL + Sample Queries

Normalization Schema Recommendation (Minimum Viable)

You want to support:

Suggested core tables:

Mermaid: Data Flow / ETL

flowchart LR
  A[BPS CBSA/county/place files] --> S[(Raw landing zone)]
  B[SOC NRC/NRS tables] --> S
  C[HMDA bulk/yearly] --> S
  D[HUD-USPS vacancy/no-stat (restricted)] --> S
  E[Local open data: ArcGIS/Socrata] --> S
  F[Permitting platforms: Accela/Tyler/OpenGov (credentialed)] --> S
  G[Recorder/Assessor portals] --> S

  S --> N[Normalize & map geographies (county->CBSA)]
  N --> W[(Warehouse: permit/lifecycle/sales)]
  W --> M[Metric transforms: Q starts, Q closings, UC stock, finished vacant, VDL]
  M --> DSH[Dashboard/API layer]

Mermaid: Release-Timeline Gantt (Illustrative)

gantt
  title Housing data release/refresh cadence (typical)
  dateFormat  YYYY-MM-DD
  axisFormat  %b %Y

  section Monthly
  BPS permits (monthly)            :active, 2025-01-01, 2026-12-31
  NRC starts/UC/completions        :active, 2025-01-01, 2026-12-31
  NRS sales/inventory              :active, 2025-01-01, 2026-12-31
  FHFA HPI (monthly file)          :active, 2025-01-01, 2026-12-31

  section Quarterly
  HUD-USPS vacancy/no-stat (Q)     :active, 2025-01-01, 2026-12-31

  section Annual
  HMDA public release (annual)     :active, 2025-01-01, 2026-12-31

Notes: SOC release schedule details (monthly press releases, quarterly tables, seasonal revisions) are documented by Census. HUD's dataset schedule calls out USPS vacancies posting timelines and a "20th business day" cadence for HUD SOCDS permits (where used).

SQL DDL: Permits, Lifecycle, and Closings Tables (Houston-Like)

SQL DDL — Full Table Definitions (5 tables)
-- 1) Permit aggregates (BPS-like, monthly, CBSA/jurisdiction)
CREATE TABLE fact_permit_monthly (
  source_system         TEXT NOT NULL,          -- e.g., 'census_bps'
  geo_level             TEXT NOT NULL,          -- 'cbsa' | 'county' | 'place'
  geo_id                TEXT NOT NULL,          -- cbsa_code, county_geoid, or place_id
  survey_month          DATE NOT NULL,          -- normalized to first day of month
  units_1u              INTEGER,                -- if available; otherwise total_units
  units_2u              INTEGER,
  units_3_4u            INTEGER,
  units_5pu             INTEGER,
  units_total           INTEGER NOT NULL,
  valuation_kusd_total  NUMERIC,                -- valuation in $000 if from BPS ASCII
  is_imputed_series     BOOLEAN NOT NULL,       -- true = "with imputation", false = "reported only"
  loaded_at             TIMESTAMP NOT NULL DEFAULT NOW(),
  PRIMARY KEY (source_system, geo_level, geo_id, survey_month, is_imputed_series)
);

-- 2) Permit record-level (local portals / vendor APIs)
CREATE TABLE fact_permit_record (
  source_system        TEXT NOT NULL,           -- 'accela', 'arcgis', 'socrata', 'houston_portal', etc.
  jurisdiction_id      TEXT NOT NULL,
  permit_id            TEXT NOT NULL,
  permit_type          TEXT,
  work_type            TEXT,                    -- new construction, addition, etc.
  structure_type       TEXT,                    -- SF, MF, etc.
  units                INTEGER,
  valuation_usd        NUMERIC,
  status               TEXT,                    -- issued, finaled, expired, etc.
  applied_date         DATE,
  issued_date          DATE,
  final_date           DATE,                    -- final inspection or CO date if present
  address              TEXT,
  latitude             NUMERIC,
  longitude            NUMERIC,
  raw_payload          JSONB,                   -- keep original for schema drift
  loaded_at            TIMESTAMP NOT NULL DEFAULT NOW(),
  PRIMARY KEY (source_system, jurisdiction_id, permit_id)
);

-- 3) Inspection / milestone events (start/completion inference)
CREATE TABLE fact_inspection_event (
  source_system        TEXT NOT NULL,
  jurisdiction_id      TEXT NOT NULL,
  permit_id            TEXT NOT NULL,
  inspection_id        TEXT NOT NULL,
  inspection_code      TEXT,                    -- e.g., 'FOUNDATION', 'FRAMING', 'FINAL'
  inspection_result    TEXT,                    -- pass/fail/cancel
  scheduled_date       TIMESTAMP,
  completed_date       TIMESTAMP,
  raw_payload          JSONB,
  loaded_at            TIMESTAMP NOT NULL DEFAULT NOW(),
  PRIMARY KEY (source_system, jurisdiction_id, inspection_id)
);

-- 4) Closings / transfers (recorder)
CREATE TABLE fact_sale_transfer (
  source_system        TEXT NOT NULL,           -- 'county_recorder'
  county_geoid         TEXT NOT NULL,
  document_number      TEXT NOT NULL,
  recording_date       DATE NOT NULL,
  sale_date            DATE,
  grantor              TEXT,
  grantee              TEXT,
  sale_price_usd        NUMERIC,
  address              TEXT,
  parcel_id            TEXT,
  deed_type            TEXT,
  raw_payload          JSONB,
  loaded_at            TIMESTAMP NOT NULL DEFAULT NOW(),
  PRIMARY KEY (source_system, county_geoid, document_number)
);

-- 5) HMDA curated (financed purchase proxy)
CREATE TABLE fact_mortgage_origination (
  source_system        TEXT NOT NULL,           -- 'hmda'
  year                 INTEGER NOT NULL,
  msa_md_code          TEXT,                    -- 5-digit
  action_taken         TEXT,
  loan_purpose         TEXT,
  loan_amount          NUMERIC,
  occupancy_type       TEXT,
  property_type        TEXT,
  lien_status          TEXT,
  purchaser_type       TEXT,
  is_purchase          BOOLEAN NOT NULL,
  is_originated        BOOLEAN NOT NULL,
  loaded_at            TIMESTAMP NOT NULL DEFAULT NOW(),
  PRIMARY KEY (source_system, year, msa_md_code, action_taken, loan_purpose,
               loan_amount, occupancy_type, property_type, lien_status, purchaser_type)
);

Sample SQL Queries: Q4 Starts, Completions, UC Stock, Finished Vacant Inventory

Assume:

Query A: Q4 Starts (count permits with a start-event in the quarter)
-- Parameters
-- :cbsa_code, :q_start_date, :q_end_date are inputs
-- Example q_start_date='2025-10-01', q_end_date='2025-12-31'

-- A) Q4 Starts (count permits with a start-event in the quarter)
SELECT
  :cbsa_code AS cbsa_code,
  COUNT(DISTINCT pr.permit_id) AS q_starts
FROM fact_permit_record pr
JOIN fact_inspection_event ie
  ON pr.source_system = ie.source_system
 AND pr.jurisdiction_id = ie.jurisdiction_id
 AND pr.permit_id = ie.permit_id
WHERE pr.structure_type = 'SF'
  AND ie.inspection_code IN ('FOUNDATION', 'SLAB', 'PIER_BEAM')   -- customize per jurisdiction
  AND ie.completed_date::date BETWEEN :q_start_date AND :q_end_date;
Query B: Q4 Completions (count permits with final/CO event in quarter)
-- B) Q4 Completions (count permits with final/CO event in quarter)
SELECT
  :cbsa_code AS cbsa_code,
  COUNT(DISTINCT pr.permit_id) AS q_completions
FROM fact_permit_record pr
JOIN fact_inspection_event ie
  ON pr.source_system = ie.source_system
 AND pr.jurisdiction_id = ie.jurisdiction_id
 AND pr.permit_id = ie.permit_id
WHERE pr.structure_type = 'SF'
  AND ie.inspection_code IN ('FINAL', 'CERTIFICATE_OF_OCCUPANCY', 'CO') -- customize
  AND ie.completed_date::date BETWEEN :q_start_date AND :q_end_date;
Query C: Under-Construction Stock at Quarter End
-- C) Under-Construction Stock at Quarter End
-- UC = started on/before q_end and NOT completed by q_end
WITH starts AS (
  SELECT
    pr.source_system, pr.jurisdiction_id, pr.permit_id,
    MIN(ie.completed_date)::date AS start_date
  FROM fact_permit_record pr
  JOIN fact_inspection_event ie
    ON pr.source_system = ie.source_system
   AND pr.jurisdiction_id = ie.jurisdiction_id
   AND pr.permit_id = ie.permit_id
  WHERE pr.structure_type='SF'
    AND ie.inspection_code IN ('FOUNDATION','SLAB','PIER_BEAM')
  GROUP BY 1,2,3
),
completions AS (
  SELECT
    pr.source_system, pr.jurisdiction_id, pr.permit_id,
    MIN(ie.completed_date)::date AS completion_date
  FROM fact_permit_record pr
  JOIN fact_inspection_event ie
    ON pr.source_system = ie.source_system
   AND pr.jurisdiction_id = ie.jurisdiction_id
   AND pr.permit_id = ie.permit_id
  WHERE pr.structure_type='SF'
    AND ie.inspection_code IN ('FINAL','CERTIFICATE_OF_OCCUPANCY','CO')
  GROUP BY 1,2,3
)
SELECT
  :cbsa_code AS cbsa_code,
  COUNT(*) AS uc_stock_end_q
FROM starts s
LEFT JOIN completions c
  ON s.source_system=c.source_system
 AND s.jurisdiction_id=c.jurisdiction_id
 AND s.permit_id=c.permit_id
WHERE s.start_date <= :q_end_date
  AND (c.completion_date IS NULL OR c.completion_date > :q_end_date);
Query D: Finished Vacant Inventory at Quarter End
-- D) Finished Vacant Inventory at Quarter End (completed but not sold/closed by end date)
-- Requires at least one sales signal (recorder or HMDA proxy).
WITH completed_units AS (
  SELECT
    pr.source_system, pr.jurisdiction_id, pr.permit_id,
    MIN(ie.completed_date)::date AS completion_date,
    pr.address
  FROM fact_permit_record pr
  JOIN fact_inspection_event ie
    ON pr.source_system = ie.source_system
   AND pr.jurisdiction_id = ie.jurisdiction_id
   AND pr.permit_id = ie.permit_id
  WHERE pr.structure_type='SF'
    AND ie.inspection_code IN ('FINAL','CERTIFICATE_OF_OCCUPANCY','CO')
  GROUP BY 1,2,3,5
),
sales AS (
  SELECT
    st.address,
    MIN(st.recording_date) AS first_recording_date
  FROM fact_sale_transfer st
  GROUP BY 1
)
SELECT
  :cbsa_code AS cbsa_code,
  COUNT(*) AS finished_vacant_end_q
FROM completed_units cu
LEFT JOIN sales s
  ON cu.address = s.address
WHERE cu.completion_date <= :q_end_date
  AND (s.first_recording_date IS NULL OR s.first_recording_date > :q_end_date);

Important: Address matching for sales is the hardest part in practice (standardization, unit numbers, parcels). This is why recorder/assessor integration tends to be the long pole.


← Back to research index