What is RealHouse?

RealHouse is a national housing-market data platform that ingests federal and local public data sources to produce CBSA-level homebuilding metrics — permits, starts, under construction, completions, finished vacant inventory, and closings. The MVP targets the Houston-The Woodlands-Sugar Land CBSA (26420) to validate the approach before expanding nationally.

System Overview

Federal Data Sources
BPS, NRC, HPI, FRED
Python Ingest CLI
Connectors + Models
Supabase (Postgres)
Warehouse Schema
Next.js Dashboard
KPIs + Charts

What questions can RealHouse answer?

  1. How many building permits were issued in Q4 2025?
  2. What is the estimated number of starts and completions this quarter?
  3. How many homes are estimated under construction?
  4. What is the estimated finished vacant inventory and months' supply?
  5. How is employment trending? House prices?
  6. How do Houston permits compare to the national trend?

Deep Dives

Two-Tier Data Architecture

RealHouse uses a two-tier data architecture. The federal backbone provides uniform, automatable national coverage through standardized Census and agency data products. The local systems layer provides higher-fidelity lifecycle data but is fragmented across jurisdictions, requiring per-metro connector work.

Federal Backbone
BPS Permits, NRC/NRS Pipeline, FHFA HPI, FRED, BLS, Census PEP, HMDA
Local Systems
Permit Portals, Inspections, CO Events, Recorder/Assessor, ArcGIS/Socrata

The federal backbone is consistent across geographies but lacks metro-level starts/UC/completions (NRC/NRS are national/regional only). The local systems layer can unlock these metrics through inspection events and certificates of occupancy, but varies by jurisdiction.

See all data sources →

The Pipeline Model

Since NRC starts/UC/completions are only published at national and regional level, we use a distributed lag model to estimate metro-level pipeline metrics from BPS permits.

St = Σk wk · Pt−k

Starts from permits via lag weights

Ct = Σj vj · St−j

Completions from starts via completion weights

UCt = UCt−1 + StCt

Under-construction stock-flow identity

FVt = FVt−1 + Ct − closingst

Finished vacant inventory

Calibration: Adjust w and v so national/regional aggregates match NRC totals (South region for Houston)

Note: The model has a ~9 month warm-up period where first quarters underestimate starts/completions. As of Phase 2, HMDA purchase-loan originations are wired into the finished-vacant estimate as the closings signal (previously a placeholder).

See full model details →

The Warehouse Schema

Geography

  • dim_cbsa — CBSA codes, names, delineation vintage
  • bridge_cbsa_county — County-to-CBSA membership mapping

Federal Facts

  • fact_bps_permits — Monthly building permit counts by CBSA, structure type, and revision vintage
  • fact_fhfa_hpi — Quarterly FHFA house price index values by CBSA
  • fact_nrc_pipeline — National/regional housing starts, under construction, and completions
  • fact_nrs_inventory — National new residential for-sale inventory and months' supply

Phase 2 Federal Complete

  • fact_bls_employment — Metro-area payroll employment and unemployment rates
  • fact_census_pop — Annual population estimates and migration components
  • fact_hmda_originations — Mortgage origination aggregates by MSA/MD
  • fact_fred_series — FRED API economic time series (mortgage rates, macro indicators)

Houston Local Phase 3

  • fact_hcad_parcels — Harris County Appraisal District parcel-level data
  • fact_houston_permits_agg — Aggregated Houston permitting data

TPIA (Pending) Phase 3

  • fact_permit_record — Individual permit records from TPIA request
  • fact_inspection_event — Inspection events tied to permits (foundation, framing, final/CO)

Derived

  • mart_pipeline_quarterly — Modeled starts, under construction, completions, and finished-vacant inventory per CBSA per quarter

See full schema details →

The Dashboard

The Next.js 15 dashboard (with shadcn/ui components and Recharts visualizations) provides six pages, all scoped to a CBSA. All pages are now implemented as of Phase 2:

Page Route Content
Market Overview /market/26420 6 KPI tiles (permits, starts, UC, completions, finished vacant, months' supply) + quarterly trend chart
Permits /market/26420/permits Monthly permit time series, single-family vs total, year-over-year bars
Pipeline /market/26420/pipeline Modeled starts/UC/completions chart, UC stock trend
Employment /market/26420/employment Payroll trend, unemployment rate, year-over-year job growth
Prices /market/26420/prices FHFA HPI trend, year-over-year price change, vs national
Demographics /market/26420/demographics Population trend, migration components
Project Status & Roadmap

Phase 1: Federal Backbone + Modeled Pipeline Complete

BPS permits ingestion, FHFA HPI ingestion, NRC/NRS calibration data, permit-to-start distributed lag model, and dashboard scaffold with market overview and permits detail pages.

Phase 2: Employment + Prices + Closings Complete

BLS employment connector, Census PEP population connector, FRED API series connector, HMDA mortgage origination connector (Data Browser API), four new database migrations, and three new dashboard pages (employment, demographics, prices). HMDA closings are now wired into the finished-vacant inventory estimate.

Phase 3: Houston Local + TPIA Future

HCAD parcel data ingestion, Houston permit aggregates, and TPIA lifecycle data (individual permits, inspection events, certificates of occupancy) when the public information request is fulfilled.


What Remains

  • Merge Phase 2 PR (feat/phase2-employment-pop-hmda)
  • Set up Supabase project with real credentials and run migrations
  • Phase 3 planning (Houston local data + TPIA lifecycle)
  • File TPIA request with Houston Permitting Center
  • Set up GitHub Actions CI