A comprehensive metric-by-metric reproduction plan mapping every Houston CBAS presentation metric to public data sources, with national replicability assessments, modeling strategies, and SQL implementation artifacts.
The Houston presentation you referenced (and the uploaded PDF dated 02/12/2026) is best understood as a hybrid of: (a) federal macro indicators that are easy to automate nationwide, (b) local government process data (permits, plats, inspections, certificates, recordings) that is automatable but fragmented and vendor-dependent, and (c) proprietary "new-home market intelligence" (community-level starts/closings, spec inventory, floorplan/base-price catalogs, vacant developed lots, future-lots pipelines) that is difficult to replicate purely from public sources without building a sophisticated local-data fusion + modeling layer.
A national rollout across "top-25 CBSAs" (unspecified; placeholders provided later) is feasible if you design the product as a layered stack:
Important scope note: the Houston deck's new-home and lot pipeline metrics appear to have been produced by Community Builders Advisory Services ("Source: CBAS" appears repeatedly in the deck), implying an internal address/community database and field-verified pipeline. A public-data-only implementation will inevitably be more probabilistic unless you integrate deep local permitting/inspection/plat/parcel/recording data at address level.
This section itemizes every metric/chart/table in the Houston presentation, focusing on: metric name, interpretation/definition, units, time window, and visual type. Where the deck does not define the metric precisely, it is flagged as ambiguous (per your instruction).
| Section in Deck | Slide Element (Metric/Table/Chart Name) | Definition as Shown or Inferred | Units | Time Window Shown | Visualization Type |
|---|---|---|---|---|---|
| Demographics | Total Population Change | Net population change (not decomposed on this slide) | People | 2001–2023 | Line chart + callouts |
| Demographics | Metro population growth ranking table | Rank + metro population + numeric/population change | People | 2023–2024 (vintage implied) | Table (top metros) |
| Demographics | County population growth ranking table | Rank + county population + numeric change + % change | People, % | 2023–2024 (vintage implied) | Table (top counties) |
| Demographics | Births–Deaths, Domestic Migration, International Migration | Components of population change | People | 2020–2024 | Multi-series chart |
| Demographics | "Metro Houston Population by Year" | Population level | People | 2015–2024 | Line chart |
| Employment | Employment growth trend ranking table | Total employment, annual job growth (# and %), unemployment rate | Jobs, %, % | "as of" ~Sep 2025 (implied by slide) | Table (top metros) |
| Employment | Houston annual job growth | Annual job growth as a time series | Jobs (or change) | Jan 2024–Sep 2025 | Line chart |
| Employment | Total nonfarm payroll employment (seasonally adjusted) | Employment level | Jobs | 2003–2025 | Line chart |
| Employment | "Metro Houston Employment – Select Industries" | Employment by industry | Jobs | A point-in-time (implied) | Table |
| Employment | Annual employment growth by sector | Sector contribution/growth | Jobs change | ~latest year | Bar chart |
| Employment | 2026 employment growth forecast | Forecasted job gains/losses by sector + total forecast | Jobs change | 2026 | Bar chart |
| Resale market | "Single Family True Resale Home Sales – MLS Stats" | Annual sales, monthly sales, average price, median price, active listings, days on market, sales/list price, and YoY % changes | Counts, $, days, % | December 2025 snapshot (annual + monthly) | Table (with callouts) |
| Resale market | "Resale Home Closings and Inventory" | Monthly closings and active listing inventory | Counts | Jan 2018–Nov 2025 | Dual-axis line chart |
| New home market | Houston starts and closings history | Ambiguous definition of "estimated starts" and "estimated closings" (likely for-sale single-family new construction) | Homes | 4Q 2000/2002–3Q 2025 | Dual-series chart (rolling annual by quarter) |
| New home market | "3Q 2025 New Home Market by the Numbers" | Quarterly starts; quarterly closings; homes under construction; finished vacant homes inventory; finished vacant months' supply | Homes, months | 3Q 2025 | KPI tiles |
| Lots / pipeline | "3Q 2025 New Home Market by the Number" (lots) | Vacant developed lots; VDL months' supply; future lots; future lots w/ active site work | Lots, months | 3Q 2025 | KPI tiles |
| New home segmentation | "Annual starts and closings by market area" | Starts and closings split by sub-areas (North/West/etc.) | Homes | "Annual" at 3Q 2025 (likely trailing 4 quarters) | Bar chart + map |
| New home segmentation | "Annual starts and closings by lot size program" | Starts and closings split by lot width/depth bands | Homes | Annual at 3Q 2025 | Bar chart |
| New home segmentation | High volume subdivisions (starts) | Count of subdivisions with 50+ annual starts | Subdivisions | Annual at 3Q 2025 | Map + count callout |
| New home segmentation | High volume subdivisions (closings) | Count of subdivisions with 50+ annual closings | Subdivisions | Annual at 3Q 2025 | Map + count callout |
| New home supply | Finished vacant homes by market area | Finished vacant inventory by area + months' supply | Homes, months | 3Q 2025 | Bar + line |
| New home supply | Submarkets with high finished vacant inventory | Finished vacant homes and months' supply for selected submarkets; includes selection criteria text | Homes, months | 3Q 2025 | Table |
| New home supply | Finished vacant inventory by lot size program | Finished vacant inventory + months' supply by lot-size band | Homes, months | 3Q 2025 | Bar + line |
| Pricing | Median new home price trend + price per square foot | Median $ and $/sf | $, $/sf | Through Dec 2025 (multi-year) | Dual-axis line chart |
| Pricing | Floorplan price direction | Count/% of floorplans with price decreases/no change/increases; median magnitude of change | Count, %, $ | QoQ (quarter-over-quarter) | KPI/stat block |
| Pricing | Annual starts/closings by base price band | Starts and closings by base price buckets | Homes | Annual at 3Q 2025 | Bar chart |
| Pricing | Finished vacant inventory by price band | Finished vacant inventory + months' supply by price bucket | Homes, months | 3Q 2025 | Bar + line |
| Floorplans | Least expensive base priced floor plan | Specific plan: base price, beds/baths, sqft; community + builder | $, count, sqft | Snapshot | Profile card |
| Floorplans | Most expensive base priced floor plan(s) | Plan specs and base price | $, count, sqft | Snapshot | Profile card(s) |
| Rankings | Top communities & neighborhoods ranked by annual starts | Rank + annual starts + annual closings by community | Homes | Annual at 3Q 2025 | Table (spans multiple slides) |
| Lot inventory | VDL inventory by market area + months' supply | Vacant developed lots by market area + months' supply | Lots, months | 3Q 2025 | Bar + line |
| Lot inventory | Submarkets with high VDL inventory | VDL inventory and months' supply by submarket + selection criteria | Lots, months | 3Q 2025 | Table |
| Lot inventory | High VDL subdivisions | Count of subdivisions with 100+ VDL | Subdivisions | 3Q 2025 | Map + count callout |
| Lot inventory | VDL by lot size program | VDL inventory + months' supply by lot-size band | Lots, months | 3Q 2025 | Bar + line |
| Future lots | Future lots by market area | Lots in future pipeline split by "raw land" vs "active site work" (labels inferred from legend) | Lots | 3Q 2025 | Stacked bar chart |
| Future lots | Future lots by status | Lots by status buckets (Vacant, Clearing, WS&D, Paving) | Lots | 3Q 2025 | Bar chart |
| Future lots | Future planned subdivision locations | Count of identified future planned subdivisions | Subdivisions | 3Q 2025 | Map + count callout |
| Focus area | 288 South corridor "Historical Community Activity" | Quarterly closings; models; complete vacant; under construction; total inventory; total supply; quarterly starts; VDL; VDL supply; future lots; lot deliveries | Homes/lots, months | 1Q 2025–4Q 2025 | Table |
| Focus area | 288 South "Top Communities" table | For each community: 4Q models; quarterly starts & annual; quarterly closings & annual | Homes, models | 1Q 2025–4Q 2025 + annual | Table |
| Focus area | 288 South "Top Builders" table | Annual starts, annual closings, 4Q market share, annual market share | Homes, % | Annual at 4Q 2025 | Table |
| Focus area | 288 South starts/closings by lot size | Starts and closings by lot-size band | Homes | Annual at 4Q 2025 | Bar chart |
| Focus area | 288 South starts/closings by base price band | Starts and closings by price bucket | Homes | Annual at 4Q 2025 | Bar chart |
| Focus area | 288 South VDL by lot size | VDL inventory + months' supply by lot size | Lots, months | 4Q 2025 | Bar + line |
| Focus area | 288 South future planned developments | "Future planned lots identified" | Lots | 4Q 2025 | KPI callout |
| Focus area | Grand Magnolia (map + community table) | Map of submarkets + community table as above | Homes, models | 1Q 2025–4Q 2025 + annual | Map + table |
| Focus area | River Ranch (map + community table) | Map of submarkets + community table as above | Homes, models | 1Q 2025–4Q 2025 + annual | Map + table |
| Focus area | Lago Mar East (map + community table) | Map of submarkets + community table as above | Homes, models | 1Q 2025–4Q 2025 + annual | Map + table |
| Conclusions | Houston starts forecast ranges | 2025 and 2026 forecast start ranges and % deltas | Homes, % | 2025–2026 | KPI callout |
| Conclusions | Permits vs estimated starts history + long-term average | Annual SF building permits vs estimated annual starts; includes long-term average line | Homes | 4Q 2000–4Q 2026 | Dual-axis bar+line |
The following are central to reproducing the deck but not formally defined on-slide, so a national build should treat them as metric-spec decisions you must codify:
These are solvable, but they materially affect replication success.
This section is a catalog of the most relevant nationwide public sources, with endpoints/patterns, cadence, granularity, and automation difficulty.
The U.S. Census Bureau Building Permits Survey (BPS) is the most important uniform national feed for local-market construction activity. Revised permits are released on the 17th workday of each month and are published down to CBSA, county, and permit-issuing place.
BPS bulk files are distributed through Census "FTP-style" directories (public HTTP). The directory structure includes CBSA, County, Place, State, and a large "Master Data Set."
Concrete endpoint patterns (BPS):
# Directory landing (browseable)
https://www2.census.gov/econ/bps/
# CBSA revised monthly & year-to-date files (text)
https://www2.census.gov/econ/bps/CBSA%20%28beginning%20Jan%202024%29/cbsaYYMMc.txt
https://www2.census.gov/econ/bps/CBSA%20%28beginning%20Jan%202024%29/cbsaYYMMy.txt
# Example shown in directory listing
.../cbsa2512c.txt (revised monthly, Dec 2025)
.../cbsa2512y.txt (YTD, Dec 2025)
# Master compiled data documentation (notes it is extremely large)
https://www2.census.gov/econ/bps/Master%20Data%20Set/Compiled%20Data%20Documentation.docx
The Master Compiled Data Set is described as extremely large (millions of rows / multi-GB) and is usually unnecessary if you only need top CBSAs; it's typically easier to ingest the monthly CBSA files plus county/place as needed.
Automation difficulty: 1/5 (bulk file ingest;
stable)
Time to automate ingestion: ~1–2 engineer-days
for a robust downloader/parser + 2–3 days for QA and schema
stabilization.
The Census "New Residential Construction" (NRC) and "New Residential Sales" (NRS) releases (from the Survey of Construction and BPS) provide national and regional estimates for: starts, under construction, completions, and stage-of-construction inventory, including "completed houses for sale."
These are released on the Survey of Construction schedule: NRC and NRS typically on the 12th workday, and revised permits on the 17th workday.
Key constraint: NRC/NRS are not published at CBSA granularity for the "under construction / completions / stage inventory" measures, so they function mainly as macro calibration targets for any metro estimation model.
Automation difficulty: 1/5 (direct Excel
downloads)
Time to automate ingestion: <1 week including
field dictionary mapping.
HMDA is maintained/published through the FFIEC HMDA platform (with CFPB stewardship). The dataset is a powerful proxy for mortgage-financed purchase originations and can be summarized by MSA/MD (metro). Federal Financial Institutions Examination Council provides a "Data Browser API" that returns either aggregated JSON or raw CSV subsets, filtered by filing year and geography.
Key endpoints from the official documentation include:
# Aggregations (JSON)
GET https://ffiec.cfpb.gov/v2/data-browser-api/view/aggregations?years=YYYY&msamds=#####&actions_taken=...
# Raw streamed CSV (careful: can be huge)
GET https://ffiec.cfpb.gov/v2/data-browser-api/view/csv?years=YYYY&msamds=#####&actions_taken=...
The documentation specifies required parameters (year + at least one HMDA data filter) and geographic filters including msamds and counties.
For bulk files, "HMDA File Serving" documents institution-level modified LAR endpoints (CSV/TXT per LEI) and describes that other files are served from a public bucket with a fixed prefix.
Automation difficulty: 3/5
Time to automate ingestion: ~2–4 weeks for a production-grade pipeline (query builder, paging/streaming, retries, audit logs, and a stable metric layer).
The Federal Housing Finance Agency publishes a "master" House Price Index file as direct CSV; it contains CBSA-level series (among many levels).
Concrete endpoint (FHFA HPI):
https://www.fhfa.gov/hpi/download/monthly/hpi_master.csv
Automation difficulty: 1/5
Time to automate ingestion: ~2–4 engineer-days
including a CBSA filter/extract and QA.
Many of the BPS permit series and other macro series are mirrored in FRED. FRED is particularly useful when you want simple CBSA series IDs and consistent format options, and you're comfortable depending on FRED as a distributor.
Example: Houston 1-unit permits series includes a definition that 1-unit structures correspond to single-family homes (including certain attached forms if separated by ground-to-roof walls).
FRED Web Services provide a stable API with a required API key
parameter and support JSON/CSV output via file_type.
Automation difficulty: 1/5
Time to automate ingestion: <1 week.
The Census Population Estimates Program publishes metro/micro population totals and components of change (including natural change and net domestic/international migration components) in downloadable files for 2020–2024 vintages.
This is a direct match for the deck's "births–deaths / domestic migration / international migration" visuals at the metro level.
Automation difficulty: 2/5 (file layout + annual
refresh)
Time to automate ingestion: ~1–2 weeks
including crosswalk stabilization across vintages.
HUD has two major "distribution surfaces" relevant here:
HUD's aggregated USPS administrative data is extremely relevant to your "vacancy / under construction" problem because it includes quarterly counts and defines "No-Stat" as including addresses like homes under construction and not yet occupied.
However, HUD states that under its agreement with USPS it can make the data accessible only to governmental entities and registered nonprofits, and access requires registration.
HUD also describes a newer "Neighborhood Change Web Map API" and a HUD User dataset API tester that requires an access token; again the access model is restricted.
Automation difficulty: 5/5 (because access
eligibility and licensing is the gating factor)
Time to automate ingestion: Engineering time
~2–4 weeks once access is granted, but "time to access" is
organizational/legal and may dominate.
This is where national replication becomes "connector-driven." You should assume each CBSA will require a portfolio of permitting sources (city + county + special districts) rather than one.
Two common patterns:
data.city.gov): default $limit behavior
and offset-based paging are standard; application tokens can raise
throttling limits.
query endpoints with
pagination via resultOffset and
resultRecordCount, when supported.
This section maps each Houston metric family to the most plausible public sources, plus an explicit replicability assessment (difficulty 1–5). Where a metric is likely CBAS-proprietary, the mapping focuses on best-effort public substitutes and modeling strategies.
| Metric Family | Examples from Houston Deck | Best Public Source(s) | National Coverage | Difficulty (1–5) | Notes |
|---|---|---|---|---|---|
| Permits issued | "Annual SF building permits," implicit in starts forecast | Census BPS (CBSA/county/place); FRED permit series | Excellent | 1 | BPS is your backbone. |
| Population + migration | Components of change, population levels | Census metro population estimates + components files | Excellent | 2 | Annual refresh; stable file layouts. |
| Employment + unemployment | Metro tables and time series | BLS time series (LAUS/CES); BLS public API | Excellent | 2 | Requires series-id management and caching. |
| House price index | Price trend proxy for resale/new home | FHFA HPI (CBSA series); optionally Freddie Mac FMHPI | Strong | 1–2 | FHFA is fully public and downloadable. |
| Mortgage-financed sales/closings | Closings proxy | HMDA Data Browser API aggregations by msamds | Good | 3 | Mortgage-only; does not cover cash. |
| Total sales/closings (incl cash) | Resale closings counts | County recorder deed transfers (varies), assessor sales files (varies) | Fragmented | 4–5 | Many portals are paywalled or scraping-only; normalization hard. |
| Starts/UC/completions at metro | Quarterly starts, UC stock, completions | Requires local permit + inspection + CO event fusion; calibrate to NRC/SOC | Fragmented | 4 | NRC/SOC is national/regional only; use it for calibration. |
| Finished vacant new-home inventory | "Finished Vacant Homes in Inventory" | Best-effort: CO/completion minus recorded/financed sales; alternative: HUD-USPS no-stat/vacancy (restricted) | Fragmented | 4–5 | "Finished vacant" is conceptually computable if you can unify CO and sales, but it's data-work heavy. |
| VDL + future lots pipeline | VDL supply, future lots, "active site work" | Plats + land-development permits + parcel subdivision buildout heuristics | Highly fragmented | 5 | Closest public analogs are plat agendas/approvals and infrastructure permits, but definitions differ. |
| MLS stats | Active listings, DOM, sale/list ratio | Requires MLS (proprietary) | Not public | 5 | Public proxies exist but are not government-adjacent; your best "adjacent" route is recorder+permit+HMDA. |
Below are the key Houston metrics you called out (permits, quarterly starts, UC, completions/closings, finished vacant inventory), each with a reproduction plan.
Source(s): Census BPS revised CBSA and
county/place files.
Cadence: monthly revised (17th workday).
Granularity: CBSA/county/place; by unit type and
structure where provided (depends on file).
Difficulty: 1/5 nationally.
Automation plan (robust + incremental):
cbsaYYMMc.txt and cbsaYYMMy.txt from
the Census directory listing.
(source_file, ingest_timestamp, row_hash) for
audit.
fact_permits(cbsa, month, unit_type, units_authorized,
...).
Public-source reality check: There is no uniform federal CBSA "starts" series comparable to NRC/SOC starts; NRC is national/regional.
Therefore: You either (A) treat permits as an approximation of starts, or (B) build a local lifecycle model from permits + inspections.
Option A (fast, permits-as-starts):
starts_qtr = sum(bps_permits_1unit) by
quarter and CBSA.
Difficulty: 2/5 (simple, but conceptually imperfect).
Option B (Houston-style, lifecycle-derived starts):
Difficulty: 4/5 (requires local connectors).
Option A (modeled stock from starts):
Difficulty: 3/5 (no local data, but statistical).
Option B (observed stock from local lifecycle):
Difficulty: 4/5.
Mortgage-financed closings proxy (nationally uniform):
HMDA "originations" filtered to purchase loans in a CBSA gives a
mortgage closing proxy (not total). Use the Data Browser
aggregation endpoint by msamds and relevant filters.
Difficulty: 3/5.
Total closings (mortgage + cash):
Difficulty: 5/5.
Hybrid "cash closings" estimate:
cash_proxy = recorder_sales -
hmda_purchase_originations
(after aligning geographies and time windows).
There is no federal CBSA "finished vacant new homes" statistic. The closest public analogs are national stage-of-construction series and, potentially, USPS vacancy/no-stat signals (but restricted).
Lifecycle-based computation (closest to the Houston KPI concept):
Difficulty: 4–5/5 depending on recorder availability.
USPS-based proxy (if eligible for HUD–USPS access):
Difficulty: 5/5 (access-limited).
This section addresses: "Which local Houston sources or vendor systems did the presenter likely use for each non-federal metric, and why?"
The deck explicitly cites "HAR.com" as the source for the resale table and the resale closings/inventory time series. This strongly indicates the presenter used data from Houston Association of Realtors (HAR MLS data distribution), which is proprietary and not a government dataset.
Implication for national rollout: you cannot reproduce "Active Listings," "Days on Market," or "Sales/List Price" purely from government sources; you need MLS partnerships or accept alternative proxies.
Across the new-home and lot pipeline slides, the deck repeatedly cites "Source: CBAS," and many visuals are at community/subdivision granularity rather than jurisdiction granularity. That pattern is consistent with a proprietary internal "new-home market census" (address/community-level tracking). The data elements that are especially indicative of private tracking rather than a single public feed include:
Public-data inference: CBAS likely used a blend of (1) permit data and inspection milestones from local permitting agencies, (2) parcel/subdivision geometry from county appraisal/GIS sources, and (3) field validation / builder portal scraping for floorplans and prices.
Houston-area publicly visible building-process systems that could contribute include:
If CBAS produced new-home "closings," they likely relied on either: (a) builder-reported closings, (b) deed recording data, or (c) MLS new construction closings (if covered). Public recorder systems exist in the Houston region, but they commonly have account requirements and transaction frictions. Example: Harris County Clerk real property records portal emphasizes account creation and paid copies.
For national replication, this is one of the hardest domains to automate cleanly.
This section gives you the "connector playbook" for city/county permitting, inspection, platting, and recording systems, including typical API patterns, auth/rate limits, and normalization.
$limit + $offset paging; defaults to
~1,000 rows and supports ordering.
X-App-Token) for improved throttling
and traceability.
Typical query pattern:
GET https://{domain}/resource/{dataset_id}.json?$select=...&$where=...&$order=...&$limit=1000&$offset=...
Header: X-App-Token: <token>
query endpoints and page using
resultOffset and resultRecordCount where
supported.
exceededTransferLimit to continue paging.Typical query pattern:
GET https://{host}/ArcGIS/rest/services/{svc}/FeatureServer/{layer}/query
?where=1%3D1
&outFields=*
&f=json
&resultRecordCount=2000
&resultOffset=0
Accela publishes an API with explicit offset/limit pagination and rate
limit headers (x-ratelimit-*). This is a "best case"
compared with scraping public portals.
Tyler describes an API toolkit for permits and code enforcement, implying access to building permits and inspections data programmatically. OpenGov provides a developer portal and Permitting & Licensing API catalog access, again typically gated.
For both, your connector strategy should assume:
Recorder portals vary widely, often require user accounts, and may not have stable JSON APIs. Harris County Clerk's real property records portal emphasizes portal login/account and copy purchasing. Texas also has county-aggregating portals (e.g., Tyler-hosted "countygovernmentrecords" experience), indicating vendor concentration but not necessarily a public API.
Practical automation stance:
The Building & Land Development Specification (BLDS) exists specifically for standardizing building permit open data. It is a useful reference point for your schema layer.
A Houston-style dashboard, however, needs more than permit issuance: it needs a unit lifecycle. Recommended canonical entities:
This section gives detailed practical methods when direct measurements are not uniformly available.
Inputs:
Core idea:
Pros: quick to nationalize; stable inputs.
Cons: cannot reproduce community-level tables;
finished vacant inventory is weakly identified.
Inputs per jurisdiction:
Definitions (suggested defaults; you can change):
Calibration and QA:
If you can meet HUD/USPS access requirements, tract-level quarterly "No-Stat" and vacancy counts can serve as an additional signal for "under construction" addresses, because HUD notes "No-Stat" includes homes under construction not yet occupied.
This method is most useful as a sanity-check layer rather than a sole estimator.
You requested that the exact top-25 list be treated as unspecified. Use placeholders until you define whether "top" means population, new-home starts, transaction volume, or another criterion:
<CBSA_01>, <CBSA_02>, …, <CBSA_25>
| Connector | Scope | What It Unlocks | Est. Effort (weeks) | Risk Notes |
|---|---|---|---|---|
| Census BPS ingest | Federal baseline | Permits (CBSA/county/place) | 1–2 | Low risk; stable files. |
| FRED ingest | Federal convenience | Permits & macro series by CBSA | 1 | Requires API key. |
| FHFA HPI ingest | Pricing trend | CBSA HPI series | 1 | Simple CSV. |
| Census PEP metro ingest | Pop + migration | Population and components | 2 | Annual refresh; vintage changes. |
| BLS API ingest | Labor market | Employment/unemployment series | 2–3 | Series-id management + rate limits. |
| HMDA Data Browser API | Mortgage closings proxy | Purchase originations counts/sums by metro | 3–5 | Large payloads; query constraints. |
| ArcGIS FeatureServer connector | Local open data | Permits/inspections/CO/plats where exposed | 3–6 | Endpoint variability; pagination quirks. |
| Socrata connector | Local open data | Same as above | 2–4 | Dataset discovery and churn. |
| Accela API connector | Permitting workflow | Permit + inspection lifecycle where credentialed | 4–8 | Requires credentials/agency cooperation. |
| Recorder/assessor connector patterns | Closings (cash+financed) | Total transfer counts and prices | 6–12 | Most difficult; legal + paywalls + normalization. |
| HUD–USPS vacancy (if eligible) | Vacancy/under-construction proxies | Tract-level vacancy/no-stat trend | 4–8 | Access eligibility dominates. |
A practical heuristic is to prioritize CBSAs where you can achieve a "minimum viable Houston" with high automation:
CBSA: <CBSA_NAME>
Counties (dominant):
<COUNTY_LIST>
Primary cities/jurisdictions for permits:
<CITY_LIST>
Permitting system(s) observed:
<Accela / EnerGov(Tyler) / OpenGov / custom /
unknown>
Open-data portals present:
<ArcGIS Hub / Socrata / none / unknown>
Recorder access pattern:
<bulk export / searchable portal / paywall / unknown>
Parcel / assessor layer:
<GIS downloads available? yes/no>
Key metrics support:
<yes> (BPS CBSA files; baseline)
<modeled / lifecycle>
<HMDA-only / recorder><derived / not supported>
<modeled from plats / not supported>
Houston example notes (from public sources):
flowchart LR
subgraph Federal[Federal baseline]
BPS[Census BPS permits files]
PEP[Census metro pop + components]
BLS[BLS employment/unemployment]
FHFA[FHFA HPI]
HMDA[FFIEC HMDA Data Browser API]
FRED[FRED series API]
end
subgraph Local[Local government-adjacent]
PERMITS[City/County permit systems]
INSP[Inspections + CO events]
PLATS[Plats / entitlement agendas]
PARCELS[Assessor / parcel GIS]
REC[Recorder deeds / transfers]
end
subgraph ETL[Ingestion + normalization]
RAW[(Raw landing tables)]
STG[(Staging/standardization)]
CORE[(Canonical warehouse)]
METRICS[(Metric marts)]
end
subgraph Outputs[Products]
DASH[CBSA dashboards]
API[Public/internal metrics API]
end
BPS-->RAW
PEP-->RAW
BLS-->RAW
FHFA-->RAW
HMDA-->RAW
FRED-->RAW
PERMITS-->RAW
INSP-->RAW
PLATS-->RAW
PARCELS-->RAW
REC-->RAW
RAW-->STG-->CORE-->METRICS
METRICS-->DASH
METRICS-->API
gantt
title Typical monthly release cadence (conceptual)
dateFormat YYYY-MM-DD
axisFormat %d
section Census/SOC releases
New Residential Construction (12th workday) :a1, 2026-03-12, 1d
New Residential Sales (12th workday) :a2, 2026-03-12, 1d
Revised Building Permits (17th workday) :a3, 2026-03-19, 1d
section Other feeds
FHFA HPI (monthly, lagged) :b1, 2026-03-28, 1d
BLS metro series (monthly cycles vary) :b2, 2026-03-15, 2d
HMDA (annual publication window varies) :b3, 2026-04-01, 30d
The 12th/17th workday schedule for NRC/NRS and revised permits is documented on Census SOC and BPS schedules. HMDA publication cadence is annual and not "monthly schedule-based"; public announcements confirm annual availability for a filing year.
Below are exact SQL DDL examples (PostgreSQL-style) for a Houston-like pipeline, and sample queries to compute Q4 starts/completions/under-construction/finished-vacant metrics.
-- Jurisdictions (cities, counties, agencies)
CREATE TABLE dim_jurisdiction (
jurisdiction_id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL,
jurisdiction_type TEXT NOT NULL, -- city, county, agency, special_district
state_fips TEXT,
county_fips TEXT,
source_system TEXT, -- accela, energov, opengov, socrata, arcgis, custom
source_base_url TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Core CBSA dimension (your own, seeded from Census delineation files)
CREATE TABLE dim_cbsa (
cbsa_code TEXT PRIMARY KEY, -- e.g., '26420'
cbsa_name TEXT NOT NULL,
delineation_year INT NOT NULL, -- e.g., 2023
is_metro BOOLEAN NOT NULL
);
-- Address/parcel entity (normalized)
CREATE TABLE dim_property (
property_id BIGSERIAL PRIMARY KEY,
address_full TEXT,
address_norm TEXT, -- normalized (USPS style)
city TEXT,
state TEXT,
postal_code TEXT,
parcel_id_raw TEXT,
latitude NUMERIC(10,7),
longitude NUMERIC(10,7),
county_fips TEXT,
cbsa_code TEXT REFERENCES dim_cbsa(cbsa_code),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Permit header (raw ingest may be separate; this is standardized)
CREATE TABLE fact_permit (
permit_id BIGSERIAL PRIMARY KEY,
jurisdiction_id BIGINT REFERENCES dim_jurisdiction(jurisdiction_id),
permit_number TEXT NOT NULL,
permit_type TEXT, -- building, electrical, plumbing, etc.
work_class TEXT, -- new, addition, alteration, demo, etc
residential_flag BOOLEAN,
single_family_flag BOOLEAN,
units_authorized INT,
declared_value_usd NUMERIC(14,2),
application_date DATE,
issue_date DATE,
status TEXT,
source_record_id TEXT, -- vendor/system id
property_id BIGINT REFERENCES dim_property(property_id),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(jurisdiction_id, permit_number)
);
-- Inspection/event log (foundation, framing, final, CO, etc)
CREATE TABLE fact_inspection_event (
inspection_event_id BIGSERIAL PRIMARY KEY,
jurisdiction_id BIGINT REFERENCES dim_jurisdiction(jurisdiction_id),
permit_number TEXT NOT NULL,
event_type TEXT NOT NULL, -- foundation_pass, framing_pass, final_pass, co_issued, etc
event_status TEXT, -- pass/fail/issued/scheduled
event_date DATE NOT NULL,
source_record_id TEXT,
property_id BIGINT REFERENCES dim_property(property_id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Recorder transfers (deeds / conveyances)
CREATE TABLE fact_property_transfer (
transfer_id BIGSERIAL PRIMARY KEY,
county_fips TEXT NOT NULL,
instrument_number TEXT,
document_type TEXT, -- warranty deed, deed, etc
record_date DATE NOT NULL, -- recording date
sale_date DATE, -- if available; else null
sale_price_usd NUMERIC(14,2), -- if available
grantor TEXT,
grantee TEXT,
property_id BIGINT REFERENCES dim_property(property_id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Derived unit lifecycle table (one row per "housing unit observation key")
-- If you track at address+unit granularity, add unit_number or a unit_id.
CREATE TABLE fact_unit_lifecycle (
unit_id BIGSERIAL PRIMARY KEY,
property_id BIGINT REFERENCES dim_property(property_id),
cbsa_code TEXT REFERENCES dim_cbsa(cbsa_code),
start_date DATE, -- your chosen start definition
completion_date DATE, -- CO or final pass
closing_date DATE, -- deed record date or best proxy
is_model_home BOOLEAN DEFAULT FALSE,
source_confidence NUMERIC(3,2) DEFAULT 0.70, -- 0..1
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Optional: monthly permits mart (BPS)
CREATE TABLE fact_cbsa_permits_bps (
cbsa_code TEXT REFERENCES dim_cbsa(cbsa_code),
ym DATE NOT NULL, -- first day of month
units_1unit INT,
units_total INT,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (cbsa_code, ym)
);
The queries below assume you have derived start_date,
completion_date, and closing_date in
fact_unit_lifecycle.
-- Parameterize these in your application
-- Example for Q4 2025:
WITH params AS (
SELECT DATE '2025-10-01' AS q_start,
DATE '2025-12-31' AS q_end
)
SELECT
ul.cbsa_code,
COUNT(*) FILTER (WHERE ul.start_date BETWEEN p.q_start AND p.q_end) AS q4_starts,
COUNT(*) FILTER (WHERE ul.completion_date BETWEEN p.q_start AND p.q_end) AS q4_completions,
COUNT(*) FILTER (WHERE ul.closing_date BETWEEN p.q_start AND p.q_end) AS q4_closings
FROM fact_unit_lifecycle ul
CROSS JOIN params p
GROUP BY ul.cbsa_code
ORDER BY ul.cbsa_code;
WITH params AS (
SELECT DATE '2025-12-31' AS q_end
)
SELECT
ul.cbsa_code,
COUNT(*) AS under_construction_stock
FROM fact_unit_lifecycle ul
CROSS JOIN params p
WHERE ul.start_date IS NOT NULL
AND ul.start_date <= p.q_end
AND (ul.completion_date IS NULL OR ul.completion_date > p.q_end)
GROUP BY ul.cbsa_code;
WITH params AS (
SELECT DATE '2025-12-31' AS q_end
)
SELECT
ul.cbsa_code,
COUNT(*) AS finished_vacant_inventory
FROM fact_unit_lifecycle ul
CROSS JOIN params p
WHERE ul.completion_date IS NOT NULL
AND ul.completion_date <= p.q_end
AND (ul.closing_date IS NULL OR ul.closing_date > p.q_end)
AND COALESCE(ul.is_model_home, FALSE) = FALSE
GROUP BY ul.cbsa_code;
A common definition is: months supply = (finished vacant inventory) / (average monthly closings over trailing 3 months).
WITH params AS (
SELECT DATE '2025-12-31' AS q_end,
DATE '2025-10-01' AS trailing_start
),
finished_vacant AS (
SELECT ul.cbsa_code, COUNT(*) AS fv
FROM fact_unit_lifecycle ul, params p
WHERE ul.completion_date <= p.q_end
AND (ul.closing_date IS NULL OR ul.closing_date > p.q_end)
AND COALESCE(ul.is_model_home, FALSE) = FALSE
GROUP BY ul.cbsa_code
),
trailing_closings AS (
SELECT ul.cbsa_code, COUNT(*) AS closings_3mo
FROM fact_unit_lifecycle ul, params p
WHERE ul.closing_date BETWEEN p.trailing_start AND p.q_end
GROUP BY ul.cbsa_code
)
SELECT
fv.cbsa_code,
fv.fv AS finished_vacant_inventory,
tc.closings_3mo,
CASE
WHEN tc.closings_3mo = 0 THEN NULL
ELSE (fv.fv::NUMERIC / (tc.closings_3mo::NUMERIC / 3.0))
END AS months_supply_finished_vacant
FROM finished_vacant fv
LEFT JOIN trailing_closings tc USING (cbsa_code)
ORDER BY fv.cbsa_code;
A Houston-style national product is achievable if you explicitly separate:
The biggest strategic decision is whether your national MVP should:
Either way, the sources and connector patterns above provide a realistic, automatable path grounded in primary federal and government-adjacent systems.