CBAS Parity Analysis

The Houston CBAS presentation contains 61 distinct metrics, charts, and tables. Most are reachable — only 3 are truly proprietary. Here’s the full breakdown by what we have, what we can buy, what we can build, and what remains out of reach.

Summary

Of the 61 metrics in the Houston CBAS presentation (Land Tejas Private Consult, Feb 2026), we categorized each by what it takes for RealHouse to replicate it:

8
Full parity
Federal data, automatable today
6
Partial
Gaps in scope or cadence
5
Local lifecycle
Permit events (TPIA/FOIA)
12
Paid data
MLS, satellite, builder scraping
27
Deep curation
Submarket definitions + field work
3
Proprietary
Forecasts & system-level models

58 of 61 metrics are reachable with the right combination of government data requests, paid data licenses, and engineering investment. The Houston presentation is heavily weighted toward submarket corridor deep-dives (27 metrics) that require community curation and field work — labor-intensive but not technically proprietary. Only 3 metrics (CBAS forecasts and community-level system tables) have no reproducible path.

Full Parity — Federal Data

These 8 metrics can be fully replicated for any U.S. metro using standardized federal data sources. RealHouse already has connectors built for all of them.

Houston CBAS Metric RealHouse Source Cadence Status
Metro population by year Census PEP metro estimates Annual Built
MSA population growth rankings Census PEP metro estimates Annual Built
County population growth rankings Census PEP county estimates Annual Built
Components of change (births, deaths, migration) Census PEP components of change Annual Built
Employment growth trend (top MSAs) BLS LAUS (place-of-residence) Monthly Built
YoY employment growth (Houston) BLS LAUS metro series Monthly Built
Total nonfarm payroll employment BLS CES metro series Monthly Built
Mortgage rates trend FRED (Freddie Mac PMMS 30-yr) Weekly Built
Data sources used
  • Census PEP: Population Estimates Program. Annual metro and county population totals plus components of change. Connector: packages/ingest/src/connectors/census_pop.py
  • BLS LAUS/CES: Local Area Unemployment Statistics and Current Employment Statistics. Monthly metro employment, unemployment, payroll. Connector: packages/ingest/src/connectors/bls.py
  • FRED: Federal Reserve Economic Data. Mortgage rate series (PMMS) and other macro indicators. Connector: packages/ingest/src/connectors/fred.py

Partial Parity — Gaps in Scope or Cadence

These 6 metrics can be partially replicated, but the public data source has a meaningful gap compared to what CBAS shows.

Houston CBAS Metric RealHouse Source What’s Missing
Employment by sector (table) BLS CES industry series Some Houston-specific sector detail (medical/energy) may require Texas Workforce Commission data, which has inconsistent availability.
Annual employment growth by sector BLS CES industry series Same as above — sector taxonomy may not perfectly match CBAS’s presentation groupings.
Annual new-home closings (financed) HMDA purchase originations Only covers mortgage-financed purchases. Misses all-cash transactions (significant in new-home market). Annual cadence, not quarterly.
Quarterly closings KPI HMDA purchase originations HMDA is annual bulk release, not quarterly. Would need recorder deed data for quarterly granularity.
Consumer confidence FRED alternative sentiment series Conference Board index requires paid subscription. Public alternatives (U. Michigan consumer sentiment via FRED) exist but are different measures.
SF permits vs estimated annual starts Census BPS (permits only) We have the permits bar chart. The “estimated starts” overlay line is what CBAS adds — that requires either local lifecycle data or a model.

Needs Local Lifecycle Data

These 5 metrics are the core of the “new-home operational layer” — they track the construction pipeline from start through completion and sale. No federal source publishes these at the metro level. To produce them, you need actual permit inspection events from local jurisdictions (foundation inspection = start; certificate of occupancy = completion).

The pipeline gap

Federal data gives us permits (how many were authorized) but not starts (how many actually broke ground), under construction (how many are in progress), or completions (how many finished). The Census Survey of Construction (SOC) publishes these concepts nationally and regionally, but not at the metro level. CBAS fills this gap with field-verified lifecycle tracking.

Houston CBAS Metric What We’d Need Current Status
Annual / quarterly new-home starts Foundation/slab inspection dates from local permit portals TPIA filed
Homes under construction (stock) Start dates + completion dates to compute stock-flow: UC = started − completed TPIA filed
Completions (quarterly) Certificate of occupancy (CO) or final inspection dates from local permit portals TPIA filed
Finished vacant inventory Completions (CO dates) + sales signal (recorder deeds or HMDA) to compute: completed but not yet sold Phase 3
Finished vacant months’ supply Derived: finished vacant inventory ÷ trailing monthly closings Phase 3
Houston-specific path to these metrics

For Houston, we’ve filed a TPIA (Texas Public Information Act) request with the Houston Permitting Center for bulk historical permit and inspection data. If fulfilled, this gives us foundation inspection dates (starts) and CO dates (completions) for the City of Houston jurisdiction. Harris County ePermits covers unincorporated areas.

For other metros, the equivalent approach is to find open-data portals (ArcGIS, Socrata) or file equivalent public records requests in each jurisdiction.

Reachable with Paid Data & Engineering

These 12 metrics have no free public data source, but each can be replicated by purchasing a data license, building a scraping pipeline, or both. None require field research or manual curation — they are technical problems with known commercial solutions.

MLS / Resale Market (3 metrics) — MLS license
CBAS Metric Data Source Path to Parity
MLS resale snapshot (sales, prices, DOM, active listings, sale/list ratio) HAR / RESO Web API License a data feed from the local REALTOR association ($8–15k/yr) or use a national aggregator like CoreLogic/ATTOM.
Resale closings vs active inventory time series HAR / RESO Web API Same MLS license — time-series view of the same data.
Median new-home price + price per SF over time MLS new-construction filter MLS listings tagged as new construction provide list price proxies. Not identical to CBAS base prices but close.

Cost: $8–15k/yr per metro for RESO Web API access. ~100% parity for resale metrics; close proxy for new-home pricing.

Lot Pipeline — VDL & Future Lots (4 metrics) — satellite + parcel GIS
CBAS Metric Data Source Path to Parity
Vacant developed lots (VDL) count Satellite imagery + parcel GIS Detect cleared/graded lots without vertical construction via temporal change analysis on monthly satellite imagery (Planet Labs / Maxar). Cross-reference with parcel GIS for lot boundaries.
VDL months’ supply Derived: VDL ÷ trailing starts Once VDL count and starts are available, this is a straightforward computation.
Future lots count Plat records + engineering permits Track subdivision plat applications from Houston Planning Department reports. Supplement with engineering permit data for infrastructure work.
Future lots with active site work Satellite imagery Month-over-month satellite comparison detects clearing, grading, utility trenching, and paving activity on platted lots.

Cost: $15–50k/yr for satellite imagery + free–$5k for parcel GIS. ~70–80% parity; residual gap is underground utilities and ownership intent that satellite can’t detect.

Pricing & Builder Analytics (5 metrics) — builder scraping
CBAS Metric Data Source Path to Parity
Floorplan QoQ price direction stats Builder website scraping Daily/weekly scrape of top builder websites for floorplan catalogs with base prices. Top 10–20 builders cover ~60% of metro production.
Starts/closings by base price band Builder scraping + lifecycle data Link permit lifecycle events to floorplan base prices from scraped catalog. Requires builder name normalization.
Finished vacant by base price band Builder scraping + lifecycle + transactions Combines price catalog, completion dates (CO), and sale signals (recorder/HMDA) for inventory by price tier.
Cheapest floorplan profile Builder website scraping Rank active floorplans by current base price from scraped catalog data.
Most expensive floorplan profiles Builder website scraping Same — opposite end of price ranking.

Cost: $10–30k/yr for scraping infrastructure + direct feeds. ~85% parity; residual gap is distinguishing genuine price cuts from incentive changes (~20–30% of observed “price changes”).

Deep Curation & Field Work

These 27 metrics are the corridor and submarket deep-dives that form the bulk of the CBAS presentation. They are technically reachable — every input is obtainable through data described in the tiers above — but they require significant manual curation to define community boundaries, normalize builder rosters, and validate lot status through field sampling.

Submarket & Corridor Breakdowns (27 metrics)

The Houston presentation includes detailed corridor deep-dives (288 South, Grand Magnolia, River Ranch, Lago Mar East) with the following metric template repeated per corridor:

  • Starts/closings by market area, lot size, and price band
  • Finished vacant inventory and months’ supply by market area and lot size
  • High-volume subdivision counts and maps
  • Top communities ranked by starts and closings
  • Top builders by starts, closings, and market share
  • VDL by market area and lot size
  • Future lots by market area, condition, and planned subdivisions
  • Submarket maps with community locations

What’s needed to reproduce these:

  • All data from tiers above — permit lifecycle (starts/completions), parcel GIS (lot sizing), transaction data (closings), satellite (VDL/site work), builder scraping (price bands)
  • Submarket polygon definitions — define corridor/market-area boundaries using ZIP clusters, county subareas, or planning-department designations ($10–25k labor per metro)
  • Community/subdivision reference table — normalize subdivision names from plat records, link to geographic polygons, and maintain a builder-to-community roster
  • Builder name normalization — permit applicant/contractor field is ~60–70% accurate as a builder proxy; manual curation closes the gap
  • Field validation (optional but improves accuracy) — 2–4 field specialists sample 20–30% of VDL inventory quarterly ($30–100k/yr per metro)

Cost: $10–25k one-time + $35–115k/yr per metro (with field sampling) or $10–25k one-time + $5–15k/yr (without field sampling). ~70% parity without field work; ~85% with it.

Truly Proprietary

Only 3 of the 61 metrics (5%) have no reproducible path. These rely on CBAS’s proprietary forecast models, builder survey relationships, and a fully integrated community-level tracking system built over 10+ years.

CBAS Metric Why Proprietary
2026 employment growth forecast by sector Sourced from Greater Houston Partnership with proprietary methodology. Public alternatives exist (BLS projections, Moody’s, Oxford Economics) but produce different numbers and aren’t CBAS-equivalent.
Houston SF starts forecast range (2025, 2026) CBAS’s own proprietary forecast model calibrated to 10+ years of Houston field data. We could build a competing model, but it would take 2–3 years of backtesting to validate.
Community-level quarterly metrics tables Not a single data source but the output of the entire CBAS platform — community database, lifecycle tracking, builder attribution, and quarterly field verification combined into a production-grade reporting system.

Closing the Gap: Data Acquisition Playbook

The 44 metrics beyond our current federal + partial tiers break down into concrete data acquisitions, each with known costs and timelines. Here’s the investment-by-investment playbook — ordered from highest leverage to most specialized.

1. MLS License (RESO Web API)

Cost: $8–15k/yr per metro Timeline: 2–4 weeks

License a data feed from the local REALTOR association (HAR in Houston) via the RESO Web API standard. Alternatively, use a national aggregator like CoreLogic or ATTOM.

Unlocks:

  • Resale snapshot — annual/monthly sales, prices, days on market, active listings, sale-to-list ratio
  • Resale closings vs active inventory time series
  • Median new-home price + price/SF — via new-construction filter on MLS listings
~100% parity for resale metrics; close proxy for new-home pricing

Residual gap: MLS new-construction tagging varies by market. List price ≠ base price (CBAS tracks base prices before incentives).

2. Local Permit Lifecycle Data

Cost: $15–30k one-time per metro Ongoing: $5–15k/yr

Obtain bulk permit + inspection event data from local jurisdictions via TPIA/FOIA requests or negotiate API access to platforms like Accela, Tyler, or OpenGov. For many metros, open-data portals (ArcGIS, Socrata) provide this data for free.

Unlocks:

  • Quarterly/annual starts — foundation inspection date = start per Census SOC definition
  • Under construction stock — started but not yet completed
  • Completions — CO / final inspection = completion
  • Finished vacant inventory — completed minus sold (combined with recorder/HMDA data)
  • Finished vacant months’ supply (derived)
  • Starts/closings by geography — once geocoded, can aggregate to any polygon
  • Builder attribution — permit applicant/contractor field as proxy (~60–70% accurate; needs normalization)
~95% parity for pipeline KPIs — the single biggest unlock

Residual gap: Inspection event definitions vary by jurisdiction (Houston uses “Foundation”; another metro may use “Footing” or “Slab”). Normalization required per metro.

3. Property Transaction Data (ATTOM / CoreLogic)

Cost: $20–40k/yr Alt: $5–10k per county TPIA

Commercial property data providers like ATTOM Data and CoreLogic maintain national deed/transaction databases that consolidate county recorder data. Alternatively, file TPIA requests directly with county clerk offices for bulk deed exports.

Unlocks:

  • Total new-home closings (financed + cash) — HMDA covers ~70% (mortgaged); recorder adds the remaining cash transactions
  • Quarterly closing granularity — recorder has daily recording dates vs HMDA’s annual bulk release
  • Sale date linkage for finished-vacant computation — “completed but not yet sold” requires knowing when each unit sold
~90% parity for closing metrics

Residual gap: Texas does not mandate sale price disclosure on deeds, so Houston recorder data often has blank price fields. Must cross-reference with HMDA or MLS for price. Some cash transfers lack clear “new construction” flags.

4. County Assessor Parcel GIS

Cost: Free–$5k per county Timeline: 2–8 weeks

Most county assessor offices publish parcel GIS data with lot boundaries, frontage/area dimensions, land use codes, and ownership. HCAD (Harris County) offers quarterly GIS downloads. This is nearly always free or very cheap.

Unlocks:

  • Lot-size segmentation — parcel frontage/area fields enable the 40’/50’/60’ lot-width bands CBAS uses
  • Starts/closings by lot size — spatial join permits to parcels, then bin by lot width
  • Finished vacant by lot size — same join applied to inventory computation
  • Parcel-level joins for all metrics — the backbone for connecting permits, deeds, and inspections to physical lots
~95% parity for lot-size segmentation

Residual gap: ~5% of parcels have missing/incomplete frontage fields in some counties. Quality varies.

5. Satellite Imagery (Planet Labs / Maxar / Nearmap)

Cost: $15–50k/yr Timeline: 4–12 weeks

Monthly satellite imagery (3m resolution from Planet Labs, or higher from Maxar/Nearmap) enables detection of ground disturbance, site clearing, and construction activity through temporal change analysis. This is the closest substitute for CBAS’s field verification of lot pipeline status.

Unlocks:

  • VDL detection — identify cleared/graded lots that haven’t started vertical construction
  • Future lots with active site work — detect clearing, grading, utility trenching, paving via month-over-month image comparison
  • Construction progress — supplement permit data with visual confirmation of starts and completions
~70–80% parity for lot pipeline metrics

Residual gap: Satellite can detect visible surface changes but misses underground utilities, ownership intent, liens/encumbrances, and lot “readiness to build” status. False positives from non-residential earthwork. Monthly cadence means 2–4 week lag.

6. Builder Website Scraping + Direct Feeds

Cost: $10–30k/yr Timeline: 8–20 weeks

Maintain a daily/weekly scrape of builder websites for floorplan catalogs (base prices, specs, community assignments). Supplement with direct data feeds negotiated with major public builders (Lennar, Pulte, Toll Brothers, etc.). The top 10–20 builders typically cover ~60% of metro production volume.

Unlocks:

  • Floorplan QoQ price direction — detect base price changes across the catalog
  • Cheapest/most expensive floorplan profiles — rank active plans by current base price
  • Starts/closings by base price band — link lifecycle events to floorplan base prices
  • Finished vacant by price band
~85% parity for pricing & builder analytics

Residual gap: Base price ≠ list price. CBAS distinguishes genuine price cuts from incentive changes — scraped data conflates them (~20–30% of observed “price changes” are actually incentive shifts). Smaller builders often don’t list base prices publicly. Plan ID normalization when builders rename/retire plans is manual.

7. Community & Submarket Definition

Cost: $10–25k labor per metro Timeline: 8–20 weeks

Define communities from plat polygons + permit text field parsing + builder roster curation. Define submarkets/corridors as ZIP code clusters, county subareas, or planning-department market areas. Validate against 2+ years of permit/sales data. This is labor, not a data license — someone has to curate and maintain the reference tables.

Unlocks:

  • Top communities by starts/closings — aggregate permits to community polygons, rank
  • Top builders by market share — permit applicant + manual builder roster
  • All corridor/submarket breakdowns — once polygons defined, any geocoded metric can be sliced by submarket
  • Submarket maps with community locations
~70% parity for corridor & community analytics

Residual gap: CBAS submarket/corridor boundaries are proprietary and informed by market knowledge (traffic patterns, school districts, builder focus areas). Our ZIP-based or plat-based approximations will differ. Community names are marketing constructs that don’t always match legal subdivision boundaries. Builder attribution from permit applicant fields is ~60–70% accurate (GC ≠ builder).

8. Field Research & Validation

Cost: $30–100k/yr per metro Timeline: Ongoing quarterly

Contract 2–4 field specialists per metro to sample 20–30% of VDL inventory quarterly and validate lot status, site-work conditions, and community boundaries. This is what CBAS does — “visually verified” is their core differentiator.

Unlocks:

  • VDL field verification — confirm that “appears buildable” lots are genuinely ready to build (utilities connected, no liens, owner willing)
  • Site-work condition taxonomy — standardize lot status beyond what satellite/permits show (raw, cleared, utilities, paving)
  • Community/builder ground truth — validate which subdivisions are active, which builders are present, model home counts
Closes the last ~15–20% gap on VDL/lot pipeline metrics

What remains: Even with field sampling, ~10–15% of lot inventory status changes between sampling visits. CBAS likely has higher cadence and deeper builder relationships built over years.

Investment Summary

Total estimated cost to reach ~80–85% CBAS parity for a single metro (Houston), organized by phase:

Data Acquisition One-Time Annual Metrics Unlocked Parity
Federal spine (existing) 14 (8 full + 6 partial) Done
Permit lifecycle (TPIA/API) $15–30k $5–15k ~10 (pipeline + geo slicing) ~95%
Parcel GIS (assessor download) Free–$5k ~4 (lot-size segmentation) ~95%
Transaction data (ATTOM/recorder) $5–10k $20–40k ~3 (total closings + quarterly) ~90%
MLS license (HAR / RESO) $8–15k ~3 (resale + new-home pricing) ~100%
Builder scraping + feeds $10–20k $10–30k ~5 (pricing, floorplan catalog) ~85%
Satellite imagery (Planet/Maxar) $15–50k ~4 (VDL, site work detection) ~75%
Community/submarket curation $10–25k $5–15k ~12 (corridors, rankings, maps) ~70%
Field research (sampling) $30–100k ~6 (VDL validation, conditions) ~80%
Total (first metro) $40–90k $93–265k ~47 additional metrics ~80–85%

Subsequent metros cost ~60–70% of the first (reuse connector code, scraping infrastructure, and satellite processing pipelines; marginal cost is mainly per-metro TPIA/API negotiation and community curation labor).

The Accuracy Gap (Why ~80–85%, Not 100%)

Even where we can produce the same metric, our version won’t always match CBAS’s precision. These are the systematic accuracy gaps that investment alone doesn’t fully close:

Key Takeaway

Today
Federal data only
14 metrics — done
+ Local Lifecycle
TPIA + parcel GIS
+5 metrics — ~$20–35k
+ Paid Data
MLS, satellite,
builder scraping
+12 metrics — ~$50–130k/yr
+ Curation
Submarket definitions,
field validation
+27 metrics — ~$40–125k/yr

The critical insight: permit lifecycle data is the single biggest unlock. For ~$20–35k one-time investment (TPIA + parcel GIS), we go from 14 metrics to 19 — covering the full construction pipeline (starts, UC, completions, finished vacant) that federal data alone cannot provide.

Adding paid data sources (MLS, satellite, builder scraping) pushes coverage to 31 metrics. The remaining 27 corridor/submarket metrics all become available once community boundaries and builder rosters are curated — labor-intensive work, but every input is obtainable.

Bottom line: 58 of 61 metrics (95%) are reachable. With ~$40–90k one-time and ~$93–265k/year, a single metro can reach ~80–85% parity with CBAS. Only 3 metrics (proprietary forecasts and system-level community tables) have no reproducible path. The gap is investment, not impossibility.