Data Quality & Dashboard Fixes

A comprehensive audit of the Houston dashboard identified 12 issues across data freshness, model correctness, and presentation. All were resolved in a single release.

At a Glance

9
Commits
Across 13 files
48
Tests Pass
+1 new test added
3
Categories
Freshness, model, UI
1
Migration
FK + 2 indexes

1. Data Freshness

Pages were statically rendered at build time. Even after loading new data into Supabase, the dashboard showed stale values until the next Vercel deployment.

ISR Revalidation on All Data Pages

Problem: All five market data pages were statically cached at build time. New Supabase data (permits through Dec 2025, HPI through Q3 2025) was invisible to users.

Fix: Added export const revalidate = 3600 to each page, enabling Next.js Incremental Static Regeneration. Pages now re-fetch data at most hourly.

Pages affected:

  • /market/[cbsa] — Market Overview
  • /market/[cbsa]/permits — Permits
  • /market/[cbsa]/employment — Employment
  • /market/[cbsa]/demographics — Demographics
  • /market/[cbsa]/prices — Prices

Verification: Build output shows all market pages as ƒ (dynamic/ISR) instead of (static).

2. Model & Metric Corrections

The pipeline model and several derived metrics had structural bugs that produced misleading numbers on the dashboard.

Broken “Finished Vacant” Metric Replaced

Problem: The finished_vacant_modeled metric subtracted HMDA purchase originations (~106K/year, all housing types) from new-construction completions (~18K/year, single-family only). The result was always deeply negative — off by approximately 5×.

Completions
~18K/yr (SF only)
Minus HMDA Closings
~106K/yr (all types)
= Deeply Negative
Always wrong

Fix: Replaced the Finished Vacant KPI card with a “Closings (HMDA)” card showing the raw quarterly purchase originations. The underlying finished_vacant_modeled column remains in the database for future work when a better closings proxy (new-construction-only HMDA filter or recorder deed transfers) becomes available.

Incomplete Trailing Quarters Dropped

Problem: When BPS data had a month count not divisible by 3 (e.g., 13 months), the pipeline model produced a final “quarter” with only 1–2 months of data. This partial quarter appeared artificially low on the dashboard.

Fix: Added a check at the end of run_quarterly():

if n_months % 3 != 0 and quarters:
    quarters.pop()

Test: New test test_incomplete_quarter_dropped verifies that 13 months produce exactly 4 full quarters, not 5.

Result: Pipeline now outputs 8 clean quarters for Houston (Jan 2024 – Dec 2025).

Pipeline Query Filters by Model Version

Problem: The getPipelineQuarterly query fetched from mart_pipeline_quarterly without filtering by model_version. If multiple model runs existed, the dashboard would show duplicate quarters.

Fix: Added .eq("model_version", "v1") to the Supabase query.

Schema Fixes: FK and Indexes

Problem: fact_bls_employment had no foreign key to dim_cbsa, and common dashboard query patterns lacked indexes.

Fix: Migration 013 adds:

  • FK constraint: fact_bls_employment.cbsa_code → dim_cbsa.cbsa_code
  • Index: (cbsa_code, ref_month) on BLS table
  • Index: (cbsa_code, index_type, ref_period) on HPI table

3. Dashboard Presentation

Several charts and labels were misleading or showed data in suboptimal ways. These fixes improve clarity without changing underlying data.

KPI Labels Clarified: SF vs All Permits

Problem: The pipeline model operates on units_1u (single-family only), but KPI cards said “Quarterly Permits,” “Starts,” etc. Meanwhile the “Monthly Permits” card showed units_total (all types). Users couldn’t tell which metric was which.

Before After Data Source
Quarterly Permits SF Permits (Qtr) units_1u
Starts (Modeled) SF Starts (Qtr) Pipeline model
Under Construction SF Under Constr. Pipeline model
Completions (Modeled) SF Completions (Qtr) Pipeline model
Monthly Permits All Permits (Mo) units_total

All numeric values also gained .toLocaleString() formatting for consistent thousand-separators.

Permits Chart: Stacked SF + Multifamily

Problem: The permits bar chart showed “Single Family” and “Total Units” as separate side-by-side bars. Since Total includes SF, this visually double-counted single-family permits.

Fix: Replaced with stacked bars showing Single Family (bottom, dark blue) and Multifamily (top, light blue), where Multifamily is computed as units_total − units_1u. The stack total equals the real total.

Demographics: Partial 2020 Excluded

Problem: Census PEP 2020 data covers only April–June (3 months instead of 12). In the “Components of Change” chart, births appeared to collapse from ~90K to ~22K — a misleading artifact of the short coverage period.

Fix: Filtered 2020 from the components chart (ref_year > 2020). The population trend chart retains 2020 since the estimate itself (7.17M) is correct.

Population Change Sign Fix

Problem: The demographics page hardcoded a + prefix on YoY population change and net migration. If a metro shrank, the display would show +-1,234.

Fix: Conditionally prepends + only when the value is non-negative.

Mortgage Rate Chart: 5-Year Window

Problem: The mortgage rate chart loaded all weekly FRED data back to 1971 (~2,800 data points). The chart was dense and slow, and historical rates from the 1980s weren’t relevant to current market analysis.

Fix: Added an optional startDate parameter to getFredSeries() and filtered mortgage data to 2020–present.

Y-Axis Auto-Fit on Rate Charts

Problem: Unemployment rate (3–6%) and mortgage rate (5–8%) charts forced the Y-axis to start at 0. This compressed meaningful variation into the top sliver of the chart area.

Fix: Changed Y-axis domain from [0, "auto"] to ["dataMin - 1", "auto"]. The axis now starts just below the lowest data point, giving better visual resolution of rate movements.

Deferred Issues

These were identified during the audit but intentionally deferred as low-priority or requiring broader design work:

Issue Reason to Defer
BLS data is NSA (not seasonally adjusted) Annotation-only, not a data bug
HCAD query downloads all parcels for median Performance issue, local page only
Permit-to-start weights sum to 1.0 (should be ~0.97) Minor calibration, doesn’t change user perception
UC stock model starts from zero Fundamental model redesign needed
loaded_at not updated on upsert Observability improvement, not user-facing
Employment YoY JS timezone edge case Unlikely to trigger, needs investigation