Research & Methodology

How we rebuilt
the population map.

The technical foundations, validation methodology, and data architecture behind Ether Data's spatio-temporal intelligence layer.

Data provenance

Government statistical records, not device signals

Every feature traces back to federal statistical programs. No mobile panels, no SDK data, and no inferred device locations.

Federal Employment Records (Origin-Destination)

U.S. Federal Statistical Program

Workplace-residence flows matched to federal data.

Federal Census Records (Community Survey)

U.S. Federal Statistical Program

Demographics, income, education, housing, and commuting priors.

Government Labor Statistics (Quarterly Census)

Federal Labor Statistics Agency

Establishment-level employment counts by industry sector.

Federal Time Use Survey

Federal Labor Statistics Agency

Time-use curves by occupation and sector for hourly presence.

Spatial architecture

Proprietary spatial grid

We use a hexagonal tiling system because hexagons tile without distortion, have uniform neighbor distances, and integrate natively with modern analytics stacks.

Resolution

~460m

Edge length per cell

Manhattan coverage

3,360

Complete borough coverage

Features per cell

1,000+

Across 12 attribute domains

Attribute domains

Core identifiers & totals
Age & sex distribution
Race & ethnicity
Education & schooling
Income, poverty & inequality
Housing stock & tenure
Housing structure types
Household & family structure
Labor force & employment
Commuting & travel behavior
Transportation & vehicles
Mobility, migration & language

Product layers

Three intelligence surfaces

Each layer builds on the previous one: presence, attraction, and hourly composition.

01

Daytime Population Proxy

Employment assignment and commute redistribution onto H3 cells.

Validation:F1=0.988F_{1}=0.988

02

Ether's Gravity Model

Hub scores, commuter vectors, and economic adjacency mapping.

Structural priors:R2=0.92R^2=0.92

03

Ether's Temporal Engine - Hourly Composition

Sector-level workforce curves per cell by hour.

Lift above baseline:ΔR2=+0.1525\Delta R^2=+0.1525

Validation

Tested against ground truth

We validate on observed outcomes, not internal consistency alone.

7-day holdout prediction

R2=0.92R^2=0.92

Structural priors only

NYC temporal validation

ΔR2=+0.1525\Delta R^2=+0.1525

492,316 observations

Restaurant revenue prediction

Rraw2=0.80R^2_{raw}=0.80

No transaction training data

Daytime population accuracy

F1=0.988F_{1}=0.988

Validated on high-density zones

Integration

SQL-first. BigQuery-native.

One join key connects Ether data to your existing spatial tables.

SELECT
  cell_id,
  daytime_population,
  top_sector,
  gravity_score,
  finance_worker_pct,
  peak_hour
FROM ether.population_intelligence
WHERE metro = 'NYC'
  AND hour_local = 12
  AND finance_worker_pct > 0.15
ORDER BY gravity_score DESC
LIMIT 100;

Evaluate

Ready to evaluate?

Manhattan is available free in BigQuery. Start querying in minutes.