Research & Methodology

How we rebuilt
the population map.

The technical foundations, validation methodology, and data architecture behind Ether Data's spatio-temporal intelligence layer.

Data provenance

Government statistical records, not device signals

Every feature traces back to federal statistical programs. No mobile panels, no SDK data, and no inferred device locations.

Federal Employment Records (Origin-Destination)

U.S. Federal Statistical Program

Workplace-residence flows matched to federal data.

Federal Census Records (Community Survey)

U.S. Federal Statistical Program

Demographics, income, education, housing, and commuting priors.

Government Labor Statistics (Quarterly Census)

Federal Labor Statistics Agency

Establishment-level employment counts by industry sector.

Federal Time Use Survey

Federal Labor Statistics Agency

Time-use curves by occupation and sector for hourly presence.

Spatial architecture

Proprietary spatial grid

We use a hexagonal tiling system because hexagons tile without distortion, have uniform neighbor distances, and integrate natively with modern analytics stacks.

Resolution

~460m

Edge length per cell

Manhattan coverage

3,360

Complete borough coverage

Features per cell

1,000+

Across 12 attribute domains

Attribute domains

Core identifiers & totals

Age & sex distribution

Race & ethnicity

Education & schooling

Income, poverty & inequality

Housing stock & tenure

Housing structure types

Household & family structure

Labor force & employment

Commuting & travel behavior

Transportation & vehicles

Mobility, migration & language

Product layers

Three intelligence surfaces

Each layer builds on the previous one: presence, attraction, and hourly composition.

Daytime Population Proxy

Employment assignment and commute redistribution onto H3 cells.

Validation: $F_{1}=0.988$

Ether's Gravity Model

Hub scores, commuter vectors, and economic adjacency mapping.

Structural priors: $R^2=0.92$

Ether's Temporal Engine - Hourly Composition

Sector-level workforce curves per cell by hour.

Lift above baseline: $\Delta R^2=+0.1525$

Validation

Tested against ground truth

We validate on observed outcomes, not internal consistency alone.

7-day holdout prediction

R^2=0.92

Structural priors only

NYC temporal validation

\Delta R^2=+0.1525

492,316 observations

Restaurant revenue prediction

R^2_{raw}=0.80

No transaction training data

Daytime population accuracy

F_{1}=0.988

Validated on high-density zones

Integration

SQL-first. BigQuery-native.

One join key connects Ether data to your existing spatial tables.

SELECT
  cell_id,
  daytime_population,
  top_sector,
  gravity_score,
  finance_worker_pct,
  peak_hour
FROM ether.population_intelligence
WHERE metro = 'NYC'
  AND hour_local = 12
  AND finance_worker_pct > 0.15
ORDER BY gravity_score DESC
LIMIT 100;

Evaluate

Ready to evaluate?

Manhattan is available free in BigQuery. Start querying in minutes.

How we rebuiltthe population map.