Bike-Share Equity & Mobility Need (Manhattan Demo) - Powered by Etherdata's Canonical US Census at H3 Resolution 8
Etherdata.ai makes spatial data trustworthy and usable for every team. We publish open, rigorous, deeply usable datasets that make decision-making fairer and faster.
This page is the first of our free starter queries, built on a free New York sample of our flagship product: the Canonical, Complete US Census Dataset at H3 Resolution 8.
This demo answers a practical question cities, operators, and planners ask constantly: Where is bike-share service aligned with resident mobility needs - and where is it not?
Focus
Equity-first bike-share planning
Geography
Manhattan, New York (H3 R8)
Spatial unit
Canonical H3 grid with census + trips
Output
Need, service, and service gap scores
Why Census is the Backbone (and trips alone are not enough)
Bike trips are an outcome. They reflect where stations exist, where commuters and tourists concentrate, weather, pricing, and operational decisions. If you only map trips, you mostly map the operator's supply footprint and the city's activity hotspots - not resident mobility need.
Census transforms bike-share analysis from "what happened" to "what should happen" by providing:
Denominators (population, households, workers) to convert counts into comparable rates
Structural constraints (car ownership, renter share, rent burden) that shape real transportation choices
Commuting context (public transit share, walk share, work-from-home) to separate commuter/tourist effects from resident mobility
Planning lens
The result is a planning-grade lens: we can quantify whether observed bike service is under-delivering relative to the people and constraints in each neighborhood cell.
Datasets and spatial framework
Spatial unit: H3 Resolution 8
We use H3 resolution 8 hexagons (average area about 0.737 km^2) to create a stable, comparable grid across Manhattan. H3 provides consistent adjacency, clean aggregation, and native compatibility with modern GIS pipelines.
Geographic scope: Manhattan (New York County)
We retrieve Manhattan's county geometry, polyfill it into H3 R8 cells, then use membership joins so every metric aligns to the same spatial support. This avoids centroid-in-polygon errors and keeps all joins deterministic and reproducible.
Data sources (BigQuery)
Citi Bike trips: start station point events aggregated into H3 R8
Citi Bike stations: supply proxy (stations + dock capacity) aggregated into H3 R8
Etherdata canonical census (H3 R8): demographic, housing, and commuting variables keyed to the same H3 cells
Methodology (How census converts raw trips into decision signals)
1
Define the analysis grid (H3 R8 Manhattan)
Load Manhattan polygon
Polyfill to H3 R8
Use that H3 list as the authoritative domain for joins
2
Aggregate trips into the grid (observed usage)
Trips are aggregated by the H3 cell of the start station. We compute counts and behavioral slices (subscriber share, weekend share, average duration) to separate commuting-like usage from leisure-like usage.
3
Aggregate station supply into the grid (service capacity)
Stations and dock capacity are aggregated into H3 cells. This is a service proxy: it is not perfect, but it captures where supply exists and its potential throughput.
4
Roll up census into the grid (structural demand)
This is the core value proposition: Etherdata's canonical census layer provides consistent, H3-keyed demographic and mobility variables. We roll them safely (sums for counts, weighted rollups for income/rent) to ensure one row per H3 cell.
5
Create indices that support planning decisions
Mobility Need Index (census-driven)
A composite indicator designed to approximate structural need for shared mobility using census variables:
no_car_rate (higher -> more reliance on non-car modes)
public_transit_share (higher -> transit-oriented mobility; complements bikes as first/last-mile)
renter_share and rent_burden (higher -> tighter budgets and higher sensitivity to mobility cost)
wfh_share (lower -> more commute pressure)
Key point: this index is intentionally independent of bike usage. It is a census-only "need surface."
Service Index Core (service supply + utilization)
A composite "delivered service" indicator using:
docks (supply)
trips_per_dock (utilization)
We avoid per-capita service metrics in isolation because Manhattan includes many non-residential cells (parks, CBD corridors). That is exactly why census denominators matter: you use them when appropriate (households/workers), and you guard against places где "residential population" is not the correct exposure.
Negative: well-served or over-served relative to resident need (often tourism/CBD effects)
NULL: non-comparable cell (typically too few households/workers, parks, water edges)
Starter query (BigQuery / Looker Studio)
The demo is powered by a single query that polyfills Manhattan into H3 R8, aggregates trips and stations, joins Etherdata's canonical census variables, and computes need, service, and gap metrics.
-- Census-first: Mobility Need + Service Gap (Manhattan, H3 R8)
-- Looker Studio compatible (single statement, no DECLARE)
WITH
manhattan AS (
SELECT county_geom AS geom
FROM `bigquery-public-data.geo_us_boundaries.counties`
WHERE state_fips_code = '36'
AND RIGHT(county_fips_code, 3) = '061'
LIMIT 1
),
manhattan_h3 AS (
SELECT h3
FROM manhattan m,
UNNEST(bqcarto.h3.ST_ASH3_POLYFILL(m.geom, 8)) AS h3
),
trips_by_h3 AS (
SELECT
bqcarto.h3.LONGLAT_ASH3(t.start_station_longitude, t.start_station_latitude, 8) AS h3,
COUNT(1) AS trip_count,
COUNTIF(LOWER(t.usertype) = 'subscriber') AS trip_subscriber,
COUNTIF(LOWER(t.usertype) = 'customer') AS trip_customer,
COUNTIF(EXTRACT(DAYOFWEEK FROM DATE(t.starttime)) IN (1, 7)) AS trip_weekend,
COUNTIF(EXTRACT(DAYOFWEEK FROM DATE(t.starttime)) BETWEEN 2 AND 6) AS trip_weekday
FROM `bigquery-public-data.new_york_citibike.citibike_trips` t
WHERE DATE(t.starttime) BETWEEN DATE '2017-04-12' AND DATE '2018-09-10'
AND t.start_station_latitude BETWEEN 40.68 AND 40.88
AND t.start_station_longitude BETWEEN -74.05 AND -73.90
AND t.start_station_latitude IS NOT NULL
AND t.start_station_longitude IS NOT NULL
GROUP BY 1
),
trips_manhattan AS (
SELECT t.*
FROM trips_by_h3 t
JOIN manhattan_h3 mh USING (h3)
),
stations_by_h3 AS (
SELECT
bqcarto.h3.LONGLAT_ASH3(s.longitude, s.latitude, 8) AS h3,
COUNT(*) AS station_count,
SUM(COALESCE(s.capacity, 0)) AS total_docks
FROM `bigquery-public-data.new_york_citibike.citibike_stations` s
WHERE s.latitude BETWEEN 40.68 AND 40.88
AND s.longitude BETWEEN -74.05 AND -73.90
AND s.latitude IS NOT NULL
AND s.longitude IS NOT NULL
AND COALESCE(s.is_installed, TRUE) = TRUE
GROUP BY 1
),
stations_manhattan AS (
SELECT sb.*
FROM stations_by_h3 sb
JOIN manhattan_h3 mh USING (h3)
),
census_manhattan AS (
SELECT
c.h3,
c.total_pop,
c.households,
c.workers_16_and_over,
c.commuters_16_over,
c.median_income,
c.bachelors_degree_or_higher_25_64,
c.pop_25_64,
c.no_car,
c.commuters_by_public_transportation,
c.commuters_by_subway_or_elevated,
c.walked_to_work,
c.worked_at_home,
c.occupied_housing_units,
c.housing_units_renter_occupied,
c.percent_income_spent_on_rent,
c.median_rent
FROM `v3layer-pro.census_tract_h3r8.nydemo` c
JOIN manhattan_h3 mh USING (h3)
WHERE c.state_fips = '36'
AND c.h3 IS NOT NULL
),
census_rollup AS (
SELECT
h3,
SUM(total_pop) AS total_pop,
SUM(households) AS households,
SUM(workers_16_and_over) AS workers_16_and_over,
SUM(commuters_16_over) AS commuters_16_over,
SAFE_DIVIDE(SUM(median_income * total_pop), NULLIF(SUM(total_pop), 0)) AS median_income,
SAFE_DIVIDE(SUM(bachelors_degree_or_higher_25_64), NULLIF(SUM(pop_25_64), 0)) AS bachelors_or_higher_25_64_rate,
SAFE_DIVIDE(SUM(no_car), NULLIF(SUM(households), 0)) AS no_car_rate,
SAFE_DIVIDE(SUM(commuters_by_public_transportation), NULLIF(SUM(commuters_16_over), 0)) AS public_transit_share,
SAFE_DIVIDE(SUM(commuters_by_subway_or_elevated), NULLIF(SUM(commuters_16_over), 0)) AS subway_share,
SAFE_DIVIDE(SUM(walked_to_work), NULLIF(SUM(workers_16_and_over), 0)) AS walk_share,
SAFE_DIVIDE(SUM(worked_at_home), NULLIF(SUM(workers_16_and_over), 0)) AS wfh_share,
SAFE_DIVIDE(SUM(housing_units_renter_occupied), NULLIF(SUM(occupied_housing_units), 0)) AS renter_share,
SAFE_DIVIDE(SUM(percent_income_spent_on_rent * households), NULLIF(SUM(households), 0)) AS rent_burden,
SAFE_DIVIDE(SUM(median_rent * households), NULLIF(SUM(households), 0)) AS median_rent
FROM census_manhattan
GROUP BY 1
),
base AS (
SELECT
c.h3,
bqcarto.h3.ST_BOUNDARY(c.h3) AS geom,
ST_Y(ST_CENTROID(bqcarto.h3.ST_BOUNDARY(c.h3))) AS lat,
ST_X(ST_CENTROID(bqcarto.h3.ST_BOUNDARY(c.h3))) AS lon,
-- Public behavioral signals (validation)
COALESCE(t.trip_count, 0) AS trip_count,
COALESCE(s.station_count, 0) AS station_count,
COALESCE(s.total_docks, 0) AS total_docks,
-- Canonical census (hero)
c.total_pop,
c.households,
c.workers_16_and_over,
c.commuters_16_over,
c.median_income,
c.bachelors_or_higher_25_64_rate,
c.no_car_rate,
c.public_transit_share,
c.subway_share,
c.walk_share,
c.wfh_share,
c.renter_share,
c.rent_burden,
c.median_rent,
-- Robust normalizations
SAFE_DIVIDE(COALESCE(t.trip_count, 0), NULLIF(c.households, 0)) AS trips_per_household,
SAFE_DIVIDE(COALESCE(t.trip_count, 0), NULLIF(c.workers_16_and_over, 0)) AS trips_per_worker,
SAFE_DIVIDE(COALESCE(t.trip_count, 0), NULLIF(COALESCE(s.total_docks, 0), 0)) AS trips_per_dock,
SAFE_DIVIDE(COALESCE(s.total_docks, 0), NULLIF(c.households, 0)) * 1000 AS docks_per_1k_households,
SAFE_DIVIDE(COALESCE(s.total_docks, 0), NULLIF(c.workers_16_and_over, 0)) * 1000 AS docks_per_1k_workers,
-- Census-first Mobility Need Index
(
0.40 * COALESCE(c.no_car_rate, 0)
+ 0.25 * COALESCE(c.public_transit_share, 0)
+ 0.15 * COALESCE(c.renter_share, 0)
+ 0.10 * COALESCE(c.rent_burden, 0) / 50
+ 0.10 * (1 - COALESCE(c.wfh_share, 0))
) AS mobility_need_index,
-- Service index (NOT pop-based)
(
0.60 * SAFE_DIVIDE(COALESCE(s.total_docks, 0), NULLIF(c.households, 0))
+ 0.40 * SAFE_DIVIDE(COALESCE(t.trip_count, 0), NULLIF(COALESCE(s.total_docks, 0), 0))
) AS service_index_core
FROM census_rollup c
LEFT JOIN trips_manhattan t USING (h3)
LEFT JOIN stations_manhattan s USING (h3)
WHERE c.total_pop > 0
),
scored AS (
SELECT
*,
(mobility_need_index - service_index_core) AS service_gap_score,
CASE
WHEN mobility_need_index >= 0.60 AND renter_share >= 0.65 THEN 'HIGH_NEED_RENTER'
WHEN mobility_need_index >= 0.60 AND no_car_rate >= 0.70 THEN 'HIGH_NEED_NO_CAR'
WHEN mobility_need_index < 0.40 AND median_income >= 120000 THEN 'LOW_NEED_HIGH_INCOME'
ELSE 'MIXED'
END AS census_segment
FROM base
)
SELECT
*,
PERCENT_RANK() OVER (ORDER BY mobility_need_index) AS pct_need,
PERCENT_RANK() OVER (ORDER BY service_gap_score) AS pct_service_gap
FROM scored
ORDER BY pct_service_gap DESC;
Reuse this template for other cities by swapping the county geometry filter, trip and station source tables, and any thresholds that exclude non-residential cells.
Results & Interpretation (What each chart means)
Mobility Need Index
This map answers: Where do resident and commuter demographics imply high dependence on shared mobility? It is a census-driven layer and should not be interpreted as bike demand. It is the need surface.
How to read it:
Higher index cells typically have higher no-car rates, stronger transit reliance, higher renter share, and higher rent burden.
If the map looks smooth, that is a feature: it reflects structural demographics, not operational volatility.
What we can conclude:
Mobility need is not identical to bike usage; it is a different construct.
This index is useful for equity framing: it describes where residents have fewer transportation options and higher sensitivity to cost.
Need vs Service
This scatter tests whether the system is allocating service where need is highest. If the bike-share system were need-aligned, you would expect a positive relationship: higher need -> higher service.
How to interpret:
High need + low service suggests a candidate under-service area (investigate docks, placement, safety/connectivity, pricing).
Low need + high service often reflects non-residential demand drivers (CBD/tourism/parks). Not bad, but not equity-driven.
Mobility Need vs Observed Bike Usage
This chart asks: Does observed usage increase with structural mobility need? Often the answer is not strongly, because usage is constrained by supply placement, safety, and network connectivity. That weak correlation is exactly why census is necessary: it reveals where usage is suppressed despite high need.
Service Gap Score
This is the planning output: a spatially explicit measure of mismatch between census-defined need and delivered bike service.
How to read it:
Positive values: under-served - need exceeds service.
Near zero: balanced.
Negative values: well-served relative to resident need (often tourism/CBD/parks).
What we can conclude (and what we cannot)
Conclusions we can support
Census materially changes the story. Trips show operations; census shows structural exposure and constraints.
Need is stable and portable. The need surface is not the same as downtown intensity, which is why it generalizes.
Service alignment is measurable. Need vs service dispersion indicates where equity and infrastructure questions should be investigated.
Gap mapping is actionable. Service gap highlights priority cells for expansion, pricing programs, or safety/connectivity improvements.
What we cannot conclude from these data alone
We cannot claim causality (need causes rides) without controls (bike lanes, safety, land use, tourism, pricing, events).
We cannot measure latent demand directly; we infer it from mismatch between census need and observed service.
Negative gap values are not bad-they often reflect non-residential demand.
Why Etherdata's canonical census at H3 R8 is the enabler
This demo is intentionally simple from a modeling standpoint because the product being demonstrated is the data layer: a canonical, complete census surface at H3 R8.
The value is operational:
Deterministic joins (same H3 key across all datasets)
Consistent denominators to convert counts into comparable rates
Auditability (every metric traces to a known census field)
Portability (the same query pattern generalizes across cities and states)
In practice, this means a data scientist or analyst can move from raw events to equity-aware service planning in minutes, without rebuilding geography pipelines or fighting inconsistent tract joins.
Overall Summary
If you only look at trips, Manhattan mostly looks like midtown and downtown win. With Etherdata's canonical H3 census layer, you can ask a more rigorous question: Is the system delivering mobility utility where resident constraints and exposures imply higher need?
This demo shows how the census layer turns operational bike data into a planning instrument:
A stable need surface (census-driven)
A measurable service surface (supply + utilization)
A single actionable output: service gap score
That is the core of trustworthy spatial decision-making: measurable, auditable, and portable.