Citi Bike trips (BigQuery public)
Trip events with rider type (subscriber vs customer) and station coordinates aggregated to H3 R8.
Etherdata.ai makes spatial data trustworthy and usable for every team. We publish open, rigorous, deeply usable datasets that make decision-making fairer and faster.
This demo answers a practical operator + city question: where is membership adoption strong vs weak, where is casual usage disproportionately high, and how those patterns relate to neighborhood socio-economics.
Focus
Membership adoption and casual usage mix. Equity-aware neighborhood segmentation using canonical census context.
Geography
Manhattan, New York (NY County), H3 Resolution 8.
Spatial unit
Canonical H3 grid (R8) with deterministic joins between trips and census context.
Outputs
Member penetration, casual-to-member ratio, trip frequency proxy, socio-economic context for segmentation.
Trip logs alone mostly reflect where stations exist and where people pass through (CBD, tourism, parks). They do not explain whether membership adoption is structurally high or low relative to neighborhood context. Etherdata’s canonical census layer provides stable signals that make adoption patterns interpretable and comparable across the city.
These charts do not claim causality. They provide a planning-grade lens to prioritize deeper investigation and intervention design (pricing, partnerships, station placement, outreach).
Trip events with rider type (subscriber vs customer) and station coordinates aggregated to H3 R8.
Income and education attributes keyed to the same H3 cells for consistent context.
Retrieve Manhattan geometry and polyfill to H3 R8 to create the authoritative cell list.
Convert start-station longitude/latitude to H3 indices and keep only cells within Manhattan.
Aggregate trips into subscriber vs customer counts per H3 cell.
Join canonical income and education fields to the same H3 cells for socio-economic interpretation.
Calculate member penetration, casual-to-member ratio, and trip frequency proxies (trips per distinct bike_id by rider type).
-- Member vs casual riders
WITH
manhattan AS (
SELECT county_geom AS geom
FROM `bigquery-public-data.geo_us_boundaries.counties`
WHERE state_fips_code = '36'
AND RIGHT(county_fips_code, 3) = '061'
LIMIT 1
),
manhattan_h3 AS (
SELECT h3
FROM manhattan m,
UNNEST(bqcarto.h3.ST_ASH3_POLYFILL(m.geom, 8)) AS h3
),
trips_raw AS (
SELECT
DATE(t.starttime) AS trip_date,
bqcarto.h3.LONGLAT_ASH3(t.start_station_longitude, t.start_station_latitude, 8) AS h3,
LOWER(t.usertype) AS usertype_norm,
SAFE_CAST(t.bikeid AS STRING) AS bikeid
FROM `bigquery-public-data.new_york_citibike.citibike_trips` t
WHERE DATE(t.starttime) BETWEEN DATE '2018-04-10' AND DATE '2018-09-10'
AND t.start_station_latitude BETWEEN 40.68 AND 40.88
AND t.start_station_longitude BETWEEN -74.05 AND -73.90
AND t.start_station_latitude IS NOT NULL
AND t.start_station_longitude IS NOT NULL
),
trips_manhattan AS (
SELECT tr.*
FROM trips_raw tr
JOIN manhattan_h3 mh USING (h3)
),
-- Aggregate by (h3, trip_date)
trips_by_h3_day AS (
SELECT
h3,
trip_date,
COUNT(*) AS trips_total,
COUNTIF(usertype_norm = 'subscriber') AS trips_member,
COUNTIF(usertype_norm = 'customer') AS trips_casual,
COUNT(DISTINCT IF(usertype_norm = 'subscriber', bikeid, NULL)) AS distinct_bike_member,
COUNT(DISTINCT IF(usertype_norm = 'customer', bikeid, NULL)) AS distinct_bike_casual
FROM trips_manhattan
GROUP BY 1, 2
),
census_manhattan AS (
SELECT
c.h3,
c.total_pop,
c.pop_25_64,
c.median_income,
c.bachelors_degree_or_higher_25_64
FROM `v3layer-pro.census_tract_h3r8.nydemo` c
JOIN manhattan_h3 mh USING (h3)
WHERE c.state_fips = '36'
AND c.h3 IS NOT NULL
),
census_rollup AS (
SELECT
h3,
SUM(total_pop) AS total_pop,
SAFE_DIVIDE(SUM(median_income * total_pop), NULLIF(SUM(total_pop), 0)) AS median_income,
SAFE_DIVIDE(
SUM(bachelors_degree_or_higher_25_64),
NULLIF(SUM(pop_25_64), 0)
) AS bachelors_or_higher_25_64_rate
FROM census_manhattan
GROUP BY 1
),
final AS (
SELECT
t.trip_date,
t.h3,
bqcarto.h3.ST_BOUNDARY(t.h3) AS geom,
ST_Y(ST_CENTROID(bqcarto.h3.ST_BOUNDARY(t.h3))) AS lat,
ST_X(ST_CENTROID(bqcarto.h3.ST_BOUNDARY(t.h3))) AS lon,
t.trips_total,
t.trips_member,
t.trips_casual,
SAFE_DIVIDE(t.trips_member, NULLIF(t.trips_total, 0)) AS member_penetration_rate,
SAFE_DIVIDE(t.trips_casual, NULLIF(t.trips_member, 0)) AS casual_to_member_ratio,
SAFE_DIVIDE(t.trips_member, NULLIF(t.distinct_bike_member, 0)) AS member_trips_per_bike_proxy,
SAFE_DIVIDE(t.trips_casual, NULLIF(t.distinct_bike_casual, 0)) AS casual_trips_per_bike_proxy,
c.total_pop,
c.median_income,
c.bachelors_or_higher_25_64_rate,
CASE
WHEN t.trips_total = 0 THEN 'NO_TRIPS'
WHEN SAFE_DIVIDE(t.trips_member, NULLIF(t.trips_total, 0)) >= 0.70
AND c.median_income >= 120000 THEN 'HIGH_MEMBER_HIGH_INCOME'
WHEN SAFE_DIVIDE(t.trips_member, NULLIF(t.trips_total, 0)) < 0.40
AND c.median_income < 80000 THEN 'LOW_MEMBER_LOWER_INCOME'
WHEN SAFE_DIVIDE(t.trips_casual, NULLIF(t.trips_member, 0)) >= 1.5 THEN 'CASUAL_HEAVY'
ELSE 'MIXED'
END AS adoption_segment
FROM trips_by_h3_day t
LEFT JOIN census_rollup c USING (h3)
)
SELECT
*,
PERCENT_RANK() OVER (PARTITION BY trip_date ORDER BY member_penetration_rate) AS pct_member_penetration,
PERCENT_RANK() OVER (PARTITION BY trip_date ORDER BY casual_to_member_ratio) AS pct_casual_heaviness,
FROM final;
The scatter suggests that membership penetration is not tightly explained by income in this sample. Member penetration stays relatively high across a wide range of median income values, which implies the system has reached broad adoption where service exists.
This chart shows a clear socio-economic structure: education attainment rises with income and forms a strong gradient. It is not an adoption chart by itself, but it establishes the “context surface” that Etherdata provides. In practice, it enables consistent segmentation: you can compare adoption metrics against stable neighborhood context in a deterministic H3 framework.
This map shows the share of trips taken by subscribers (members) in each H3 cell. Visually, the surface is fairly consistent across much of Manhattan, indicating that membership adoption is broadly strong wherever the network is active. A small number of cells diverge noticeably from the citywide pattern and are good candidates for deeper investigation.
This map highlights where casual rides are disproportionately high relative to member rides. The distribution is much more extreme than member penetration, which is expected: casual rides concentrate heavily in a smaller set of cells (destination demand).
The segment table summarizes the city into interpretable “behavior + context” groups. Based on what’s visible:
| Segment | Bachelor's-or-higher rate | Trips (total) | Member penetration | Casual-to-member ratio | Median income |
|---|---|---|---|---|---|
| HIGH_MEMBER_HIGH_INCOME | 86% | 1,335,484 | 88.7% | 0.14 | $158,971 |
| MIXED | 61% | 926,130 | 82.6% | 0.26 | $74,287 |
| CASUAL_HEAVY | 83% | 10,129 | 28.2% | 3.09 | $170,901 |
| LOW_MEMBER_LOWER_INCOME | 57% | 95 | 35.8% | 1.79 | $38,583 |
This demo is intentionally simple from a modeling standpoint because the product being demonstrated is the data layer: a canonical, deterministic, H3-keyed census surface that can be joined to operational mobility data in minutes.
The value is operational:
Member vs casual is not just an operational split—it is a lens on adoption, pricing leverage, and destination demand. With Etherdata’s canonical H3 census layer, you can interpret that split in neighborhood context, identify where casual dominance likely reflects destinations, and isolate the pockets where membership may be underperforming relative to local socio-economic structure.