HubCab is an interactive visualization that invites you to explore the ways in which over 170 million taxi trips connect the City of New York in a given year.
This interface provides a unique insight into the inner workings of the city from the previously invisible perspective of the taxi system with a never before seen granularity. HubCab allows to investigate exactly how and when taxis pick up or drop off individuals and to identify zones of condensed pickup and dropoff activities. It allows you to navigate to the places where your taxi trips start and end and to discover how many other people in your area follow the same travel patterns. What do these visualizations tell us about collective mobility? How many of these cabs might you have been able to share with the people around you? And how might entertaining these questions be the first step in building a more efficient and cheaper taxi service?


Technical Development

The basis of the HubCab tool is a data set of over 170 million taxi trips of all 13,500 Medallion taxis in New York City in 2011. The data set contains GPS coordinates of all pickup and drop off points and corresponding times.

Cartographic data of street shapes were obtained from OpenStreetMap. The streets were cut into over 200,000 street segments of 40m length each with a Python script and the help of the shapely Python library, and imported into a MongoDB. Pickup and drop off points were matched to the closest street segments. Street types unlikely to contain taxi drop offs or pickups, such as footpaths, trunks, service roads, etc. were not used in the matching process. Line widths of yellow and blue street segments on low zoom levels were styled on a logarithmic scale. The pickup and drop off points, represented as dots on the high zoom levels, were generated via an Arcpy script, being placed randomly within a box around a given street segment with the box width again following a logarithmic scale. GPX files of the dots were styled using Maperitive, then merged and amended for different zoom levels. The dots and street line files were layered together with MapBox, which is the platform that streams all the map content.

The data back end of HubCab runs on a MongoDB, containing all street segments and their coordinates, and all flows between each pair of street segments. The number of all possible street segment pairs is over 40 billion (200,000 times 200,000) per map. Radius selection is dynamic, using MongoDB’s $near function to obtain flows from all segments within the radius of the pickup marker to all segments within the radius of the drop off marker. With nine maps (one for the yearly data, eight for 3-hour time segments on all Fridays/Saturdays) and three selectable radii, there is a total of over one trillion flow combinations that can be explored with HubCab. Communication between MongoDB and the front end is realized via PHP scripts and Javascript+JSONP.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>