Sunday, 2 March 2014

London maps and bike rental communities, according to Boris Bike journey data

Every time someone in London makes a journey on a Boris Bike (officially, the Barclays Cycle Hire Scheme), the local government body Transport For London (TFL) record that journey. TFL make some of this data available for download, to allow further analysis and experimentation.

Below, you'll find maps of the most popular bike stations and routes in London, created from the TFL data using Gephi, plus a few simple data processing scripts that I threw together. The idea for these maps originated within a project group at a course on Data Visualisation, held at the Guardian last year. We're working on a more publisher friendly form, so thank you to my course mates for giving me the go ahead to include them here.

First, here's a map showing all bike stations and all popular journeys.


Popular Boris Bike journeys and stations. Full version.

The first map shows the most popular routes and bike stations, those with more than ~150 journeys made during the six months of data that TFL make available. The size of each bike station in this map is based on the number of popular journeys that start or end at that station, a measure of the connectedness of the location. Note: the labels just show the rental area, not the specific station name.

Next, a map where the stations have been grouped together into rental areas, as allocated by TFL:


Rental areas and traffic volumes in the Boris Bike network. Full version | Alternative.

The second map is a version of the first map where related bike stations have been grouped together, and the volume of journeys between areas determines the weight of each connection. Colours in the second map are related to distinct communities in the network - more on this later. The position of the rental areas is approximate and calculated by Gephi. So please don't blame me for any geographical inaccuracies in this map ;)

Some interpretation, along with inspection of underlying data shows that:
  • Major entry points for Boris Bike use are via Kings Cross and Waterloo, more than likely due to commuters arriving from the North and South then heading deeper into London for work.
  • The most popular journeys are those around Hyde Park, corresponding to a popular tourist activity. 
  • The most popular journey (by a long way) is from Hyde Park Corner ... to Hyde Park Corner, presumably a nice trip round the park.
  • The most popular commuter route is between Waterloo (station 3) and Holburn, probably via the Waterloo Bridge.
Of course that's just scratching the surface, and just one example of how to vizualize the data. There's much more that can be done, and similar maps have been created before. Here are a couple of my favourites, plus another of my own, afterwards:
  • This delightful video by specialist Jo Wood at City University in London, published by New Scientist also shows popular routes in the network.
  • A recent BMJ article included a street-level map showing predicted routes and volumes, the focus here is on the health impact of bike sharing schemes.
A bit of experimentation with Gephi's community detection tool results in this map:


Rental communities in the Boris Bike network. Full version.

Here, major connected clusters of bike stations are shown in the same colour (red for Waterloo and environs, Green for around Hyde Park, etc). The communities are detected using Gephi's implementation of the Louvain Method, which finds local communities within large networks. This algorithm has a random element, and generates slightly different communities on each run. However it's clear from repeated runs that distinct local communities exist in the network, in particular around Hyde Park, Kings Cross, Waterloo, and Canary Wharf.

The map that shows bike rental areas (rather than stations) was coloured according to these communities, with journeys between different communities having a colour that's a mixture of the two that are involved.

If you fancy playing with the data or trying out some other visualizations, you can find everything in this GitHub repository.

1 comment: