NYC’s Bikeshare is Almost Here

nycbikes

Not long now – less than a month – until the 1 May launch of New York City’s long-awaited and delayed (the hurricane last year didn’t help) bikeshare system, Citi Bike. Stations are starting to be rolled out.

A pilot test is currently being run with some docking stations and bikes, in the Navy Yard area of Brooklyn. I’ve discovered the live data feed for these stations, and combined it with the “planned stations” data feed, to produce a map of the system as it stands, which I’ve added to my Bike Share Map. The live docking stations are in blue/red colours, depending on how full of bikes they currently are, while the planned stations are in grey – dark grey for Phase 1 (launching in May), and light grey for Phase 2 (later this year). The rollout is happened in two phases due to damage inflicted on some of the warehoused bikes caused by the flooding from Hurricane Sandy late last year. Disasters like that kind of put the occasional complaints about the London’s own system into perspective!

London and New York share a common base design for both the bikes and docking stations, so in theory if you were to fly a Boris Bike to NY, it would fit in a dock’s slot – although I presume the system would then reject it for having an alien ID! Both cities’ docking stations and bikes have had design modifications from the Montreal BIXI original, with London’s docking stations being concreted in to the ground while NY’s docking stations have a solar “tower” for power, and a credit-card shaped slot as well as the normal key slot, for future integration with transport smartcards.

One of the most promising parts of the Citi Bike website is the System Data tab – right there on the front page. This looks like NYC will potentially be joining London, Minneapolis, Washington DC and Boston (and possibly Paris soon) in making anonymous journey data available to anyone who wants to analyse it.

Incidentally I’m delighted to see that the NYC system has an official blog and it looks like it’s not alone, with the largest non-Chinese system in the world, Paris, also having an official blog. Come on London, get blogging!

London’s Oyster Card Tidal Flow

Here is an animation I created a couple of years ago, one of a number I created for the “Sense and the City” exhibition at the London Transport Museum, which ran from Summer 2011 to Spring 2012. A version of this animation was branded appropriately for the exhibition and shown upstairs in the interactive section. I also created a similar animation of the Barclays Cycle Hire, and colleagues created other map-based visualisations of the moving city.

The animated map shows the touch-ins (going into the network) and touch-outs (leaving the network) of Oyster cards at London’s tube and train stations, including a few beyond the Greater London boundary which still accept Oyster cards. Oyster cards are London’s travel smartcards. As the animation moves forwards in 10-minute intervals during the typical weekday, the balance between touch-ins and touch-outs is shown by a colour scale. Red indicates the great majority of taps are touch-ins, and green indicates mainly touch-outs. White is the “neutral” colour, indicating that roughly as many people are entering the network as leaving it, at that period in time.

This Place

thisplace1

This Place is a visualisation of 2011 Census data for England and Wales, for your local area.

I’ve been meaning to adapt Michal Miguski‘s This Tract for the 2011 UK Census, ever since I saw it a couple of years ago showing the 2000 US Census. The clear, clean styling – simple a map of the local area, and a nice table of pie charts – was a world away from the choropleth maps I’ve produced previously. The most striking feature is what’s not there – when you are looking at a particular area, the surrounding areas are blanked out – they don’t distract.

Following the release of fine-grained 2011 Census data at the end of last month, at least for England and Wales, I’ve spent some time getting the data into the equivalent format and also customising the website with UK-specific metrics. The end result is not architectured in quite such an elegant way as Michal’s – his version uses geographical information direct from the “official” Census site, courtesy of their web services, and predefined static datafiles, whereas mine makes numerous queries to a local database – so his would scale better, although mine is backed by a decent academic server.

thisplace2I’ve used different colour ramps for each of the metrics – for ethnicity I used a rainbow-based colour ramp. The attempt is that the “colourfulness” of the wheel shows the ethnic diversity of an area. A fully diverse area will have significant proportions of every colour, creating a “wheel” of colour.

Lots of interesting results – for example, parts of London are very diverse while there’s plenty of places which are extremely homogenous – but not always with White British. Sometimes there’s a two-way split. As you might expect, parts of university towns have a young and highly educated population. The centres of major cities have many more men than women living there, and seaside results have an old population. Deep in the rural countryside, primary industries such as farming are popular. Liverpool’s large public sector workforce is clear.

One undocumented feature – you can input the MSOA code (found at the bottom of the page) into the search box, or the URL, to create a weblink specifically for that area. At the moment, my smallest unit geography is MSOA – the size is about right, but the boundaries of MSOA can be very arbitrary. If the data is released at ward level I may well switch to that.

The mapping for This Place comes from MapQuest Open, which is MapQuest-style map based on OpenStreetMap data.

The London Bike Share Marches North

bbike_nexpansion

It’s not just Wandsworth and Fulham that will be getting Barclays Cycle Hire in the next year or so when Phase 3 goes live – Hackney and Islington will be getting a few too. The iconic “Boris Bikes” will be heading up Mare Street towards central Hackney – although not quite getting there – plus there’ll be various new docking stations in Haggerston, just north of the Regent’s Canal. There will also be a docking station on Islington Green, and a few around the Canal Museum on Calendonian Road. In all, if planning permission is forthcoming, there will be up to 15 new docking stations, all north of the Regent’s Canal. It’s a modest increase – 3% – but the communities affected will doubtless enjoy the new facility. It’s still a long way south from myself though!

I’ve adapted my Bike Share Map to show the proposed locations, above. The potential docking stations appear in green.

It’s great to see that the system is continuing to expand in all directions – but now the central London demand is being sated, it would be nice if Transport for London relaxed their requirement for docking stations to be within 300m of each other. The most successful bike share systems generally have a dense core and a well spaced out periphery, which accommodates commuters, tourists and locals equally well. I would much rather have the system properly penetrating Zone 2 and 3, even if there’s a 1km gap between each docking station. Then it becomes more useful for the utility users who unlike the commuters (going from stations to skyscrapers) and tourists (concentrating on the bigs parks and markets) act as useful re-distributors in their own right by the nature of their diverse journey directions.

Thanks to Loving Dalston for spotting a planning application for the docking station by London Fields. I had a quick trawl through the Hackney and Islington council planning websites to spot the others.

A Map of Scotland’s Deprivation

newbooth_edinburgh

About this time last year, I created a “Map of the Geodemographics of Great Britain” which included the Output Area classifications (OAC) for GB, based on the 2001 Census, and also included the Index of Multiple Deprivation (IMD) for England, published in 2010. At the time, there was no up-to-date equivalent to the IMD for Scotland. However the 2012 SIMD (Scottish IMD) has recently been published, and I’ve applied the resulting datasets to my map, using the same technique of filling in just the buildings, rather than all the land, in the appropriate colour (a red-yellow-green Colorbrewer ramp from most to least deprived).

The SIMD and IMD are calculated in a similar way – by looking at measurements of poverty for each area across several categories (e.g. education, crime, income) – however the details of the way the measures are taken is slightly different between the two countries. Additionally each index is based on the range of deprivation found in that country. This means that the indices should not be directly compared across the two countries, i.e. A dark green area in Scotland only has the same relative level of deprivation to similarly coloured areas in Scotland, not in England. Accordingly, the website does not show the two IMD maps at the same time – there is a toggle at the bottom to switch between the two (and to the OAC). As an example – just because Edinburgh is largely green does not mean that it has the same leve of affluence/deprivation, on absolute terms, as a similarly-coloured city in England.

Nonetheless, comparisons within Scotland are perfectly valid, and the differences between the cities are striking – most notably Edinburgh vs Glasgow. See the whole map here.

As always with classifications, remember that they represent an average throughout the geographical area concerned – in Scotland this area is known as a Data Zone, which is similar to an English Output Area (as an aside, the SIMD is more fine-grained than the IMD – the latter uses a more aggregated measure). This means that the colour covering a house is not a measure for that house, simply that that house is within an area where the average SIMD is that value. Also, non-residential buildings get coloured, as the dataset I’m using for the building (Ordnance Survey Vector Map District, via the OS Open Data releases) does not distinguish building types. The SIMD of buildings that have no occupants is meaningless, and they are not included in the underlying calculation.

newbooth_glasgow

Me, Geolocated on Twitter

tweets_london

I was prompted by the excellent Twitter Tongues map, where geolocated tweets in London (including mine, and those from hundreds of thousands of others) were mined by Ed Manley over the summer, and then mapped by James Cheshire, to see where I had left my own Twitter footprint.

Many people would probably be quite alarmed to learn that the data, on the exact locations they have tweeted at – if they’ve allowed geolocation – is freely accessible to anyone, not just themselves, through the Twitter API.

tweets_chancerylane

It’s a bit of a faff to get the data – Twitter is starting to rollout a “download my Tweets” option which may make the first few steps here easier – but here’s how I did it.

  1. I used the user_timeline call on the Twitter API, repeatedly, to pull in my last 3200 tweets (the maximum) in batches (“pages”) of 200. The current Twitter API (1.1) requires OAuth authentication – not of the person whose tweets you are mining, but simply yourself, so that rate limits can be correctly applied. Registering a dummy application on the Twitter gives access to OAuth credentials, and then using the OAuth tool generates a CURL string that can then be run – the result is put in a file ( > pageX.json), and I do this 16 times to get all 3200 tweets, using the count, page and include_rts parameters. For this particular case, I’m interested in the locations of my own account but – to stress again – you can do this for anyone else’s account, unless their account is protected and you are not a follower.
  2. The output is as various JSON files. Lacking a JSON parser, or indeed the skill, I had to do a bit of manual text processing. Those with a flexible JSON parser can therefore skip a few steps. I then merged together the files (cat *.json > combined.txt), and in a text editor, put a line break between each },{"crea and replaced ," with ,^" with the caret being an otherwise unused character.
  3. I opened up the file as a text file (not CSV!) in Excel and did a text-to-column on the caret. I then extracted three columns – the date/time, tweet text, and the first coordinates column that occurred. These were the 1st(A), 4th (D) and 28th (AB) columns. I did further find/replace and text-to-columns to remove the keys and quotes, and split the coordinates column into two columns – lat and long.
  4. I removed all the rows that didn’t have a lat/long location. Out of 3186 (14 less than 3200 due to deleted tweets) I had 268 such tweets. I also added a header row.
  5. I created a new Google Fusion Table on the Google Drive website, importing in the Excel file from the above step, and assigning the latter two columns to be a two-column location field.
  6. I marked the table as public (viewable with a link). This is necessary as Google doesn’t allow the creation of a map from a private file, except though a paid (business) account. The flip side of course is this gives Google themselves the right of access to the file contents, although I can’t imagine they are particularly interested in this one.
  7. Finally, I added a tab to the Google Fusion Table which was a map tab, and then zoomed in and around and took the screenshots below. The map is zoomable and the points clickable as normal. It should be possible to colour-code the dots by year, if the categories are set appropriately and the appropriate part of the datetime feed is reformatted appropriately in Step 3.

The whole process, including some trial-and-error, took a little over an hour – not so bad.

In the images above and below, you can see the results – 268 geolocated tweets over the course of two and a half years from my account – many of them precisely and accurately located.

tweets_nweurope

All screenshots from Google Maps.

Paris Workshop on Bike Sharing Systems

IMG_2856

I attended a one-day workshop last week, hosted by IFSTTAR’s GERI Animatic research group at École des Ponts ParisTech just east of Paris. The workshop was on Bicycle Sharing Systems, and as I have recently been working with a couple of colleagues, Dr Martin Zaltz-Austwick and Dr James Cheshire, on research relating to bicycle sharing data, and mapping the systems currently live in various cities around the world, I was keen to attend, particular as the agenda was packed with interesting sounding talks.

My rush-hour commute through Paris proved to be slightly more traumatic than planned (I wonder if Parisian visitors find London Underground stations as confusing as I find those on the Paris metro?) but I arrived at the École des Ponts ParisTech in time to hear the workshop organiser introducing the sessions. First up was Pierre Borgnat talking about network analysis of Lyon’s system. I had seen a paper by him on Lyon before, and the popularity and density of Lyon’s system has allowed for a rich and interesting dataset for mining and community detection. The community detection has been done using both spatial and temporal variables. Pierre’s thorough and technical treatment of the data was backed up with some excellent mapping of the data, which you can see above and below.

IMG_2859

Next up was Jon Froehlich. Jon’s talk was underpinned by a discussion of the different data sources and types available in the field. He focussed on temporal cluster analysis of the Barcelona bicycle sharing system (below) – a particularly interesting city for me as, along with London and Zurich, it is a case study for the EU project I have recently started working on, EUNOIA. Barcelona’s bicycle sharing system is not unlike London’s, in terms of its size, shape and usage characteristics – although the general downward slope of the city causes headaches for its operator. Jon gets bonus points for including not only a quote from this blog on his presentation, but Martin’s beautiful routed bike-flow animation for London, and Dr Jo Wood’s more recent bi-directional flow animation, again of London.

IMG_2887

Etienne Côme, from the hosting school, was next on, with an analysis of the biggest system (outside of China) of all – the Vélib in Paris. The Vélib is perhaps the holy grail of academic research in the field as its size, and Paris’s multiple commercial and residential zones, means that community and network analysis is likely to be eye-opening. Similar to Pierre, Etienne outlined eight detected communities, by looking at temporal variations in the origin-matrix between the 1200-odd stations on the Vélib network.

IMG_2914

After lunch, Vincent Aguilera was first on, with a switch away from bicycle sharing systems but showing some techniques that have potential for the field – Vincent looked at using mobile phone network data to detect station dwell times and true journey durations on a section of the RER metro in Paris. He compared this data with Twitter messages with appropriate hashtags (below), and the real-time running supplied by the operator on its website. The availability and structure of the cell-towers on the network allowed a direct comparison to be made – indeed, such data may actually be of better quality than that currently available at the operator’s disposal, allowing more fine-tuned operation and monitoring.

IMG_2925

Neal Lathia was next with a look at London’s system – specially effects caused by the addition of casual (i.e. non-key, non-member) availability in December 2010. The additional option did see some changes in the usages of certain docking stations. The comparison was done by clustering the network’s docking stations by time, before and after the transition, and then seeing which stations changed cluster. One of the main areas of change was in the very heart of London, around the Trafalgar Square area, suggesting a slight shift away from the (still dominating) railway station-based usage patterns.

IMG_2948

Fabio Pinelli’s talk was wide-ranging – it included system design, routing for Dublin’s (over)used system, a look at the reliability of the Vélib fleet.

IMG_2950

Finally, Francis Papon from the hosting school took a step back from the modern electronically managed bicycle sharing systems and mobile/social data sources, and looked at change in uses of urban cycling more generally. His dataset stretched over a hundred years, rather than the typically five-year maximum historical range that bicycle sharing systems have. A key trend is that in the largest French cities studied, including Paris, there is a recent (post-2000) renaissance in urban cycling usage, but this is not matched in many of the country’s smaller cities.

The workshop concluded with a general discussion of the research field to date and its direction. What was particularly interesting was that several bike sharing operators were in attendance, they were fully engaged with the academic research being carried out, asking questions but also revealing some nuggets of information about how the systems are rebalanced, relative costs of operations and why they thought some systems were more successful than others.

Hopefully there will be more such workshops in the future in Europe – with UCL CASA, Cambridge, City University London and LSHTM all involved in the field, maybe there should be one taking place in London next year?

A Periodic Table for London

Here is a webpage that uses my own CityDashboard API*, to build a Periodic-Table inspired “data artwork” of live London information, as a series of coloured square panels on a website. The squares update regularly with fresh information, and throb red (or blue) if there are particularly extreme values present.

As an artwork, it’s deliberately not 100% clear what it shows. A key on the bottom right will help a bit, but a degree of guesswork will be needed for some of the panels. With a bit of thought, almost all of the panels should be decipherable.

It’s a super-simple webpage. I’m using CSS3 for the animations – no Javascript used. The page is customised to be most relevant to the CASA office here in central London – the chosen weather station, bike share stands, air quality monitor and variable message road sign have been chosen accordingly. A more sophisticated version – which doesn’t currently exist but would be simple to do – would use a combination of the location information in the CityDashboard feeds, and the HTML5 geolocation functionality of many browsers, to show a version more relevant to where in London the viewer is.

As the page is so simple, it displays well on mobile browsers – on my iPhone, the webpage shows four panels on each row. On larger displays, it will rearrange appropriately. See the acknowledgements link on the page to see where the data’s coming from – the same sources as CityDashboard, including TfL, DEFRA, Yahoo! Finance and Mappiness, as well as CASA’s own sensors.

I created the piece for the ODI’s recent Data as Art installation competition – I didn’t win, but decided to do it anyway.

Live version here.

*Strictly, I’m using my Bike Share Map data for the individual docking station information – this could be easily added to the CityDashboard API in due course.

Update to CityDashboard CSV API & iPad Wall!

I’ve made some minor alterations to the CSV API for CityDashboard. The main changes are in the metadata rows (the top two) rather than the subsequent rows. Specifically, the top metadata row has now split out the description, source and source URL – which were previously rather messily combined into a bit of HTML – into three text fields; and the second metadata row now uses properly formatted names for value titles, i.e. including spaces, and units, for example “broken_pc” now becomes “% docks/bikes broken”.

The reason for these changes is to accommodate a new and exciting use of the API here at CASA – our lab hardware specialist has recently been hard at work building an “iPad wall” and one of the visualisations in it is of CityDashboard data. Here’s what the uncompleted – but operational – iPad wall looks like (source):

It’s a physical CityDashboard!

I also took the opportunity to fix a few bugs and typos – mainly just cosmetic, but including a pretty silly one for the Mappiness-sourced data that was over-reporting the true value by a large and variable amount. Entirely my fault. That will serve me right for doing a coding change during a colleague’s Ph.D viva drinks reception! I also handle temporarily unavailable source feeds a little better – they’ll now appear unavailable for one complete update cycle but it means the source server doesn’t get repeatedly hammered until it comes back up again.

The Electric Tube

In six weeks time, London will have a second orbital railway. The Circle Line has been running for just over 100 years, and on 9 December will be joined by the latest addition to Transport for London (TfL)’s Overground network – a link between Clapham Junction in the south-west and Surrey Quays in the south-east. This means that the West London Line, North London Line, East London Line and South London Line will all be linked up (you won’t be able to travel 360 degrees on one train though – you’ll need to change at both Highbury & Islington and Clapham Junction, and often Willesden Junction, to complete a circuit). Should you travel around the complete loop, you’ll pass through areas as varied as Imperial Wharf, Dalston Junction, Whitechapel and Peckham Rye.

Anyway this was a tenuous excuse for me to produce a diagram – above – of London’s TfL-owned network – the Underground, the Overground, the DLR, Tramlink and the Cable Car. Click the graphic for a larger version. My starting principles for the diagram were concentric circles for the orbital sections of the Circle Line and the Overground network, and straight lines for the Central and Piccadilly Lines, with the latter two converging in the centre of the circles. I then squeezed everything else in. I realised that the Northern Line’s Bank branch passed the Circle Line three times so was going to need something special, so I added a sine wave for this section, and extended this north and south as much as possible.

The River Thames is on there – because any tube diagram doesn’t look correct without the river – and the diagram is topologically accurate – everything connects correctly, and features are in an approximately correct geographical position relative to their neighbours, but not to the diagram overall. Only stations that are designated intersections, or have connections with National Rail stations, are shown. I haven’t labelled anything. It’s art.

I was also thinking about physics when creating the diagram – specifically Feynman diagrams, bubble chamber traces, particle physics collisions, magnetic flow lines and electrical circuit diagrams (as was Beck himself). Hence why I’ve called it the Electric Tube.

The work was also inspired by the likes of Fransicso Dans (more) and Project Mapping, as well as of course the famous Official Tube Map. 1 November Update – I’ve updated the map slightly to add in Tramlink and a few more connections.