Showing posts from 2015

Visualizing Rebel Alliances in the UK Government

The UK will shortly go to the polls for the 2015 General Election. However there's currently no clear front-runner, and in fact no clear coalition on the cards for a new government. The "new normal" of hung parliaments and coalition forming as part of UK politics appears to be here to stay.

As such, I decided to take a look at the open dataset provided by The Public Whip project, with a view to visualizing the relationships between MPs (members of parliament) in the 2010 to 2015 UK government, using a tool called Gephi. The idea was to analyse how MPs are related through their voting patterns in the house of commons, and in particular how they are related through agreement or rebelliousness.

Also I'll admit it: I wanted to write an article with "Rebel Alliance" in the title because I like Star Wars.

In the rest of this article, I'll describe several visualizations that were created from that public whip dataset. These show various aspects of MP relation…

How to create your own replica of the SureChEMBL patent-chemistry dataset

Introduction: Why replicate SureChEMBL? SureChEMBL is a patent chemistry dataset and set of web services that provides a rich source of information to the drug discovery research community. It was previously owned, developed, and sold by Macmillan, but was recently handed over to the European Bioinformatics Institute (EMBL/EBI) and is now free for everyone to use.

SureChEMBL can already be accessed online, so why would a locally hosted replica be needed?

To answer that question, I'll give the reasons provided by a pharmaceutical company who recently commissioned me to develop a SureChEMBL data replication facility:
1) Firewall restrictions can be avoided - companies involved in drug discovery are often working with substructures or other related search queries which may lead to highly lucrative discoveries. As such, researchers are often prohibited from using external web services, even secure services such as SureChEMBL, as a risk-mitigation strategy. Downloading data files - e.g…