header image
Jun 23

I posted earlier today about the mountain of cancer data donated by GlaxoSmithKline. While scientist and big pharmaceuticals are constantly generating more biological data the bottle neck problem is mining this data to find useful information.

Over at PIMM, Attila in his blog discusses a feature article in Wired magazine and offers a potential solution to this data infoglut problem (which is only going to get larger) might be Google, whose mission statement is: “to organize the world’s information and make it universally accessible and useful”. Could Google with their infrastructure and high concentrations of very smart people make sense of the current, and coming, infloglut of scientific data sets? Very interesting question.


And an update on the personal genomic squabble between the California government, who sent ceased and desist orders, and the genetic testing companies is covered here.

Jun 23

Image via Wikipedia

Many scientist believe that in the near future there will be breakthroughs due to our ability to generate huge data bases, and more importantly our ability to mine them.

GlaxoSmithKline have done a huge and expensive study on cancer, and as of June 20 2008 released a mountain of data for free. They are not releasing all of the data, but the vast majority of it.

The data is mostly if the form of microarray results which the company has given to Cancer Biomedical Informatics Grid (caBIG), part of the National Cancer Institute to house.

Here is a link to where the data is located. It contains the genomic profile of 300 cancer cell lines. The genomic profiles include both the results of SNPs microarrays and microarrays to measure mRNA transcript expression.

Now this is potential a fantastic resource of information for the hundreds of thousands of cancer researchers. I am sure the data in this bank could be the start of hundreds of Phd degrees.

I really hope that the new generation of smart open source scientist can harness this data and make gold out of straw (one example would be Shirley Wu).

Taking a slightly cynical route, do you think GlaxoSmithKline, or their shareholders, would be happy with the company giving away information that contains ‘value’? Or is it more likely they have used massive computer power all their bioinformatic personnel to strip out all the gold before releasing this mountain of data freely to the public ? Yes, this move will build some ‘good will’ for GlaxoSmithKline with the public which is of some worth, especially with the current climate of the publics trust in big pharmaceuticals. However, it seems unlikely that big pharma is really in the habit of giving away valuable information. Could it be that even after shifting through this mountain of data on cancer cell lines that they only found a few nuggets, but overall the mountain was barren? That would be a scary thought regarding future cancer research.