Open Data and Science

Jerry Sheehan's picture
Personal

Open Data and Science

The growth of the web publication of data from formal governmental and private sources has led to a substantial increase in the quantity freely available data. This availability coupled with semantic web 3.0 technologies creates a foundation to bring decentralized heterogeneous data sources together into shared public repositories. One example of this trend is Freebase. Freebase bills itself as an open database of the world's information. The database was built using all of the public information in wikipedia and has been expanded by individual and automated aggregation. The site covers millions of topics with hundreds of categories available for public derivative use via an open Application Programming Interface (API).

The group Creative Commons has recently focused their efforts on helping identify better ways to promote the sharing of information in scientific databases. An idea of the complexity of the legal complexity surrounding this issue can be found at Databases and Creative Commons

While the obstacles to free scientific data remain substantial the promise may be dramatic. The Neurocommons is a project that aims to create an open source knowledge management platform for biological research promising greater efficiency for biomedical research:

"With this system, scientists will be able to load in lists of genes that come off the lab robots, and get back those lists of genes with relevant information around them based on the public knowledge. They’ll be able to find the papers and pieces of data where that information came from, much faster and more relevant than Google or a full text literature search, because for all the content in our system, we’ve got links back to the underlying sources. And they’ve each got an incentive to put their own papers into the system, or to make their corner of the system more accurate for the better the system models their research, the better results they’ll get."[1]

Free data is essential for science because it allows for the objective evaluation of
experimental findings and the replication of experiments. As Michael Gough, Adjunct Scholar to the Cato Institute noted when testifying before the US House of Representatives on The Importance of Data Access for Science and Governance:

"Science depends on skepticism, review, criticism, and replication. Good science and good scientists thrive under those conditions.The science used to support regulations and taxes must be based on publicly available data for review and analysis. Otherwise, government, simply by calling any collection of data, conclusion, and conjecture "science" and refusing to let others see the data, has a free hand to impose taxes and regulations."[2]

Abstract: 

Open Data and Science

Tags:

Source: 

[1]http://sciencecommons.org/projects/data/
[2]http://www.cato.org/testimony/ct-mg071599.html

Average: 3 (1 vote)

Hypotheses that reference this signal:

This signal has no hypotheses. Add a hypothesis

Forecasts that reference this signal:

This signal has no forecasts. Add a forecast