Wikidata can change the way citizen scientists contribute

If you’ve been following discussions on citizen science, you’ve probably realized that researchers are generating so much data, that they need extensive help for parsing the data and making it more useful. For many projects, citizen scientists have answered the call for help–making enormous contributions. Sure, there was a recent study which found that “Most participants in citizen science projects give up almost immediately”, but as Caren Cooper pointed out:

Just by trying, citizen scientists made important contributions regardless of whether or not they chose to continue.

But I digress...
What does citizen science have to do with wikidata? On that matter, what the heck is wikidata?

Much of citizen science contributions come in some form of data collection (observations; sample collection; taking measurements, pictures, coordinates, etc) or classification (identification, data entry, etc) but few citizen scientists participate in analyzing the data.

From ‘Surveying the Citizen Science Landscape’ by Andrea Wiggins and Kevin Crowstone (click the figure to read the paper, it’s open access)

Wikidata (a linked, structured database for open data) may serve to change that. Naturally, wikidata relies on the contributions of volunteers; however, the data incorporated into wikidata is open for anyone to use. In fact, wikidata is begging to be used and citizen scientists and citizen data scientists are welcome to use it. An international group of has already put together a grant proposal (open/crowdsourced in the true spirit of wikipedia) to make wikidata an open virtual research environment. Dubbed, Wikidata for Research the proposal aims to establish “Wikidata as a central hub for linked open research data more generally, so that it can facilitate fruitful interactions at scale between professional research institutions and citizen science and knowledge initiatives.”

As exciting as this all is, there is a lot of work that still needs to be done for making wikidata more successful. Although it’s open access, it’s still a bit inaccessible due to the lack of clear documentation for new users. It’s not that the information doesn’t exist–there is a ton of information on wikidata available and a lot of neat tools already available and in development. You just have to look really hard for it. Fortunately, the wikidata community is already aware of the key issues that need to be addressed in order to become more successful.

Researchers have already taken considerable effort to make science more accessible by contributing to science-related articles. There are over 10,000 genes already in wikipedia thanks (in part) to the Gene Wiki initiative! It makes sense that wikidata is next. A lot of progress has been made in this arena, but I’ll save that for later.

NoB Hackathon and Mark2Cure status update

The NoB Hackathon starts at 6:00pm today! Don’t forget to register before you go: 

The programmer behind Mark2Cure has made incredible progress in setting up the tutorial and basic infrastructure of the site and is now busy improving the landing page and overall usability of the site. Plans for the first experiment are already under consideration but we need more citizen scientist volunteers.

NoB Hackathon

The Network of Biothings is aiming to alleviate the problem of too much information using a number of different approaches such as:

  • Crowdsourcing
  • Natural Language Processing
  • Citizen science
  • Microtask markets
  • Professional biocuration
  • Scientific publishing
  • Open Innovation Challenges

With resources already devoted to (and making incredible progress on) some of the citizen science aspects (ie-Mark2Cure) and some of the scientific publishing (ie- GENE/Gene Wiki) it’s time to draw attention to some crowdsourcing/open innovation challenge effort that is already underway–The NoB 2nd Hackathon taking place on 11/7/2014-11/9/2014 at UC San Diego on the 5th floor of the CALIT2 building.
To join in on the fun, you will first need to register, but scholarships are available to cover registration expenses. Details here.

edit: Here are some ideas for the NoB hackathon, feel free to add your own.

Too much information part 2

Yesterday’s Neat Science Thursday post was about the growing amount complex biomedical research and omics data that needs to be organized, integrated, and analyzed in order to be effectively applied. Similar issues apply to the growing volume of growing volume of biomedical research publications that take an increasing amount of time for researchers to thoroughly examine. The Net work of BioThings aims to:

    “structure biological knowledge by annotating BioThings (genes, proteins, mutations, diseases, and drugs) in biomedical research articles. We want to comprehensively annotate the mentions of these BioThings in research articles, and also want to describe the nature of the relationships between these entities. Finally, we want to do it in roughly real-time (within one week of publication)”.

Finding buried treasure in shifting sand

The problem of keeping up with scientic literature is not new. In 1986, information scientist, Don R. Swanson, published an article about mining the wealth of knowledge buried in academic literature. In his article, “Undiscovered public knowledge”, Swanson investigated information that was not readily available simply because individual biomedical research papers were (and in many ways still are today) created “to some degree independently of one another.” By investigating literature that was “logically connected”, but was otherwise “non-interactive”, Swanson teased out a hypothesis essentially joining two small fields of research.

Since then, researchers are still trying to develop methods to wade through this ever-growing body of literature, only now there are about one million new biomedical research articles being published per year compared to the roughly 350 thousand published in 1986.

Finding the right information is a problem that's only going to get worse unless we do something
Compound this issue with the growing amount of information that is now contained within biomedical research literature, but is not readily accessible due to lack of appropriate annotations.

You can lead a researcher to cash, but you can’t make them apply

One would think that a $500 prize would be sufficient to motivate a cash-strapped researcher, or a starving scientist-in-training student to put together an entry for the Network of Biothings Driving Biological Projects contest. Heck, since the contest is open to anyone with an imagination, one would think that someone somewhere would be interested in submitting an entry–especially since NO actual programming is needed for this contest.

The contest simply asks, if the entire body of biomedical literature was easily searchable (if it was annotated for all variables in each article), what biomedical research question would you answer? What variables would you need to answer that question, and why can’t it be answered now?

With a deadline looming in just a few days (contest closes on June 30th), we could really use some entries (ie- good ideas). So, what are you waiting for? Enter the contest already…details can be found here.