Bioinformaticians Have a Leg Up on Data Issue, Standards Needed
Scientists have a problem: there is just too much data being created and not enough structure in dealing with it. The Chronicle of Higher Education reports on the issue at hand in a Feb. 10, 2011, article titled, “Dumped On by Data: Scientists Say a Deluge Is Drowning Research” by Josh Fischman. The article points the finger at “the lack of data libraries, insufficient support from federal research agencies, and the lack of academic credit for sharing data sets.”
Associate Professor Dr. Randy Zauhar, who is the Graduate Program Director for Bioinformatics at University of the Sciences, understands the issue first hand:
"In general, Bioinformatics has done a better job of data standardization than other disciplines. There are several reasons for that - first it included computer scientists at the beginning, and they are much better than biologists (or even physicists) in keeping raw data systematized. (Biologists did a great job with taxonomy, but that developed a long time ago, and it was in Greek and Latin, not binary!) Second, the raw data has a simple format (just text strings).
"That said, Bioinformaticians struggle today in dealing with the sheer volume of data generated by contemporary sequencing methods. A more pressing problem in my view is moving away from the raw data (where standardization is easy) and into the world of annotation, where you attach meaning to the raw data. Here there is much less uniformity, and that has prompted interest in development of ontologies (essentially controlled vocabularies). If standard ontologies can be agreed on, there may be some hope to better systematizing ALL the information being generated. However, given that one of our major resources, NCBI (National Center for Biotechnology Information) has actually become LESS friendly to the end user over the years, I do not feel very optimistic."