Not So Wild a Dream: The Science 2.0 Federated Search Dream Machine

Hope Leman received a $500 prize for achieving second place in the 2nd annual Federated Search Blog contest. Her essay, Not So Wild a Dream: The Science 2.0 Federated Search Dream Machine, is published here in its entirety.

Hope is a research information technologist for Samaritan Health Services in Oregon where she is helping to develop a service to help scientists and public health researchers find professional conferences and places to submit their research papers. Hope’s essay shares her dream of creating a federated search engine to help scientists with some key aspects of research: finding the current state of research on a topic and finding calls for papers and presentations.

Not So Wild a Dream: The Science 2.0 Federated Search Dream Machine -Hope Leman

Many of us love someone who is ill. Most of us have loved someone who has been cut down far too early in life from illness. What can we do to enable scientific researchers to make quicker progress on advances in such fields as cell biology, neuroscience, pharmacology and other realms that will lead to breakthroughs that will help prevent or cure devastating diseases? We can make crucial information easier to find and disseminate. Federated searching (the ability to search multiple databases simultaneously) is one way to do that.

My dream is to create a free online federated search engine that a researcher could use to find everything she needs to begin to solve key questions in science, such as what causes human motor neuron cells to die. She would start with fundamental questions such as what is the current state of research on this topic. Thus, she would need to be able to search databases of the scientific/medical literature such as PubMed. That would be the first database our dream machine would need to be able to search.

Our dream machine is taking shape. It is able to search PubMed at the speed of light.

What next? As I have been thinking hard about what could be done to help people with brain diseases I have become fascinated by the movements Science 2.0 and its subset, Open Science.

One of the things I have noticed is that many scientists don’t even realize that their peers elsewhere are working on issues similar to their own research interests. Sometimes that is because so much of science crosses so many disciplinary boundaries that a scientist working in one field (say, biochemistry) may not realize that a counterpart in a comparable but different field (say, molecular biology) has made an important discovery or developed a useful tool that has wide-ranging implications.

Often, such discoveries are discussed at scientific conferences. But there doesn’t seem to be a good search engine that would enable scientific researchers to simply enter in their research interests and determine what sorts of conferences will feature presentations on their interests. What is needed is a search engine that would enable scientists to set up alerts of slated presentations in their fields, the better to develop interdisciplinary scientific networks and thereby advance science.

Ideally, our dream machine would enable users to track the subject matter and proposed agenda of scientific conferences and to determine, if possible, which scientists are to speak or at least attend such conferences. Users would be able to enter into the database part of the search engine their own plans to attend such conferences and the subject matter of their proposed presentation. Some directories of conferences already exist on the Web, but they are limited in scope and technical sophistication. What is needed is a powerful awareness tool that would truly facilitate the creation of collaborative endeavors in the sciences.

Thus, so far our dream machine would enable scientists to determine what the state of the science is (PubMed) on their topic and determine where that topic is slated to be discussed in coming months (let’s call that part of our dream machine Meeting Matcher). What else does our dream machine need?

Scientists need to publish the results of their papers so to apprise their peers of the progress they are making. I am blessed in my work for a Center for Health Research and Quality to work on developing and providing free Web services for the greater research community. Currently, we are developing a platform (called ResearchRaven) that would include databases that would enable researchers to locate calls for papers for periodicals and call for papers and presentations for professional meetings in their fields. But those would be simply aggregations of information. It would take our dream machine to render all that information findable and usable.

Okay, so far our dream machine can provide speedy results from PubMed, Meeting Matcher and ResearchRaven.

What else do we want our dream machine to be able to do? Let’s get to an ugly but undeniable fact: science takes money. Thus, we want the dream machine to be able to enable researchers to find grant money and fellowships that will enable them to fund their projects and support them so that they can remain in science. I am privileged to be able to work on a free online listing of funding opportunities and scholarships in the health sciences, ScanGrants. I list as much as I possibly in ScanGrants, but our dream machine would mine not only ScanGrants but other such funding databases and seek out and capture any mention on the Internet of funding in the sciences from tiny amounts for abstract or poster awards for graduate students in toxicology to $2,000 travel grants for young scientists in the field of biological psychiatry to million dollar awards for scientific achievement in cancer research.

Indeed, there are already powerful platforms such as Elsevier’s new SciVal Funding (which also leverages Elsevier’s vast resources vis-à-vis data on the published literature and its array of reference management tools) and Community of Science. But these are commercial products available only to those affiliated with resource-rich research institutions or government agencies. The dream machine would be free online and accessible to everyone from an impecunious graduate student in cognitive psychology in Ohio to a junior faculty member in a chemistry department in Kenya or Honduras.

Okay, our dream machine can now enable researchers to determine what has been done so far on her topic. She can learn about activity in the journals and at conferences. She can look for funding. And she should be able to determine a la SciVal funding and new tools such as ResearchScorecard who the leading researchers and grant getters are on her topic (and it would be nice if there were a gigantic opt-in listing somewhere of the contact info of every scientist on the planet and a massive repository of all the Open Access material that is currently lying underutilized–WorldWideScience.org is well on its way to becoming that portal and our dream machine would have to be able to search that superbly).

What else does our dream machine need to be able to do? Well, it would need to be able to find the increasingly large amount of scientific information that is being added to specialty databases such as ChemSpider.

What else? Our dream machine will have to be able to detect the fact that key information is contained in documents that have been uploaded to the increasing number of pre-print services such as Nature Precedings. Our dream machine will not force researchers to wait around for the cataloguers at PubMed to enter an item into it once the traditional (and slow-moving) scientific publishing process is completed. Our dream machine will also be able to capture data on the latest papers published by the growing number of Open Access journals published by such entities as the Public Library of Science (PLOS) and the papers that will be curated by such institutions as Harvard and MIT under their new Open Access mandates.

And given how much key information is now being circulated via such new services as ResearchBlogging.org and in comments on blogs not affiliated with it our dream machine will have to be able to handle blog search. And what appears in Twitter. And in Google Wave. And in the lively discussions that take place in science social networking sites like the Life Scientists Room of FriendFeed.

And then there is the whole world of Open Science and Open Notebook Science. Brilliant thinkers like the chemist Jean-Claude Bradley of Drexel University and the British biochemist Cameron Neylon are creating new paradigms of how science is done and the results disseminated. More and more research is being conducted via a plethora of online collaborative tools and venues and our dream machine will have to be able to detect, digest and render useable all of that material. It will have to be able to search in a variety of formats (e.g., wiki pages, slide shows, key phrases in lecture uploaded to YouTube or interview with a notable researcher in a podcast, comment in blog posts, key words in PDFs). It will have be able to disambiguate and sort by date and author and enable quick downloads of PDFs and other kinds of documents and files.

It will have to be designed so that those with any sort of disability can use it and be equipped with a translation feature. It would have to be subscribeable via RSS or email. It would be contain “Tweet this,” “email this item” and social book marking buttons. It is imperative that it be maximized for the easiest possible transmission to the scientific community of important news.

Finally, given how much of the new world Open Science depends on the creation of Web tools, our dream machine should have the capacity to facilitate the adoption (and improvement via the dissemination of news about enhancements in real-world settings) of such new tools. The dream machine should enable users to enter in their professional interests (e.g., librarianship or epidemiology) and be able to download those free applications. Such a feature of our dream machine would increase awareness and adoption of Open Source tools, thereby benefiting the creators of such tools (who have devoted many hours and who have applied their skill to the creation of such tools) and the researchers and institutions who could save money and master new technical skills by adopting such valuable tools.

I started this journey into the future of federated searching out by saying that most of us know someone who is ill. I love someone with the terribly disabling disease amyotrophic lateral sclerosis (Lou Gehrig’s Disease). I don’t have the scientific or medical skills to cure her illness. I can only walk her dog or weed in her yard. But as I do those things, I often daydream about how technology could be employed to create things like our dream machine that would help scientists help my friend. It is not so wild a dream.

First posted on the Federated Search Blog click here.