Text-mining and intelligent analysis of knowledge in databases

Chairmen: Prof. Andrey Rzhetsky (The University of Chicago, USA), Prof. Goncharov S.S., Zagoruiko N.G. (IM SB RAS, Novosibirsk, Russia).

Scientists today cannot hope to manually track all of the published science papers relevant to their work. A cancer biologist, for instance, can find more than 2 million relevant papers in the PubMed archive, more than 200 million Web pages by a Google search, and databases holding experimental results contain millions of gigabytes of data. This explosion of knowledge is changing the landscape of science. Computers already play an important role in helping scientists to store, manipulate, and analyze data. New capabilities, however, are extending the reach of computers from analysis to hypothesis. Drawing on approaches from artificial intelligence, computer programs increasingly are able to integrate published knowledge with experimental data, search for patterns and logical relations, and enable new hypotheses to emerge with little human intervention. Scientists have used such computational approaches to repurpose drugs, functionally characterize genes, identify elements of cellular biochemical pathways, and highlight essential breaches of logic and inconsistency in scientific understanding. We predict that within a decade, even more powerful tools will enable automated, high-volume hypothesis generation to guide high-throughput experiments in biomedicine, chemistry, physics, and even the social sciences.