Neil Smalheiser

Informatics of Scientific Discovery

Date: May 1, 2015

Time: 12:30pm – 1:30pm

Room: Wells Library, Rm LI 030


Scientists working in the laboratory or clinic “play” with hypotheses and evidence in a very creative, flexible and sophisticated manner, which is not well captured by existing informatics models of scientific discovery. I will review some of the “Aha!” moments that have occurred in my own laboratory, and illustrate how they can be assisted by tools based on the concepts of “literature based discovery” (Arrowsmith A-B-C model), “undiscovered public knowledge”, and other use cases. The Arrowsmith project has spun off text mining models and tools that have taken on lives of their own. Notably, the Author-ity author name disambiguation tool predicts which individual wrote which article (for all authors and articles in Medline), providing a unique dataset for studying scientific innovation and collaboration, and creating new knowledge in and of itself (by linking disparate types of author-related information together). Finally, I will summarize our ongoing effort to create a pipeline of text mining tools to assist experts in assessing published evidence in writing systematic reviews.


Neil R. Smalheiser, MD, PhD is Associate Professor in Psychiatry at University of Illinois in Chicago. He has almost 30 years of experience pursuing basic wet-lab research in neuroscience, most recently studying synaptic plasticity and the genomics of small RNAs. He has also directed multi-disciplinary, multi-institutional consortia dedicated to text mining and informatics research. Regardless of the subject matter, one common thread in his research is to link and synthesize different datasets, approaches and apparently disparate scientific problems to form new concepts and paradigms. Another common thread is to identify scientific frontier areas that have fundamental and strategic importance, yet are currently under-studied, particularly because they fall “between the cracks” of existing disciplines.