Cold/Cozy Mice Paper

2019-07-19T08:03:20Z (GMT) by Helena Deus
Problem Statement: the task facing biomedical scientists hoping to find publications that corroborate or debunk a hypothesis is akin to finding a needle in a haystack that keeps growing. Strategies that mine or summarize the scientific literature exist but have been largely focused on recovery of named entities (e.g. proteins, cells) or more sophisticated methods that make use of ontologies to recover also related terms and even, more recently, machine learning methods when there is sufficient training data. Our Approach: we describe a use case faced by a biomedical scientist who needs to compare tumor volume/weight results in papers describing mice experiments where mice were exposed to the same or similar compounds but housed in different temperatures. In our approach, we have extracted annotations of units and measures (U&M) in scientific literature, which we then used in combination with contextual information (e.g. section of the paper, patterns in sentences) to identify the specific entity being measured (Housing Temperature, Compound Dose). Results and Discussion: from a corpus of 1M open access publications we found 299 relevant papers using the U&M approach combined with its surrounding contextual information. We found a clear prevalence of papers mentioning housing conditions in the range of 20-25C, which is the approximate temperature range suggested by NIH guidelines. We also found a small increase in the number of paper describing mouse thermos-neutral housing conditions in the period after the observation that this variable has an impact in mice responding to chemo-toxic drugs (2014-2016). With discuss how our approach of using U&M combined with contextual information can be used as a starting point toward the generation of large minimal-intervention datasets for machine learning purposes. This dataset contains our results.