Specifically, we are looking for projects that bring together global health experts and data scientists into teams to dig into rich data sets to answer the toughest questions about maternal, child, and newborn health in India. We want to take advantage of the data revolution and computing power that keeps growing exponentially—and we also want to make sure that these quantitative analyses are grounded in deeply contextualized knowledge of disease and local communities. Our hope is that this approach will yield specific solutions to specific problems in the short term and a more general process for solving all sorts of problems in the long term.
This data challenge is sponsored by Grand Challenges-India, a partnership with BIRAC and the Gates Foundation, and the Gates Foundation’s knowledge integration initiative, known as Ki.
Since 2012, GC-India has issued calls for proposals on a range of topics and funded promising early stage ideas so that innovators can discover whether they might lead to breakthroughs. The point is to encourage more different kinds of innovators with more different kinds of ideas to join the project of solving health and development problems.
Since 2014, Ki has promoted data-driven solutions to some of the most confounding problems in maternal, newborn, and child health. Working with research scientists from around the world, Ki has collected data from more than 190 studies representing almost 12 million children and integrated it into a single knowledge base. Working with data scientists, the Ki team has also developed a set of cutting-edge tools to analyze the data and get actionable insights out of it.
Until August 25, BIRAC and the Gates Foundation will be accepting proposals for Grand Challenges-India grants that take a Ki approach (and use Ki data, although investigators can also bring in data sets they or their collaborators collected in the past or use data from public repositories). For additional information about the call, please go to this website. It includes a primer about data science for research scientists, a primer about maternal, newborn, and child health for data scientists, a summary of the data sets available for analysis, and detailed information about the types of proposals we’re looking for.
What do we think investigators can accomplish with these integrated datasets that they can’t accomplish in the usual way? Traditional global health research relies on clinical trials, which are used to test a single hypothesis. However, when you take all the data generated by all the trials and integrate it into one big knowledge base, you can start asking many more questions. You can also start seeing connections and patterns across domains that never would have emerged from the standard hypothesis-based approach. For example, you might investigate how sanitation relates to enteric disease, which relates to stunting, which relates to cognitive impairment. What’s more, you can formulate new hypotheses and make sprawling connections in a fraction of the time and at a fraction of the cost.
Take an example from Ki’s work analyzing wasting, the condition of being too thin as a result of acute malnutrition. We know that wasting is devastating—in 2011, it was the cause of 875,000 child deaths. But we know very little about how wasting works—that is, at what age children tend to suffer bouts of wasting, for how long they tend to last, or how well children tend to recover from being wasted.
To start generating some insights that could help prevent and treat wasting and stunting, Ki looked through the 29 India studies covering 409,000 moms and babies and pulled together 11 that include measurements of height and weight to look for patterns. These 11 studies were compiled into a “giant data set” that is tailored to research questions about stunting and wasting in India. The data scientists spent two weeks on iterative data analysis, generating new and important insights regarding seasonality, incidence, and prevalence that were possible only because the composite data set has greater statistical power to ask questions at a new level of granularity. The new learning coming from these analyses is already helping the foundation adjust its investment strategy.
This brand new information in itself is not a solution to wasting. But it tells experts where to look for solutions. And it was hiding in data that already existed in 11 different places. It was just waiting to be unlocked by an innovative data sharing and analysis approach.
We believe the India dataset can be analyzed in similar ways to yield insights on a wide range of issues that affect vulnerable populations, including:
- Finding patterns that link individuals with positive health outcomes despite a high number of risk factors
- Converting correlations into causal hypotheses (e.g., establishing the impact of air pollution on fetal growth)
- Combining data focused on improving child survival with data focused on improving early neurodevelopment
- Determining critical periods for intervention during pregnancy and early childhood
- Pinpointing the relative contributions to health outcomes of diet quantity versus quality