Datafest, a recent event at The Shard in London, marked the culmination of a 12-week effort by students participating in the 2023 Data Science for Social Good (DSSG) Summer Fellowship programme at the University of Warwick, supported by the London Mathematical Laboratory.
These activities are part of the larger Data Science for Social Good (DSSG) international programme, which originated at the University of Chicago in 2013. DSSG aims to equip data scientists with skills to tackle pressing societal issues.
At Datafest, students reported on four specific projects which were undertaken this summer. This blog post briefly examines some of the progress made in one project focussing on improving predictions of deforestation in the Brazilian Amazon.
Deforestation in the Brazilian Amazon is a global concern, with devastating impacts on climate change and biodiversity. The Amazon rainforest lost approximately 9% of its territory, equivalent to the size of France, between 2001 and 2020. Recognizing the gravity of this issue, the DSSG project partnered with UN-REDD, the United Nations Programme on Reducing Emissions from Deforestation in Developing Countries.
LML Fellow Mark Buchanan learned about some of the project activities from Jack Buckingham of the University of Warwick, a Technical Mentor in the project.
Mark Buchanan: “How did this project originate? And what were its aims?”
Jack Buckingham: “Representatives of the UN-REDD programme came to us with the problem of predicting deforestation in Brazil and Indonesia. They wanted to get more accurate predictions than already exist and a better understanding of the key drivers behind deforestation.”
“The project’s goal was to develop a high-resolution visualisation tool powered by machine learning, enabling the forecast of deforestation and identification of factors driving it at a localised scale. Our first step was to narrow the scope to focus solely on the Brazilian Amazon, recognizing that the drivers of deforestation vary across countries. The objective was to generate valuable insights for policy makers, such as identifying critical areas for protection due to their heightened risk of deforestation.”
“In the end, we produced prediction maps for how much deforestation could be expected in each of the next three years, which were then visualised through the development of a Google Earth Engine App. The interface allowed user interaction to explore factors and understand impacts, such as carbon loss at a 6km scale.”
Mark Buchanan: “What were the main challenges in taking the data and going further than what has been done in the past?”
Jack Buckingham: “The biggest challenge was to understand how fine-grained the predictions could be. The data comes in at 30m resolution, and we tried to make predictions at this level of granularity. Then we zoomed out to see at what scale it really made sense. As expected, predicting for larger areas is a much easier problem but still a significant challenge in that we had really big data sets and only 16 Terabytes of storage space. It took a lot of computing power to do even simple things.”
“The data is from satellite images, but we didn’t use the satellite images directly. Other people have pulled out certain features, such as the current forest coverage, or maps of land use and land cover. We could tell when land was currently used for cattle farming, for example, or mining. These are useful features for making predictions.”
“I think we’ve come up with a well-rounded product in the end. I don’t think it will be possible to get a complete causal model of how different specific factors lead to specific outcomes but we have made further insights into the key drivers contributing to deforestation.”
This project exemplifies the potential of data science to address some of the world’s most complex challenges.