18.09.2024

Monitoring species’ populations helps us understand changes in biodiversity. However, gaps in data collection can lead to misleading results. It’s important to carefully analyze and adjust for these gaps to get accurate insights into species trends. Diana Bowler and Rob Boyd discuss their new paper highlighting key steps to consider…

Indicators of species’ population sizes, or 'biodiversity indicators', are core parts of the evidence-base about how our natural world is changing. Biodiversity indicators typically derive from species monitoring data collected by national monitoring schemes, local projects and community scientists. As such, they highlight one of the many ways in which monitoring data are invaluable for research and policy.

At the same time, there are many gaps in monitoring data, which can affect the interpretation of biodiversity indicators. Gaps in monitoring data occur both in space – some geographic regions and habitats are still not well-sampled – and in time – monitoring effort has increased over time and often has peaks associated with the production of species atlases.

These complexities mean we must think carefully about our data and analysis when producing indicators to ensure they accurately reflect how species are changing. For example, if gaps disproportionately fall in areas where species are faring poorly, then using the available data to create a biodiversity indicator will paint an overly optimistic picture of change.

The issue of gaps and biases in monitoring data has become a central topic of the Biodiversity Monitoring and Analysis team at UKCEH, as highlighted by our recent publication (Bowler et al. 2024) in Biological Reviews. The paper draws on ‘missing data theory’, a well-established statistical framework that deals with the more general problem of missing data that affects all disciplines in science. By using this theory, we hoped to borrow learnings and solutions developed elsewhere and apply them to the specific problems found within biodiversity monitoring data.

Set of squares illustrating a scenario of multiple survey visits across sites and years with different types of data gaps
Illustrating a scenario of multiple survey visits across sites and years with different types of data gaps in biodiversity data, from Bowler et al. 2024

Using this framework, we established some general principles for when data gaps lead to biased predictions of species trends and identified possible approaches to adjust for them within our statistical analysis. We show that data gaps can be problematic when the causes of gaps are linked to the factors that affect where species are found; for instance, if data gaps are more common in habitats avoided by species.

We also show that adjusting for data gaps is only possible when there is a good understanding of the causes underlying the gaps and data available to model them. With the right knowledge and data, we can use methods like weighting (giving more importance to certain data based on their sampling coverage) or imputation (filling in missing data with estimated values) to handle gaps and reduce bias in species trend predictions.

When we don’t have enough data, we suggest using sensitivity analysis (checking how different assumptions affect the result) to compare model predictions in different scenarios with missing data. Because of these complexities, there’s no one-size-fits-all solution for dealing with gaps in monitoring data. The best approach depends on the specific scientific question, sampling pattern, and species’ ecology.

We hope that our new work, and past work in this area (eg Boyd et al. 2023), provide useful options for greater testing and exploration of these approaches to deal with data gaps in biodiversity change analyses, helping ensure we make effective use of large and growing biodiversity data sets for both research and policy decisions.

Diana Bowler and Rob Boyd

Bowler, D E, Boyd, R J, Callaghan, C T, Robinson, R A, Isaac, N J, & Pocock, M J (2024). Treating gaps and biases in biodiversity data as a missing data problem. Biological Reviews. In press.

Boyd, R J, Powney, G D, & Pescott, O L (2023). We need to talk about nonprobability samples. Trends in Ecology & Evolution, 38(6), 521-531