Meta-What? Statistical Analysis in the Era of Big Data



Symposia will be available on-demand on their scheduled date, then again at the conclusion of the conference.

Studies specific to individual animals, study areas, or points in time provide insight into wildlife ecology and management, but their inference space is often limited to local populations and environmental conditions, and in time. In the era of big data and the ability to rapidly and efficiently consolidate large data sets, along with new policies that require publication of data, the ability to conduct meta-analyses of ecological data is increasing. A meta-analysis of thematically related wildlife studies can provide a broader scale of inference, increase sample sizes, and yield more robust analyses of wildlife data. Data used for a meta-analysis can originate from published or unpublished work and sometimes disparate sources. Wildlife scientists have amassed large amounts of data that may be suitable for use in a meta-analysis, but discovering available and relevant sources and deciding on the best approaches for moving forward with analysis can be difficult. This symposium will focus on meta-analysis approaches, data reporting, examples of recent meta-analyses of wildlife data, cautions, and recommendations.

What Meta-Analysis Can Do for Wildlife Science and Management
Ryan Nielson
As more wildlife professionals join the effort to understand our natural world and manage the consequences of human actions, the number of wildlife studies pertaining to any one individual species (e.g., mule deer; Odocoileus hemionus) has grown exponentially. These studies, when viewed alone, have at least one thing in common; unless replicated, their validity is unknown, and inference is limited or tenuous at best. The goal of meta-analysis in wildlife science is to increase replication, beyond one specific study area, period, or sample to see if relationships are consistent across a broad spectrum of environmental conditions. A meta-analysis leverages data from multiple studies, either conducted in unison or independently, for more robust predictions and to measure variability across environmental conditions or replicates (e.g., study areas or animals). In addition, we can use meta-analysis to identify where and when predictions can be applied with confidence. Thus, a meta-analysis can lead to broader inference space beyond one individual study. Individual studies can be designed to complement each other with a meta-analysis in mind or a meta-analysis on previously collected data may be more cost effective. The number of opportunities to conduct a meta-analysis in wildlife science will continue to grow. We are just beginning to discuss ways to conduct meta-analysis, evaluate and interpret results, anticipate difficulties and account for the inherent variability in data that comes from disparate sources. In the era of big wildlife data, meta-analysis will provide more reliable information for wildlife managers and help guide future research.
Meta-Analyses of Observational Studies in Wildlife Conservation: Getting the Basics Right
John R. Fieberg; Dan Larkin; Robert Buck
Meta-analyses are appealing because they provide a formal method for synthesizing results from multiple studies. There are many ways that meta-analysis can go astray, however, potentially leading to faulty conclusions that nonetheless become influential because they appear to distill an entire body of research. Using a few high-profile examples from the published literature, we highlight several challenges that arise when applying meta-analyses to observational studies in wildlife conservation and management. Specifically, we will discuss issues related to choosing an appropriate response metric, selection biases and confounding in observational studies, difficulties with summarizing multiple endpoints where desired effects may be of opposite sign, and appropriate case weighting.
Modeling Landscape Use by Animal Species with Disparate Data: Challenges and Opportunities
Michael J. Wisdom; Ryan M. Nielson; Mary M. Rowland
Models of landscape use based on estimates of distribution, movement, occupancy, resource selection, or habitat use are a common and valued tool in contemporary wildlife research and management. These models provide a means to evaluate and predict habitat choices made by animal species over large areas, and by extension, characterize spatial distributions and spatiotemporal responses of populations to habitat change. The burgeoning volume of telemetry data available for many wildlife species, collected across large areas of a species range, now facilitates modeling of landscape use with broad spatial inference such as to an ecoregion or biome. It is not clear, however, how well regional models can be developed and validated for broad inference with disparate data sources, often available from multiple study areas and time periods but not collected under a consistent sampling framework. We highlight case examples of successful integration of disparate data to model landscape use across millions of ha of a species range. We discuss the strengths and weaknesses of such approaches and offer general guidance. Opportunistic use of disparate data sets to model landscape use can be considered a form of meta-analysis, with the same challenges and opportunities for knowledge gain and expansion of inference space. Taking advantage of these opportunities will require a careful understanding of data limitations and innovative methods of integration.
Recursive Bayesian Updating for Ecologists
Mevin B. Hooten
Bayesian models are naturally equipped to provide recursive inference because they can formally reconcile new data and existing scientific information. However, popular use of Bayesian methods often avoids priors that are based on exact posterior distributions resulting from previous data. Recursive Bayesian methods include two main approaches that we refer to as Prior- and Proposal-Recursive Bayes. Prior-Recursive Bayes uses Bayesian updating, fitting models to partitions of data sequentially, and provides a convenient way to accommodate new data as they become available. Prior-Recursive Bayes uses the posterior from the previous stage as the prior in the new stage based on the latest data to update inference and forecasts, but is difficult to implement exactly in practice. By contrast, Proposal-Recursive Bayes is intended for use with hierarchical Bayesian models and relies on a set of transient priors in first stage independent analyses of the data partitions. The second stage of Proposal-Recursive Bayes uses the posterior distributions from the first stage as proposals in a simplified MCMC algorithm that results in computational improvements. We combine Prior- and Proposal-Recursive concepts in a framework that can be used to fit any Bayesian model exactly and efficiently. Our method can be applied to fit a wide range of ecological models and we demonstrate it by analyzing both telemetry and population survey data for various species.
Within Study Meta-Analysis: Practical Multistage Estimation for Multi-Component Models
Devin Johnson; Brian Brost; Mevin Hooten; Michelle Lander
In this presentation we examine fitting complex, multi-component ecological models using a variety of different software platforms. When researchers think of using meta-analysis methods it is usually to amass information from various studies collected over time and space to form a quantitative picture of the accumulated evidence on a topic. In this talk we demonstrate that these same techniques can be used within the same study for estimating parameters in complex ecological models. Traditionally, complex models such as hierarchical models used to estimate population-level responses or integrated population models which combine different sources of data with common parameters must be analyzed with bespoke code in an MCMC setting. This can be time consuming or require large amounts of computer processing and memory when the individual level data are themselves complex or “big.” In addition, there is often software available for analyzing the individual level data creating wasted effort recoding the same procedures as submodels in a full hierarchical model. Inspired by recent research on 2-stage inference we show that approximate Bayesian inference for these complex models can be achieved through linear mixed model analysis of the individual submodel results. To demonstrate these techniques we analyze data from a study on wolf effects on moose browsing and some integrated population data on northern lapwings. In the moose browse example individual level models where fitted with standard GLM software and combined into a hierarchical model in the second stage. In the lapwing example ring-recovery data were fitted with program MARK and population data were fitted with a custom state-space model. Both examples give approximately the same parameter estimates as a simultaneously fit model.
Replication -The Hidden Value of Meta-Analysis
Douglas H. Johnson
Meta-analysis is the analysis of analyses. The term itself dates only to 1976, but the concept of pooling information from multiple sources has a much longer history. One of its objectives is to determine how consistent an estimated effect size is among studies of that effect. The fundamental component of meta-analysis is replication, specifically what I termed meta-replication, replication of entire studies. Within the last 10 years, researchers have reported glaring issues in replication in several scientific fields. The poor record of reproducibility of studies is a major concern in science and a factor in current skepticism about science. Guidelines for appropriate replication methods—which depend on the objectives—will be discussed
Predator Control Is Unlikely to Increase Ungulate Populations: A Formal Meta-Analysis
T.J. Clark; Mark Hebblewhite
Large carnivores are expanding and increasing conflicts with humans via predation on livestock and harvested wildlife. Recent meta-analyses have shown that predator control (hereafter, predator removal) has mixed success in reducing livestock predation. Yet it is unknown how effective predator removal is to decrease predation on harvested ungulates due to a lack of quantitative synthesis. We quantified the demographic response of ungulates to experimental predator removal and identified the ecological and experimental factors which improve the likelihood of removal increasing ungulates. We conducted a literature review finding 62 experiments from 47 publications. We conducted a meta-analysis to determine the overall effect size and factors which improved ungulate demography during predator removal. Lastly, we tested for evidence of publication bias and lack of experimental rigor for these experiments. We found that predator removal improved ungulate responses by 13% (95% CI = 4.1 – 23%), yet prediction intervals overlapped with 0 (95% PI = -34 – 93%), indicating that future experiments could have negligible effects. Predator removal was more successful in improving the demography of young (e.g., recruitment ES = 44%, 95% CI = 13 – 83%) but equivocal in improving adult survival (ES = 5.4%, 95% CI = -18% – 36%) and ungulate abundance (ES = 13%, 95% CI = -17 – 31%). The low effectiveness of predator removal might be linked to their slow life history and the compensatory mortality of carnivores on ungulates. We identified the experimental design factors which increased ungulate responses to predator removal, including improvements in experimental design. Lastly, we found evidence of publication bias, where experiments with negative effects were underreported. We suggest an open standards framework akin to the “Open Standards for the Practice of Conservation” framework developed for evaluating predator removal practices to increase harvested wildlife populations.
Using Recursive Bayesian Computation for Population-Level Inference on Snake Movement
Abigail Feuka; Melia Nafus; Amy Yackel Adams; Mevin Hooten; Larissa Bailey
Bayesian hierarchical models can be an intuitive way to achieve population-level inference in ecological studies, as they consist of separate data, process, and parameter models that account for uncertainty at multiple scales. However, these models can become large and unwieldy to fit when the data model is complex and population-level inference is desired for multiple parameters within it. Such is the case with animal movement models that are designed to analyze telemetry data with many relocations per individual and multiple individuals in a study. Two-stage model fitting algorithms can be useful for handling these telemetry data sets because they can be used to analyze each individual trajectory in parallel and then combine the results from the individual-based models to obtain population-level inference. A remaining challenge arises due to the need to supervise and tune Markov chain Monte Carlo (MCMC) algorithms to fit the models to data. To circumvent this issue, we developed an approach to fit a hierarchical state-space movement model in two stages, using a completely unsupervised MCMC algorithm in the first stage on untransformed parameters. We fit this model to daily-resolution Brown Treesnake (Boiga irregularis) telemetry data from an experiment designed to compare the movements of snakes translocated from forest and urban areas to those of resident snakes in forest and urban areas. Using parallel computing, we fit individual-level models to over 100 snake trajectories and combined the results to achieve exact population-level inference on the four treatment groups of the translocation experiment while accounting for individual variation in movement parameters. Our approach is among the first that formally scales from individual- to population-level inference using mechanistic hierarchical movement models for herpetofauna.
A Meta-Analysis of Bald and Golden Eagle Productivity Accounting for Spatial and Temporal Studies
Mark C. Otto
When it is not possible to obtain direct survey estimates needed, it can be possible to obtain robust estimates using meta-analysis. As part of a larger effort to update and improve demographic parameters used in eagle population modeling efforts, Brennan and Millsap (2016) compiled a dataset of contemporary productivity information for bald and golden eagles, Haliaeetus leucocephalus and Aguila chrysaetos respectively, across the U.S. from 1995-2014. As in many ecological studies, individual surveys are done over multiple areas and/or multiple years. To obtain a representative estimate over areas and years, the variation over areas and years must be accounted for within the individual studies. Centered random effects accounted for separate spatial and temporal effects within studies so that overall estimates are apart from the individual study area and year variation. A random-effects meta-analysis model estimated the predictive distributions for bald eagle and golden eagle productivity. Differences between models were accounted for by differences in AIC. Bald eagle productivity differed by region with lower productivity in the Southwest (mean = 0.77, SE = 0.249) than in the rest of the continental U.S. (mean = 1.15, SE = 0.252), whereas golden eagle productivity did not differ by region (mean = 0.55, SE = 0.087). Apart from the fixed stratum differences for bald eagles, the best-supported models included standard errors for the random effects for study, area (bald eagles only), year given study, and over dispersion; the extent to which the random effect credible intervals overlapped zero varied by species.
Methods to Evaluate the Population Impacts of Disease: An Example with Bald Eagles and Lead Poisoning
Brenda Hanley; André A. Dhondt; Elizabeth M. Bunting; Mark A. Pokras; Kevin P. Hynes; María J. Forzán; Ernesto Dominguez-Villegas; Krysten L. Schuler
Poisoning from lead (Pb) ammunition fragments causes death to birds of prey worldwide, but it is unclear the impact that these mortalities have on population dynamics at the ecological scale. While ingestion of Pb from spent ammunition continues to kill individual bald eagles (Haliaeetus leucocephalus) throughout the United States, eagles have been deemed a wildlife recovery success story after populations rebounded from near extirpation. We pooled veterinary and demographic data from seven states to determine whether eagle deaths from Pb toxicosis altered the dynamics of the population’s recovery in the Northeast United States (NE) over the past three decades. We adapted a combinatorial optimization algorithm (COA) and used it in conjunction with a mathematically symbolic life history to compare population dynamics of bald eagles under current (Pb) and hypothetical (Pb-free or Pb-reduced) scenarios. We found that Pb-associated mortalities depressed the long-term growth rate of eagles in the NE and differentially influenced females. Given slight modifications, the methods may be used to ascertain the population impact of any disease or contaminant to any wildlife species given a detailed understanding of the species’ life history, sufficient annual time series count data, and observational necropsy data from several sources.
Meta-Analyses of Range-Wide Population Trends in Northern Spotted Owls
Alan B. Franklin; Gary C. White
The northern spotted owl (Strix occidentalis caurina; NSO) was federally listed as a threatened species in 1990. To better understand range-wide population trends, seven meta-analyses of NSO demography data have been conducted every 5 years since 1993 to facilitate merging research and management. Each meta-analysis combined data on survival, reproduction and rates of population change, and later territory occupancy, from up to 16 individual study areas. The approach used in these meta-analyses contrasted with traditional meta-analytical methods in that the raw data from each study area was used rather than just analyzing parameter estimates from each study area. These meta-analyses were successful for multiple reasons. First, rigorous protocols included data checking and certification followed by a collaboratively developed analytical framework prior to conducting analyses and viewing results. Second, a team of biometricians was included to incorporate current methods, parcel out analytical tasks, and provide input into developing analytical procedures. Third, benefits of combining data from across the entire range of the NSO demonstrated stronger relationships from covariates such as latitude, climate, habitat, and barred owl presence and range-wide inferences to assist decision making by managers that needed information on NSO populations, Results from each meta-analysis have been published in peer-reviewed monographs and journals. Additional byproducts of these meta-analyses have been collaborations that have resulted in numerous other publications and novel statistical estimators and software, such as program MARK.
How we combined three quarters of a century of sage-grouse studies into a range-wide demographic meta-analysis
Rebecca Taylor; Brett Walker; David Naugle
Sage-grouse demography has been studied since the 1930s in a multitude of locales across the species’ range, but most studies have been only a few years long. We conducted a meta-analysis to provide a comprehensive view of sage-grouse demography and broad-based recommendations to enhance population growth. We applied search criteria to obtain 108 demographic rate studies, and we used information from the 50 studies that met our inclusion criteria. When needed, we applied post hoc-corrections to make apparent rate estimates comparable to true rate estimates. We then described the distribution of each demographic rate over space and time with a mean and process variance estimated from a mixed effects model. These distributions allowed us to conduct a life-stage simulation analysis that identified which demographic rates had the largest per-unit effect on population growth and which explained the most variation in population growth. Both are critical because demographic rates that have the highest per-unit impact on population growth are often the rates that vary least in nature and may therefore be less susceptible to management actions. To maximize population growth, management should simultaneously target female survival, chick survival, and nest success. High spatio-temporal variation in demographic rates indicates that while findings from short-term studies are important, they should be viewed with caution because a rate may be low for a few years as a result of natural variation. Our comprehensive meta-analysis has facilitated conservation of mesic habitats within sagebrush landscapes where females raise chicks because we demonstrated the importance of chick survival to population growth. Finally, because our meta-analysis can include mechanistic linkages between a management action (e.g., pinyon juniper removal), a demographic rate (e.g., nest success), and population growth, it has inspired new research on sage-grouse habitat management.
Use My Data! Publishing Tips to Facilitate Future Meta-Analyses
Althea ArchMiller
The ideal meta-analysis is derived from a body of literature with consistent and thoroughly reported statistics. However, in reality, inconsistencies exist from journal to journal and study to study, and reported statistics do not always include key features, such as associated uncertainties or sample sizes. Drawing from personal experience overcoming such obstacles, I will review best publishing practices to help future meta-analyses run smoothly and efficiently. We will discuss the importance of reporting comprehensive statistical data, ways to create graphs that facilitate data extraction, and other aspects related to research reproducibility and transparency. I hope that better publishing practices will increase the likelihood of all of our research becoming meaningful data in future meta-analyses.
Data Synthesis Techniques, Practical Applications in Wildlife: Notes from a Data Slayer
Robin Russell
Multiple statistical and non-statistical issues arise when combining data across sites, projects, and timescales. I will summarize the main points of the symposium, focus on the logistics of data synthesis including pitfalls and benefits, suggestions for data management, suggestions for working with collaborators, and provide an overview of data synthesis for the practicing quantitative ecologist. In addition, I will present several examples of data synthesis projects including survival analyses of robust design data across multiple species of frogs, a similar analyses of prairie dog survival in response to a field trial of an oral sylvatic plague vaccine, an analyses of time to event data from lab trials of plague challenged prairie dogs, and multi-session spatial-capture recapture data. These data sets included multiple collaborators, sites, species, as well as differing survey designs. I will discuss the trade-offs in biological inference between large scale analyses of data and smaller-scale in depth analyses of single site data, as well as tips for combining data across multiple studies.

Organizers: Ryan Nielson, Eagle Environmental, Inc. Santa Fe, NM; Mary Rowland, U.S. Forest Service, La Grande, OR; Robin Russell, USGS, Madison, WI
Supported by: Biometrics Working Group

Location: Virtual Date: September 30, 2020 Time: -