User Guide

Automated Probabilistic Co-Occurrence Assessment Tool

Version 2.0 - 2023


Table of Contents

  1. Release Notes
  2. Introduction
  3. Software and Data Download
  4. System Requirements
  5. Running the Software
    1. Project Management
    2. Pesticide Use Footprints
    3. Species Distribution Modeling
    4. Co-Occurrence Assessment
  6. Example Assessment
    1. Pesticide Use Footprints Example
    2. Species Distribution Modeling Example
    3. Co-Occurrence Assessment Example
  7. Citation
  8. References

Foreword

 In response to the need for efficient production of advanced geospatial analyses of co-occurrence between pesticide use and species of interest as required by the Endangered Species Act, Stone Environmental, with the support of Syngenta Crop Protection, developed the Automated Probabilistic Co-Occurrence Assessment Tool (APCOAT) in early 2022. APCOAT is designed to produce probabilistic spatial models of both pesticide use and species distributions, and combine the models for co-occurrence assessments. Each of the models may also be run independently. The pesticide use models are represented by probabilistic crop footprints[1, 2] and statistical measures of the Percent Crop Treated (PCT) derived from freely available pesticide usage data[3], or from pesticide usage data provided by the user. The species distribution models (SDMs) are produced using maximum entropy methods[4] analyzing the statistical fit between species presence location records and geographic predictor rasters for environmental variables. Probabilistic co-occurrence between pesticide usage and species distributions is calculated by multiplying the two model output rasters. For planning and conservation purposes, the co-occurrence statistics may be summarized by state, crop reporting district, county, or watershed.



1. Release Notes

Overview
Version 1.0 core functionality - June 2022:

Version 2.0 additional features - April 2023:

Bug Fixes
 If you encounter any bugs in this release, please use the contact form at https://www.stone-env.com/APCOAT to submit a detailed report, including a description of the inputs used, error messages received, and contact information for follow-up communications.


2. Introduction

APCOAT workflow diagram

 More than 1,600 animal, plant, and other species are listed as threatened or endangered under the Endangered Species Act (ESA) by the US Fish and Wildlife Service (USFWS), whose mission is to prevent the extinction of these sensitive species and take actions to allow population recovery and eventual delisting. Under Section 7 of the Act, federal agencies are required to consult with USFWS and the National Marine Fisheries Service (NMFS) to evaluate whether their actions may affect listed species protected by the ESA[12]. For example, the US Environmental Protection Agency (EPA) regulates the sale of more than 1,000 pesticide chemicals, each of which is required by law to go through this consultation process, in which the USFWS and/or NMFS must issue a Biological Opinion regarding potential risks to listed species. Despite this legal requirement, few pesticides have undergone the consultation process, perhaps due to interagency disagreements about interpretation of the law or poorly defined methods. A 2013 National Research Council[13] study described a method for conducting these consultations, known as Biological Evaluations (BE), in which a preliminary determination for each listed species is made, in part, based on the degree of spatial overlap between pesticide usage areas and the species' range or Critical Habitat maps. Following publication of this study, USFWS conducted several preliminary consultations on pesticides regulated by EPA[14, 15, 16], later clarifying that >1% spatial concurrence between pesticide usage areas and listed species habitats was a threshold for a 'may affect' determination[14]. In the event of such a 'may affect' decision, BEs use a weight-of-evidence approach to evaluate exposure and toxicity data in consultation with USFWS.

 The current BE model has not been effective for regulatory agencies or industry and may not be appropriately assessing potential risks to listed species. The method is deterministic (i.e., presence/absence), identifying simple spatial overlap between listed species ranges and crop footprints. The species range datasets are often poorly modeled or of low spatial resolution, and the crop footprints do not reflect variable patterns of crop production or pesticide usage. Such methods may either under- or over-predict pesticide exposure risk, benefiting neither listed species nor the parties involved. In recent work, a team of scientists at Syngenta Crop Protection and Stone Environmental has addressed some of these challenges by developing probabilistic models of pesticide use patterns and species distributions[1, 2, 18]. To address the high number of species protected by the ESA and pesticide products regulated by EPA, in ongoing research in 2019-2020 we apply these methods to suites of listed species co-occurring in delimited geographic areas (i.e., watersheds). USFWS recently published standard operating procedures for characterizing listed species distributions that are expected to improve the rigor of co-occurrence analysis, and follow similar methods to those used in our own work.

 The scope and aim of this software is to provide users with the ability to rapidly implement the methods we have developed to assess probabilistic co-occurrence between pesticide use and species of interest for the purposes of regulatory review. With APCOAT, users are able to use pesticide application rates and species locations to generate automated reports detailing pesticide usage, species distribution modeling, and co-occurrence between the two (Figure 1). This guide walks users through each APCOAT function in a step-by-step manner and includes descriptions of the methods and processing involved in each model. These methods have been peer-reviewed and published in several research manuscripts, and more detailed descriptions of the models are cited in the References section.


3. Software and Data Download

The APCOAT software package, user guide, and associated probabilistic crop footprints can be downloaded in a single package at http://stone-env.com/APCOAT.


4. System Requirements

4.1 Installation


There are three software packages that must be installed before APCOAT can be deployed:
  1. Install 64-bit Java for Windows version 8.x - https://www.java.com/en/download/.
  2. Install 32-bit version of Microsoft Access database driver http://www.microsoft.com/en-us/download/details.aspx?id=13255.
  3. Install APCOAT.

4.2 Memory Requirements




Component Space Required
64-bit Java for Windows 81 MB
32-bit Microsoft Access driver 25 MB
APCOAT (including probabilistic crop footprints and SDM predictor variables) 16.1 GB
TOTAL 16.2 GB

4.3 Processing

APCOAT will run on most 64-bit Windows 10 machines. However, generating use footprints, species distribution models, and co-occurrence summaries may require several hours of processing time. We recommend beginning co-occurrence assessments using a subset of species of interest to determine the general rate of processing time before completing large batches of species assessments.

Example processing times for an analysis of one pesticide applied to two crops, two SDMs, and the four resulting co-occurrence reports:

Computer Pesticide Usage Modeling Species Distribution Modeling Terrestrial Co-Occurrence Modeling
Laptop
Processor: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
RAM: 8.00 GB
70 minutes 80 minutes 36 minutes
Work Station 1
Processor: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
RAM: 16.00 GB
41 minutes 37 minutes 27 minutes
Work Station 2
Processor: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
RAM: 32.00 GB
45 minutes 42 minutes 30 minutes


5. Running the Software

  Running APCOAT to generate co-occurrence reports consists of four separate processes:
    5.1 Project Management to create, save, and load the project database and associated folders
    5.2 Pesticide Use Footprints to generate probabilistic maps of the likelihood of crop cultivation at 30 m² resolution, and that the given crop will be treated with a pesticide of interest
    5.3 Species Distribution Modeling to generate probabilistic maps of habitat quality for a given species
    5.4 Co-Occurrence Assessment to multiply the pesticide use footprints and species distribution models and produce a co-occurrence raster and report summarizing modeling methods, inputs, and outputs


5.1 Project Management

Methods
  APCOAT projects consists of a project folder paired with a project database. Renaming, editing, or moving any files created by APCOAT in this folder will corrupt database connections. To further process or analyze outputs created by APCOAT, create a copy of the data in a new folder. Associated project sub-folders paired with project databases are used to store the raster and statistical outputs for each process.

To create a project database and associated output sub-folders:

  1. - Click the "File" menu on the upper left corner of the window.
  2. - Select "New".
  3. - Select an empty destination folder.
  4. - Provide a project file name. APCOAT will generate the following files and be named as shown below:
    • A project database used to store user inputs
      • [Project Name].mdb
    • A folder used to store results of co-occurrence analyses
      • \CoOccurrence\
    • A folder used to store results of species distribution modeling
      • \SpeciesDistModels\
    • A folder used to store results of pesticide usage modeling
      • \UseFootprints\
  5. - Project progress may be saved at any time that APCOAT is not actively processing data. The file menu is also used to load previously saved projects.

5.2 Pesticide Use Footprints

Methods
 APCOAT pesticide use footprints are generated by multiplying probabilistic crop footprints by estimates of the Percent Crop Treated (PCT)(Figure 1). The probabilistic crop footprints have already been produced and are included in the software package. The method for producing the probabilistic crop footprints incorporates best available information at the time of analysis from the Cropland Data Layer (CDL)[20] for 6 years (2016 - 2021), the 2016 National Land Cover Database (NLCD)[21], and 6 years of the NASS Agricultural Survey (2016 - 2021)[22]. The method also accounts for misclassification of crop classes in the CDL by incorporating accuracy assessment information[23] by state, year, and crop. The NLCD provides additional information to improve the CDL crop probability through an adjustment based on the NLCD accuracy assessment data using the principles of Bayes' Theorem. Finally, annual crop probabilities are scaled at the state level by comparing against NASS surveys of reported planted acres by crop, and the average annual probability is calculated. Due to large discrepancies between the alfalfa acreage reported by CDL and NASS Agricultural Survey data, the Agricultural Survey data adjustment has been omitted for the alfalfa probabilistic crop footprint.

 The PCT rasters are created by first calculating a time series of the maximum potential annual usage in a region. Maximum annual usage is calculated by multiplying the regional crop acreage measured from CDL for 6 years by the specified application rates. Regional pesticide usage data is then divided by the maximum potential usage in each year to generate a time series of annual PCT calculations for each region. Finally, a user-specified statistic is calculated from these regional time series and converted to raster format. The PCT rasters are then multiplied by probabilistic crop footprint rasters.

To create probabilistic pesticide use footprints:
  1. - Click the "Pesticide Use" tab.
  2. - Select one of the following the pesticide use data sources. Note that your selection will modify the appearance of the subsequent input fields and may not match the example screenshot:
    1. - USGS ePest: Usage estimates compiled by USGS for years 2012 - 2019[3]
    2. - 100% Crop Treated: No usage data is used and all usage sites are assumed to be treated
    3. - Custom (CSV file): Usage data provided by the user. See below for Preparing Custom Usage Data
  3. - Select the desired pesticide use statistic. This will be calculated for each region of interest over the specified annual time series.
  4. - Select the desired spatial resolution, which is dependent on the pesticide use data source.
    1. - USGS ePest estimates are published by state, the only available spatial resolution will be "State" if it is selected as the pesticide use data source.
    2. - Custom (CSV file) data may be provided at the county, Crop Reporting District, or state scale, and the resolution will be automatically detected from the file header. If you wish to summarize the data you have provided at a larger spatial scale, select that scale from the Upscale options (not visible in the example screenshot).
  5. - Select or name the pesticide of interest, depending on pesticide use data source:
    1. - USGS ePest select one of the 315 pesticides available from the ePest database.
    2. - Custom (CSV file) enter the name of the selected pesticide
  6. - Specify the units and annual application rates for each crop of interest where the pesticide may be applied.
  7. - Click "Generate Pesticide Use Footprint". APCOAT will generate the following files for each crop in the "\UseFootprints\" folder and be named as shown below:
    • A CSV file containing the time series of Percent Crop Treated (PCT) calculations
      • [Pesticide]_PCT_[CropApplication Rate]_[Usage Data Source]_[Usage Data Resolution]_[UsageStatistic].csv
    • A CSV file containing the statistic calculated from the PCT time series
      • [Pesticide]_PCTstatsRep_[Crop]_[Application Rate]_[Usage Data Source]_[Usage Data Resolution]_[UsageStatistic].csv
    • A raster showing the PCT values applied for each region
      • [Pesticide]_PCT_[Crop]_[ApplicationRate]_[Usage Data Source]_[Usage Data Resolution]_[UsageStatistic].tif
    • Probabilistic crop usage footprints.
      • [Pesticide]_ProbUseFP_[Crop]_[ApplicationRate]_[Usage Data Source]_[Regional Resolution]_[UsageStatistic].tif
      • Downsampled versions of the crop usage footprints for display and review are also created during co-occurrence processing:
        • [Pesticide]_ProbUseFP_[Crop]_[ApplicationRate]_[Usage Data Source]_[Usage Data Resolution]_[UsageStatistic]_[2-128]x.tif

Preparing Custom Usage Data

If you wish to load pesticide use data it must be formatted as a comma-separated values (CSV) file conforming to the following requirements. Templates for county, CRD, and state level pesticide use data files are available and installed by default in "C:\Program Files (x86)\APCOAT\UserDataTemplates".

Field Title Data Type Data Requirements
"State_FIPS" for state resolution
or
"CRD_STASD" for CRD resolution
or
"FIPS" for county resolution
Text
  • County and state FIPS codes must match those published by the US Census Bureau[21].
  • Crop Reporting District, or Agricultural Statistics District codes must match those published by the National Agricultural Statistics Service[22].
  • A lookup table (tbl_FIPS_CRD) joining each of these code sets is available in each project database.
Year Integer May be 2012 - 2021. Years outside of this range will be ignored.
Crop_Code Integer Crop codes must conform to the following USGS crop groups:
 1 - corn
 2 - soybeans
 3 - wheat
 4 - cotton
 6 - rice
 8 - alfalfa
Pesticide_Use_kg Float No blanks, must include 0 for years and regions with no use.

Screenshot of example pesticide use data

5.3 Species Distribution Models

Methods
 APCOAT uses Maxent software[4] to determine which environmental variables may limit species distributions through comparison of species location records to spatially coincident environmental data, then locates areas with similar environmental values as the basis for generating a Species Distribution Model (SDM). Specifically, Maxent uses presence-only species records to "minimize the relative entropy between two probability densities (one estimated from the presence data and one, from the landscape) defined in covariate space"[26] and generate probabilistic models. Here the user provides the species location records as a CSV file and APCOAT provides the environmental variables that comprise the covariate space as either raster images for terrestrial species or vector files for aquatic species.

 This SDM method is employed iteratively in APCOAT, using predictor variable contribution analysis and removal, to select the best-fit model that uses the fewest predictor variables and minimizes correlation among these predictors[18]. Eighty percent of the species location records are used in the initial model training iterations, and the remaining twenty percent of location records are used for evaluation of the final model. Since the probabilistic models vary slightly with each solution, each model iteration is tested using 5 separate model runs with the same selected input variables.

To create probabilistic species distribution models:

  1. - Click the "Species Distribution" tab.
  2. - Specify the habitat type, "Terrestrial", "Aquatic Flowing", or "Aquatic Static"
    • For terrestrial organisms, values of environmental predictor values are joined to each species location record from contiguous raster surfaces covering the continental United States.
    • For aquatic organisms, species location records are joined to the nearest flowing waterbody (streams, rivers) or static waterbody (ponds and lakes, limited to waterbodies 0.01 - 10 ha in size) within 100 m. If a species record is not within 100 m of a waterbody the record is excluded from modeling.

    Species Selection
  3. - Click "Browse ..." and navigate to a folder containing either a single CSV file or multiple CSV files composed of species location data. See below for recommended species location data sources and formatting requirements.
    • As the species location data is loaded, APCOAT will perform quality checks to ensure that there are the requisite minimum number of 5 records per species (80% of records are used for model training and 20% for model validation), that there are no formatting errors, and whether or not duplicates exist.
  4. - Select the species you would like to model which will use the same SDM predictor variables. Hold Ctrl to add single species to the selection, or Shift to add multiple species to the selection.

    Environmental Predictor Variable Selection
  5. - Select the environmental predictor datasets that you would like to evaluate for inclusion in the SDM of the selected species. The available predictor variables for will differ for terrestrial, flowing waterbodies, and static waterbodies. See below for descriptions of the available datasets. By default they will be installed at C:\Program Files (x86)\APCOAT\maxentPredictors. Hold Ctrl to add single variables to the selection, or Shift to add multiple variables to the selection.
    • Whenever possible, take care to select environmental predictor variables that are known to be relevant to the species being modeled, and exclude any variables that may be clearly irrelevant. Here we recommend that predictor variable selection be guided literature review and expert opinion if possible. Whether or not this guidance is available, we recommend a critical review of the final variables that are selected during the automated model iteration and evaluation, as statistical artifacts may persist from bias in species location data collection. For example, results of a SDM may show that a species exhibits a statistical preference for developed areas, when the reality may be that crowd-sourced location data is more likely to come from densely populated areas. In this case, "Frac_Developed" could be excluded from a second round of species distribution modeling.
    • If you are interested in using external environmental predictor variable datasets, they may either be placed in the default predictor folder before project initiation, or selected by clicking "Browse ..." and navigating to a folder containing the SDM predictor variable rasters of interest. See below for formatting requirements for using external rasters and vector datasets.
    • If a custom predictor variable is selected that contains categorical data such as land cover, the user must double click the name of the predictor in the selection window and add '(factor)' to the end of the name. This is done automatically for "landCover_NLCD_2019.tif", the land use/land cover raster, and "soilTexture_gSSURGO_2019tif", the soil texture raster, both included with APCOAT.

  6. Model Parameter Selection
  7. - Specify the correlation threshold. If predictor variables are correlated above this threshold, only the variable with the highest contribution to the distribution model will be kept. All other correlated variables will be excluded.
  8. - Specify the minimum predictor contribution threshold used to exclude SDM predictor variables.
  9. - Specify the minimum species distribution modeling threshold. This value is used to filter out low quality habitat values.

    Modeling Outputs
  10. - Click "Check Predictors and Observation Points". APCOAT will analyze and report on the alignment of the predictor and observation datasets. For terrestrial SDMs, APCOAT will report if observation points are dropped due to multiple observations being located within the same approximately 800 m pixel, or if observation records are dropped due to being located beyond the extent of a predictor raster and assigned null values for that predictor. For aquatic SDMs, APCOAT will report if observation records are dropped due to being located greater than 100 m from an aquatic feature, or if a selected predictor variable value is null for the closest aquatic feature.
     For either type of SDM, the dropped points are compiled for review in tables linked in the species location report.Reviewing these reports can help to decide if the observation points should be retained in modeling by either de-selecting the predictor variable with null values, or editing the location of the record to better align with the predictor variables. If it is determined that the location of the observation records can be reasonably edited to be included in the modeling, the "Clear SDM Results" function must be used before predictor and observation points can be analyzed and reported again.
  11. - Click "Generate Species Distribution Models". APCOAT will generate the following files in the "\SpeciesDistModels\" folder and be named as shown below. If the "Generate Species Distribution Model Report" box is checked, an optional report containing the model inputs and intermediate results will also be generated. Note that especially for terrestrial SDMs the output file sizes can be large for wide-ranging species (> 10 GB):


  12. For terrestrial species
    • Final SDM raster without the modeling threshold applied
      • [Scientific Name]_sdm_final_raster_raw.tif
    • SDM raster showing only values that exceed the modeling threshold
      • [Scientific Name]_sdm_final_raster_nan_[threshold].tif
    • SDM raster classifying locations where final values exceed the modeling threshold as 1
      • [Scientific Name]_sdm_final_raster_msk_[threshold].tif

    • The following files will also be created when a species is included in a co-occurrence assessment:
    • SDM raster rescaled to match the resolution, projection, and cell alignment of the CDL
      • [Scientific Name]_sdm_final_raster_nan_[threshold]_res.tif
    • Polygon shapefile showing the coverage of the co-occurrence regions where the modeling threshold is exceeded
      • [Scientific Name]_[co-occurrence resolution]_[threshold].shp
    • Captions for diagnostic image and text files detailing model iteration and production which are included in the assessment report.

    For aquatic species
    • Shapefile showing the input values used in aquatic features and the statistics of SDM values computed
      • rivers_[Scientific Name]_epsg5070.shp
    • SDM raster showing only values that exceed the modeling threshold
      • [Scientific Name]_sdm_final_raster_nan_[threshold].tif

    • The following files will also be created when a species is included in a co-occurrence assessment:
    • Shapefile showing values of predictor variables analyzed during modeling, and final SDM statistics
      • [rivers/ponds]_[Scientific Name]_epsg5070_[threshold].shp


Screenshot of example species location file

Species Location Data
 The user may provide their own species occurrence records, or collect them from third parties including:

  • GBIF.org - freely available batches of species occurrence records.
  • FESTF.org - batches of species occurrence records may be accessed through membership.
  • iNaturalist.org - individual species occurrence records, locations of threatened species will be obscured and require additional authorization for access at high resolution.
 When compiling species location data into a single CSV file or multiple CSV files, records must conform to the following requirements:

Field Title Data Type Data Requirements
Scientific Name Text
Latitude_DecimalDegrees Float Must be in decimal degrees
Longitude_DecimalDegrees Float Must be in decimal degrees

 A terrestrial and an aquatic example species location dataset[27] are installed by default in C:\Program Files (x86)\APCOAT\SpeciesLocationData\

Species Distribution Predictor Variables

Terrestrial Predictor Variables
The following rasters are included in the APCOAT installation package:

  1. - Average annual solar radiation, 1990 - 2020[5], annualSolar_PRISM.tif
  2. - Bioclimatic variables[28] listed in the reference table to the right, derived from 1990 - 2020 climate normals[5], BIO[XX]_[Variable].tif for all 19 variables
  3. - Distance to saltwater, calculated using NHD flowline and waterbody datasets[7].
  4. - Distance to freshwater, calculated using NHD flowline and waterbody datasets[7].
  5. - Elevation[26], elevation_EDNA_2018.tif
  6. - Land use/land cover[8], landCover_NLCD_2019.tif
  7. - Soil texture[6], based on representative soil profiles generated by soil series and further simplified to 41 texture classes. soilTexture_gSSURGO_2019.tif
  8. - Growing season length[8, 9], CHELSA_gsl_1981_2010.tif
  9. - Köppen climate classification[8, 9], CHELSA_kg0_1981-2010.tif
  10. - Net primary productivity[8, 9], CHELSA_npp_1981-2010.tif
  11. - Potential evapotranspiration during wettest/driest/warmest/coldest quarters[8, 9], CHELSA_pet_MaxPrecipQuarter.tif
  12. - Site water balance[8, 9], CHELSA_swb_2018.tif
  13. - Habitat heterogeneity[10], EarthEnv_EVI_cv.tif
Aquatic Predictor Variables
The following fields and data for SDM of flowing waterbodies are included in the APCOAT installation package:
  1. - BIO[XX]_[Variable], the mean value within the flowline catchment for all 19 BioClim variables
  2. - FCODE, the flowline type
  3. - Frac_[Class], the fraction of the given land use/land cover class found within a 100m buffer of the flowline[8]
  4. - MAXELEVSMO, elevation in meters[7]
  5. - StreamCalc, order of stream [7]
  6. - Tidal, 1 indicates the waterbody is tidal, 0 indicates that it is not[7]
  7. - SLOPE, [7]
  8. - Sinuosity, calculated as sinuousity = [flowline length]/[distance between flowline endpoints]
  9. - QE_MA, mean annual flow rate[7]
  10. - QE_StDev, standard deviation of monthly flow rates[7]
  11. - *VE_MA, mean annual velocity[7]. *Note that tidal flowlines and those impounded by dams are assigned a dummy value of 0.001 so this variable is not recommended for use for species found in those areas.
  12. - VE_StDev, standard deviation of monthly velocity[7]
  13. - D50_mm_, median bed-material particle size[30]
The following fields and data for SDM of static waterbodies are included in the APCOAT installation package:
  1. - Area_sq_km, area of the waterbody in square kilometers[7]
  2. - BIO[XX]_[Variable], the mean value within the waterbody for all 19 BioClim variables
  3. - Elevation_m, minimum elevation detected within the waterbody [26]
  4. - FCODE, waterbody type[7]
  5. - Frac_[Class], the fraction of the given land use/land cover class found within a 100m buffer of the waterbody[8]
  6. - Length_div_Area, the ratio of the waterbody perimeter to area
  7. - ONOFFNET, 1 indicates that the waterbody is hydrologically connected to flowing waterbodies, 0 indicate that it is not[7]
Using Additional Predictor Variables

 Rasters used in APCOAT species distribution modeling must be of identical resolution, projection, extent, and cell alignment. As such, in order to use additional predictor variable rasters the user may need to modify either the additional rasters or the existing ones if they are to be used together. Additional rasters can also be used independently of the rasters provided with APCOAT. Resampling, reprojection, and cell alignment functions are available in GIS software such as ArcGIS or QGIS. The bioclimatic, land cover, soil texture, elevation, and solar radiation raster datasets provided with APCOAT for terrestrial species distribution modeling are of 5 arc minute spatial resolution (approximately 800 m) and projected using the GCS_North_American_1983 projection, WKID: 4269.

 At this time, APCOAT does not support the use of additional vector predictor datasets. However, additional data may be appended to the existing flowline and waterbody datasets using GIS software. Since Maxent evaluates all possible features within the species range, any null values present in the additional data will result in an erroneous model where habitat quality predictions cannot be properly matched to spatial features.

Reference table for Flowing Aquatic FCODEs
FCODEDescription
33400Connector (no attributes)
33600CanalDitch (CanalDitch Type = null)
33601CanalDitch (CanalDitch Type = Aqueduct)
33603CanalDitch (CanalDitch Type = Stormwater)
42800:42823Pipelines
46000StreamRiver (no attributes)
46003StreamRiver (Hydrographic Category = Intermittent)
46006StreamRiver (Hydrographic Category = Perennial)
46007StreamRiver (Hydrographic Category = Ephemeral)
55800Artificial Path (no attributes) - Note that many of these have been filtered out from the NHDPlus database since they are either represented by waterbodies, or modeled predictor values are unavailable or unreliable. This may cause some flowline SDMs to not appear as fully connected hydrologic networks.
Reference table for Land Use/Land Cover Values derived from NLCD 2016
1 - water
2 - developed
3 - barren
4 - forest
5 - shrub/herb
6 - pasture/hay
7 - cultivated
8 - wetland

Reference table for WorldClim bioclimatic variables
BIO1 = Annual Mean Temperature
BIO2 = Mean Diurnal Range (Mean of monthly (max temp - min temp))
BIO3 = Isothermality ((BIO2/BIO7) x100)
BIO4 = Temperature Seasonality (standard deviation x100)
BIO5 = Max Temperature of Warmest Month
BIO6 = Min Temperature of Coldest Month
BIO7 = Temperature Annual Range (BIO5-BIO6)
BIO8 = Mean Temperature of Wettest Quarter
BIO9 = Mean Temperature of Driest Quarter
BIO10 = Mean Temperature of Warmest Quarter
BIO11 = Mean Temperature of Coldest Quarter
BIO12 = Annual Precipitation
BIO13 = Precipitation of Wettest Month
BIO14 = Precipitation of Driest Month
BIO15 = Precipitation Seasonality (Coefficient of Variation)
BIO16 = Precipitation of Wettest Quarter
BIO17 = Precipitation of Driest Quarter
BIO18 = Precipitation of Warmest Quarter
BIO19 = Precipitation of Coldest Quarter

Reference table for gSSURGO Soil Texture variables
Value Texture
1 Artifacts
2 Coarse sand
3 Loam
4 Clay loam
5 Coarse sandy loam
6 Fine sandy loam
7 Silt loam
8 Sandy loam
9 Loamy fine sand
10 Loamy sand
11 Loamy very fine sand
12 Sand
13 Silty clay
14 Silty clay loam
15 Very fine sandy loam
16 Bedrock
17 Boulders
18 Clay
19 Loamy coarse sand
20 Slightly decomposed plant material
21 Fine sand
22 Cobbles
23 Very fine sand
24 Highly decomposed plant material
25 Moderately decomposed plant material
26 Peat
27 Sandy clay loam
28 Muck
29 Material
30 Cinders
31 Consolidated permafrost
32 Fine gypsum material
33 Fragmental material
34 Gravel
35 Silt
36 Gypsiferous material
38 0
39 Sandy clay
40 Stones
41 Unweathered bedrock
42 Variable
43 Water

Reference table for Static Aquatic FCODEs
FCODEDescription
36100Playa
39000Lake/Pond - feature type only: no attributes
39001Lake/Pond - Hydrographic Category|intermittent
39004Lake/Pond - Hydrographic Category|perennial
39005Lake/Pond - Hydrographic Category|intermittent; Stage|high water elevation
39006Lake/Pond - Hydrographic Category|intermittent; Stage|date of photography
39009Lake/Pond - Hydrographic Category|perennial; Stage|average water elevation
39010Lake/Pond - Hydrographic Category|perennial; Stage|normal pool
40307Inundation
43600:43624Reservoirs
46600SwampMarsh (Hydrographic Category = null)
49300Estuary (no attributes)

5.4 Co-Occurrence Assessments

Methods
 With APCOAT, co-occurrence is measured by multiplying batches of SDM rasters by pesticide use footprints and summarizing the resulting rasters by zones of interest for ease of interpretation. When calculating co-occurrence APCOAT will also produce a report detailing the inputs variables, methods, and outputs involved in each step of the analysis.

 For aquatic co-occurrence assessments, the probabilities of co-occurrence are calculated separately for flowing and static waterbodies (Figure 2). Flowing waterbodies that do not receive flow from neighboring HUC12 watersheds are classified as tributaries and assigned the maximum and mean values of the probabilistic pesticide use footprint within their HUC12. A HUC12 connectivity database is then used to assign maximum and mean HUC12 pesticide use probabilities to flowing and static waterbodies that do receive inflow from neighboring HUC12s. For static waterbodies that are not connected to flowing waterbodies receiving inflow from neighboring HUC12s, a drainage area equal to 10 times the surface area is assumed and delineated using a linear buffer, and the maximum and mean pesticide use probabilities are calculated within these drainage areas. Co-occurrence is then calculated and reported separately for flowing and static waterbodies within the co-occurrence report.

To create probabilistic co-occurrence assesments:

  1. - Click the "Co-Occurrence" tab.
  2. - Confirm that the path to the use footprints folder is correctly associated with the current project. If not, click "Browse..." and navigate to the "UseFootprints" folder associated with the currently loaded assessment project database.
  3. - Select the crop footprints of interest.
  4. - Confirm that the path to the SDM folder is correctly associated with the current project. If not, click "Browse..." and navigate to the "SpeciesDistModels" folder associated with the currently loaded assessment project database.
  5. - Select the SDMs of interest.
  6. - (Optional). Select external NatureServe SDMs of interest.
     a - Provide a species name for the external SDM.
     b - Specify the path of the .TIF file of the external SDM.
     c - Specify the desired SDM threshold. Values below this threshold will be excluded from co-occurrence modeling.
     d - Include helpful metadata for the external SDM that will be included in the co-occurrence report.
     e - Click Submit to apply the modeling threshol and load the external SDM into the database.
  7. - Select the desired summary resolution.
  8. - Click "Generate Co-Occurrence Report". APCOAT will generate the following files for each combination of crop and species analyzed in "[Project Folder]\CoOccurrence\[Usage Footprint]\[Species name]\[SDM run number]\..." and be named as shown below:



6. Example Assessment

 This section shows the expected modeling outputs using an example co-occurrence assessment between two endangered species and applications of a pesticide to two crops. For each modeling tab the inputs, intermediate results, and selected report document figures are shown:
    6.1 Pesticide Use Footprints Example
    6.2 Species Distribution Modeling Examples
    6.3 Co-Occurrence Modeling Example

6.1 Pesticide Use Footprint Example

Inputs

  • The desired output is the probabilistic usage of malathion on alfalfa and cotton.
  • The usage data statistic of interest is the 90th percentile of annual usage as measured by USGS ePest estimates.
  • Application rates are provided in imperial units, 7.5 lbs/acre for both alfalfa and cotton.
  • Processing time was 70 minutes using a laptop with an Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz processor and 8 GB of RAM.

Intermediate Results
* indicates data will be included in co-occurrence report

  • Malathion_PCT_A84064T84064_ePest_state_90thpercentile.csv
    • Pesticide usage, maximum potential usage, and PCT for all crops, years, and regions
  • Malathion_PCTstats_A84064T84064_ePest_state_90thpercentile.csv
    • The annual usage statistic (90th percentile) calculated for each crop and region
  • * Malathion_PCTstatsRep_Alfalfa_84064_ePest_state_90thpercentile.csv
    • A range of statistics from minimum to maximum for a single crop and each region
  • Malathion_PCT_Alfalfa_84064_ePest_state_90thpercentile.tif
    • The national PCT raster generated for the specified usage statistic and crop at 30m resolution. The raster is in integer format and multiplied by a factor of 10,000 such that 100% PCT = 10,000 and 50% PCT = 5,000.
  • Malathion_ProbUseFp_Alfalfa_84064_ePest_state_90thpercentile.tif
    • The national probabilistic usage raster for the specified usage statistic and crop at 30m resolution. The raster is in integer format and multiplied by a factor of 10,000 such that 100% usage probability = 10,000 and 50% usage probability = 5,000. It is produced by multiplying the PCT raster by the probabilistic crop footprint included in the APCOAT installation.
  • Malathion_ProbUseFp_Alfalfa_84064_ePest_state_90thpercentile_64x.tif
    • The national probabilistic usage raster for the specified usage statistic and crop, showing the maximum values from the above raster at 1/64th the original spatial resolution (1,920m). The raster is in integer format and multiplied by a factor of 10,000 such that 100% usage probability = 10,000 and 50% usage probability = 5,000.
Co-Occurrence Report Result

 When a given probabilistic crop footprint is selected for use in a co-occurrence report, the map figure shown here will be automatically generated and included in the report document and co-occurrence folder ([Project Folder]\CoOccurrence\Malathion_ProbUseFp_Alfalfa_84064_ePest_state_90thpercentile.png). This example is for the 90th percentile of annual malathion usage by state as measured by USGS ePest estimates, multiplied by the probability that a given 30m pixel will be planted with alfalfa as measured by the USDA Cropland Data Layer.


Figure 1. Probabilistic use footprint for malathion applications on alfalfa.

6.2 Species Distribution Modeling Example

6.2.1 Terrestrial Species Distribution Modeling Example

Inputs
  • The desired outputs are Species Distribution Models (SDM) for two species, Bombus caliginosus and Hesperia dacotae.
  • The selected species distribution predictor variables include all of the terrestrial predictor variables included with APCOAT.
  • The correlation threshold used to exclude combinations of predictors that are strongly correlated with one another is left at the default value of 0.67.
  • The predictor contribution threshold used to exclude predictors during iterative model evaluation is left at the default value of 1 percent.
  • The species distribution modeling threshold used to filter out regions of low quality habitat is specified at 0.5. Regions of the final SDM with probabilistic habitat quality below 50% will not be included in the output raster.
  • Processing time was 80 minutes using a laptop with an Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz processor and 8 GB of RAM.
Intermediate Results
* indicates data will be included in co-occurrence report
  • *Bombus caliginosus_run001Parameters.txt
    • Each modeling parameter is listed on a separate line: all species predictor variable file names, discrete predictor variable file names, predictor correlation threshold, predictor contribution threshold
  • *Bombus caliginosus_PredCor.png
    • Correlation between predictor variables
  • Bombus caliginosus_sdm_iter[1...5]_real[1-5].rds
    • Intermediate modeling SDM results. Five model realizations are generated for each iteration of predictor variable combinations. In this example five iterations were run before iteration number 4 was chosen.
  • Bombus caliginosus_Contribution_Iter[1-X]_Real[1-5].png
    • The contribution of each predictor variable for all 5 realizations of each model iteration.
  • Bombus caliginosus_ResTable_Iter[1-5].rds
    • Diagnostic statistics for all 5 realizations of each model iteration.
  • Bombus caliginosus_ResVarsTable_Iter[1-5].rds
    • Table showing the contribution of each predictor variable for all 5 realizations of each iteration.
  • *Bombus caliginosus_ModelFitAUC_iter[3-5].png
    • Model fit as measured by total Area Under Curve for each model iteration and all previous iterations.
  • Bombus caliginosus_sdm_final_iter4_real[1-5].rds
    • Five realizations of the final selected SDM iteration.
  • Bombus caliginosus_sdm_final_raster.png
    • A map showing the average of the five final SDM realizations, and species locations. Be sure to consider any restrictions associated with redistributing species location data when publishing this figure.
  • Bombus caliginosus_sdm_sd_final_raster.png
    • A map showing the standard deviation of the five final SDM realizations, and species locations. Be sure to consider any restrictions associated with redistributing species location data when publishing this figure.
  • Bombus caliginosus_sdm_final_raster_raw.tif
    • The final SDM raster without the species distribution model threshold applied.
  • *Bombus caliginosus_final_stats.csv
    • CSV version of ResTable_Iter[1-5] for the final selected model iteration.
  • *Bombus caliginosus_sdm_final_auc.csv
    • Model fit as measured by Area Under Curve for each of the five realizations of the final model iteration.
  • *Bombus caliginosus_sdm_final_contribution.png
    • Contribution to model of each predictor variable included in the final model iteration.
  • *Bombus caliginosus_sdm_final_contvariablefit.png
    • The distributions of values sampled randomly and from species observations for all continuous variables included in the final model iteration. Since these graphs are produced via kernel density estimation, the initial estimated ranges may extend beyond the true range of the variables (e.g. fractions of land greater than 100 or less than 0). In these cases the ranges are truncated and density is set to 0 outside the true variable range..
  • *Bombus caliginosus_sdm_final_discvariablefit.png
    • The distributions of values sampled randomly and from species observations for all discrete variables included in the final model iteration.
  • Bombus caliginosus_sdm_final_raster_zer_500.tif
    • A raster of the final model iteration with the species distribution model threshold applied. Values below the threshold are set to 0.
  • Bombus caliginosus_sdm_final_raster_nan_res_ext_500.tif
    • The raster described above, resampled and reprojected to match the projection, resolution, and alignment of APCOAT pesticide usage footprints.
  • Bombus caliginosus_sdm_sd_final_raster.tif
    • A raster showing the standard deviation of the five realizations of the final model iteration.
  • Bombus caliginosus__[County, CRD, HUC8, HUC12, State]_[threshold].shp (and associated .dbf, .prj, .shx files)
    • The zones of interest within the extent of the final model iteration. These are generated when a species is selected for inclusion in a co-occurrence report.
Co-Occurrence Report Result

 When a given species crop footprint is selected for use in a co-occurrence report, the map figure shown here will be automatically generated and included in the report document and co-occurrence folder ([Project Folder]\CoOccurrence\HabitatSuitability_200.png). This example is for Bombus caliginosus, modeled with all of the recommended bioclimatic and elevation predictors, as well as the land use/land cover predictor included with APCOAT, and a model threshold of 50%.


Figure 8. Final mean SDM output for Bombus caliginosus.

6.2.2 Aquatic Species Distribution Modeling Example

Inputs
  • The desired outputs are Species Distribution Models (SDM) for two species, Notropis topeka and Scaphirhynchus albus.
  • The selected species distribution predictor variables include all of the static aquatic predictor variables included with APCOAT.
  • The correlation threshold used to exclude combinations of predictors that are strongly correlated with one another is left at the default value of 0.67.
  • The predictor contribution threshold used to exclude predictors during iterative model evaluation is left at the default value of 1 percent.
  • The species distribution modeling threshold used to filter out regions of low quality habitat is specified at 0.5. Regions of the final SDM with probabilistic habitat quality below 50% will not be included in the output raster.
  • Processing time was 60 minutes using a laptop with an Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz processor and 8 GB of RAM.
Intermediate Results
* indicates data will be included in co-occurrence report
  • Scaphirhynchus_albus.html
    • A summary of the five model realizations conducted (i.e. Scaphyrhynchus_albus_0.html - Scaphyrhynchus_albus_4.html) using Maxent. Includes analyses of omission/commission, response curves selected for each predictor variable, and variable contributions.
  • rivers_epsg[4269/5070].gpkg
    • GeoPackage file showing predictor values of flowlines used within species modeling extent.
  • speciesPoint_epsg[4269/5070].gpkg
    • GeoPackage file showing the species location records used in modeling.
  • background_iter[XX].csv
    • Values collected during background sampling.
  • *maxentResults_iter[XX].csv
    • Iterative statistical evaluation of Maxent modeling results.
  • Scaphirhynchus_albus_[0-4]*.csv.gpkg
    • Intermediate Maxent processing files.
  • *ModelFitAUC.png
    • Model fit as measured by total Area Under Curve for each model iteration.
  • *Scaphirhynchus_albus_PredCor.png
    • Correlation between predictor variables at species locations.
  • *variableFit.png
    • Graphs comparing predictor variable values at species locations and randomly sited background samples. Since these graphs are produced via kernel density estimation, the initial estimated ranges may extend beyond the true range of the variables (e.g. fractions of land greater than 100 or less than 0). In these cases the ranges are truncated and density is set to 0 outside the true variable range.
  • rivers_Scaphirhynchus albus_epsg[4269,5070].shp
    • Shapefile showing flowlines, their predictor values, their final SDM values.
  • *geoextent_bm.tif
    • Image showing the spatial extent of the SDM.
Co-Occurrence Report Result

 When a given species is selected for use in a co-occurrence report, the map figure shown here will be automatically generated and included in the report document and co-occurrence folder (run[XXX]_[speciesThreshold]\[rivers/ponds]_coo_[summaryResolution]_[Species name]__epsg5070.png). This example is for Scaphirhynchus albus, modeled using aquatic flowing predictors included with APCOAT, and a model suitability threshold of 50%.


Figure 8. Final mean SDM output for Scaphirhynchus albus.

6.3 Co-Occurrence Modeling Example

6.3.1 Terrestrial Co-Occurrence Modeling Example

Inputs

  • The desired outputs are reports on probabilistic spatial co-occurrence between applications of atrazine to corn, and areas where the probability of suitable habitat is greater than 50% for three terrestrial species: Arabis serotina (imported from a NatureServe SDM), Bombus caliginosus and Hesperia dacotae, and two aquatic species: Notropis topeka and Scaphirhynchus albus.
  • The spatial co-occurrence is to be summarized by Crop Reporting District for visualization purposes.
  • Processing time was 36 minutes using a laptop with an Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz processor and 8 GB of RAM.

Intermediate Results
* indicates data will be included in co-occurrence report
  • \[Project Folder]\CoOccurrence\Atrazine\Corn_28020_ePestHigh_state_Maximum\Bombus caliginosus\run001_500
    • Directory for results of a single co-occurrence assessment. Separate directories are created for each combination of species and crop.
  • *Bombus caliginosus_run001Parameters.txt
    • A copy of the SDM input parameters file. Each modeling parameters is listed on a separate line: all species predictor variable file names, discrete predictor variable file names, predictor correlation threshold, predictor contribution threshold.
  • ProbUseFp.tif
    • Probabilistic use footprint raster clipped to the extent of the species distribution model. The raster is in integer format and multiplied by a factor of 10,000 such that 100% usage probability = 10,000 and 50% usage probability = 5,000.
  • Coo_500.tif
    • Raster showing values of probabilistic co-occurrence between pesticide usage raster and SDM raster at 30m resolution
  • *Coo_500.csv
    • Table showing min, mean, and max values of probabilistic co-occurrence between pesticide usage raster and SDM raster at summary zone resolution
  • *Coo_500__CRD_USCensusBureau_org.csv
    • Zonal summary of average co-occurrence raster values
  • Coo_500__CRD_USCensusBureau_org.shp (and associated .dbf, .prj, .shx files)
    • Shapefile containing zonal averages of co-occurrence raster values
  • * SdmOverview.png
    • Map of the extent of the SDM
  • Coo_500__CRD_USCensusBureau_org_wgs84.shp (and associated .dbf, .prj, .shx files)
    • Shapefile containing zonal averages of co-occurrence raster values, projected to the WGS 1984 datum
  • *Coo_500__CRD_USCensusBureau_org.png
    • Map of zonal co-occurrence averages
  • *Atrazine_ProbUseFp_Corn_28020_ePestHigh_state_Maximum.png
    • Map of pesticide usage probability
  • *HabitatSuitability_500.png
    • Map of habitat suitability derived from SDM
  • APCOAT_report_Coo_500__CRD_USCensusBureau_org.html
    • Probabilistic co-occurrence report containing modeling inputs, outputs, and assessment
Co-Occurrence Report Results

 A report containing model inputs and outputs will be generated for each selected combination of species and crop and located in
\[Project Folder]\CoOccurrence\[Pesticide]\[Crop]_[Application Rate]_[Usage Data Source]_[Usage Data Summary Resolution]_[Usage Data Statistic]\[Species]\[SDM Version_Threshold]\.
 The example table shows statistics regarding co-occurrence over the species range. The example figure shows probabilistic co-occurrence between the extent of the SDM of Bombus caliginosus and atrazine applications on corn, summarized at the CRD scale. An example case study composed of ten compiled co-occurrence reports is available at https://stone-env.com/APCOAT.

Table 6. Statistics of probabilistic co-occurrence between the range of Bombus caliginosus and atrazine applications on cotton.
Number of CRD polygons included in assessment 18
Maximum of maximum co-occurrence by CRD 1.79e+01%
Minimum of maximum co-occurrence by CRD 0.00e+00%
Average co-occurrence for species range 1.24e-02%



Figure 9. Map showing probabilistic co-occurrence between the range of Bombus caliginosus and atrazine applications on corn, summarized at the CRD scale.

6.3.2 Aquatic Co-Occurrence Modeling Example

Inputs

  • The desired outputs are reports on probabilistic spatial co-occurrence between applications of atrazine to corn, and areas where the probability of suitable habitat is greater than 25% for two aquatic species: Elliptio spinosa and Lepidurus packardi.
  • The spatial co-occurrence is to be summarized by HUC12 for visualization purposes.
  • Processing time was 24 minutes using a laptop with an Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz processor and 8 GB of RAM.

Intermediate Results

 The intermediate results for an aquatic co-occurrence analysis are similar to those for a terrestrial analysis. The primary difference is that the results of the SDM process and co-occurrence calculations are stored in vector format (lines for flowing aquatic habitats, polygons for static aquatic habitats). The example shown to the right is for a static aquatic analysis, for flowing aquatic analyses the resulting files will be similar, with "rivers" in place of "ponds" in the filenames.

Co-occurrence results at the flow line or pond level are found in the following fields in the [ponds/rivers]_coo_[species]_epsg5070.shp files:
  • comid: Feature identifier
  • avg_habsui: Average habitat suitability
  • local_mean: Local mean usage probability
  • local_max_: Local maximum usage probability
  • upstream_m: Upstream mean usage probability
  • upstream_1: Upstream max usage probability
  • local_me_1: Local mean co-occurrence
  • local_ma_1: Local max co-occurrence
  • upstream_2: Upstream mean co-occurrence
  • upstream_3: Upstream max co-occurrence

The maximum values of the pond or river features are aggregated to the selected summary region in the following fields in the [ponds/rivers]_coo_[region]_[species]_epsg5070.shp files:
  • nmb_lines/ponds: Number of lines considered within region (i.e., number of suitable habitat flow lines)
  • max_avg_ha: Maximum habitat suitability index
  • max_local_: Maximum of local mean usage probability
  • max_loca_1: Maximum of local maximum usage probability
  • max_upstre: Maximum of upstream mean usage probability
  • max_upst_1: Maximum of upstream maximum usage probability
  • max_loca_2: Maximum of local mean co-occurrence
  • max_loca_3: Maximum of local maximum co-occurrence
  • max_upst_2: Maximum of upstream mean co-occurrence
  • max_upst_3: Maximum of upstream maximum co-occurrence
Co-Occurrence Report Results

 A report containing model inputs and outputs will be generated for each selected combination of species and crop and located in
\[Project Folder]\CoOccurrence\[Pesticide]\[Crop]_[Application Rate]_[Usage Data Source]_[Usage Data Summary Resolution]_[Usage Data Statistic]\[Species]\[Habitat Type]\[SDM Version_Threshold]\.

 The example table shows statistics regarding co-occurrence over the species range. The example figure shows probabilistic co-occurrence between the extent of the SDM of Elliptio spinosa and atrazine applications on corn, summarized at the HUC8 scale. An example case study composed of ten compiled co-occurrence reports is available at https://stone-env.com/APCOAT.

Table 7. Statistics of probabilistic co-occurrence between the range of Elliptio spinosa and atrazine applications on cotton.
Number of HUC08 polygons included in assessment 11
Maximum average HUC08 co-occurrence 8.07e+01%
Minimum average HUC08 co-occurrence 1.92e+01%
Local co-occurrence using mean pesticide usage 4.08e-01%
Local co-occurrence using maximum pesticide usage 3.49e+01%
Upstream co-occurrence using mean pesticide usage 2.52e-01%
Upstream co-occurrence using maximum pesticide usage 4.50e+01%



Figure 10. Map showing probabilistic co-occurrence between the range of Elliptio spinosa and atrazine applications on corn, summarized at the HUC8 scale.

7. Citation

Suggested citation for this manual:
Dunne, J., Rathjens, H., Winchell, M. (2023). User Guide Automated Probabilistic Co-Occurrence Assessment Tool Version 2.0. Stone Environmental Inc., Montpelier, Vermont.

Suggested citation for APCOAT software:
Dunne, J., Rathjens, H., Winchell, M. (2023). Automated Probabilistic Co-Occurrence Assessment Tool. Stone Environmental Inc., Montpelier, Vermont.


8. References

1. Budreski, K., Winchell, M., Padilla, L., Bang, J., & Brain, R. A. (2016). A probabilistic approach for estimating the spatial extent of pesticide agricultural use sites and potential co-occurrence with listed species for use in ecological risk assessments. Integrated environmental assessment and management, 12(2), 315-327. https://doi.org/10.1002/ieam.1677

2. Richardson, L., Bang, J., Budreski, K., Dunne, J., Winchell, M., Brain, R. A., & Feken, M. (2019). A probabilistic co-occurrence approach for estimating likelihood of spatial overlap between listed species distribution and pesticide use patterns. Integrated environmental assessment and management, 15(6), 936-947. https://doi.org/10.1002/ieam.4191

3. Wieben, C.M. (2019). Estimated Annual Agricultural Pesticide Use by Major Crop or Crop Group for States of the Conterminous United States, 1992-2017 (ver. 2.0, May 2020): U.S. Geological Survey data release. https://doi.org/10.5066/P9HHG3CT.

4. Phillips SJ, Anderson RP, Dudík M, Schapire RE, Blair ME. (2017). Opening the black box: an open-source release of Maxent. Ecography. 40(7):887-893. https://doi.org/10.1111/ecog.03049

5. PRISM Climate Group, Oregon State University, https://prism.oregonstate.edu, data created November 2021, accessed August 2022.

6. Soil Survey Staff, Natural Resources Conservation Service, United States Department of Agriculture. (2022) Soil Survey Geographic (SSURGO) Database. https://sdmdataaccess.sc.egov.usda.gov. Accessed November 2022.

7. US EPA and USGS. (2012). National Hydrography Dataset Plus - NHDPlus. Accessed May 2019.

8. Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E., Linder, P., Kessler, M. (2017): Climatologies at high resolution for the Earth land surface areas. Scientific Data. 4 170122. https://doi.org/10.1038/sdata.2017.122

9. Karger D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E, Linder, H.P., Kessler, M. (2018): Data from: Climatologies at high resolution for the earth’s land surface areas. EnviDat. https://doi.org/10.16904/envidat.228.v2.1

10. Tuanmu, M.-N. and W. Jetz. (2015) A global, remote sensing-based characterization of terrestrial habitat heterogeneity for biodiversity and ecosystem modeling. Global Ecology and Biogeography. DOI: 10.1111/geb.12365

11. Multi-Resolution Land Characteristics Consortium. (2021). NLCD 2019 Land Cover (CONUS) | Multi-Resolution Land Characteristics Consortium. (2019). https://www.mrlc.gov/data/nlcd-2019-land-cover-conus

12. USFWS. (2019). Endangered Species Act Section 7 consultation. https://www.fws.gov/endangered/laws-policies/section-7.html.

13. National Research Council. (2013). Assessing risks to endangered and threatened species from pesticides. https://doi:10.17226/18344.

14. US EPA. (2016). Biological evaluation chapters for diazinon ESA assessment. US EPA https://www.epa.gov/endangered-species/biological-evaluation-chapters-diazinon-esa-assessment.

15. US EPA. (2016). Biological evaluation chapters for malathion ESA assessment. US EPA https://www.epa.gov/endangered-species/biological-evaluation-chapters-malathion-esa-assessment

16. US EPA. (2016). Biological evaluation chapters for chlorpyrifos ESA assessment. US EPA https://www.epa.gov/endangered-species/biological-evaluation-chapters-chlorpyrifos-esa-assessment.

17. US EPA. (2020). Revised method for national level endangered species risk assessment process for biological evaluations of pesticides. US EPA https://www3.epa.gov/pesticides/nas/revised/revised-method-march2020.pdf

18. Richardson, L. L., Dunne, J., Feken, M., Brain, R., Ghebremichael, L., & Winchell, M. (2021). Probabilistic co-occurrence assessment for suites of listed species. Integrated environmental assessment and management. https://doi.org/10.1002/ieam.4542

19. Moskwik, M., Mainali, K., Pavelka, M., Nicolaysen, T., Juliusson, L., Chhetri, D., Shultz, M. (2019). USFWS refined range maps for threatened and endangered species--standard operating procedures. V. 3.0. https://ecos.fws.gov/docs/SR_SOP/SDM_SOP_Final_14Nov2019.pdf

20. United States Department of Agriculture, National Agricultural Statistics Service. (2019). Cropland data layer. https://nassgeodata.gmu.edu/CropScape/

21. Multi-Resolution Land Characteristics Consortium. (2021). NLCD 2016 Land Cover (CONUS) | Multi-Resolution Land Characteristics (MRLC) Consortium. https://www.mrlc.gov/data/nlcd-2016-land-cover-conus

22. USDA - National Agricultural Statistics Service - Census of Agriculture. (2021). https://www.nass.usda.gov/AgCensus/

23. United States Department of Agriculture, National Agricultural Statistics Service. (2020). Cropland Data Layer - Metadata. https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/meta.php

24. United States Census Bureau. (2021). ANSI and FIPS codes. https://www.census.gov/library/reference/code-lists/ansi.html

25. USDA National Agricultural Statistics Service. (2018). USDA - National Agricultural Statistics Service - Data and Statistics - County Data FAQs. https://www.nass.usda.gov/Data_and_Statistics/County_Data_Files/Frequently_Asked_Questions/

26. Elith, J., Phillips, S. J., Hastie, T., Dudík, M., Chee, Y. E., & Yates, C. J. (2011). A statistical explanation of MaxEnt for ecologists. Diversity and distributions, 17(1), 43-57. https://doi.org/10.1111/j.1472-4642.2010.00725.x

27. GBIF.org. (2022). GBIF Occurrence Download. https://doi.org/10.15468/dl.yu2nqj

28. Fick SE, Hijmans RJ. (2017). WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology. 37(12):4302-4315. https://doi.org/10.1002/joc.5086

29. USGS. (2018). Elevation Derivatives for National Applications (ENDA) | U.S. Geological Survey. https://www.usgs.gov/special-topics/elevation-derivatives-for-national-applications

30. Abeshu, G. W., Li, H. Y., Zhu, Z., Tan, Z., & Leung, L. R. (2022). Median bed-material sediment particle size across rivers in the contiguous US. Earth System Science Data, 14(2), 929-942.