LAGOS Database overview

LAGOS Research Platforms

A central focus of the LAGOS research program has been the development of LAGOS research platforms that are research-ready to study all lakes within a specified geographic extent, that are intended to integrate with other existing lake databases where possible, and that allow for future open-access additions to the platform (i.e., are extensible). Although lakes are the focal unit of study of the LAGOS research platforms, studying land-water interactions requires not only in situ lake measurements, but also descriptions of the lakes, their watersheds, and their landscape ecological context. Further, each lake’s ecological context can be influenced by processes occurring at a variety of spatial and temporal extents.

Therefore, each LAGOS research platform includes the following features: 

(a) A pre-defined spatial extent for the entire platform; 

(b) A modular design that has multiple open-access and version-controlled database modules that represent similar types of features that were either compiled or quantified using similar approaches and that are themselves interoperable with each other and fully integrated; and, that allow the development of future specialized modules from any future user; 

(c) Observations of not only in situ characteristics of lakes, but also characteristics of the surrounding land and air (e.g., land use, geology, atmospheric deposition, climate) at multiple extents for space (e.g., lake shoreline, lake watershed, ecoregion) and time (e.g., monthly, annual, decade) that are connected to lakes via unique identifiers; 

(d) Representation of all lakes (above a pre-determined surface area threshold) within the spatial extent for population studies of lakes; 

(d) Geospatial datasets of key spatial features for the integrated study of lakes at broad spatial extents;

(d) Detailed documentation and metadata to make each database module research-ready and to allow future users to develop other specialized modules that can be integrated into the LAGOS platform through the documentation of methods, data sources, data processing steps, and possible data limitations; 

(e) Unique identifiers from other common lake databases to enable easy integration with other datasets to further the geospatial and temporal analysis of lake studies across broad spatial extents;

(f) An R-package that facilitates the access and querying of the database modules

Currently, the LAGOS program has created 2 stand-alone research platforms. LAGOS-NE was created for 17 US states in the upper midwest and northeast regions. LAGOS-US is being created for the conterminous US. Although they are distinct platforms that were developed approximately 5 years apart, there is overlap in many of the methods, some of the data, and the unique identifier for lakes is the same (lagoslakeid) allowing full integration. To help users decide which research platform to use, we describe the high-level differences next. For further descriptions of each research platform, please see the detailed documentation for each as well as the above links.

LAGOS-NE vs LAGOS-US Research Platforms

There are several key differences in the two platforms that we describe below in an effort to guide users of the LAGOS Research Platforms and the specific data modules within them:

Spatial extent: LAGOS-NE includes 17 US states (and additional tribes) from the upper midwest and northeastern US.

Time period: LAGOS-NE was created from 2011-2017 and so the source data that underlie the data modules are from this date range; LAGOS-US was created from 2017-2022 and so its source data originate from this time period.

Data sources for the LIMNO data module in LAGOS-NE: Data were obtained from individual sampling programs from each state or tribal agency, multiple community-science datasets obtained from individual programs, university datasets, and non-profit datasets. At the time of data module creation, there were few agency and other datasets from the LAGOS-NE states that were in the Water Quality Portal (WQP); therefore, most of the datasets were individually obtained and made inter-operable by putting them into the LAGOS data model.

Data sources for the LIMNO module in LAGOS-US: Data obtained through downloads from the Water Quality Portal only. However, because it is not known at this time if all of the data that were integrated into LAGOS-NE are currently in the WQP, there may be data in LAGOS-NE that are not in LAGOS-US (but only for the variables that are in LAGOS-NE, which is a subset of the variables in LAGOS-US, see next point).

Lake measurements for the LIMNO data module in LAGOS-NE: Include only selected surface-water quality types of observations (e.g., major nutrients, chlorophyll, Secchi depth).

Lake measurements for the LIMNO data module in LAGOS-US: Include more types of lake observations beyond just basic water quality (e.g., water temperature, lake chemistry, contaminants, etc.) and some lake profile data.

Minimum lake size: The minimum lake surface area for LAGOS-NE is 4 ha (and so, lake watersheds are delineated only for lakes ≥ 4 ha), and the minimum lake surface area for LAGOS-US is 1 ha (with lake watersheds for all lakes ≥ 1 ha). Nevertheless, all areas associated with lake geospatial representation are similar to the errors present in the NHD datasets that form the basis of all LAGOS data products.

Lake watershed delineations and lakes in LOCUS: The methods and underlying data used for watershed delineations for LAGOS-US have been improved over the methods and data that were available for LAGOS-NE; thus it is recommended that watersheds from LAGOS-US are used over LAGOS-NE. Further, because LAGOS-US LOCUS is based on the most recent NHD dataset, it is recommended that even if users use the LAGOS-NE LIMNO data module, that users use the LAGOS-US LOCUS module.

Variables in the GEO data module: There are additional variables and additional time periods in the LAGOS-US GEO module; therefore it is recommended that users use the LAGOS-US GEO modules, even when using the LAGOS-NE LIMNO data.

Extension modules in LAGOS-US: Building upon the 3 core LAGOS modules in both LAGOS-NE and LAGOS-US (LOCUS, GEO, LIMNO), LAGOS-US has 4 extension modules: RESERVOIR (machine learning-based classification of natural lakes vs. reservoirs), LAKE DEPTH (manually compiled maximum and/or mean depth for over 17,000 lakes), NETWORKS (898 lake networks consisting of 86,511 lakes), and LANDSAT (satellite-estimated water quality for all lakes ≥ 4 ha).