LAGOS is a multi-scaled database system and a set of tools to study lake water quality at macroscales. Within the LAGOS umbrella, we have created and shared: versioned water quality databases, software and code for analysis of geographic and tabular data, procedures and policies for team science, and research publications. This website provides the background on the creation of LAGOS, the description of the funded projects and research teams that created LAGOS, a summary of the research products, and a description and location of the documentation for users of LAGOS products. For any questions about the LAGOS system, please contact any of the current LAGOS developers.


To create the needed infrastructure of people, data, and computer tools for studying water quality at broad scales for research, management, policy, education, and outreach. We use an interdisciplinary approaches from a wide range of fields, including limnology, landscape limnology, geographic information science, ecoinformatics, data mining, machine learning, and statistics. Our approach is based strongly on a foundation of open science, team-science, and data-intensive science. By taking an open-science perspective and by combining site-based lake datasets and national geospatial datasets, scientists gain the ability to ask important research questions related to grand environmental challenges that operate at macroscales.

Why study lake water quality at macroscales?

Understanding the factors that affect lake water quality and the ecological services provided by lakes is an urgent global environmental issue. Predicting how lake water quality will respond to global changes not only requires water quality data, but also information about the ecological context of individual lakes across broad spatial extents. However, lake water quality is usually sampled in limited geographic regions, often for limited time periods; therefore, to study water quality across regions, continents, and the globe, scientists must compile many lake water quality and geographic datasets into an integrated database.

Guiding principles in LAGOS database development to foster open science [adapted from Table 1 in Soranno et al. 2015]

  • LAGOS databases are fully documented, including descriptions of: the original data providers or sources, database design, all data processing steps and code for all data, possible errors or limitations of the data for the integrated dataset and individual datasets, and methods and code for geospatial data processing.
  • To the greatest degree possible, existing community data standards are used to facilitate integration with other efforts.
  • To the greatest degree possible, the provenance of the original data are preserved through to the final data product.
  • LAGOS databases use a versioning system to track different versions of the database for future users and to facilitate reproducibility.
  • LAGOS databases are made publicly accessible in an online data repository with a permanent identifier using non-proprietary data formats.
  • LAGOS data paper(s) will be written with the original data providers as coauthors to ensure recognition of data-providers.
  • LAGOS database-methods paper(s) will be written with the data-integration team as coauthors to ensure recognition of data-integrators.
  • Once LAGOS databases are made available in a data repository and are open access, whether it is static (no further data is added to the database) or ongoing (data continues to be added to it), there are a set of community policies by which other scientists use and cite the database.