Welcome to the LAGOS Visualization Blog. Here we will post interesting and fun visualizations of data from LAGOS as well as other posts that we find interesting to share and talk about. In this blog, we focus mostly on data visualizations because it is not easy to fully capture the complexity of macroscale data, which includes a range of environmental characteristics across both space and time. So, this space is devoted to thinking creatively about visualization for macrosystems ecology. Current members of the ‘Continental Limnology’ Team will be posting here.

We Didn’t Start the Fire, but we did research it

By Emily Wasen

In the cold months of spring semester 2022, I was sitting in my 8:30am Conservation Biology lecture dreaming of the warmth of summer, wondering what was in store just a few short months when I was told about an undergraduate research assistant job exploring the effects of wildfire. An announcement that would soon change the trajectory of my senior year of college. Little did I know at the time, my summer heat would be coming from a Minnesota wildfire.

In June 2022, I was fortunate enough to begin as an undergraduate research assistant with Dr. Ian McCullough in the Data Intensive Landscape Limnology Lab at Michigan State University. Going into this experience, I was very nervous. I had no prior lab experience except in a few of my classes and did not know what to expect. I am a Zoology major with a concentration in Zoo and Aquarium science and minors in Environmental Sustainability Studies and Marine Ecosystem Management. That being said, my previous knowledge about wildfires was fairly limited to what I have been told by Smokey the Bear. With my natural love of animals and constantly learning more about their biology in my classes, the chance to focus more on the environmental side of an ecosystem sparked my interest in applying for this position. Reading more into the project, I thought it was fascinating to look at not only what happens to the environment after a drastic change, but also how that environment can bounce back in the coming months and years.

Sign with Smokey the Bear, showing fire risk in Superior National Forest, Minnesota.

Our project focuses on the repercussions of the 2021 Greenwood wildfire in Isabella, Minnesota. The main goal of this project is to understand the effects wildfires have on lake water quality. This study is the largest of its kind as we are looking at 30 different lakes (15 in burned watersheds, 15 in unburned watersheds, i.e., control) over five months. Lots of wildfire research has been done in the Western US, but not much has been looked into in the Midwest where wildfires are becoming increasingly more common. The summer of 2021 marked especially high temperatures paired with very dry weather. This fire in particular is recorded as one of the largest in state history.

A side-by-side comparison of the differences between the shoreline of a burned lake (Stony Lake) and a control/unburned lake (Flathorn Lake).

I was very excited to be granted the opportunity to go out to the burn site and aid in collecting data. The MSU team and I were able to travel up to Minnesota in both June and July of 2022. In the field we traveled out lake by lake via canoe or pontoon, collecting water samples to take back into the lab to further look at different variables such as phosphorus and nitrogen. In looking at these variables we expect phosphorus and nitrogen to increase in burn lakes due to the runoff from the burn area. This is significant to other measured variables as well as it promotes primary productivity, as measured in chlorophyll. We also recorded different characteristics to aid in putting together different lake profiles, of which included dissolved oxygen content. This variable shows direct correlation to lake stratification and how nutrients runoff from the burn area affects the lake’s water quality. It was so cool to be able to be out in the field and see how the wildfire has visibly affected the environment, especially seeing the regrowth that had already begun. My favorite part was canoeing out onto each lake. The scenery was beautiful to look at whether burned or unburned; it was so peaceful to be out continuously immersed in nature.

Working alongside Dr. Jennie Brentrup in MN collecting a water sample on Stony Lake.

This position allowed me to gain my first lab experience as well as my first field experience. I learned so much this summer and I am continuing to do so with the help of all the amazing people I have had the privilege of working side by side with. This has been such a great experience so far and I am happy to have the opportunity to be continuing our research this semester. Having been out in the field this summer makes it that much more interesting when we have new data coming in that we get to analyze. I am grateful for all the important skills I have learned over the past six months. Working on this project has opened me up to new possibilities and potential career paths I can explore post-graduation. I am looking forward to continuing my work in the lab and am excited to see how the data will play out and how it will influence other studies in the years to come. 

Earth, Wind and Wildfire

My name is Andrea Paul and in May of 2022, I joined the Data Intensive Landscape Limnology Lab at MSU to work as an undergraduate research assistant with Dr. Ian McCullough. My major is Fisheries and Wildlife with a minor in Environmental Studies and Sustainability. I have had a strong interest in animals since I was born and as I got older I began to appreciate the complex ecosystems behind them. My deep interest in animals combined with the knowledge of the disastrous effects humans and climate change have had on the world has sparked my passion for conservation and sustainability. I didn’t know what direction this would take me in my career, but I knew that I wanted to take advantage of the research opportunities this university provides. I am a strong believer in the idea that you don’t know what something is like until you try it, so I wanted to see first-hand what research was like and if it would be something I could see myself doing in the future. This position in freshwater and fire ecology aligned with my interests and matched my goals of gaining research experience, building technical skills and enhancing my knowledge of ecology.

The research project I joined is studying the effects of wildfires on lakes. Wildfires are becoming more and more prevalent in the American West due to climate change and lots of research has been targeted to that location, but not a lot has been involved in the Midwest. Many research projects on wildfire and limnology (the study of inland waters) have solely looked at rivers and streams, but this project is unique in that it focuses on freshwater lakes. Specifically, we looked at how the 2021 Greenwood fire in Isabella, MN affected the water quality of 30 surrounding lakes in the Superior National Forest. We had 2 field trips to Isabella, MN where we sampled the lakes for nutrient concentration, acidity, chlorophyll and more. We expected that the forest would have intense burn marks with little of the past vegetation left. Instead what we found was patches of burnt areas interspersed with unburned areas. It was very surprising to see how mixed the landscape was in terms of fire damage. We also expected that our water quality data would show an increase in nutrient concentration between May and June, as the burned nutrients run-off into the water after the snow melts. This prediction did hold true and I’m excited to see what the following months show us. That was my first experience doing research in the field and it was so much fun! I learned how to use sampling instruments, I met some amazing researchers and I got to experience what a huge effort it requires to conduct a research study. Now we are using the data we collected, along with data from LAGOS, a public database of all US lakes, to determine the relationships between variables such as burn severity, nutrient concentration, time since fire in R, a freely available statistical analysis platform. 

My experience this summer working on this project has been eye-opening and very rewarding. I really enjoyed sampling the lakes, seeing the Superior National Forest and contributing to research on limnology. Understanding historical fire regimes, past studies on how wildfires affect lakes and wildlife while building my knowledge and skills around research was very enjoyable and I believe it will help me in my future career. I am continuing to work this semester in the lab while the project comes to a close and I am eager to see how the project wraps up and the implications it will have on future research in understanding the effects of wildfires on lakes. 

Duck, Here Come the Big Data!

The Backstory of One Young Graduate Student’s Research

By Marcella Domka

In August of 2020, I started my ‘next step’ after college as a graduate student at Michigan State University. I knew from the moment I received an invitation to join the Data Intensive Landscape Limnology (DILL, find more at https://bigdatalimno.org/) lab, a lab that bases many of their projects and endeavors on big data, that I was facing an exciting challenge. Big data was a fairly new concept to me. I had several research experiences during my time as an undergraduate that helped me understand the vastly complex world of ecological interactions and freshwater systems, but none that spanned the extent of the entire United States and included thousands of lakes. It was daunting, to say the least!

However, I knew that I had a strong interest in freshwater ecology and wildlife, and wanted to better understand environmental interactions between biota (living components of an ecosystem) and abiota (non-living components of an ecosystem) at a much larger scale. The DILL lab was the perfect place to embark on this journey of greater understanding. I began my first semester as a master’s student working with my advisor, Dr. Kendra Spence Cheruvelil, a co-founder of the DILL lab. I remember that one of the first ‘tasks’ she assigned me was to brainstorm ideas for what would eventually become my thesis research. I had quite a few thoughts bouncing around in my head already, so I was happy to take this on and dive right into the questions that the fields of ecology, limnology, and big data science have to offer.

Right away, I knew that I wanted to study something about eutrophication. Eutrophication, or the excess concentration of nutrients in freshwater environments, is often caused by point and nonpoint sources of pollution (e.g., septic tanks and fertilizer runoff from agriculture, respectively). This excess of nutrients can lead to algal blooms that remove oxygen from the water. This results in hypoxic conditions and subsequently the death of many aquatic organisms. 

During an internship I had through the NSF-funded LAKES (Linking Applied Knowledge in Environmental Sustainability) program in summer of 2019, I studied the widespread eutrophication and phosphorus pollution of Lake Menomin in Menomonie, Wisconsin. It was a fantastic experience in which I learned a wide variety of laboratory and field techniques, conducted literature reviews and composed a graduate-level research poster, and most of all, learned which elements of freshwater ecology were most intriguing to me.

Lake Menomin on June 18th, 2019. The lake was not yet experiencing algal blooms and no green coloration can be seen (yet!).
Photo Caption: Lake Menomin on July 15th, 2019. As you can see, in about 1 month, the lake turned a bright green color due to excessive nutrient content (particularly high levels of phosphorus pollution).

I realized that integrating eutrophication as a major component of my thesis research would allow me to continue studying this concept that I clearly had a passion for. Eutrophication, and the concentrations of nutrients that may indicate eutrophic conditions, would be the foundation of my research. 

Before my first research question could be fully formulated, however, I had to consider an additional component. I knew upon my acceptance to the DILL lab that I would be working with the LAGOS-US RSVR database (find more at https://lagoslakes.org/), which contained hundreds of thousands of data rows about two types of waterbodies: natural lakes (NLs) and reservoirs (RSVRs). A ‘natural lake’ is typically naturally formed, with no apparent flow-altering structures present, whereas a ‘reservoir’ is a lake that is likely to be human-made, or include some sort of large flow-altering human made structure. Natural lakes and reservoirs are different in a variety of ways, with reservoirs typically being warmer in temperature, with larger watershed areas and larger ratios of basin to lake/reservoir surface area. Additionally, reservoirs are not well studied in comparison with natural lakes.

 Part of the cornerstone of my research would be to investigate major differences between these two types of lakes, including differences with nutrient concentrations, which is where eutrophication comes back in to play. Thus, my first research question is: are natural lakes or reservoirs more likely to have higher concentrations of total phosphorus and chlorophyll-a and lower water transparency (variables that typically indicate eutrophication)?

Yes, I did mention ‘first’ research question. And yes, while I was happy that I would be studying nutrient concentrations across two distinct waterbody types (natural lakes and reservoirs), I knew there was something else missing from the scope of my research. From a very young age, I’d always loved everything about the natural world, but something about wildlife was especially captivating. I knew my master’s thesis wouldn’t feel complete without a wildlife component. 

After bringing this interest up to my advisor and getting feedback from my other DILL lab members, I did some searching to find readily available wildlife data. I decided to use waterfowl (aquatic birds such as ducks, geese, mergansers, etc.) data from the United States Fish and Wildlife Service (find more at Migratory Bird Data Center – About the Atlantic Flyway Breeding Waterfowl Survey). After performing a detailed literature search, I understood that there were myriad factors that may influence waterfowl use of aquatic habitats (such as lakes and reservoirs), so I knew that incorporating these data with the LAGOS nutrient data would form the perfect ‘second’ research question. Thus, I’m asking: are natural lakes or reservoirs more likely to be associated with more species of and more abundant waterfowl?

 Once these questions were finalized, I felt that my thoughts and passions were truly fueling my research questions. I came to graduate school to address multifaceted ecological questions, and I feel that I have embraced that process. While brainstorming, writing a thesis proposal, and performing months of data exploration and statistical analysis have proved challenging, I haven’t regretted anything. The only way forward is understanding.

How a walk in the park provided the “spark” I didn’t know I needed

by Ian McCullough

Academic researchers are trained to package their science as neatly manicured manuscripts. We provide an overview of a topic and then describe what we did, what we found and what it all means. Rarely, however, do we hear about how an idea or project came to be in the first place. Today, I would like to pull back the curtain and share an anecdote about an important day in my research career.

It was late June 2016. At the time, I was a graduate student at UC Santa Barbara. On the backend of a road trip to Oregon for a wedding, I stopped in Lassen Volcanic National Park, California, an off-the-beaten-path park I would imagine few Americans have even heard of. 

Early in the morning of my only full day in the park, I left my beautiful campsite along Manzanita Lake and headed east along the park’s only major road toward the Cluster Lakes Loop. Having previously lived in Maine, I sorely missed these formerly glaciated, lake-rich landscapes that in California could only be found at the highest elevations. 

One thing I did not miss about Maine was the mosquitoes. Even though it was late June, at an elevation of about 7000 ft (2134 m), the snowpack had not completely melted beneath the shady coniferous canopy, which meant there were plenty of puddles ripe for mosquitoes to breed. Having grown used to the dry climate of Southern California, I clearly was not prepared for this, so my solution was to bundle up and walk quickly.

After basically putting my head down and booking it for a few miles, suddenly the mosquitoes disappeared. I found myself in complete sunlight and the air felt significantly warmer and drier. It turned out I was in the middle of a burned area from the 2012 Reading Fire. As I walked through the skeleton of the pine-fir forest, I couldn’t help but notice how the fire had transformed what would have been a cool, dark, wet, mosquito-infested patch of forest into a warm, bright, dry, mosquito-free patch of forest. Clearly, the fire had had an effect on the landscape, and probably in many more ways than just these.

What a difference a fire can make across the landscape. These forest patches were just a few hundred meters apart, yet their environments were completely different. The top patch was shady, cool, moist and mosquito-infested, whereas the bottom patch was bright, warm, dry and mosquito-free. If fire effects on the terrestrial landscape are this noticeable, what about fire effects on the freshwater landscape?

Before long, however, I came across some lakes with completely burned shorelines. Clearly the fire had affected the landscape, so what about the lakes? Their watersheds probably were completely scorched and all sorts of nutrients and materials could have ended up in the water. Soon after, I encountered more lakes with completely intact shorelines, at least some indication that their watersheds might have avoided the fire. As I finished my hike, I wondered if anyone had ever studied fire effects on lakes. I filed this thought away, but didn’t necessarily expect to revisit it.

In 2016, I encountered several nearby lakes within the same landscape. Some had heavily burned shoreline and watersheds, whereas others largely escaped the fire. What were the ecological implications for the lakes, I wondered?

Fast forward about a year later and I’m in a brainstorm session with my new research group at Michigan State University, where I was starting as a postdoctoral researcher. I casually mentioned that I was interested in fire impacts on lakes. One of the professors’ eyes lit up and one of my first projects ended up being a review paper on the fairly limited amount of existing lake-fire research. The amusing title “Do lakes feel the burn?” was inspired by some roadside graffiti I encountered on my aforementioned 2016 trip (it was an election year). Years later, this foundational paper still produces new research collaborations and grant proposals as fires continue to torch many parts of the US.

Fresh off a hike through some burned lake watersheds, some election-year (2016) roadside graffiti inspired the lighthearted title of the 2019 review article “Do lakes feel the burn?” https://doi.org/10.1111/gcb.14732

Even if you don’t study lakes or wildfires, part of my hope in sharing this anecdote is to illustrate just how seemingly ideas might come to you randomly in life yet how career/life-impacting these can be. Personally, I rarely come up with very interesting ideas while sifting through spreadsheets and data graphs. So if you’re searching for inspiration, whether for science or otherwise, maybe a “walk in the park” is all you need. 

LAGOS Launches First US Data Modules and Twitter Campaign!

New and exciting things are happening for the LAGOS (Spanish for lakes) team. After years of hard work, their first core and extension data modules of LAGOS-US have been published. Even better, they are open access, providing the opportunity for wide and free use.

The LAGOS-US research platform provides data and tools to study lake water quality at the continental scale. LAGOS-US data modules include the conterminous US, which has 479,950 natural lakes and reservoirs larger than or equal to 1 hectare. This month, the LAGOS-US core  data module LOCUS and the extension data module NETWORKS were released. LOCUS contains locational, identifying, and physical characteristics for these nearly half million lakes and their watersheds. NETWORKS includes 898 lake networks across the US and provides quantitative surface water connectivity metrics for those networks and the 86,511 lakes in those networks. Stay tuned for future data module releases!

Along with the public release of LOCUS and NETWORKS, the LAGOS team is excited to announce the official launch of their first Twitter campaign. The LAGOS-US account can be found using the handle @LAGOS__Lakes (note the double underscores). 

The primary aim of @LAGOS__Lakes is to raise awareness about the availability of LAGOS-US data products and tools for others to use and extend. The LAGOS team hopes to inspire new users to study US lakes at broad scales of space and time. 

Twitter posts via @LAGOS__Lakes will use a weekly #LakesForLunch hashtag, with info about a unique US lake and its lagoslakeid for reference. Be sure to keep an eye out for these posts by following @LAGOS__Lakes and keep up to date with your fellow lake lovers!

Citizen scientists are important contributors to species distribution data

By Patrick Hanly

While citizen scientists are already known to be a vital source of water quality data, they have also been quietly amassing a substantial collection of species records through digital platforms such as the popular iNaturalist. For example, there are 900,000 dragonfly and damselfly records on iNaturalist as of August 2020. Although iNaturalist was created with the goal of connecting people with nature, a fortunate byproduct of this effort is an extensive database of species records with spatial and temporal coverage that vastly exceeds the capacity of the scientific community.

Examples of some of my odonate observations recorded on iNaturalist in Ingham County, Michigan. Dragonflies (top): a blue dasher (Pachydiplax longipennis) on left and a widow skimmer (Libellula luctuosa) on right. Damselflies (bottom): an eastern forktail (Ischnura verticalis) on left and a double-striped bluet (Enallagma basidens) on right.

You may have heard of eBird, a well-established citizen science project run by the Cornell Lab of Ornithology that tracks observations of birds. Similarly, iNaturalist accounts for 66% of the U.S.’s 470,000+ georeferenced records in the Global Biodiversity Information Facility (GBIF), an international organization that focuses on compiling biodiversity data and making it publicly accessible. However, unlike eBird, iNaturalist encompasses all biota and relies primarily on photographic records that can be corroborated by the community. Corroboration is an important verification step that increases the quality of the data and allows researchers to be part of the identification process to fix errors prior to use. Observations can achieve “Research Grade” when they are properly dated and georeferenced, submitted with verifiable evidence, and when greater than two-thirds of users agree on identification.

I am developing tools to help people access these important data. All records (Research Grade or not) are freely accessible in an open database through the iNaturalist API. This online tool facilitates downloads into R using a package I am developing called iNatTools that provides data processing tools such as ways to determine sampling efforts for ecological research. Research Grade records are also exported to the biodiversity data compiler GBIF. To date, these GBIF records have generated 738 citations, showing that Research Grade iNaturalist records are an increasingly important source of contemporary distribution data for many taxa.

Georeferenced records of the blue dasher (Pachydiplax longipennis) on the Global Biodiversity Information Facility (GBIF) from iNaturalist (left) and from all other sources excluding iNaturalist (right). iNaturalist accounts for 22,781 of records since 2010 compared to just 203 from other sources.

Although vouchered specimens from museums and universities offer a wide breadth of species for many taxonomic groups, citizen science is an important source of recent and geographically widespread data for easily documented species such as dragonflies. These data will be essential for understanding biogeography and other investigations into species and ecological communities. Despite the large and growing number of observations, the biodiversity of many areas remains poorly documented. You can help fill these gaps — get started as an observer, identifier, or both.

Number of iNaturalist species observations within 500 meters of all Michigan LAGOS-NE lakes > 4 hectares. As of July 2020, 10,710 of the 15,569 lakes lack observations entirely for any taxonomic groups.



Research Experience for Undergraduates (REU) Summer 2020: An environmental justice approach to data-intensive lake research

By Jessica Díaz Vázquez

I joined the Data Intensive Landscape Limnology Lab in October 2018 to gain research experience in the general field of ecology. As I learned more about the database LAGOS and the openness of the lab for interdisciplinary research, I saw an opportunity to incorporate my interest in environmental justice.

I grew up in Northeast Houston, Texas in a predominantly latinx and low-income community that is adjacent to petrochemical plants and oil refineries. Living in a ‘frontline’ or environmental justice community means that the topics of health, racial/ethnic identity, economic status, and natural environment are extremely interconnected. Just like any other community, we love our backyard gardens, neighborhood parks, and local bayous. However, the disproportionate burden of air and water pollution make outdoor activities much less pleasant or healthy. From my lived experiences and as a rising senior in MSU’s Department of Fisheries & Wildlife, I seek to improve the habitat of wildlife and expose and correct environmental injustices. I am excited to apply my combined knowledge in fisheries & wildlife and environmental justice through this REU position.

The overall goals of this REU position are to integrate information about lake watersheds and lake water quality with human demographics and apply an environmental justice lens. I hope to answer the question: Are people and communities within marginalized demographics (e.g., low income, people of color, younger/older people) disproportionately affected by low water quality lakes and their watersheds?

For my research, I am using lake and watershed data from the LAGOS database that covers the conterminous U.S. Therefore, the human demographic data used must be compatible with this large scale. I am using tract-level data from the 2010 Decennial Census and the American Community Survey (ACS). The main variables that I will focus on for lakes are those that together serve as a measure of water quality: water clarity, phosphorus, and nitrogen. For the human demographic variables, I will choose those of interest in the environmental field, such as median household income, race/ethnicity, population, and sex. Figure 1 is an example of a visual output resulting from linking watersheds and median household income for LAGOS-NE.

Although I expect challenges to arise from working with two unique databases (LAGOS and ACS), I look forward to bringing a new perspective to the research group. Stay tuned for an update at the conclusion of my summer 2020 REU!

Highlight on Research Experiences for Undergraduates (REU): MSU math major applies his skills to data-intensive lake research, By Sam Polus

I was motivated to apply for this particular REU position as, growing up in northern Michigan, I have always been interested in nature and ecology, and I wanted to be able to apply my math degree in areas that would allow me to pursue these interests. It has been an amazing learning opportunity for me to apply things I have been learning in my math, computer science, and statistics classes into areas where I did not expect to apply them. Being able to work in such a diverse research group has helped me greatly in learning how to translate and apply mathematical skills into different useful applications.

The project that I mainly focused on over the course of the summer involved classification of lakes in LAGOS-NE (www.lagoslakes.org) into two categories: natural lakes, and reservoirs.  Since this involved such a large number of lakes (~50,000), much of my work revolved around training a deep-learning algorithm with the help of a computer science REU student Laura Danilla. Manually, I classified a subset of the lakes using GIS layers and satellite imagery.  We then used this subset of lakes with confirmed types to train our deepmind AI to identify lake types based solely on the shape of the lake.

Throughout the course of the summer, I created a training set containing 5334 lakes, roughly half natural lakes and half reservoirs.  Using these lakes of known types, we estimated the performance of our model as we prepared to apply it to all lakes in LAGOS-NE.  After this testing, we estimated our model to have around 80%-85% accuracy when determining lake type for a given lake in LAGOS-NE.  Then, we applied our model to the ~45,000 remaining unclassified lakes in LAGOS-NE and obtained the following results: 63% of lakes in LAGOSNE (28,733) are natural lakes, and 39% of lakes (16,864) are reservoirs. For reservoirs, the average predictive confidence was 45% and for natural lakes the average confidence was 61%. This metric of confidence is estimated by the model as it is determining the type of a lake.  It develops a probability of each lake being from either category (natural lake or reservoir).  The confidence metric is the absolute value of the difference between the probabilities of a lake being in each category.

Figures 1 and 2 show the distributions of the model’s reservoir predictions and natural lake predictions, respectively.  Note that natural lakes tend to appear in clusters, whereas reservoirs are more evenly distributed.  Also note that regions with many lakes such as Minnesota have high concentrations of both reservoirs and natural lakes.  Both of these results are very promising for our model because they match up well with what we expect.

For instance, we expect natural lakes to be found in clusters from processes such as glaciation, and reservoirs we expect to be more evenly distributed as they can form anywhere that we can pool water.  Finally, Figure 3 shows the count of lakes in each category for each state.Working in this REU position over the summer has been a great experience for me. I’m still working in the lab this semester, hoping to extend my work to the entire conterminous US.  After graduating in May 2020, I hope to continue to apply the skills I have developed working in the lab in similar areas.  I have become particularly interested in deep-learning algorithms after working so closely with one, and I hope to find a position where I can continue to pursue this interest.



Water quality observations through time in LAGOS-NE by Nicole Smith


This animation shows the accumulation of water quality observations for each of the lakes in the LAGOS-NE database. For each year and lake, the  cumulative count of in water quality observations to date is shown by color. The first field observation in the database was recorded in 1933 from Lake Pepin (WI/MN). The lake with the most data points across the time period is Lake Champlain (NY/VT/QC). Approximately 12,000 lakes have at least one observation and appear as points on this map. Another 39,000 lakes are included in LAGOS-NE that do not have water quality observations, but have a large range of GIS-derived ecological context variables and watersheds calculated.

LAGOS-NE: The people behind the scenes to create an open database

We are thrilled to announce that the LAGOS-NE data paper is published, which means that the underlying data are live:  https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/gix101/4555226


Creating something like LAGOS-NE takes a wide range of contributions, expertise, and types of work. We want to extend a HUGE thanks to everyone who contributed. This effort could not have happened without the willingness of people to work within an open-science perspective–to share their data, skills, and tools openly. I am also particularly struck by the important contributions of so many early-career scientists who provided so many creative and novel approaches and ideas to make this effort happen. Specifically, it took 5 major efforts and types of contributions to create LAGOS-NE, and all individuals played a key role:

  1. The data providers spent time sharing their data and documentation with us to create the database, and fielded numerous questions about the data.
  2. The data integrators spent time manipulating the data and authoring metadata for the individual datasets and were the point of contact to the data providers
  3. The geo-data creators developed the GIS tools to create the large number of metrics calculated from the many national-scale geographic datasets that part of LAGOS-NE
  4. The Information-managers designed and created the database
  5. The Data-accessibility-managers designed the strategy for sharing the data by preparing the data and metadata tables into a format to share and make publicly available; as well as design and write an R package for users to use the LAGOS-NE data.


I would like to extend a personal thanks to each and every one of the following individuals:

1.  Data providers

  • Provided water quality datasets by finding funding for data collection, sampling, entering data, conducting quality-control,  writing documentation and metadata, and sharing  — Linda Bacon, Michael Beauchene, Karen Bednar, Marvin Boyer, Mary Tate Bremigan, Steve Carpenter, Jamie Carr, Kendra S Cheruvelil, Matt Claucherty, Joseph Conroy, John Downing, Jed Dukett, Chris Filstrup, Clara Funk, Maria Gonzalez, Linda Green, John Halfman, Steve Hamilton, Paul Hanson, Elizabeth Herron, Celeste Hockings, James Jackson, Kari Jacobson-Hedin, Lorraine Janus, William Jones, Jack Jones, Caroline Keson, Scott Kishbaugh, Barbara Lathrop, Jo Latimore, Yuehlin Lee, Noah Lottig, Jason Lynch, Leslie Mathews, William McDowell, Karen Moore, Brian Neff, Sarah Nelson, Mike Pace, Donald Pierson, Amina Pollard, David Post, Paul Reyes, Donald Rosenberry, Karen Roy, Lars Rudstam, Orlando Sarnelle, Nancy Schuldt, Pat Soranno, Nick Spinelli, Emily Stanley, John Stoddard, Jason Tallant, Anthony Thorpe, Mike Vanni, Gretchen Watkins, Kathie Weathers, Kathy Webster, Jeff White, and Marcy Wilmes

2.  Data integrators

  • Authored metadata and prepared individual datasets — Mary Tate Bremigan, Claire Boudreau, Kendra S. Cheruvelil, Sarah Collins, C. Emi Fergus, Chris Filstrup, Emily N. Henry, Noah Lotticg, Sam Oliver, Nick Skaff, Pat Soranno, Emily Stanley, Kathy Webster
  • Prepared the integrated metadata document for LAGOS-NE — C. Emi Fergus
  • Prepared EML metadata for water quality datasets — C. Emi Fergus
  • Prepared EML metadata for some water quality datasets — Claire Boudreau
  • Designed and implemented the quality-control analysis for the water quality data — Noah Lottig
  • Wrote parts of the technical documentation for LAGOS-NE that was part of the documentation article — Ed Bissell, Mary Bremigan, Kendra S. Cheruvelil, Sarah Collins, C. Emi Fergus, Corinna Gries, Noah Lottig, Caren Scott, Nick Skaff, Nicole Smith, Scott Stopyak, Pat Soranno, Craig Stow, Ty Wagner, Kathy Webster
  • Editor of the technical documentation for LAGOS-NE that was part of the documentation article — Jean-Francois Lapierre

3.  Geo-data creators

  • Developed geospatial tools and performed geospatial analyses — Scott Stopyak, Nicole J. Smith
  • Developed methods for delineating lake watersheds — Scott Stopyak
  • Developed freshwater metrics — Scott Stopyak, C. Emi Fergus, Nicole J. Smith, Patricia Soranno
  • Created LAGOS-NE_LOCUS and conducted quality control — Ed Bissell
  • Designed the quality-control analysis for the geo-data — Sarah Collins, Caren Scott
  • Conducted quality-control analysis for the geo-data — Sarah Collins, Caren Scott, C. Emi Fergus, Nick Scaff, Kathy Webster
  • Authored geospatial metadata — Nicole J. Smith
  • Prepared geodatabase for sharing — Nicole J. Smith

4.  Information managers

  • Database design, database creator, database manager — Ed Bissell
  • Database design — Pang-Ning Tan
  • Database design – Corinna Gries
  • Database design contributor — Patricia Soranno
  • Wrote R code to import water quality datasets into LAGOS data model — Ed Bissell, Sam Christel, Noah Lottig, Shuai Yuan 

5.  Data-accessibility-managers 

  • Designed the strategy for sharing the data by preparing the data and metadata tables into a format to share and make publicly available — Corinna Gries, Colin Smith, i.e.,  Environmental Data Initiative
  • Wrote the LAGOS-NE R package to make LAGOS-NE accessible to users — Jem Stachelek, Sam Oliver


Without all of these individuals, LAGOS-NE could never have happened. Thanks to you all.

—  Pat Soranno, October 19, 2017

Some pictures of the CSI-Limnology team that worked together to integrate the data into LAGOS-NE:

CSI-Limnology Annual Workshop Meeting – Year 3, Archbold Field Station, January 2014

CSI GROUP PIC 2013-lighter
CSI-Limnology Annual Workshop Meeting – Year 2, Trout Lake Station, January 2013

CSI-Limnology Annual Workshop – Year 1, Michigan State University, June 2011