Loading…
This event has ended. Create your own event on Sched.
For over 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, thus forming a community dedicated to making Earth observations more discoverable, accessible and useful to researchers, practitioners, policymakers, and the public. The theme of this year’s meeting is Leading Innovation in Earth Science Data Frontiers.
ALL SESSION RECORDINGS CAN NOW BE FOUND ON THE ESIP YOUTUBE CHANNEL.
Climate [clear filter]
Monday, July 19
 

1:30pm EDT

Delivering Trusted Data to Real Users & Decision Makers
Since the ESIP Winter meeting the Disaster Lifecycle Cluster has further evolved its approach to Leading Innovation in Earth Science Data Frontiers to put more Earth science data to work in decision making environments. The ‘ecosystem of innovation’ places ESIP clusters on the pathway of user engagement through real-time collaboration and an expanding user base. This session looks to attract all ESIP Clusters who have a mission to put more data, processes, analytics and/or machine learning to work, add ORL ranking to their data to serve decision makers through improved data quality, data processing and data use.

Our session will show examples of incorporating data to support recovery efforts to ‘get back to business’ after disaster hits. A specific need is for tree canopy data to aid utility operators and enable them to assess the risk of power line breaks due to an approaching storm. We encourage participants to join the Disasters cluster to mature their products or services through the ecosystem of innovation for evaluation by end users such as the All Hazards Consortium and others.

This session will identify how the ecosystem of innovation works and how ESIP clusters and members can plug into this process by introducing their products, services and methods into the pathway for real users and decision makers to experience. To prepare for the session please consider what use cases and data and/or products you may have that could be valuable to the disasters community.

View Notes

Organizers & Speakers
avatar for Dave Jones

Dave Jones

CEO, StormCenter Communications, Inc.
GeoCollaborate, is an SBIR Phase III technology (Yes, its a big deal) that enables real-time data access through web services, sharing and collaboration across multiple platforms. We call GeoCollaborate a 'Collaborative Common Operating Picture' that empowers decision making, situational... Read More →
avatar for Karen Moe

Karen Moe

Cheverly Green Infrastructure Committee, NASA Retired
Managing an air quality monitoring project for my town just outside of Washington DC and looking for free software!! Enjoying citizen science roles in environmental monitoring and sustainable practices in my town. Recipient of an ESIP 2022 Funding Friday grant with Dr Qian Huang to... Read More →


Monday July 19, 2021 1:30pm - 3:00pm EDT
TBA

4:00pm EDT

Innovations in EnviroSensing Technology and Practice
This session, sponsored by the ESIP EnviroSensing Cluster, will include a series of presentations on emerging and proven approaches that further the collection, management, and exchange of in situ environmental monitoring and observation data. The EnviroSensing Cluster fosters collaborative, across-discipline exchange on technology and practices that facilitate innovation in sensor-based science and data. All are welcome to attend!

Presentations:
  • SWEX: Sundowner Winds Experiment - Gert-Jan Duine 
  • Improving HydroMet data availability - Christel Valentine
  • Using GCE Toolbox in data processing workflows: An example from H.J. Andrews Experimental Forest - Stephanie Schmidt
  • Low cost environmental sensing using hobby-level technology, opportunities and trade-offs - John Porter
  • High Altitude Soil Testing (HAST): Increasing accessibility of data in remote locations - Justin Rubalcaba
  • Ultra-low power LoRa-embedded microcontrollers that simplify eco-enviro-sensing designs - Daniel Fuka
View Notes

Organizers & Speakers
avatar for Renée F. Brown

Renée F. Brown

Information Manager, McMurdo Dry Valleys LTER
dryland ecosystem ecology, biogeochemical cycles, global change, research data management, environmental sensor networks
avatar for Scotty Strachan

Scotty Strachan

Director of Cyberinfrastructure, University of Nevada, Reno
Institutional cyberinfrastructure, sensor-based science, mountain climate observatories!
avatar for Joseph Bell

Joseph Bell

Hydrologist, USGS
KF

Kristina Fauss

University of California, Santa Barbara
avatar for Stephanie Schmidt

Stephanie Schmidt

Information Manager, US Forest Service/H.J. Andrews Experimental Forest & LTER
avatar for John Porter

John Porter

Res. Assoc. Prof., University of Virginia
avatar for Dan Fuka

Dan Fuka

Scientist, Virginia Tech
JR

Justin Rubalcaba

Montana Tech
GD

Gert-Jan Duine

University of California, Santa Barbara



Monday July 19, 2021 4:00pm - 5:30pm EDT
TBA
 
Tuesday, July 20
 

11:00am EDT

CARE Principles for ESIP Data Repositories
The ESIP cluster “Sustainable Data Management” promotes mechanisms for repositories to collaborate to preserve their holdings (https://wiki.esipfed.org/Sustainable_Data_Management). Their current project is to produce recommendations for member repositories on implementing guidance principles within frameworks like FAIR (https://doi.org/10.1038/sdata.2016.18) and TRUST (https://doi.org/10.1038/s41597-020-0486-7). We are also including a third framework: the CARE principles for indigenous data governance (http://doi.org/10.5334/dsj-2020-043), where CARE stands for Collective Benefit, Authority to Control, Responsibility, and Ethics. These principles extend data management concerns to be more people- and purpose-oriented, and to respect indigenous sovereignty. As stated in the Data Science Journal paper, “The ‘CARE Principles for Indigenous Data Governance’ empower Indigenous Peoples by shifting the focus from regulated consultation to value-based relationships that position data approaches within Indigenous cultures and knowledge systems to the benefit of Indigenous Peoples”. This session will present the cluster’s recent examination of the CARE principles, how these are related to repository activities, and extend FAIR and TRUST. Introductory material on CARE and the cluster’s work will be presented, followed by discussion.

View Notes

Organizers & Speakers
avatar for Ruth Duerr

Ruth Duerr

Research Scholar, Ronin Institute for Independent Scholarship
avatar for Margaret O'Brien

Margaret O'Brien

Data Specialist, University of California
My academic background is in biological oceanography. Today, I am a data specialist working with the Environmental Data Initiative (EDI) plus ecosystem-level projects conducting primary research, like the LTER network, and a marine Biodiversity Observation Network. My primary data... Read More →
avatar for Shelley Stall

Shelley Stall

Vice President, Open Science Leadership, American Geophysical Union
Shelley Stall is the Vice President of the American Geophysical Union’s Open Science Leadership Program. She works with AGU’s members, their organizations, and the broader research community to improve data and digital object practices with the ultimate goal of elevating how research... Read More →


Tuesday July 20, 2021 11:00am - 12:30pm EDT
TBA

1:00pm EDT

Teacher Workshop: Exploring Earth, Wind and Fire via Earth Science Data
The Earth Science Information Partners (ESIP) Education Committee will host a virtual workshop for 50 educators on Tuesday July 20 and Wednesday, July 21. (1:00 to 5:00pm EDT on both days). ESIP members will share an educational resource and lead participants through an activity using Earth science data to explore phenomena via different types of data. Tools and resources include:
  • The NOAA CrowdMag app,
  • NASA’s Earth System Data Explorer,
  • UNAVCO Velocity Viewer,
  • NOAA CIMSS satellite data activities,
  • NASA SEDAC Hazards Mapper and HazPop App,
  • En-ROADS Climate Decision Model, and
  • The Concord Consortium Wildfire Module, and
  • The “Out 2 Lunch” archive: Earth Science webinar demonstrations of data tools and resources
Participating STEM educators will also be eligible to apply for $500 implementation grants!
What better way to inspire innovation in Earth science data frontiers than training the teachers who educate our youth?

Agenda/Teacher Road Map: https://docs.google.com/document/d/1FGACsWSHPTXS8nEAXaTjpHB_-201xkfYsJDuOkAY9Rc/edit?usp=sharing

Organizers & Speakers
avatar for Shelley Olds

Shelley Olds

Science Education Specialist, UNAVCO
Data visualization tools, Earth science education, human dimensions of natural hazards, disaster risk reduction (DRR), resilience building.
avatar for Elizabeth Joyner

Elizabeth Joyner

Community Coordinator, SSAI, Goddard Space Flight Center, NASA
Elizabeth Joyner joined the Earth Science Data Systems (ESDS) Program Communications Team in 2022 as the Community Coordinator and works across the program to promote the use of NASA data and resources with end users. She previously served as the Senior Outreach Coordinator for NASA... Read More →
avatar for Trinity Foreman

Trinity Foreman

Comms Consultant, ICMS LLC
Trinity Foreman supports the educational outreach and social media output of NOAA's NCEI. NCEI hosts and provides public access to one of the most significant archives for environmental data on Earth, and Trinity Foreman works to increase the accessibility of NCEI's data tools and... Read More →
avatar for Tamara Ledley

Tamara Ledley

STEM Consultant & Adjunct Professor, Sustaining Science Consulting & Bentley University
I am interested in moving ESIP forward in broadening the reach of “making data matter” into communities and organizations for whom Earth science data and information is essential to their decision making processes. Much of my work has focused on making Earth and climate science... Read More →
avatar for Carla McAuliffe

Carla McAuliffe

Educational Researcher and Curriculum Developer, TERC
avatar for Margaret Mooney

Margaret Mooney

Education Director, NOAA's Cooperative Institute for Meteorological Satellite Studies
avatar for Robert R. Downs

Robert R. Downs

Sr. Digital Archivist, Columbia University
Dr. Robert R. Downs serves as the senior digital archivist and acting head of cyberinfrastructure and informatics research and development at CIESIN, the Center for International Earth Science Information Network, a research and data center of the Columbia Climate School of Columbia... Read More →
avatar for Becky Reid

Becky Reid

Faculty, Cuesta College
I discovered ESIP in the summer of 2009 when I was teaching science in Santa Barbara and attended the Summer meeting there. Ever since then, I have been volunteering with the ESIP Education Committee in various capacities, serving as Chair in 2013, 2019, and 2020.



Tuesday July 20, 2021 1:00pm - 5:00pm EDT
TBA

4:00pm EDT

SWEET Governance and Roadmapping working session
In this session we will openly discuss and strive to achieve consensus on how best to govern SWEET as a longstanding, domain level, semantic web resource. As such, there are several outstanding questions which either need to be addressed or past decisions confirmed and then documented (likely on the SWEET wiki). Examples, though not comprehensive, include:
1. How are SWEET issues or proposals raised?
2. What are the criteria used to evaluate SWEET issues or proposals?
3. Who, or whom, evaluates the issues or proposals?
4. Is there a ‘statute of limitations’ for any such issues or proposals?
5. How does the community arrive at a decision?
6. How is that decision recorded and/or documented for the community?
7. How or what is put in place to help ensure every member is abiding by those decisions?
8. Based on discussion of previous items, is a SWEET manager required or can the community self manage under a more specific set of guidelines?

View Notes

Organizers & Speakers
avatar for Bruce Caron

Bruce Caron

Executive Director, New Media Studio
avatar for Brandon Whitehead

Brandon Whitehead

environmental data scientist, manaaki whenua -- landcare research


Tuesday July 20, 2021 4:00pm - 5:30pm EDT
TBA
 
Wednesday, July 21
 

11:00am EDT

Identifying technology capabilities that meet wildfire science and practitioner requirements
What.  This session is co-organized by the Agriculture and Climate Cluster and the Semantic Harmonization Cluster (hereby collectively referred to as the “Clusters”).  The PDF poster on ESIP's figshare account gives you the big-picture schematic of how this session relates to data-science topics like AI/ML, semantic technology, graph database technology, etc.

Why.  Environmental risks are increasingly resulting in disasters that cost the taxpayer dearly in terms of lives lost, incurred damages, and future liabilities. A recent study on the comprehensive cost of the 2018 California wildfires estimated damages at $150B and the loss of thousands of lives. In this proposed session, the Clusters will lead transdisciplinary-oriented discussions focused on both science and technology topics for managing such environmental risks. Wildfire data and information should ideally be reusable and repurposable across different fire management phases (e.g. prediction, pre-fire planning, during fire, after-fire, recovery). For example, infrastructure that is vulnerable to wildfire-induced floods identified during the active-fight fighting phase should be easily discoverable to city managers weeks or even months later, when heavy rains on burn areas may trigger catastrophic debris-flow that threaten lives.  Features (e.g. buildings, vegetation patches, ridgelines, etc) identified by AI/ML algorithms from UAS imagery data that are used for mitigation planning should be made discoverable for fire managers making tactical fire-fighting decisions.

How.  The proposed session addresses the following question: how can we apply data and knowledge management technologies to fulfill the needs of wildfire mitigation and response? 

In this session, you will be invited to contribute your expertise to sketch out technical solutions that can be deployed to meet the speakers' stated needs.  Your ideas will be openly accessible to individuals who may use those ideas to apply for ESIP Lab and ESIP FUNding Friday projects.

Agenda
  • [11 am] Workshop begins
  • Introduction
    • Big-picture schematic of how this session relates to data-science topics like AI/ML, semantic technology, graph database, etc.
  • Slido poll: Which of the following wildfire experiences apply to you?
  • [11:10 am] Wildfire problem statement, requirements, and some focus on planning by polygon
    • Everett Hinkley, US Forest Service, Geospatial Management Office National Remote Sensing Program Manager
      • Wildfire Mapping--Leveraging AI/ML for needed improvements: Faster delivery, improved consistency, reduced subjectivity
    • Dave Zader, Wildland Fire Administrator for The City of Boulder, CO Fire Department (retired); Wildlife Fire Policy Committee member for the International Association of Fire Chiefs
      • Wildfire management and planning by polygon, a tool for improved decision-making and resources usage
    • Pier Buttigieg, Helmholtz Metadata Collaboration
      • Representing and aligning knowledge about wildfires - the need and challenge of semantic harmonization
  • [12:05 pm] Slido poll: Rank the following values-at-risk that are important to *YOUR* community: from most important (rank #1) to least important (rank #6)
  • [12:10 pm] Breakouts Part 1
    • Breakout group #1: Knowledge representation for wildfire planning and execution (Focus on Polygons)
    • Breakout group #2: Technological solutions for wildfire planning and execution
  • Short break / transition (10 min)
  • [~12:45 pm] Breakouts Part 2
    • Breakout group #1: Knowledge representation for wildfire planning and execution (Focus on Values-at-Risk)
    • Breakout group #2: Technological solutions for Wildfire Planning and Execution
  • [1:10 pm] Report out from breakout groups
  • [1:20 pm] Wrap up
  • [1:30 pm] Workshop ends

View Notes

Organizers & Speakers
avatar for Brian Wee

Brian Wee

Founder and Managing Director, Massive Connections, LLC
Transdisciplinary scientist invested in the use of environmental data and information for science, education, and decision-making for challenges at the nexus of global environmental change, natural resources, and society. Strategized and executed initiatives to engage the US Congress... Read More →
avatar for Bill Teng

Bill Teng

NASA GES DISC (ADNET)
avatar for Ruth Duerr

Ruth Duerr

Research Scholar, Ronin Institute for Independent Scholarship
avatar for Pier Luigi Buttigieg

Pier Luigi Buttigieg

Digital Architect & Senior Data Scientist, Alfred Wegener Institute / Helmholtz
avatar for Everett Hinckley

Everett Hinckley

Geospatial Management Office National Remote Sensing Program Manager, US Forest Service
avatar for Dave Zader

Dave Zader

International Association of Fire Chiefs


Wednesday July 21, 2021 11:00am - 1:30pm EDT
TBA

11:00am EDT

The Saga Continues: Cloud-Optimized Data Formats
Open science is the ability to share and reproduce analysis without sharing a computer. We recognize users have limited resources, such as network bandwidth and memory, and often this prevents them from thinking outside the box when it comes to scaling and sharing science. Open science presents a clear need to standardize on and deliver more cloud-friendly data formats and services. During this session, we highlight advances in cloud-friendly data and services and strive to answer some ongoing research in how these formats and services will support new scales of science and do so openly.

Cloud-friendly data formats and services are central to delivering new innovation in Earth science. With cloud-optimized data formats and services, Earth scientists can achieve new scales of analyses and deliver reproducible research output and information products.
The conversation about data formats is not one that will be “closed” with a decision on “one format to rule them all”. We propose a session centered around discussions which surface new advances in data formats and standards which specifically support sharing and scaling science on the cloud. Many call these formats “cloud-friendly” and “cloud-optimized” formats, respectively.

Putting data on the cloud in cloud-friendly formats is a starting point. Necessary to the utility of this data is the metadata, tools and services which support users accessing these datasets. There have been new advances in cloud-friendly services as well, however there is a lot of room for improvement. During this session, we focus not just on the data formats themselves, but on the usability of those formats made possible by the support system around using them.

Agenda (150 minutes):

Part 1: Lightning Talks - Provide a "lay of the land" and fodder for discussion:
  • Aimee Barciauskas, Welcome to this session: What do we mean by cloud-optimized and why does it matter?
  • 60 minutes of 7-10 minute lightning talks
    • Trevor Skaggs, Element84, will speak on Entwine Point Tile store generated for ATL06
    • Joe Roberts, NASA JPL, will speak about the Metadata Raster Format (MRF) and how it supports the features of NASA GIBS
    • Charles Stern will talk about motivations and progress made on pangeo-forge
    • Stavros Papadopoulos, creator of TileDB, will present "Time to depart from file formats and focus on engines and APIs"
    • Steve Olson  and Shane Mill will talk about NOAA's EDR API and how it enables programmatic access to both conventional and cloud-optimized data formats
    • Aaron Friesz will talk about the platform NASA LPDAAC has built to leverage cloud-optimized data formats.
  • 15 minutes: organize into 3-4 sub groups for continuing the conversation on a specific topic or presentation.
15 minutes: Break 

Part 2: Small Group Discussions
- Attendees and speakers will use this time to dive into discussions, questions and expertise on a sub-topic or specific question.
  • 30 minutes: Small groups meet in a virtual sub-space. A session organizer or ESIP coordinator will meet with each small group to facilitate conversation and take notes.
  • 25 minutes: Small groups present back to the larger group what was discussed
  • 5 minutes: Wrap-up

View Notes

Organizers & Speakers
avatar for Aimee Barciauskas

Aimee Barciauskas

Tech Lead / Engineer, Development Seed
avatar for Rich Signell

Rich Signell

Research Oceanographer, USGS
avatar for Robert Casey

Robert Casey

Deputy Director of Cyberinfrastructure, IRIS Data Services
Rob currently serves as Deputy Director of Cyberinfrastructure at the Incorporated Research Institutions for Seismology (IRIS) Data Management Center (DMC) in Seattle, WA. His responsibilities include management of software development and data services activities as well as leading... Read More →
AF

Aaron Friesz

LP DAAC/USGS
avatar for Steve Olson

Steve Olson

Physical Scientist, NWS/STI/WIAD
I work for the National Weather Service (NWS) Meteorological Development Laboratory (MDL).  MDL conducts applied research and development for the improvement of diagnostic and prognostic weather information; data depiction and utilization; warning and forecast product preparation... Read More →
avatar for Shane Mill

Shane Mill

Senior Web Developer, Guidehouse/NOAA - National Weather Service
Shane Mill has been an Application Developer within the Weather Information and Applications Division of the Meteorological Development Lab of the National Weather Service since September of 2018. Since joining MDL, Shane has prototyped ways that existing standards can enhance operational... Read More →
JR

Joe Roberts

Science Data Visualization, Technical Lead, NASA JPL
avatar for Trevor Skaggs

Trevor Skaggs

Element 84
CS

Charles Stern

Data Infrastructure Engineer, Lamont-Doherty Earth Observatory


Wednesday July 21, 2021 11:00am - 1:30pm EDT
TBA

1:00pm EDT

Teacher Workshop: Exploring Earth, Wind and Fire via Earth Science Data
The Earth Science Information Partners (ESIP) Education Committee will host a virtual workshop for 50 educators on Tuesday July 20 and Wednesday, July 21. (1:00 to 5:00pm EDT on both days). ESIP members will share an educational resource and lead particiapnts through an activity using Earth science data to explore phenomena via different types of data. Tools and resources include:
  • The NOAA CrowdMag app,
  • NASA’s Earth System Data Explorer,
  • UNAVCO Velocity Viewer,
  • NOAA CIMSS satellite data activities,
  • NASA SEDAC Hazards Mapper and HazPop App,
  • En-ROADS Climate Decision Model, and
  • The Concord Consortium Wildfire Module, and
  • The “Out 2 Lunch” archive: Earth Science webinar demonstrations of data tools and resources
Participating STEM educators will also be eligible to apply for $500 implementation grants!
What better way to inspire innovation in Earth science data frontiers than training the teachers who educate our youth?

Agenda/Teacher Road Map: https://docs.google.com/document/d/1FGACsWSHPTXS8nEAXaTjpHB_-201xkfYsJDuOkAY9Rc/edit?usp=sharing

Organizers & Speakers
avatar for Shelley Olds

Shelley Olds

Science Education Specialist, UNAVCO
Data visualization tools, Earth science education, human dimensions of natural hazards, disaster risk reduction (DRR), resilience building.
avatar for Elizabeth Joyner

Elizabeth Joyner

Community Coordinator, SSAI, Goddard Space Flight Center, NASA
Elizabeth Joyner joined the Earth Science Data Systems (ESDS) Program Communications Team in 2022 as the Community Coordinator and works across the program to promote the use of NASA data and resources with end users. She previously served as the Senior Outreach Coordinator for NASA... Read More →
avatar for Trinity Foreman

Trinity Foreman

Comms Consultant, ICMS LLC
Trinity Foreman supports the educational outreach and social media output of NOAA's NCEI. NCEI hosts and provides public access to one of the most significant archives for environmental data on Earth, and Trinity Foreman works to increase the accessibility of NCEI's data tools and... Read More →
avatar for Tamara Ledley

Tamara Ledley

STEM Consultant & Adjunct Professor, Sustaining Science Consulting & Bentley University
I am interested in moving ESIP forward in broadening the reach of “making data matter” into communities and organizations for whom Earth science data and information is essential to their decision making processes. Much of my work has focused on making Earth and climate science... Read More →
avatar for Carla McAuliffe

Carla McAuliffe

Educational Researcher and Curriculum Developer, TERC
avatar for Margaret Mooney

Margaret Mooney

Education Director, NOAA's Cooperative Institute for Meteorological Satellite Studies
avatar for Robert R. Downs

Robert R. Downs

Sr. Digital Archivist, Columbia University
Dr. Robert R. Downs serves as the senior digital archivist and acting head of cyberinfrastructure and informatics research and development at CIESIN, the Center for International Earth Science Information Network, a research and data center of the Columbia Climate School of Columbia... Read More →
avatar for Becky Reid

Becky Reid

Faculty, Cuesta College
I discovered ESIP in the summer of 2009 when I was teaching science in Santa Barbara and attended the Summer meeting there. Ever since then, I have been volunteering with the ESIP Education Committee in various capacities, serving as Chair in 2013, 2019, and 2020.



Wednesday July 21, 2021 1:00pm - 5:00pm EDT
TBA

2:30pm EDT

New Frontiers in AI for Earth and Space: Big Data and Parallel Computing
AI is lauded as a powerful tool for gaining insights and producing knowledge from the massive datasets we have access to today in the Earth sciences. One of the major challenges of integrating AI practices in the Earth and Space Sciences is the immense size of environmental and climate data. Intensive computational power is required for AI to efficiently learn from such massive amounts of data. The key question here, then, is what are the best strategies to make AI work and what kind of infrastructural constraints does the community face as a result? There are many parallel computing frameworks, e.g., GPU, Dask, Spark, Hadoop, CUDA, JobLib, ipyparallel, dispy, Ray, etc to assist with this challenge today. But which one is suitable for different use cases in Earth and Space sciences? On various deployment platforms such as HPC, Azure, AWS, GCP, institutional clusters, individual servers, or even personal computers, what is the best way to configure the environment for carrying out AI tasks on large spatial datasets?

This series consist of two sessions. The first session will invite speakers with experiences implementing AI at scale to share and communicate with the ESIP community working with parallel computing. We will accumulate a series of key strategies these speakers have used to move our research forward on AI4Earth&Space. In the second session, we will conduct a thorough step-by-step tutorial from environment setup (e.g., Dask-ML) to train/test AI using parallel computing on large datasets to equip the Earth and Space science community with some hands-on experiences.

Session 1: Talks (2.30 - 3.30pm)

1. Tom Augspurger, Microsoft
Title: Scalable Geospatial Analysis
Working with geospatial data can be challenging, regardless of the scale. We'll see how Microsoft's Planetary Computer is using STAC and Dask to facility large-scale geospatial data analysis. We'll use the Planetary Computer's STAC catalog to find the data matching some conditions, and a Dask cluster to process the data in parallel.

2. James Bednar, Director of Technical Consulting, Anaconda, Inc.
Title: How reproducible do you want your code to be?
Unless your simulation or analysis is reproducible, you can't be sure your results mean anything. But how reproducible does it need to be, across hardware, software environments, people, organizations, and time? I'll present a quick overview of the levels to choose from, along with a suggested way to achieve each one using Conda environments with Python.

3. Ryan McGranaghan, Data Scientist/Aerospace Engineering Scientist, ASTRA LLC
Title:
A survey of Cloud solutions for the Earth and Space Sciences
The Cloud has the potential to transform the way we collaborate and share science and to push the boundaries of what is possible with scientific computing. Cloud-based data science platforms are now being used to address challenges in the field of AI. Indeed, the Earth and Space Sciences are in an intense period of experimentation applying these platforms to more capably use AI for prediction and discovery. We will explore selected existing Cloud-based environments for the Earth and Space Sciences, particularly for the myriad components of the AI project lifecycle. We will use the survey of solutions to emerge the gaps and trends in this rapidly evolving landscape.

4. Ziheng Sun, Research Assistant Professor, George Mason University
Title: ESIP Geoweaver Update, Machine Learning Cluster Activity Overview & Future Plan
The automation of full stack workflow has become viral since the Earth data volume expontionally increases and the complexity of Earth system models and algorithms gets more difficult to manage and faciliate. The latest development in AI/ML technique brings a lot of new opportunities to significantly improve the accuracy, increase the model resilience and intelligence, and reduce the overall cost. However, managing and automating AI experiments is a grand challenge for the entire Earth science community. Geoweaver is a software developed to tackle this problem. We will show how to use Geoweaver to create AI workflow in one place and run the processes on various distributed platforms, separate code from computing resources for resilience, record the provenance of every workflow execution, and share and reuse workflows to boost knowledge accumulation and discovery. 

5. Cindy Lin, Postdoctoral Fellow, Cornell University 
Title: AI Ethics in Context
It has been broadly established by computer scientists working on AI in the environmental sciences that physical and computer science researchers pay more attention to the performance of AI-based models and less to how end users trust AI models (McGovern 2020). Accordingly, a lot of what makes an AI model usable depends on its trustworthiness; what is considered trustworthy may differ according to the needs of end user groups such as private industry and government. In this talk, I will discuss how a conundrum of political and socioeconomic factors, apart from the needs of end users, enable the establishment of AI trustworthiness in Indonesia. In particular, I provide an ethnographic account of a public-private partnership between an American IT firm and one of Indonesia’s leading engineering agency where new AI technologies are developed to address one of the world’s largest environmental concerns: tropical peatland fires.

Session 2: Demos (3.45 - 5.00pm)
1. Tom Augspurger, Microsoft
Demo Title: Scalable Geospatial Machine Learning with Dask and STAC
Abstract: In this workshop, attendees will work through several exercises to train a deep learning model to predict crop types using satellite imagery. We’ll work on a JupyterHub deployed to Azure, and will access data from Microsoft’s Planetary Computer’s data catalog

Preparation: Attendees do not need to prepare anything ahead of time. They will be provided with credentials to log into a JupyterHub during the session. 

The materials will all be at https://github.com/TomAugspurger/esip-summer-2021-geospatial-ml

2. James Bednar, Director of Technical Consulting, Anaconda, Inc.

Demo title: Using hvPlot for interactive plotting of Xarray, Pandas, and Dask data in Jupyter
Xarray and Pandas support calling .plot() to get basic matplotlib plots, and here we'll show you how to use the same commands to explore even the largest cloud or remote datasets fully interactively. hvPlot makes it easy to get small multiples, overlays, layouts, and categorical plots, with dynamic regridding of large datasets so that you can explore them in any browser. New hvPlot features now also let you replace just about any number or string in an xarray or pandas method or expression with a widget, so that you can quickly try out the effect of various parameters or dynamically filter your data to help you understand it.

Preparation: Please follow the installation instructions at https://holoviz.org/installation.html

3. Ziheng Sun, Research Assistant Professor...

View Notes

Organizers & Speakers
avatar for James Bednar

James Bednar

Director of Technical Consulting, Anaconda, Inc.
I work on HoloViz.org and PyViz.org, and am happy to chat about anything to do with visualizing data in Python.
avatar for Tom Augspurger

Tom Augspurger

Microsoft
Tom is a software engineer working at Microsoft on the Planetary Computer and is a member of the Pangeo Steering Council. Tom helps maintain several open-source libraries in the scientific Python ecosystem, including pandas and Dask.
avatar for Annie Burgess

Annie Burgess

Lab Director, ESIP
avatar for Julien Chastang

Julien Chastang

Software Engineer, UCAR - Unidata
Scientific software developer at UCAR-Unidata.
avatar for Douglas Rao

Douglas Rao

Research Scientist, NESDIS/NCEI/CSSD/CSB
I am currently a Research Scientist at North Carolina Institute for Climate Studies, affiliated with NOAA National Centers for Environmental Information. My current research at NCICS focuses on generating a blended near-surface air temperature dataset by integrating in situ measurements... Read More →
avatar for Ziheng Sun

Ziheng Sun

research associate professor, George Mason University
My research interests are mainly on geospatial cyberinfrastructure and machine learning in atmospheric and agricultural sciences.
avatar for Ryan McGranaghan

Ryan McGranaghan

Data Scientist/Aerospace Engineering Scientist, ASTRA LLC
Space scientist, engineer, data scientist, designer, podcast host. Observer of beauty in liminal spaces. I believe in being led around by your curiosity.
avatar for Cindy Lin

Cindy Lin

Postdoctoral Fellow, Cornell University
Cindy Lin is a Postdoctoral Fellow at the Atkinson Center for Sustainability, affiliated with the Department of Information Science. In Fall 2022, she will be an assistant professor at Pennsylvania State University’s College of Information Sciences and Technology. Her current research... Read More →



Wednesday July 21, 2021 2:30pm - 5:00pm EDT
TBA
 
Thursday, July 22
 

11:00am EDT

Foraging for Dataset-Usage Relationships
Over the last year, the Discovery Cluster has been developing an innovative search paradigm called Usage-Based Discovery (UBD). UBD allows users to examine the datasets used in applications and research similar to the user’s own purpose. The database underpinning UBD needs a robust population of dataset-usage relationships. Please join us in providing relationships in real-time that will serve as a core population. These high quality relationships will also serve as training data for further machine-learning-based harvesting. Training, examples, and coaching will be provided to participants.

View Notes

Organizers & Speakers
avatar for Christopher Lynnes

Christopher Lynnes

Researcher, Self
Christopher Lynnes recently retired from NASA as System Architect for NASA’s Earth Observing System Data and Information System, known as EOSDIS. He worked on EOSDIS for 30 years, over which time he has worked multiple generations of data archive systems, search engines and interfaces... Read More →
avatar for Doug Newman

Doug Newman

Data Systems Deputy Technical Manager, NASA ESDIS


Thursday July 22, 2021 11:00am - 12:30pm EDT
TBA

4:00pm EDT

AI Data Readiness: What Does ML Training Data Interoperability Mean to You? Examples and Use Cases
The results from machine learning are only as good as their training data. At the same time, it’s difficult and time consuming to develop quality training data. It would be valuable if we could reuse training data in other contexts. What is necessary to make that happen? 

In this session we explore several examples of preparing and sharing ML training data and then explore whether there are certain attributes or processes that we can standardize in order to make trains data more interoperable.

Presentations:
David Roy, Univ. Mich., on the reuse of burned area data from Landsat
Gabriel Tseng, Univ MD., on the collection and later sharing of training data on agricultural conditions
Christian Schroeder de Witt, Univ. Oxford, on a benchmark data set for precipitation prediction

Community exercise to refine most effective ways to enhance ML training data reusability and readiness.

View Notes

Organizers & Speakers
avatar for Mark Parsons

Mark Parsons

Research Scientist, University of Alabama in Huntsville
AJ

Aleksandar Jelenek

The HDF Group
avatar for Tyler Christensen

Tyler Christensen

Data Management Architect, NOAA
avatar for Douglas Rao

Douglas Rao

Research Scientist, NESDIS/NCEI/CSSD/CSB
I am currently a Research Scientist at North Carolina Institute for Climate Studies, affiliated with NOAA National Centers for Environmental Information. My current research at NCICS focuses on generating a blended near-surface air temperature dataset by integrating in situ measurements... Read More →


Thursday July 22, 2021 4:00pm - 5:30pm EDT
TBA

4:00pm EDT

Dynamics Soil Information Systems: where we are, where we need to go, and why
Soil observations are as old as agriculture and even more relevant under a carbon-dioxide driven climate. With the promise of new AI and machine learning methods, accurate and timely data becomes even more valuable for scientific insight and data-driven policy work. Yet there remains substantial barriers to soil data discovery, access, integration, and reuse. Many of these challenges are driven by the diversity in measurements, methods, and scales inherent in soils. In this session we will hear about current efforts to address these challenges. From new ontologies and semantic tools, to data formatting, to what data measurements are poised to drive novel insights, this session will focus on efforts in the US and around the world to create a dynamic soil information system for the 21st century.

View Notes

Organizers & Speakers
avatar for Kathe Todd-Brown

Kathe Todd-Brown

Assistant Professor, University of Florida
I\\'m a computational biogeochemist who uses data and mathematics to study how dirt breaths.
LD

Luís de Sousa

Federal University of Rio Grande do Sul



Thursday July 22, 2021 4:00pm - 5:30pm EDT
TBA

4:00pm EDT

GeoScience Ontology Landscape
Community adoption of some representation of geoscience concepts and relationships is an important step towards streamlining data integration and interoperability of geoscience data across domains. This session is intended to understand the requirements and applications that motivate several geoscience-related ontologies, and to promote conversation on the differences and similarities between them. The session will include overview presentations of some current geoscience and related ontologies: 
1)      OntoGeonous ontology, (Lombardo et al. 2018); an implementation of the GeoSciML conceptual model, with application to geologic maps.
2)      GeoCore ontology, (Garcia, et al.,  2020   https://doi.org/10.1016/j.cageo.2019.104387  ). Geoscience ontology, applications area in petroleum exploration.
3)      GeoScience Ontology; https://github.com/Loop3D/GKM , Developed for Loop3D project to provide background knowledge support for implicit generation of 3D geologic models. (Brodaric, Richard, GSC OFR https://doi.org/10.4095/328296 )
4)   SWEET and  ENVO; these are two widely used ontologies with broad scope; SWEET is managed by and ESIP Cluster, and efforts have been under way to update this ontology and align with ENVO.

The major discussion point is whether the use cases that have motivated these ontologies require different solutions, or if some convergence is possible into a more integrated and harmonized set of shared knowledge representations can be developed to promote interoperability.

View Notes

Organizers & Speakers
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →
avatar for Brandon Whitehead

Brandon Whitehead

environmental data scientist, manaaki whenua -- landcare research
avatar for Luan Fonseca Garcia

Luan Fonseca Garcia

Researcher, UFRGS
I'm a computer scientist focused on the development of ontologies for geosciences.
AM

Alizia Mantovani

Consiglio Nazionale delle Ricerche -IGG



Thursday July 22, 2021 4:00pm - 5:30pm EDT
TBA

6:00pm EDT

Vocabularies for rock type categories
The goal of this session is to analyze and compile use cases for a standardized rock type (lithology) vocabulary, and to learn about existing vocabularies in use. The session will start with presentations on existing vocabularies (e.g. CGI Simple Lithology, BGS Rock Names, EarthChem, MINDAT, Geological Survey of Queensland), focusing on their design requirements, how they are currently being used, and how they are accessed. This is an outstanding problem for data integration in geoscience and the time is ripe to look for convergence between the various activities.

Vocabularies to be discussed:

Discussion Questions:
  1. Do existing vocabularies meet requirements, if not, what is missing and what do we have to do next? 
  2. Who should govern a lithology vocabulary?
  3. How can the vocabulary be sustained? 

During the session, please keep an eye on the Rock Vocabulary session jam board, and post notes (use the sticky note tool) with your thoughts and questions. We'll review the board during the discussion time after presentations.

AGENDA:
  • 10 min. Welcome, overview, get organized. 
  • 10 min (5 min each), Kerstin, Lesley on their interest in/experience with lithology vocabularies 
  • 10 min: (Steve) CGI vocabs:  simple lithology, regional lithotectonic units, USGS GEMS ‘General Lithology’ and State Geologic Map Compilation (SGMC) vocabularies. 
  • 10 min: (Jolyon Ralph) MINDAT rock vocabulary 
  • 10 min: (Tim McCormick) BGS lithology SKOS resource, https://data.bgs.ac.uk/id/EarthMaterialClass/RockName/PA_RSD
  • 10 min: (Vance Kelly) Geological Survey of Queensland lithology vocabulary
  • 20 min: Q & A, Discussion
View Notes

Organizers & Speakers
avatar for Kerstin Lehnert

Kerstin Lehnert

Doherty Senior Research Scientist, Columbia University
Kerstin Lehnert is Doherty Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the Interdisciplinary Earth Data Alliance that operates EarthChem, the System for Earth Sample Registration, and the Astromaterials Data System. Kerstin... Read More →
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →
avatar for Lesley Wyborn

Lesley Wyborn

Data Strategist, Australian Research Data Commons
VK

Vance Kelly

Principal Data Manager, Geological Survey of Queensland
JR

Jolyon Ralph

MinDat.org and Hudson Institute of Mineralogy
TM

Tim McCormick

British Geological Survey



Thursday July 22, 2021 6:00pm - 7:30pm EDT
TBA
 
Friday, July 23
 

1:30pm EDT

Designing a Public Portal for Participatory Environmental Governance
Help us push the frontiers of democratic participation in environmental governance by joining this design workshop on a new data portal that enables members of environmental advocacy groups to ask geography-based questions about environmental enforcement!

Background: Vital data about federal enforcement actions against facilities that pollute the soil, air, and water is currently available but largely inaccessible in the U.S. Environmental Protection Agency’s (EPA) Enforcement and Compliance History Online (ECHO) database. We have been working for 1.5 years with data analysts, nonprofits, and community groups—and now with ESIP Lab funding—to develop well-documented and open source cloud-based Jupyter Notebooks that make ECHO data readily accessible and reportable by zip code, hydrologic unit code (to assess watersheds), state, and congressional district. However, we now have so many tools and reports that they can be hard to navigate and access!

What we’re making: We are now building a web portal to share our tools and reports. Our vision is an intuitive map-centric interface for three types of public interaction:
  1. Accessing already-generated reports
  2. Accessing our Jupyter Notebooks to generate custom reports (e.g. Clean Water Act violations in the Niagara River watershed)
  3. Sharing these custom reports and some context about why the findings are important or how they are surprising.
Where you come in: Are there best practices we should know about for displaying these kinds of reports and tools? What are similar projects we should look at during the design process? For example, EPA’s How’s My Waterway tool, justicemap.org, and DataONE (possible integration potential?)

This workshop will take place in two parts:
  • Part 1 is an introduction to the reports and tools. We will familiarize participants with the project through both a presentation and hands-on use of a Notebook.
  • Part 2 is a design workshop exploring ideas for the web portal: a structured, facilitated discussion focused on developing user scenarios to inform web development.

View Notes

Organizers & Speakers
avatar for Kelsey Breseman

Kelsey Breseman

Attendee, Head Weaver
Tlingit, forest person, engineer, and activist. Working on climate research & communication on tribal lands with Sealaska and The Nature Conservancy. Always interested in how tech tools and the stories we tell shift the balance of power.


Friday July 23, 2021 1:30pm - 3:00pm EDT
TBA

3:30pm EDT

HDF Town Hall
Several petabytes of Earth Science data are already in the HDF file formats and the collections are still growing. The HDF Group is strategically committed to support producers and users of these data as their access patterns vary throughout the data use lifecycle with evolving applications, computing frameworks, and backend storage systems. This is especially important for seamless transition from on-prem filesystem-based to cloud computing. This commitment is reflected in HDF Group’s long-time collaboration with the netCDF library developers, and more recent work to support efficient access to HDF5 files for the Zarr-based applications. Both of these data formats are popular in the ESIP community.

View Notes

• Elena Pourmal (NASA EED-2 / HDF Group): HDF - Current Status and Future Directions
• J. P. Swinski (NASA): H5Coro: The HDF5 Cloud-Optimized Read-Only Library

NASA’s migration of science data products and services to AWS has sparked a debate on the best way to access science data stored in the cloud. Given that a large portion of NASA’s science data is in the HDF5 format or one of its derivatives, a growing number of efforts are looking at ways to efficiently access H5 files residing in S3. This presentation describes one of those efforts, H5Coro, and argues for the creation of a standardized subset of the HDF5 specification targeting cloud environments. H5Coro is an open-source C++ module written from scratch that implements a performant HDF5 reader for H5 files that reside in S3. It targets high latency/high throughput environments by minimizing the number of I/O operations through caching and intelligent range GETs. H5Coro is currently available as a C library and includes Python bindings.
• John Readey (The HDF Group): HDF for the Cloud - Serverless HDF

The HDF Server (HSDS) provides a convenient method for running HDF applications in the cloud (utilizing scalable compute and object-based storage), but sometimes setting up a server is just too much to deal with due to cost, time, or management concerns. In this talk we'll discuss two alternative ways to utilize HSDS technology but leaving the server aspect behind. The first, HSDS for AWS Lambda supports the HDF REST API but runs entirely using Lambda functions. The second approach is "HSDS Direct Access", a client-side library that enables HSDS-like features exclusively on the client: Object storage read and write, multi-threading support, and sql-stye queries.
Ellen Johnson (MathWorks): MATLAB Modernization on HDF5 1.10, Support for SWMR and VDS, and Cloud Data Access
 This talk presents our effort at MathWorks toward modernizing on HDF5 1.10.7 and adding support for the much-requested Single-Writer/Multiple-Reader and Virtual Dataset features. We will discuss our updated 1.10.7 HDF5 functionality available today for MATLAB users in the R2021b prerelease (with R2021b full release planned for September) and would like to hear early feedback from the community. We will also discuss MATLAB capabilities for working with HDF5 data hosted on S3, Azure, and Hadoop introduced in R2020b which we have now enabled for Virtual Datasets. We will wrap up with performance and compatibility considerations plus our tentative roadmap for future HDF5 enhancements.
 • Joe Lee (NASA EED-2 / HDF Group): HDFEOS.org User Analysis, Updates, and Future

Organizers & Speakers
AJ

Aleksandar Jelenek

The HDF Group
EP

Elena Pourmal

Engineering Director, HDF Group
HDF
JR

John Readey

Developer, The HDF Group



Friday July 23, 2021 3:30pm - 5:00pm EDT
TBA
 
  • Timezone
  • Filter By Date 2021 ESIP Summer Meeting Jul 19 -23, 2021
  • Filter By Venue Venues
  • Filter By Type
  • Break
  • Breakout Session
  • Hackathon
  • Networking
  • Plenary
  • Workshop
  • Keywords
  • Collaboration Area Tags


Filter sessions
Apply filters to sessions.