Loading…
This event has ended. Create your own event on Sched.
For over 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, thus forming a community dedicated to making Earth observations more discoverable, accessible and useful to researchers, practitioners, policymakers, and the public. The theme of this year’s meeting is Leading Innovation in Earth Science Data Frontiers.
ALL SESSION RECORDINGS CAN NOW BE FOUND ON THE ESIP YOUTUBE CHANNEL.
Semantic Technologies [clear filter]
Tuesday, July 20
 

1:30pm EDT

AI Data Readiness: Designing A Community-Driven Road Map for Data Standards and Tools
As artificial intelligence transforms the scientific discovery for Earth and space sciences, there is an urgent need to ensure that Earth and space science data is ready for AI applications. As a collaborative community that bridges government agencies, academia, private industries, and international initiatives, ESIP is a natural space to advance the development of data standards and tools to support the transformation of Earth and space science data for AI applications. The Data Readiness Cluster is created after 2021 ESIP Winter Meeting aims to steward community effort on the topic of AI ready data, from definition and standards, to tools and capacity building.

As a new cluster, the Data Readiness Cluster invites all members of the Earth and space science community to design a road map to guide the development of data standards and tools for AI data readiness. This session will build on the landscape analysis of data standards, tools, and research that are relevant to AI ready data. The community will co-design a path forward and identify major milestones for developing data standards and tools for AI ready data. We invite individuals and groups who are interested in the topic to join the session and contribute to the design of a road map that guides the cluster activities for the next two years.

Session agenda:
13:30–13:35 Welcom & session overview
13:36–14:00 "Celebrity Interview" - Data Readiness Cluster Workplan & Community Feedback on the workplan
14:01–14:50 Breakout Room (two rounds) - Data AI-readiness Checklist 

Session materials:
- Data Readiness Cluster Problem Statement
- Data Readiness Cluster Workplan
- Sample checklist for data AI-readiness

Relevant sessions during ESIP Summer Meeting 2021

View Notes

Organizers & Speakers
avatar for Tyler Christensen

Tyler Christensen

Data Management Architect, NOAA
avatar for Eric Kihn

Eric Kihn

Division Chief OGSSD, NESDIS/NCEI/COGSD
avatar for Douglas Rao

Douglas Rao

Research Scientist, NESDIS/NCEI/CSSD/CSB
I am currently a Research Scientist at North Carolina Institute for Climate Studies, affiliated with NOAA National Centers for Environmental Information. My current research at NCICS focuses on generating a blended near-surface air temperature dataset by integrating in situ measurements... Read More →
avatar for Rob Redmon

Rob Redmon

Scientist, NOAA Center for AI
Dr. Rob Redmon is a senior scientist with NOAA's National Centers for Environmental Information (NCEI). He is the Lead for NOAA's Center for Artificial Intelligence (NCAI, noaa.gov/ai), and the Space Weather Follow On (SWFO) Science Center.


Tuesday July 20, 2021 1:30pm - 3:00pm EDT
TBA

1:30pm EDT

Science-on-Schema.org - Gathering Feedback for ESIP Assembly Endorsement

Slides: https://docs.google.com/presentation/d/1voih_wYRgP9plkbu31WmYAdjeWj7Mu6stR9Mn1JExR0/edit#slide=id.gafa808a5c8_0_39

Across geoinformatics, many initiatives deserve careful consideration and adoption to promote interoperability across our shared missions and projects. Because there are so many specifications for our discipline to evaluate and adopt, documentation and guidelines that streamline this process are highly beneficial. The ESIP “Science on Schema.org” cluster has been developing and testing a set of guidelines that help data repositories and other content providers adopt the schema.org vocabulary for publishing metadata about data resources in their HTML web documents. The goal of these guidelines is to document shared conceptualizations surrounding the description of scientific datasets and their respective data repositories for the purpose of providing consistent, machine actionable metadata using web publishing standards. In doing so, adopters achieve greater discovery of scientific datasets across the web from large scale search providers to local, domain specific metadata aggregators.

At the 2021 ESIP Winter Meeting, the community examined whether an ESIP Assembly Endorsement of Science-on-Schema.org guidelines documentation was useful and applicable. After discussion, it was decided that the Schema.org cluster would submit an updated version of the guidelines plus supporting materials for ESIP endorsement at or soon after the 2021 ESIP Summer Meeting. This session will: 1) present the ESIP endorsement package of the latest Science-on-Schema.org guidelines v.1.3; 2) host presentations from guidelines adopters regarding their experiences; and 3) gather feedback from attendees on ways to improve the submission package.

Guidelines: https://science-on-schema.org 

Endorsement Issues to Resolve:

View Notes

Organizers & Speakers
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →
avatar for Ruth Duerr

Ruth Duerr

Research Scholar, Ronin Institute for Independent Scholarship
avatar for Matt Jones

Matt Jones

Director of Informatics R&D, NCEAS / DataONE / UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Cyberinfrastructure
avatar for Mark Schildhauer

Mark Schildhauer

Senior Technology Fellow, NCEAS/UCSB
Data semantics, Ecoinformatics training, Arctic data, LTER data, Ecological synthesis
avatar for Adam Shepherd

Adam Shepherd

Technical Director, BCO-DMO, Woods Hole Oceanographic Institution
Architecting adaptive and sustainable data infrastructures.Co-chair of the ESIP schema.org clusterKnowledge Graphs | Data Containerization | Declarative Workflows | Provenance | schema.org
DV

Dave Vieglais

Research Professor, University of Kansas


Tuesday July 20, 2021 1:30pm - 3:00pm EDT
TBA

4:00pm EDT

Machine-Readable Descriptors for Heterogeneous Tabular Data
Many Earth science observation datasets are inherently tabular in nature: rows and columns of numbers and text providing measurements of particular quantities at specified times and locations. Often these data are plain text files containing comma-separated values (CSV) or other separators. Such files are easy for humans to load into a spreadsheet or Pandas Dataframe, either interactively or using ad-hoc code that understands the structure of a particular file.

Unfortunately, tabular data files are heterogeneous. There are no mandatory standards or schema for important characteristics such as the presence of header rows, the naming and ordering of columns, the units used, and so forth. Even if there were a standard approach, a data archive facility may be obligated to accept data as submitted rather than converting to another format. The end result of this file variety is that human intervention is required to inspect and understand the contents of any new instance; automated data ingestion and verification are not easily done.

To solve this problem, a number of approaches have been proposed for machine-readable descriptors that provide metadata about the syntax and semantics of the rows of data. Examples include the World Wide Web Consortium (W3C) CSV on the Web (CSVW) technical recommendation (which uses JSON format), Table Schema (also in JSON), NOAA ERDDAP's NCCSV and British Atmospheric Data Center's BADC-CSV (both of which use CSV text), CSV YAML (CSVY), NASA Ames Format Specification (text), possibly NcML (XML not for this purpose but perhaps adaptable), and doubtless others. In each case the descriptor is either a separate sidecar file or comprises additional lines of metadata in the data file itself, prior to the actual CSV-style rows of data values.

This session will invite discussion of various approaches and their benefits or limitations including ease of creation, actual machine-readability, level of standardization, availability of tools, and breadth of community adoption.

Agenda:
  • Welcome and overview - Jeff de La Beaujardière/NCAR (15 min)
  • W3C CSV on the Web (CSVW) at Italian Ministry of Transportation - Paolo Starace/SciamLab (15 min)
  • ERDDAP's datasets.xml as a File Description System - Bob Simons/NOAA NMFS (15 min)
  • CSV YAML (CSVY) at ICARUS - Tran Nguyen/UC Davis (15 min)
  • Open discussion (30 min)
View Notes

Organizers & Speakers
avatar for Jeff de La Beaujardiere

Jeff de La Beaujardiere

Director, Information Systems Division, NCAR
I am the Director of the NCAR/CISL Information Systems Division. My focus is on the entire spectrum of geospatial data usability: ensuring that Earth observations and model outputs are open, discoverable, accessible, documented, interoperable, citable, curated for long-term preservation... Read More →
avatar for Bridget Thrasher

Bridget Thrasher

Data Stewardship Coordinator, NCAR
EN

Eric Nienhouse

SE / Product Owner, UCAR
avatar for Bob Simons

Bob Simons

IT Specialist, NMFS Environmental Research Division
I work on ERDDAP, a free and open source data server that gives you a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP has been installed and used by more than 70 organizations around the... Read More →
avatar for Paolo Starace

Paolo Starace

Solution Architect & Co-founder, Sciamlab


Tuesday July 20, 2021 4:00pm - 5:30pm EDT
TBA

6:00pm EDT

Plenary in honor of Dr. Peter Fox: X-informatics - Lessons Learned from Data and Information in Research
Informatics efforts emerged largely in isolation across a number of disciplines. This new discipline, generally cast as the science and engineering of information systems originated in the middle of the last century and has undergone many adaptations and in the last two decades flourishing into discipline-specific fields like geoinformatics, bioinformatics, astroinformatics and more. Recently, certain core elements in informatics have been recognized as applicable across disciplines. Hence, efforts at systematizing the common (or core, i.e. discipline neutral) aspects of informatics have been successful: use cases, human-centered design, iterative approaches, information models and more are some of the key elements. Dr. Peter Fox has been instrumental in convening the Earth Science Informatics community, defining Informatics and Data Science in Earth Sciences, for his vision of “X-informatics” and the evolution of these fields as interdisciplinary research becomes widely accepted, and new challenges arise from the increased attention to a data-intensive approach in general. This includes creating or adapting informatics to address data that are high-dimensional, heterogeneous, sparse or with uncertain quality. We would like to dedicate this session to Dr. Peter Fox, a visionary, champion and an avid explorer of boundaries when it comes to Informatics and its benefits in scientific research. This session will showcase the field of Informatics, its history, current research, visions for the future and the role Dr. Peter Fox has in shaping these ideas and approaches.

Featured presentations:
  • Mineral Informatics: Analytics, Visualization, and the Legacy of Peter Fox (Robert Hazen)
  • X-informatics: making data science down to earth in the real world (Xiaogang (Marshall) Ma)

Organizers & Speakers
avatar for Robert Hazen

Robert Hazen

Senior Scientist, Carnegie Institution for Science
I am a mineralogist who, In 2015, was mesmerized by Peter Fox's vision of data-driven discovery. In the past 6 years, working closely with Peter and his students, we have been attempting to usher in an era of mineral informatics. We have been constructing large data resources and... Read More →
avatar for Marshall Ma

Marshall Ma

Associate Prof, University of Idaho
Xiaogang (Marshall) Ma is an associate professor of computer science at the University of Idaho. He received his Ph.D. degree of Earth Systems Science and GIScience from University of Twente, Netherlands in 2011, and then completed postdoctoral training of Data Science at Rensselaer... Read More →
avatar for Mark Parsons

Mark Parsons

Research Scientist, University of Alabama in Huntsville
avatar for Susan Shingledecker

Susan Shingledecker

Executive Director, ESIP
Susan is Executive Director or ESIP, Earth Science Information Partners, a global community of Earth science data professionals who come together to find solutions and advance data management to enable and empower the use of data to solve some of our planet's greatest challenges... Read More →


Tuesday July 20, 2021 6:00pm - 7:30pm EDT
TBA
 
Wednesday, July 21
 

11:00am EDT

Graph-Based Data Science
This tutorial introduces graph-based data science work, where machine learning approaches can be combined with complementary knowledge graph work. The tutorials leverage a popular library `kglab` – an open source project that integrates RDFlib, OWL-RL, pySHACL, NetworkX, iGraph, pslpython, node2vec, PyVis, and more – to show how to use a wide range of graph-based approaches, blending smoothly into data science workflows, and working efficiently with popular data engineering practices.

Within this space of open source graph libraries in Python, there are several camps: semantic graphs, probabilistic graphs, graph algorithms, graph ML, interactive visualization, etc. Previously these "camps" did not collaborate much and the libraries were difficult to integrate. We'll show how to write brief Python code to build complementary "Hybrid AI" workflows, which is ideal for strategies such as self-supervised learning. All of the training material is available as Jupyter notebooks.

View Notes

Organizers & Speakers
avatar for Paco Nathan

Paco Nathan

Managing Partner, Derwen, Inc.
Known as a "player/coach", with core expertise in data sciencecloud computingnatural languagegraph technologies; ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Advisor for Amplify PartnersRecognaiKUNGFU.AI. Lead committer Py... Read More →


Wednesday July 21, 2021 11:00am - 1:30pm EDT
TBA

11:00am EDT

Identifying technology capabilities that meet wildfire science and practitioner requirements
What.  This session is co-organized by the Agriculture and Climate Cluster and the Semantic Harmonization Cluster (hereby collectively referred to as the “Clusters”).  The PDF poster on ESIP's figshare account gives you the big-picture schematic of how this session relates to data-science topics like AI/ML, semantic technology, graph database technology, etc.

Why.  Environmental risks are increasingly resulting in disasters that cost the taxpayer dearly in terms of lives lost, incurred damages, and future liabilities. A recent study on the comprehensive cost of the 2018 California wildfires estimated damages at $150B and the loss of thousands of lives. In this proposed session, the Clusters will lead transdisciplinary-oriented discussions focused on both science and technology topics for managing such environmental risks. Wildfire data and information should ideally be reusable and repurposable across different fire management phases (e.g. prediction, pre-fire planning, during fire, after-fire, recovery). For example, infrastructure that is vulnerable to wildfire-induced floods identified during the active-fight fighting phase should be easily discoverable to city managers weeks or even months later, when heavy rains on burn areas may trigger catastrophic debris-flow that threaten lives.  Features (e.g. buildings, vegetation patches, ridgelines, etc) identified by AI/ML algorithms from UAS imagery data that are used for mitigation planning should be made discoverable for fire managers making tactical fire-fighting decisions.

How.  The proposed session addresses the following question: how can we apply data and knowledge management technologies to fulfill the needs of wildfire mitigation and response? 

In this session, you will be invited to contribute your expertise to sketch out technical solutions that can be deployed to meet the speakers' stated needs.  Your ideas will be openly accessible to individuals who may use those ideas to apply for ESIP Lab and ESIP FUNding Friday projects.

Agenda
  • [11 am] Workshop begins
  • Introduction
    • Big-picture schematic of how this session relates to data-science topics like AI/ML, semantic technology, graph database, etc.
  • Slido poll: Which of the following wildfire experiences apply to you?
  • [11:10 am] Wildfire problem statement, requirements, and some focus on planning by polygon
    • Everett Hinkley, US Forest Service, Geospatial Management Office National Remote Sensing Program Manager
      • Wildfire Mapping--Leveraging AI/ML for needed improvements: Faster delivery, improved consistency, reduced subjectivity
    • Dave Zader, Wildland Fire Administrator for The City of Boulder, CO Fire Department (retired); Wildlife Fire Policy Committee member for the International Association of Fire Chiefs
      • Wildfire management and planning by polygon, a tool for improved decision-making and resources usage
    • Pier Buttigieg, Helmholtz Metadata Collaboration
      • Representing and aligning knowledge about wildfires - the need and challenge of semantic harmonization
  • [12:05 pm] Slido poll: Rank the following values-at-risk that are important to *YOUR* community: from most important (rank #1) to least important (rank #6)
  • [12:10 pm] Breakouts Part 1
    • Breakout group #1: Knowledge representation for wildfire planning and execution (Focus on Polygons)
    • Breakout group #2: Technological solutions for wildfire planning and execution
  • Short break / transition (10 min)
  • [~12:45 pm] Breakouts Part 2
    • Breakout group #1: Knowledge representation for wildfire planning and execution (Focus on Values-at-Risk)
    • Breakout group #2: Technological solutions for Wildfire Planning and Execution
  • [1:10 pm] Report out from breakout groups
  • [1:20 pm] Wrap up
  • [1:30 pm] Workshop ends

View Notes

Organizers & Speakers
avatar for Brian Wee

Brian Wee

Founder and Managing Director, Massive Connections, LLC
Transdisciplinary scientist invested in the use of environmental data and information for science, education, and decision-making for challenges at the nexus of global environmental change, natural resources, and society. Strategized and executed initiatives to engage the US Congress... Read More →
avatar for Bill Teng

Bill Teng

NASA GES DISC (ADNET)
avatar for Ruth Duerr

Ruth Duerr

Research Scholar, Ronin Institute for Independent Scholarship
avatar for Pier Luigi Buttigieg

Pier Luigi Buttigieg

Digital Architect & Senior Data Scientist, Alfred Wegener Institute / Helmholtz
avatar for Everett Hinckley

Everett Hinckley

Geospatial Management Office National Remote Sensing Program Manager, US Forest Service
avatar for Dave Zader

Dave Zader

International Association of Fire Chiefs


Wednesday July 21, 2021 11:00am - 1:30pm EDT
TBA
 
Thursday, July 22
 

4:00pm EDT

GeoScience Ontology Landscape
Community adoption of some representation of geoscience concepts and relationships is an important step towards streamlining data integration and interoperability of geoscience data across domains. This session is intended to understand the requirements and applications that motivate several geoscience-related ontologies, and to promote conversation on the differences and similarities between them. The session will include overview presentations of some current geoscience and related ontologies: 
1)      OntoGeonous ontology, (Lombardo et al. 2018); an implementation of the GeoSciML conceptual model, with application to geologic maps.
2)      GeoCore ontology, (Garcia, et al.,  2020   https://doi.org/10.1016/j.cageo.2019.104387  ). Geoscience ontology, applications area in petroleum exploration.
3)      GeoScience Ontology; https://github.com/Loop3D/GKM , Developed for Loop3D project to provide background knowledge support for implicit generation of 3D geologic models. (Brodaric, Richard, GSC OFR https://doi.org/10.4095/328296 )
4)   SWEET and  ENVO; these are two widely used ontologies with broad scope; SWEET is managed by and ESIP Cluster, and efforts have been under way to update this ontology and align with ENVO.

The major discussion point is whether the use cases that have motivated these ontologies require different solutions, or if some convergence is possible into a more integrated and harmonized set of shared knowledge representations can be developed to promote interoperability.

View Notes

Organizers & Speakers
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →
avatar for Brandon Whitehead

Brandon Whitehead

environmental data scientist, manaaki whenua -- landcare research
avatar for Luan Fonseca Garcia

Luan Fonseca Garcia

Researcher, UFRGS
I'm a computer scientist focused on the development of ontologies for geosciences.
AM

Alizia Mantovani

Consiglio Nazionale delle Ricerche -IGG



Thursday July 22, 2021 4:00pm - 5:30pm EDT
TBA
 
Friday, July 23
 

3:30pm EDT

Science-on-Schema.org - Submitting the Guidelines for ESIP Assembly Endorsement
This is a working session for members of the Schema.org Cluster and others with interest to finalize and submit the Science-on-Schema.org Guidelines v1.3 for ESIP Assembly Endorsement. There are many tasks involved with preparing a new version of the guidelines - reviewing Pull Requests, committing to Github, preparing DOI metadata, and coordinating all these changes for a successful release at Github and Zenodo. This short session helps us work in real-time through those tasks to ensure all goes smoothly and nothing is missed. Finally, it gives cluster members an opportunity to celebrate their work together in real-time.

View Notes

Organizers & Speakers
avatar for Stephen Richard

Stephen Richard

Geoinformatics consultant, Independent
Stephen Richard is an independent contractor working from Tucson Arizona. He is currently involved in projects developing a cross-domain metadata scheme for describing physical samples (SESAR, iSamples0, metadata for the Deep-time Digital Earth (DDE) program, the CODATA Cross-Domain... Read More →
avatar for Ruth Duerr

Ruth Duerr

Research Scholar, Ronin Institute for Independent Scholarship
avatar for Matt Jones

Matt Jones

Director of Informatics R&D, NCEAS / DataONE / UC Santa Barbara
DataONE | Arctic Data Center | Open Science | Provenance and Semantics | Cyberinfrastructure
avatar for Mark Schildhauer

Mark Schildhauer

Senior Technology Fellow, NCEAS/UCSB
Data semantics, Ecoinformatics training, Arctic data, LTER data, Ecological synthesis
avatar for Adam Shepherd

Adam Shepherd

Technical Director, BCO-DMO, Woods Hole Oceanographic Institution
Architecting adaptive and sustainable data infrastructures.Co-chair of the ESIP schema.org clusterKnowledge Graphs | Data Containerization | Declarative Workflows | Provenance | schema.org
DV

Dave Vieglais

Research Professor, University of Kansas


Friday July 23, 2021 3:30pm - 5:00pm EDT
TBA
 
  • Timezone
  • Filter By Date 2021 ESIP Summer Meeting Jul 19 -23, 2021
  • Filter By Venue Venues
  • Filter By Type
  • Break
  • Breakout Session
  • Hackathon
  • Networking
  • Plenary
  • Workshop
  • Keywords
  • Collaboration Area Tags


Filter sessions
Apply filters to sessions.