Data and other research artifacts such as software, samples, and ontologies need to be recognized as first-class objects in scientific discourse. As such, they must be fairly and appropriately credited. The Research Artifact Citation Cluster has been exploring what roles should be credited for different artifacts and how those roles should be credited. Similar work has been done for literature (notably through CRediT), but we have found that those approaches are only partially useful for data and other artifacts. We have learned that there are critical roles that deserve greater recognition and that citation is only one, limited mechanism to do so. Often people say that we need something like film credits — the long list of people and roles listed at the end of a movie — to describe the work that goes into producing a useful dataset. What is sometimes lost in that analogy is how contested and highly negotiated film credits are. Defining roles and credit is complex and sensitive.
In this session, we will review the work of the group over the last year, the lessons learned, and initial conclusions on what roles are important and how they should be credited for five major research artifacts: data, software, samples, semantic objects, and complex learning objects. We will then have a set of breakouts working to address the specific issue of how credit for data should be characterized. We will explore multiple role taxonomies, including CRediT, ISO19115, Data Cite, and maybe Rescognito. The goal is to develop or adopt a defined and consistent set of roles that can be acknowledged and captured in a citation as well as other places such as a data set landing page, documentation, and other places is the metadata.
View Notes