Making data available – even well-managed data – does not mean that data can or will be reused. We argue that data must first be discovered, evaluated, and understood before they can be reused for particular purposes.
These practices of data discovery require different types of work. Data creators must document and share their data; data curators and repository managers must clean, enrich and select data; and ‘data seekers’ must find and make sense of data for reuse.
In other words, data discovery involves processes of both making data discoverable, as well as discovering data. Both of these perspectives need to be taken into account in order to facilitate the reuse of data.
We recently conducted a workshop for the Data Curation Network (DCN), a community of data curators from various academic and non-profit data repositories in the United States, to explore how understanding data discovery practices can inform curatorial work. Taking our recent short book as a starting point, we presented theoretical and empirical work exploring the concepts of ‘data needs’, data-centric sensemaking, and different conceptions of data quality. We discussed how these are situated within different types of data communities and hence not always easy to define.
In interactive group discussions, we discussed how data curators and repositories choose where to focus their curation and design efforts. Themes from the discussions included the question of meaningful metrics to measure data reuse, required curation skills, and how repository design facilitates reuse. Participants highlighted the human side of managing data, as well as barriers to capturing feedback from data reusers.
The importance of building relationships with data communities who both produce and reuse data as well as how to communicate the value of data curation and reuse to different stakeholders emerged as important topics. In light of data descriptions, monitoring, and curation, we also speculated about the promises and risks of AI solutions in this space.
The workshop provided an opportunity for data curators to learn a bit more about our research, but it also gave us insight into curation practice which we will take forward in our work.
Reseach Group Visualization & Data Analysis
University of Vienna
Sensengasse 6, 1090 Vienna