The Data Centre for deep geothermal energy (CDGP, https://cdgp.u-strasbg.fr) was launched in 2016 by the LabEx G-EAU-THERMIE PROFONDE (http://labex-geothermie.unistra.fr) to preserve, archive and distribute data acquired on geothermal sites in Alsace. More than 30 years of data were collected on the Soultz-sous-Forêts research site, providing an inestimable legacy wealth.
CDGP is part of the EPOS european platform EPOS-IP Anthropogenic Hazards (TCS-AH, https://tcs.ah-epos.eu/). It provides Episodes that are a set of comprehensive data of a geophysical (e.g. deformation) process, induced or triggered by technological activity, which under certain circumstances can become hazardous for people, infrastructure and environment. At CDGP, datasets related to stimulation episodes of 2004 and 2005 have been recently added. Episodes from 1993, 2000 and 2003 are also available, and datasets related to 2010 circulation will be available soon. The EPOS TCS-AH brings together a broad community interested in Anthropogenic Hazards related to induced seismicity. It is designed as a functional e-research infrastructure that provides access to a large set of relevant data and allows free experimentations in a virtual laboratory, promoting interdisciplinary collaborations between stakeholders (the scientific community, industrial partners and society).
From the very start of the repository, we decided to follow international requirements for data management, and used FAIR recommendations to distribute Findable, Accessible, Interoperable and Reusable data.
Legacy or more recent, academic or industrial, the data consist mainly of seismological and hydraulic data acquired during stimulation and circulation phases. They are collected from data providers and curated, converted into standardized (community-shared) formats and documented with metadata. Data are identified with a DOI, findable via a local geo-catalogue; metadata are also harvested by the EPOS TCS-AH platform.
As industrial partners provide some data, we set an AAAI (authentication, authorization, and accounting infrastructure); data are distributed in respect of (1) affiliation of the user (academic, industrial, etc.) and (2) distribution rules set by data providers. Interoperability is promoted with use of open or community-shared data formats: SEED, csv, pdf, etc. Open data are granted with a Creative Commons license (CC-BY or CC-BY-NC) to allow their broad use.
Despite the recent setup of CDGP, some data are vintage: we had to deal with obsolete tapes and formats to convert and archive on modern electronic supports. Identification of owners is sometimes difficult, but necessary to obtain the distribution rules. A Data Management Plan (DMP) defines all tasks and rules used to perform these tasks. We are on the track to prepare the CoreTrustSeal certification.
Examples of the use of the distributed data will be illustrated. One of them is the reinvestigation of the micro-seismicity development during the stimulation of 1993 that shows that in areas where aseismic slip on pre-existing faults has been evidenced, only small rupture sizes are observed whereas in part of the reservoir where seismicity is related to the creation of new fractures, a wider distribution and larger rupture sizes are promoted. Implications exist for detecting the transition between events related to pre-existing faults and the onset of fresh fractures.