CMIP6 Data Citation and Long-Term Archival
- cmip6cite.wdc-climate.de -¶
- Table of contents
- CMIP6 Data Citation and Long-Term Archival - cmip6cite.wdc-climate.de -
I. Data Citation / Data References¶
Motivation and Concept¶
For the evolving CMIP6 data disseminated by ESGF (Earth System Grid Federation) an early citation reference has been requested by WGCM CMIP6. These data references have to meet the requirements of Force 11’s ‘Joint Declaration of Data Citation Principles’ . The mayor publishers in Earth Sciences have signed the Commitment to Enabling FAIR Data in the Earth, Space, and Environmental Sciences and are changing their author guidelines, accordingly. For the CMIP6 data subset transferred in the long-term archive of the IPCC DDC (Data Distribution Centre) AR6, the established DataCite data publication is planned. The data citations are provided on model/MIP and simulation granularities to meet different citation requirements in literature.
Details of the concept are described in the WIP White Paper .
Survey Results (2021): https://doi.org/10.5281/zenodo.5534136 .
Figure 1: Citation Counts by publisher (Source: https://corpus.datacite.org/dashboard, 2024-09-25).
Statistics of DOI registrations: http://bit.ly/CMIP6_DOI_Statistic
DataCite statistics for repository ESGF: https://commons.datacite.org/repositories/8orcv25
IPCC WGI AR6 Usage of CMIP6 Data: https://bit.ly/CMIP6_in_IPCC
Information for Data Users¶
Find Data References (see also blob post at https://cmip6cite.blogspot.com/2020/04/how-to-find-cmip6-data-citations.html):Core citation information is displayed in the ESGF portal including a link to the landing page with complete citation information. CMIP6 data citations are also discoverable in
- CMIP6 Data Citation Search: http://bit.ly/CMIP6_Citation_Search
- CMIP6 Data Citation Search API (documentation: https://www.wdc-climate.de/ui/cmip-api-docs/): http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search
Sample Calls:
- Data references on experiment (fine) granularity for a given source_id and activity_id:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=CMIP&sourceId=HadGEM3-GC31-MM&granularity=exp- Update data references of request 1. with data references published at or after a given date:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=CMIP&sourceId=HadGEM3-GC31-MM&granularity=exp&gePublicationDate=2020-01-01- Data references on model/MIP (coarse) granularity contributing to an activity_id available at a given snap-shot date:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=ScenarioMIP&granularity=model&lePublicationDate=2020-03-31
- DataCite Commons
Query example for MPI-M: https://commons.datacite.org/doi.org?query=client.uid:dkrz.esgf+MPI-M - Google's Dataset Search .
The auto-completion supports DRS_ids. - CMIP6 overview tables per MIP and per experiment
Information for Modeling Centers¶
- General CMIP6 Guidance document for modeling centers: https://pcmdi.llnl.gov/CMIP6/Guide/modelers.html
- How to get started: WIP_letter_to_provide_citation_managers
- Maintaining citation information (see User Guide [PDF]):
- GUI for insert/update of core information and model/MIP citation information: http://cera-www.dkrz.de/citeXA
The GUI is based on Oracle Application Express (APEX) and makes intensive use of 'Interactive Grids' (see chapter 2 of APEX End User's Guide)
- (shut-down on 2022-11-01; outdated: API client for script-based insert/update of experiment citation information)
- Best Practices
- FaQ
- Quality and curation measures: https://bit.ly/CMIP6_Citation_Quality
- GUI for insert/update of core information and model/MIP citation information: http://cera-www.dkrz.de/citeXA
Information for ESGF Data Node Managers and other infrastructure developers¶
Available APIs are documented at https://www.wdc-climate.de/ui/cmip-api-docs/. This partly replaces the slightly outdated specification documentation of all technical components of the citation service and the long-term archival in the DDC: http://bit.ly/1XsVOoz [outdated].
Data Node Managers should check the completion of the data reference information. The citation service provides a specific API for this:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations?
Other APIs are provided for machine-access of the CMIP6 data citations.
CMIP6 Data Citation Service Development¶
Figure 2: Citation Workflows and API for Data Citation Information Access.
The integration of the service in the ESGF was co-ordinated by the ESGF-QCWT (Quality Control Working Team).
Components:
- Citation Information collected from Data Creators via Oracle APEX GUI: http://cera-www.dkrz.de/citeXA and API client (User Guide [PDF])
- Citation Information Storage in Oracle Database and HTML Landing Page:
landing page example with for CMIP6: http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.CMIP.IPSL.IPSL-CM6A-LR - API for Citation Information Access with Display Option (HTML Landing Page) and Data Option (DataCite XML, JSON):
XML example for CMIP6: http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.CMIP.IPSL.IPSL-CM6A-LR.xml
JSON example for CMIP6: http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.CMIP.IPSL.IPSL-CM6A-LR.json
API Client (python): https://swiftbrowser.dkrz.de/public/dkrz_11279a46963f4201bae564a253d528cc/Citation_API_Client/ - Citation Information provided on OAI Server as DataCite XMLs
II. CMIP6 Long-Term Archival and IPCC DDC AR6¶
IPCC DDC AR6 Reference Data Archive (http://www.ipcc-data.org)¶
The long-term archival in the DDC adds a level of curation and long-term availability to the data according to the TRUST principles (Transparency, Responsibility, User Focus, Sustainability, Technology; Lin et al., 2020, https://doi.org/10.1038/s41597-020-0486-7). This ensures FAIR data remains FAIR. It implements the IPCC FAIR Guidelines (Pirani et al., 2022). Tasks are:- Metadata Enrichment with information on model, simulation, experiment, errata, citation, quality based on the CMIP6 Controlled Vocabulary and ancillary metadata resources available via the ESGF index
- Long-Term Archival (LTA) of CMIP6 snapshot data at DKRZ and DataCite DOI registration
- Technical Quality Control checks for
- intermediate datasets
- Quality_Checks of CMIP6 input datasets - Transfer of LTA data into IPCC DDC AR6 and documentation on web pages
- An overview over the DDC at DKRZ activities is provided at: https://ipcc.wdc-climate.de
- Data download statistics are provided at: https://cera-www.dkrz.de/ui/statistics_index (API access is documented at: https://www.wdc-climate.de/ui/cmip-api-docs/ )
- Guidance documents: https://zenodo.org/communities/ipcc-ar6/
- Overview over software and data used in the IPCC WGI AR6: https://github.com/IPCC-WG1
- DDC Data Catalog: https://ipcc-browser.ipcc-data.org/
- Documentation of the archival process and the content of the AR6 Reference Data Archive: https://github.com/IPCC-WG1/DDC-AR6-CMIP6-Data-Archival
IPCC DDC support of IPCC AR6 Working Groups and authors¶
The DDC supports the IPCC authors during the preparation of the AR6 via:- DDC Support Activities
- Virtual Workspace at DKRZ
- Additional non-CMIP6 input/source data, which is not long-term archived in a TRUSTed repository can be archived in the IPCC DDC as supplementary data.
(CMIP6 input data for AR6: initial data content suggestion from 2017-06-12: http://bit.ly/2oNXdNs ; NOAA set up another spreadsheet collecting data requirements from IPCC WGI chapters: https://goo.gl/tVaGko )
References¶
Publications¶
P. J. Durack, K. E. Taylor, V. Eyring, S. K. Ames, T. Hoang, D. Nadeau, C. Doutriaux, M. Stockhause, P. J. Gleckler (2018): Toward standardized data sets for climate model experimentation. Eos, 99. doi:10.1029/2018EO101751 . Published on 02 July 2018.
D. Lin, J. Crabtree, I. Dillo et al. (2020): The TRUST Principles for digital repositories. Sci Data 7, 144. doi:10.1038/s41597-020-0486-7 .
M. Mizielinski, M. S., Durack, P. J., Taylor, K. E., Dunne, J., Ellis, D., Juckes, M., Stockhause, M., & Turner, B. (2024): CMIP6Plus Whitepaper. Zenodo. https://doi.org/10.5281/zenodo.13768495 .
R. Petrie, Denvil, S., Ames, S., Levavasseur, G., Fiore, S., Allen, C., Antonio, F., Berger, K., Bretonnière, P.-A., Cinquini, L., Dart, E., Dwarakanath, P., Druken, K., Evans, B., Franchistéguy, L., Gardoll, S., Gerbier, E., Greenslade, M., Hassell, D., Iwi, A., Juckes, M., Kindermann, S., Lacinski, L., Mirto, M., Nasser, A. B., Nassisi, P., Nienhouse, E., Nikonov, S., Nuzzo, A., Richards, C., Ridzwan, S., Rixen, M., Serradell, K., Snow, K., Stephens, A., Stockhause, M., Vahlenkamp, H., and Wagner, R. (2021): Coordinating an operational data distribution network for CMIP6 data, Geosci. Model Dev., 14, 629–644, doi:10.5194/gmd-14-629-2021 .
Anna Pirani, Andrés Alegria, Alaa Al Khourdajie, Wawan Gunawan, José Manuel Gutiérrez, Kirstin Holsman, David Huard, Martin Juckes, Michio Kawamiya, Nana Klutse, Volker Krey, Robin Matthews, Adam Milward, Charlotte Pascoe, Gerard van der Shrier, Alessandro Spinuso, Martina Stockhause, Xiaoshi Xing (2022): The implementation of FAIR data principles in the IPCC AR6 assessment process. Zenodo. https://doi.org/10.5281/zenodo.6504469 .
Pirani, A., Cammarano, D., Fisher, E., Krüss, B., Matthews, R., Pascoe, C., Sitz, L., & Stockhause, M. (2022): Experience in the Implementation of FAIR Data Principles in the WGI AR6 Assessment (Version 1). Zenodo. https://doi.org/10.5281/zenodo.6992173 .
S. Stall, G. Bilder, M. Cannon, N. C. Hong, S. Edmunds, C. C. Erdmann, M. Evans, R. Farmer, P. Feeney, M. Friedman, M. Giampoala, R. B. Hanson, M. Harrison, D. Karaiskos, D. S. Katz, V. Letizia, V. Lizzi, C. MacCallum, A. Muench, K. Perry, H. Ratner, U. Schindler, B. Sedora, M. Stockhause, R. Townsend, J. Yeston, T. Clark. Journal Production Guidance for Software and Data Citations. Sci Data 10, 656 (2023). https://doi.org/10.1038/s41597-023-02491-7
M. Stockhause, Ames, S., Dingley, B., Lawrence, B., Liang, H.-C., Liu, Y., Parton, G., & Radhakrishnan, A. (2024): Recommendations for a sustainable data citation service for CMIP. Zenodo. https://doi.org/10.5281/zenodo.13748704 .
M. Stockhause (2021): CMIP6 Citation Service Survey Results. Zenodo. doi:10.5281/zenodo.553413 .
M. Stockhause, M. Juckes, R. Chen, W. Moufouma Okia, A. Pirani, T. Waterfield, X. Xing, R. Edmunds (2019): Data Distribution Centre Support for the IPCC Sixth Assessment. Data Science Journal, 18(1), p.20. doi:10.5334/dsj-2019-020 .
M. Stockhause and M. Lautenschlager (2022): Twenty-five years of the IPCC Data Distribution Centre at the DKRZ and the Reference Data Archive for CMIP data, Geosci. Model Dev., 15, 6047–6058, doi:10.5194/gmd-15-6047-2022 .
M. Stockhause and M. Lautenschlager (2017): CMIP6 Data Citation of Evolving Data. Data Science Journal. 16, p.30. doi:10.5334/dsj-2017-030 .
M. Stockhause, F. Toussaint, M. Lautenschlager (2015): CMIP6 Data Citation and and Long-Term Archival. WIP white paper. Zenodo. doi:10.5281/zenodo.35178 .
M. Stockhause, H. Höck, F. Toussaint, and M. Lautenschlager (2012): Quality assessment concept of the World Data Center for Climate and its application to CMIP5 data. Geosci. Model Dev., 5, 1023-1032. 2012. doi:10.5194/gmd-5-1023-2012 .
Intergovernmental Panel on Climate Change. (2023): TG-Data Recommendations for AR7 (1.0). Zenodo. https://doi.org/10.5281/zenodo.10059282 .
Talks¶
M. Stockhause, Pirani, A., Sitz, L., Krüss, B., Pascoe, C., MacRae, M., Anderson, E., & Fisher, E. (2023, October 25): Implementation of the IPCC FAIR Guidelines into the Sixth Assessment Report (AR6): benefit, challenges and recommendations for AR7. international Data Week 2023 (IDW23), Salzburg, Austria. Zenodo. https://doi.org/10.5281/zenodo.10039597 .
M. Stockhause, Matthews, Robin, Pirani, Anna, Treguier, Anne Marie, and Yelekci, Ozge (2021): CMIP6 data documentation and citation in IPCC's Sixth Assessment Report (AR6). Presented at the vEGU2021, Zenodo. doi:10.5281/zenodo.4837277 .
M. Stockhause, A. Al Khourdajie, A. Alegria, R. Chen, D. Huard, M. Juckes, C. Pascoe, A. Pirani, R. Matthews, E. Poloczanska, S. Vicuna, X. Xing, Ö. Yelekçi (2020): IPCC Sixth Assessment approaches towards FAIR data and an enhanced data reuse. AGU 2020, virtual. Earth and Space Science Open Archive. doi:10.1002/essoar.10504799.1 .
M. Stockhause, M. Lautenschlager (2019): The importance of data references in CMIP6 data usage and IPCC climate assessments. CMIP6 Model Analysis Workshop, 25-28 March 2019, Barcelona, Spain. Zenodo. doi:10.5281/zenodo.2621084 .
M. Stockhause, M. Lautenschlager (2017): Data citation in climate sciences: Improvements in CMIP6 compared to CMIP5. AGU 2017, New Orleans, USA. Zenodo. doi:10.5281/zenodo.1148953 .
M. Stockhause, M. Lautenschlager (2017): CMIP6 Data Citation and IPCC Data Distribution Centre Services. PICO presentation at EGU 2017, Vienna, Austria. Zenodo. doi:10.5281/ZENODO.5696466 .
M. Stockhause (2016): Long-term archiving Workflow for CMIP6. Zenodo. doi:10.5281/ZENODO.157358 .
K. Berger, G. Levavasseur, M. Stockhause, and M. Lautenschlager (2015): Integration of external metadata into the Earth System Grid Federation (ESGF). PICO presentation at EGU 2015, Vienna, Austria. http://presentations.copernicus.org/EGU2015-8404_presentation.pptx .
M. Stockhause (2014): Long-term archiving workflow in. CMIP5 - a first review. Zenodo. doi:10.5281/zenodo.29104 .
Materials¶
Data Citation k204082_ESGF2015_DCLTA_poster.pdf (ESGF Conference 2015, Monterey, CA, USA.) |
Data Citation ESGF2016_DC_poster.pdf (ESGF Conference 2016, Washington DC, USA.) |
IPCC DDC / LTA ESGF2016_DDC_poster.pdf (ESGF Conference 2016, Washington DC, USA.) |
Data Citation IPCC DDC Poster_CMIP6WS_datacitation_stockhause_lautenschlager.pdf Poster_CMIP6WS_IPCCDDC_Juckes_etal.pdf (CMIP6 Model Analysis Workshop, 25-28 March 2019, Barcelona, Spain.) |
Blog¶
CMIP6 Citation Service: https://cmip6cite.blogspot.com/
OGC Climate Change Special Session: https://www.ogc.org/blog/4653
Links¶
CMIP5 QC: http://cmip5qc.wdc-climate.de
CMIP6: https://pcmdi.llnl.gov/CMIP6/
CMIP6 Citation (this page): http://cmip6cite.wdc-climate.de
CMIP6 Data Citation Search: http://bit.ly/CMIP6_Citation_Search
CMIP6 DOI Statistics: http://bit.ly/CMIP6_DOI_Statistic
DataCite: http://datacite.org
ESGF: https://esgf.llnl.gov/
input4MIPs: https://pcmdi.llnl.gov/mips/input4MIPs/
IPCC AR6 FAIR: https://zenodo.org/communities/ipcc-ar6
IPCC AR6 WGI Data and Code: https://github.com/IPCC-WG1
IPCC DDC: http://www.ipcc-data.org
IPCC DDC at DKRZ: http://ipcc.wdc-climate.de
WGCM CMIP6: http://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6
Acknowledgement¶
Martina Stockhause thanks Sasha Aimes and the Lawrence Livermore National Laboratory (LLNL) for providing access to LLNL's ESGF Node in mid 2024 on short notice, thus enabling the continuation of the CMIP6 Citation Service as joint DKRZ-LLNL Service until coordinated finalization at the end of 2024.