Journal cover Journal topic
Biogeosciences An interactive open-access journal of the European Geosciences Union
Journal topic
Discussion papers
© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.
© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

Submitted as: research article 25 Jul 2019

Submitted as: research article | 25 Jul 2019

Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Biogeosciences (BG).

A robust data cleaning procedure for eddy covariance flux measurements

Domenico Vitale1, Gerardo Fratini2, Massimo Bilancia3, Giacomo Nicolini1, Simone Sabbatini1, and Dario Papale1,4 Domenico Vitale et al.
  • 1Department for Innovation in Biological, Agro-Food and Forest Systems (DIBAF), University of Tuscia, via San Camillo de Lellis, 01100 Viterbo, Italy
  • 2LI-COR Biosciences Inc., Lincoln, Nebraska 68504, USA
  • 3Ionian Department of Law, Economics and Environment, University of Bari Aldo Moro, Via Lago Maggiore angolo ViaAncona, 74121 Taranto, Italy
  • 4Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC), 01100 Viterbo, Italy

Abstract. Integration of long-term eddy covariance (EC) flux datasets over regional and global scales requires high degree of comparability of flux data measured at different stations, which entails not only similar-performing instrumentation and their appropriate deployment, but also standardized and reproducible data processing and quality control (QC) procedures. This work focuses on the latter topic and, in particular, on the development of a robust data cleaning procedure. The proposed strategy includes a set of tests aimed at detecting the presence of specific sources of systematic error in the data, as well as an outlier detection procedure aimed at identifying aberrant flux values. Results from tests and outlier detection are integrated in such a way as to leave a large degree of flexibility in the choice of tests and of test threshold values without losing in efficacy and, at the same time, to avoid the use of subjective criteria in the decision rule that specifies whether to retain or reject flux data of dubious quality. Tests development was rooted on advanced time series analysis techniques that consider the stochastic properties of both raw, high-frequency EC data and of flux time series, such as complex dynamics, high persistence and possible presence of stochastic trends. The performance of each proposed test is evaluated by means of Monte Carlo simulations on synthetic datasets, whereas their impact on observed times series was evaluated on a selection of EC datasets distributed by the ICOS research infrastructure. Simulation results evidenced that the proposed tests have a better performance compared to alternative existing QC routines, showing lower false positive and false negative error rates. The application of the proposed tests on real datasets led to an effective cleaning of EC flux data retaining the maximum number of good quality data. Although there is still room for improvement, in particular with the development of new QC tests, we think that the proposed data cleaning procedure can serve as a basis towards a unified QC strategy for EC datasets which i) includes only completely data-driven routines and is therefore suitable for automatic and centralized data processing pipelines, ii) guarantees results reproducibility and iii) is flexible and scalable to accommodate new and additional tests that makes the approach also suitable for other greenhouse gases.

Domenico Vitale et al.
Interactive discussion
Status: final response (author comments only)
Status: final response (author comments only)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Domenico Vitale et al.
Domenico Vitale et al.
Total article views: 463 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
347 113 3 463 23 7 9
  • HTML: 347
  • PDF: 113
  • XML: 3
  • Total: 463
  • Supplement: 23
  • BibTeX: 7
  • EndNote: 9
Views and downloads (calculated since 25 Jul 2019)
Cumulative views and downloads (calculated since 25 Jul 2019)
Viewed (geographical distribution)  
Total article views: 384 (including HTML, PDF, and XML) Thereof 381 with geography defined and 3 with unknown origin.
Country # Views %
  • 1
No saved metrics found.
No discussed metrics found.
Latest update: 12 Dec 2019
Publications Copernicus
Short summary
This work describes a data cleaning procedure for the detection of eddy covariance fluxes affected by systematic errors. We believe that the proposed procedure can serve as a basis toward a unified quality control strategy suitable for the centralized data processing pipelines, where the use of completely data-driven and scalable procedures that guarantee high quality standards and reproducibility of the released products constitutes an essential prerequisite.
This work describes a data cleaning procedure for the detection of eddy covariance fluxes...