Climate Data Processing Software

The following is by Dennis Shea (NCAR)

A common question: "What is the best software to use for climate data processing?"



There is no simple answer. All software tools and languages have strengths and weaknesses. For large scale data processing on a variety of data sets in assorted data formats and differing project requirements, it is unlikley that a perfect tool or language exits. Often, a combination of software tools and languages will be needed. Climate data processing involves 3 components: (1) file handling (I/O); (2) processing (data manipulation and computations), and (3) graphics (visualization). There are three different software categories used for climate data processing and visualization: (1) compiled languages (eg., fortran, C, C++); (2) command line operators and viewers (NCOCDOncviewpanoply); (3) interpreted languages (NCLGrADSFerretRGeneric Mapping Tools (GMT)Perl Data Language (PDL) Python [CDAT/PyNIO/PyNGL/Numpy/matplotlib], and the commercial products Matlab, IDL and, to a lesser extent, PV-Wave).   

 

Compiled languages can be much faster than the interpreted languages for large computation bound tasks. Language compilers analyze and optimize code and create machine specific execution instructions. As a result, they can perform looping (iterations) faster.  For example, weather forecast and climate models are often written in fortran (usually f90). However, compiled languages lack built-in support for accessing the different data formats used in climate studies and they have no builtin graphics.  Further, programming in compiled languages can be tedious.



Command line operators (CLOs) are tools that can be executed directly at the system prompt line. There are many NCO and CDO operators and there is some functional overlap. Each operator is designed to perform a specific task efficiently. For example, the NCO operator "ncra" can input one or more netCDF files, compute time averages (means) of all or selected variables on the file(s) and save the results to a netCDF file. It is not uncommon to use an NCO/CDO operator to accomplish a specific task and, then, feed the output file to a different CDO/NCO operator. Ncview is a commonly used visual browser for netCDF format files.



Interpreted languages are general purpose software tools. They include support to read and write assorted data formats; have many built-in computational functions; and, create visualizations. These tools have all the capabilities of the CLOs and ncview and can do much more. However, they do require users to enter commands interactively or via a script.

Within NCAR's Climate Analysis Section, tera-bytes of model output, observationally based data sets like the reanalysis products (ERA-Interim, MERRA, NCEP-NCAR, JRA, ...) and satellite data are analyzed, evaluated and used as the basis for publications. The data are in a variety of formats, including: netCDF-3/4, GRIB-1/2, HDF4, HDF4-EOS, HDF5, HDF5-EOS. The primary post-processing tools used are NCL and the NCO. In some cases, data created by NCL/NCO are input to R for certain statistical methods not available within NCL. Depending upon the application, the CDO, IDL, Python and Matlab are also used.

 

Recommendation: If a desired operation can be performed by a CLO (CDO or NCO), we recommend that the appropriate operator be used. Why? Only because it can be more convenient since no programming is necessary. However, like programming in compiled or interpreted languages, it can sometimes require users to experiment with the appropriate options.

Cite this page

National Center for Atmospheric Research Staff (Eds). Last modified 03 Nov 2017. "The Climate Data Guide: Climate Data Processing Software." Retrieved from https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/climate-data-processing-software.

Acknowledgement of any material taken from this page is appreciated. On behalf of experts who have contributed data, advice, and/or figures, please cite their work as well.