Analysis Tools and Methods

netCDF Overview

The following is by Dennis Shea (NCAR):

NetCDF (Network Common Data Form) is designed to facilitate access to array-oriented scientific data. NetCDF is a portable, "self-describing" format. This means that there is a header which describes the layout of the rest of the file, in particular the data arrays, as well as arbitrary file metadata in the form of name/value attributes. The additional information about a file or variable is commonly called "meta data" (information about the data). In some cases, the file's contents were written using a standard netCDF convention which ensures users and automatic software that certain 'rules' have been followed when creating the file. The two most commonly used netCDF conventions for climate data are the COARDS and CF conventions. The COARDS convention, created in 1995, is simple and easy to read. The CF convention was created in 2003 to address the evolution of models and data sets. It is a superset of the COARDS convention. It is quite precise in language. Unfortunately, the CF documentation can read like a legal document. A netCDF file can be checked for CF compliance by clicking the NCAS-CMS link: cf-checker.

Many commercial and public software tools support netCDF. Further, there are APIs (Application Programmer Interfaces) for a number of compiled languages like fortran, C and C++.

netCDF is the format most commonly used for climate model generated data. The following is a sample 'dump' of a typical netCDF file created by a climate model. It has several components: (a) dimension names; (b) dimension sizes of the dimension names; (b)  the variables on the file which often include additional information about each variable and temporal/spatial coordinates; and (c)  global attributes which contain information about the file's contents. This sample file contains coordinate variables (CVs) which are defined as one-dimensional variables with the same name as a dimension. CVs should not have any missing data (for example, no _FillValue or missing_value attributes) and must be strictly monotonic (values increasing or decreasing). In the following, time(time), lat(lat), lon(lon) and lev(lev) are classified as coordinate variables while PS  and T are classified as variables.

The UNLIMITED dimension size is used when a variable can grow to any length along that dimension. The unlimited dimension index is like a record number in conventional record-oriented files. Some software tools (eg, the netCDF Operators) require an unlimited dimension to perform certain operations. It is recommended that netCDF files be created with the unlimited dimension.

There are two versions of netCDF-{3,4}. The netCDF-3 data model was used for many years and is often referred to as netCDF-classic. However, as datasets became larger, the grids more complicated, and user desires for more flexibility developed, the limitations of the netCDF-3 data model became apparent. For example, netCDF-3 does not support compression, string variables or parallel processing. To address these issues, netCDF-4 (nc4) was created. An nc4 files is a hybrid: a subset of HDF5 with netCDF-3 style API interfaces used to create and access the data.  Hence, a nc4 file is actually a HDF-5 file 'under-the-hood.' Many of the commonly used post-processing tools can readily handle netCDF-4 files.

 


        lat = 96 ;                                                                                                sizes of named dimensions                          
        lon = 144 ;
        lev = 30 ;
        time = UNLIMITED ; // (1 currently)

variables:        

        double time(time) ;                                                                               coordinate variable of type 'double'
                time:long_name = "time" ;                                                               attribute (COARDS, CF)
                time:units = "days since 0001-01-01 00:00:00" ;                               attribute (COARDS, CF; udunits)
                time:calendar = "noleap" ;                                                                attribute (COARDS, CF)
                time:bounds = "time_bnds" ;                                                            attribute (COARDS, CF)


        double lat(lat) ;                                                                                     coordinate variable
                lat:long_name = "latitude" ;
                lat:units = "degrees_north" ;
        double lon(lon) ;                                                                                    coordinate variable
                lon:long_name = "longitude" ;
                lon:units = "degrees_east" ;
        float lev(lev) ;                                                                                        coordinate variable of type float
                lev:long_name = "hybrid level at midpoints (1000*(A+B))" ;
                lev:units = "level" ;
                lev:positive = "down" ;

        float PS(time, lat, lon) ;                                                                          variable
                PS:units = "Pa" ;
                PS:long_name = "Surface pressure" ;
                PS:cell_method = "time: mean" ;

        float T(time, lev, lat, lon) ;
                T:units = "K" ;
                T:long_name = "Temperature" ;
                T:cell_method = "time: mean" ;

        integer LSMASK(lat, lon) ;
                LSMASK:information = "0-sea; 1-land, 2-ice shelf; 3-inland water ;
                LSMASK:long_name = "Land-Sea mask" ;

// global attributes:
                :Conventions = "CF-1.0" ;

                :Comment = "This is a sample comment designed to illustrate that arbitrary text can be included.
                :source = "FOO" ;
                :case = "FOO_uw02r" ;
                :title = "Sample FOO File"

 

Often, netCDF files contain coordinate variables and, additionally,  arrays that contain coordinates. As previously noted, the former must be monotonic and one-dimensional. The latter are multidimensional numeric arrays that describe (say) curvilinear grids. An example:

netcdf FOO2 {
dimensions:
        nlat = 384 ;                                                                                  <=== sizes of named dimensions
        nlon = 320 ;
        time = UNLIMITED ; // (1200 currently)                                        <===1200 times on file
        z_t = 40 ;

        double time(time);                                                                      <=== coordinate variable (one dimensional; monotonic
                time:long_name = "time" ;                                                                                             named dimension same as variable name0
                time:units = "days since 0000-01-01 00:00:00" ;
                time:bounds = "time_bound" ;
                time:calendar = "noleap" ;

        double z_t(z_t) ;                                                                         <=== coordinate variable
                z_t:long_name = "depth from surface to midpoint of layer" ;   
                z_t:units = "centimeters" ;
                z_t:positive = "down" ;

        float TLAT(nlat, nlon) ;                                                                 <===array that contains coordinates              
                TLAT:long_name = "array of t-grid latitudes" ;
                TLAT:units = "degrees_north" ;

        float TLONG(nlat, nlon) ;                                                              <===array that contains coordinates
                TLONG:long_name = "array of t-grid longitudes" ;                  
                TLONG:units = "degrees_east" ;

        float TEMP(time,z_t, nlat, nlon) ;                                                <===variable

                TEMP:long_name = "Potential Temperature" ;
                TEMP:units = "degC" ;
                TEMP:coordinates = "TLONG TLAT z_t time" ;                      <=== CF convention 'coordinates' attribute
                TEMP:cell_methods = "time: mean" ;
                TEMP:_FillValue = 9.96921e+36f ;
                TEMP:missing_value = 9.96921e+36f ;

Cite this page

National Center for Atmospheric Research Staff (Eds). Last modified 31 Jul 2014. "The Climate Data Guide: netCDF Overview." Retrieved from https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/netcdf-overview.

Acknowledgement of any material taken from this page is appreciated. On behalf of experts who have contributed data, advice, and/or figures, please cite their work as well.