cdfsst - convert Reynolds SST binary data to candis format
cdfsst reference-year reference-month reference-day mask_file_name
Cdfsst converts a Reynolds SST data file from binary format to Candis. The data consist of weekly averaged sea-surface temperatures over the whole globe on a 1 degree by 1 degree grid defined on grid centers. The Candis file has one variable slice per week of data.
The first three input parameters specify the year (4 digits), month, and day which marks the zero Julian day in the output. The fourth parameter is the name of the file which contains the land-sea mask as a function of latitude and longitude. A copy of this file, called ‘‘land_sea_mask.dat’’ comes with the Candis distribution and is located in candis/general.
Each variable slice contains a starting date, a reference date, and a Julian date relative to the reference date. There is also the number of days included in the average. These are scalar fields. The sea surface temperatures are contained in a 2-dimensional variable. The dimensions are lat and lon.
The static slice contains a variable called lsmask. This variable is a two-dimensional variable whose dimensions are lat and lon. The value of lsmask is 1 if the corresponding latitude and longitude is over water and 0 if it is over land. This variable is derived from the land_sea_mask.dat file. The Reynolds SST data contain sea surface temperatures over land. This is to facilitate interpolation. However, if so desired, the land values may be set to zero by multiplying lsmask by the sst field with cdfmath.
jay:~/$ cdfsst 1997 4 1 land_sea_mask.dat < infile > outfile
Cdfsst is expecting the data in a very particular binary format (see below). If the input file is not in this format or has been truncated in some way, cdfsst will fail with an error message.
Since there is scant other documentation as to the format of the binary data a few words said here about the format will be a great convenience to other programmers attempting to modify this program in the future. The data was written by a Fortran program running on a big-endian machine. The fortran program introduced many extraneous bytes of formatting into the data.
The first four bytes of each week’s data is a number (32) specifying how many bytes follow. After these four bytes there are 32 bytes which contain the begin date (year, month, day), the end date (year, month, day), the number of days, and the index. These are long integers. Then there is another set of four bytes specifying that 32 bytes preceded it.
The next four bytes specify another number (129,600) of bytes to follow. Then follows 129,600 bytes. This chunk contains the actual temperatures as short ints (2 bytes each) one for each latitude and longitude (360*180 = 64,800). The temperatures are in units of degrees celsius * 100. Following this piece of actual data is yet one more set of four bytes again telling you that 129,600 bytes of data just went by.
Whenever any manipulation is made with the binary data files the 16 bytes per weekly data slice of formatting must be accounted for or the actual data will come out horribly mutilated. In other words, to read data from the binary files the following procedure is used:
Read four bytes. Throw them away. This is just formatting.
Read 32 bytes. This is data in long integer format. (four bytes each)
Read four bytes. Throw them away. This is just formatting.
Read four bytes. Throw them away. Still just formatting.
Read 129,600 bytes. This is data in short integer format. (two bytes each)
Read four bytes. Again, throw it away, it is just formatting.
This procedure is repeated for every week in the dataset. For more information about the binary data see the file oiinfo.asc.