Posted on January 23, 2014 By Henry Newman


You might think that there is no relationship between archival data storage and weather forecasting, but without storage – and I mean lots of archival storage – our forecasts would not be improving much.

Take the following example for archival storage using HPSS, which is well known in the archival community for the most scalable HSM.   Take note that four of the top ten archive sites in the world are weather sites: ECMWF, NOAA, UK Met, and DKRZ.  The reason is that each day these sites archive all of the input data to the weather forecast.

This includes things like satellite input from all kinds of different collections to temperatures at various altitudes to wind velocities at various altitudes, buoy information in the ocean, and lots of other information. Then combine that with ground stations, airplanes, shipboard sensors and you are talking about many terabytes of input data.

Then the forecast is run – sometimes – a few times a day and the output of the forecast is saved for each forecast.  This goes on for months or years until a new forecast model is developed and now the new model has to be validated.  

So what is done is that the weather sites rerun the forecast with all of the old input data and create a new forecast.  Now sometimes the new forecast has new inputs, because, for example, a new satellite is put into service, but the models can always be run without that new data.

The new model output is compared to the old model output and statistical analysis is done to make sure that the new model provides better solution than the old model.  This is especially true when a forecast is just plain wrong. The sites make sure that the new model does a better job at prediction than the old model that got the forecast wrong.    Weather forecasting is yet another example of an application that has seen an explosion of data with no end in sight.

