We all think about availability in 9s.
For data, we all think about 100 percent data integrity, which is totally unrealistic. We should look at data integrity in terms of 9s also. Think about this:
Clearly, with large amounts of archival data you are still going to have data loss even at 20 9s. How do you even calculate the potential of data loss with all of the hardware and software in the datapath? I believe it is time that the discussion change from our current thinking that you are not going to lose any data to the realization that data loss is going to happen and how are we going to address it and deal with it? The problem is you do not know what you are going to lose, as there is no emphasis or standards that allow you to define the level of importance of the data. There are, of course, proprietary frameworks that allow you to define the importance and potentially save more copies of the file, but you are locked into a single vendor. Another problem is that file formats do not allow any recovery. Take something as simple as a jpg image. A few bytes flip in the header and the whole image can be lost unless you have some complex recovery software. I once worked on a project where multiple copies of the header were written for a file in case the header got corrupted. The problem is the assumption that most standards bodies make is that that the header will never get corrupted or they would have multiple copies. Maybe someday everyone will finally come to the realization that digital data will not be and will never be 100 percent reliable and deal with the problem.
Labels: high availability,Storage,data integrity
posted by: Henry Newman