Henry Newman's Storage Blog Archives for June 2013

Chinese Supercomputer vs. Blue Waters

The hype about the new Chinese supercomputer is deafening. Do a Google news search if you doubt it.

And about the same week, the rumors started to appear about a major release on solving a mystery for HIV. One of the articles was titled"GPUs Help Researchers Uncover New Approach to Combating HIV Virus" (read the article but more importantly watch the video). This news got none of the hype that the Chinese supercomputer that ran the Top500 LINPACK test did, and yet it actually solved something that could have a dramatic impact worldwide.

So I am asking myself, why did the press care more about a some new petaflop monster rather than a system that actually will help people on our planet. Is this the press or is this something more sinister? People seem to want the USA to compete in the petaflop race without understanding what applications will run well on the system and what restructuring will be required.

The Blue Waters team working on the HIV project has yet to run and submit the LINPACK results. And what would be the value given the fine work that they have done and they will continue to do?

Building supercomputers for the sake of running the Top500 LINPACK test and having the number one system in the world does not mean that the machine has enough memory, communications, bandwidth or storage space to do leading-edge scientific research. The Chinese machine is a good example of that with limited memory, high-latency interconnect and minimal storage bandwidth and space.

In my opinion, we should be highlighting the accomplishments of the scientific community on systems designed for scientific leadership rather than listening to the hype in the press about a machine that cannot do much of anything other than the Top500 tests.

Labels: China, supercomputers, HPC, TOP500, high-performance computing

posted by: Henry Newman

They Keep Losing? Why?

There's a great article by Chris Mellor in the Register about how the traditional storage vendors are in decline. Dell, HP, IBM, and NetApp—you name your major vendor, and it are in decline. Chris states, and I think he is correct, that the new smaller vendors are taking away the sales and profit.

The connection vendors are having the same problem. Qlogic’s CEO just resigned.

This reminds me of the late 1990s when the nimble vendors were making money and many of them got purchased. Sun and EMC were big consumers of these smaller vendors. I honestly cannot think of a single vendor that Sun purchased from say 1996 to 2004 that was successful inside Sun. On the other hand, many of the vendors that EMC purchased were successful. We know how this worked out for both Sun and EMC.As I am a big fan of believing that history repeats itself, I think that we are ripe for the smaller vendors getting plucked. Of course, times are quite a bit different now than in the late 1990s. We just exited a major recession, and companies are not flush with cash and have high value stock. So what I think will be different this time is that the big vendors will be far more choosey than the kid-in-a-candy-store approach of the 1990s when they were buying everything that they could see.

It is clear that first on the agenda are the SSD companies, where consolidation has been going on for a while. I think many big vendors learned their lessons on how to integrate and how not to integrate.

If I were the big vendors, I would be looking for companies that complement my products, that are operating independently now and are profitable with a good balance sheet. Might be an investment opportunity.

Labels: IBM, EMC, Dell, Sun, HP, NetApp, QLogic, Storage, merger and acquistion activity

posted by: Henry Newman

Quantum Computing

There has been a lot of news on quantum computing recently, from reports on NPR to Nature to the usual technical trade publications. The hype is enormous given the potential for changing the landscape, but I think there are a number of outstanding questions.

One of the biggest issues I see is that we have around 60 years of knowledge and understanding of programming languages. We have FORTRAN, COBOL, PL1, C, Python, Java and a whole myriad of languages all revolving around programming as we know it today. If quantum computing is to become successful there has to be an interface that allows it to be programmed easily. What will the interface be?

There will always be instances where people will, for the sake of performance, write machine code. Yes, it still happens today, but for quantum computing, given what I have read, there will have to be a paradigm shift in how things are programmed.

Quantum computing, given the cryogenics involved, is at best going to be relegated to large customers that have the proper facilities. That is not a great deal different than what happened in the 1950s when only the largest organizations had computers.

I think the key to success of this completely disruptive technology (assuming that the technology matches the marketing spec sheet) is going to be the interface and training the right people to use the programing method to utilize the machine. In the beginning, this will be very basic, but the things to watch for, I think, will be the how quickly the infrastructure is developed.

Of course, you are going to need to get data in and out of the machine, communicate with the machine and all the things that we have today. How fast these things come together will determine the success or failure of the technology, I believe.

Labels: programming languages, quantum computing

posted by: Henry Newman

Archives and Big Data

My good friend Rich Brueckner over at InsideHPC posted a talk I did at the IDC HPC User Forum on why archives are important to big data analysis.

I believe I made the case that organizations are going to need to keep far more archive data than they think they will, and they are going to need to keep the raw data, not just the processed data. Organizations are going to have to plan for this data with the right budgets and the right people to manage it because this data is important to their future.

As the saying goes, we do not know what we do not know about the data we have. And we are going to have to go back to the original data and reprocess it to extract new information. This has been done for decades in the oil and gas industry, as new algorithms are developed to better understand where to find new oil and gas. It has also been done for decades for genetics and medical data.

I make the case in the talk that it will be critical for all kinds of businesses to keep their data to themselves and not outsource the data to a cloud provider, given network performance and the requirement to get the database to extract new information. I think this will be needed in all types of industries from retail to sciences to manufacturing.

Planning for archives will become more important in the future, not less important, as archives are going to be critical to the future of many, many industries from retail to medical to you name it.

I hope you enjoy the video.

Labels: archive, big data, Storage

posted by: Henry Newman

Fibre Channel's Decline?

I recently read an article in The Register on declining Fibre Channel revenues. The volume of switch fibre channel ports, HBAs and connections has been steadily declining since the disk drive manufacturers introduced SAS. The first thing to go was the tens of thousands of disk connections. With SAS performance moving up the food chain as the baseline backend interconnect and with prices dropping fast to become more price competitive than fibre channel, why would anyone want to continue with fibre channel for new environments?

There are, of course, a few good reasons why sites still buy fibre channel. First, if the storage they want to buy has a fibre channel interface, they need to use fibre channel. Good examples here are enterprise tape drives, which use fibre channel interfaces, as do LTO.

Second, fibre channel supports longer distances than SAS. Though the distances drop with each performance increase, they are still longer distances than are supported by SAS.

Almost four years ago I speculated that 10 GbE Ethernet price drops would doom fibre channel. Well, the Ethernet vendors never got their stuff together, and FCoE did not take over the world. But appliances are replacing block storage, and that spells doom.

There are not many file systems that run on block storage that can scale to today’s requirements and run today’s applications. That was not true in the heyday of fibre channel. The appliance revolution is in full swing with big data appliances starting to appear on the horizon, and little by little fibre channel volume drops.

It was a good run compared with other technologies: FDDI, FW-SCSI, U-SCSI and a host of others. At one point fibre channel tried to challenge Ethernet with an IP stack and other features. But it was never cost effective, and the price performance did not make sense. It was a good run with a few more years left, but sadly fibre channel is on the down side of the hill.

Labels: SAS, fibre channel, Storage

posted by: Henry Newman

Amazon vs. Best Buy and the Storage Industry

After I wrote my last blog on IBM selling the x86 division, I heard from a friend of mine who added a few points about the U.S. economy. This got me thinking about the big box stores vs. online retailers and how in some ways this is no different than what IBM faced.

We all know that IBM has lots of smart, innovative people, and smart, innovative people are expensive. For IBM to have on staff lots of smart people configuring commodity systems with low margin is not much different than Amazon competing with Best Buy in a general way.

IBM has salespeople, presales people and architects all working to configure complex systems that the customer could easily buy from a lower-margin competitor if they choose. They just take the configuration and the parts, which are all commodity, go online and click, click, click to buy the equipment from someone else who may not have as many smart, innovative people. That is what commodity computing is all about.

Hardware is a commodity, and so is Linux, for the most part. What is not a commodity is support and applications. Best Buy did not do well with the Geek Squad for a number of reasons, but support is also a pretty low-margin business.

The industry with good margins the datacenter today is not hardware or commodity software but information generation. Something like IBM’s Watson system for medicine will be a non-commodity system.

I have been talking about appliances and the appliances model for storage for a long time, and I've been talking about the movement to appliances for data analysis for a while. I think IBM’s move could be the signal for the rest of the industry. The value in the future is solving problems not selling hardware.

Labels: Amazon, IBM, hardware, Best Buy, Storage, commodity servers

posted by: Henry Newman

IBM Sell Off

As we have all heard, IBM looks to be selling its x86 server division to Lenovo.

Lots of my customers are very unhappy about the potential sale as they do not want their servers controlled by a Chinese company. I am not going to make value judgment if this is good or bad. The issue really is that we have no one to blame.

I see this as no different than when other industries went offshore. We have seen textiles and steel leave and almost saw the auto industry go off shore. Of course, the reason all of these industries left was cost. We as a society did not want to pay extra to have a domestic supplier and/or in the case of the auto industry, we did not want to have to buy a lower-quality product.

The commoditization of the computer industry is a reality. We all heard rumors not long ago that HP wanted to sell their PC division. So we have two big companies trying to sell their x86 divisions.

It is clear to me that what else is happening is that Intel is trying to take more of the margin for the system over the years in the x86 market. The motherboard today has far more Intel product than it did ten years ago. For the most part, in commodity products, those who add the most value get the most profit. As things move down the value chain, the profit margins get lower and lower.

If the rumor is true, I think IBM is making the right move for the long term. Like it or not, no one will pay extra for commodity product.

Labels: IBM, x86 servers, Lenovo

posted by: Henry Newman

Clouds Down Again

I saw an article a few weeks ago about Google drive being down. This was a problem for someone I know who was depending on Google drive for sharing a file for a group edit. My friend was not happy at all.

It is clear to me that the cloud's reliability has not kept up with cloud applications for group activities. Name your major cloud vendor, and it has gone down. And many of them have lost data.

The problem seems to better getting worse, not better. The user community, which includes companies and regular people, seems to be getting used to the fact that things are going down. Just a few years ago if a company had a companywide outage there would be head rolling in the IT department. Today, a cloud goes down and it is an, oh well.

Part of the reason is there is no one to blame. All of the cloud agreements I have seen allow for downtime. None of the agreements I have seen say anything in the contract about data integrity.

This was not the case when IT departments ran their own data centers. But today company’s IT departments are under attack from the bean counters (partially I think because of the recession), and they do not seem to understand about downtime and data integrity. If the accounting department is not impacted by the downtime, data integrity and/or data loss is no big deal.

The big deal is coming though. It will not be long until someone important outside of IT needs something or loses something. With the number of clouds and the amount of downtime the day is coming.

Every time a cloud goes down it is for a different reason. At least that should give some people some solace that people are not making the same mistakes.

Labels: cloud computing, reliability, downtime, Google Drive

posted by: Henry Newman

SSD Reality

There was a great presentation from FAST 2013 about SSDs. It details critical issues that have not been addressed by many vendors. The presentation is worth watching for sure.

Now power faults might not be common, but I think the issues point to the differences in resiliency between disk drives and SSDs. What was most interesting to me was that some supposed "enterprise" SSDs did not behave in an enterprise way during the power fault tests. Sadly and not unexpectedly, the authors did not out which vendors failed and which vendors passed. I would love to know the answers, so my plan is to ask storage vendors that often test a variety of SSDs if they know.The real question is, and I have been asking this for a number of years, how do SSD vendors really test their products. There is lots of talk about IOPS and performance, but as I have said time and time again, IOPS are not the big issue in my opinion, quality is. If an SSD loses its virtual block mapping, it become nothing more than a large paperweight.Disk drive vendors have decades of experience making drives that are reliable under a variety of conditions. For this test, only two disk drives were tested. One did well, and the other was perfect. I am pretty sure that all enterprise disk drives would have passed this demanding test given what I have seen for their testing procedures.

The SSD vendors likely need to have more testing than they currently have. But since I think there will continue to be consolidation of vendors the big three disk companies might end up doing the testing.

Labels: testing, SSD, disk drives, Storage

posted by: Henry Newman

Helium and Disk Drives

Did anyone see this opinion piece on the Federal Helium Reserve and the helium shortfall?

This got me thinking about WD's plans for making helium-filled drives. I, of course, do not want hydrogen fill disk drives or a disaster like the Hindenburg in a computer room.

I have been hearing about the impending issues with helium. There needs to be only a small amount of helium in each disk drive, but with the thousands of drives being shipped, I am sure this adds up to a significant amount of the gas.This makes me ask the question, what were they thinking?

WD needs to provide a lot more information about how this is all going to work. This is not the first case nor the last case where a vendor has said something that on the surface makes sense, but after learning a few other things leads to lots of questions about how things will really work together. The issues with helium shortfall and the price increases have been well known for a long time. I have seen many reports on this, but why hasn't the IT media asked WD how this all will work? It is a fair question and there could easily be a reasonable explanation.

The IT new media needs to start asking hard questions of vendors, not just reporting on what they say. This is just one example. I am not a reporter, so maybe I do not know the rules of engagement when talking with vendors. But in-depth reporting would provide everyone more valuable information.

Labels: disk drives, WD, helium

posted by: Henry Newman