When you upload your pictures and posts to Facebook, all that data is being stored somewhere. Facebook’s storage infrastructure is massive, but it’s not entirely a mystery, at least on the server side.
Facebook is the leader of the Open Compute Project, an effort that kicked off in April of 2011. With Open Compute, the goal is to open source the specifications behind Facebook’s data center designs. Initially the effort released server-related specifications, but is now also looking at storage as well. That’s where the Open Vault effort comes into play.
Amir Michael, Facebook’s system engineering team lead, told InfoStor that Open Vault is one of the most interesting projects within Open Compute today.
“Open Vault is Open Compute for a storage platform,” Michael said.
Open Vault is based on the concept of JBOD(Just a Bunch of Disks). The JBOD is then interfaced to Facebook’s Open Compute server platform.
“This is the first time that there will be an open source storage-type of technology behind there [Open Compute] and it’s based primarily around the SAS protocol,” Michael said.
SAS (Serial Attached SCSI) has become increasingly popular in recent years as hardware vendors embrace it to accelerate storage performance.
Open Vault interfaces lots of drives together in a very dense way, according to Michael. He noted that most storage hardware today is front facing, such that it consumes a lot of space at the front of a server rack. That fact limits the number of storage drives that can then be deployed inside of a rack. Open Vault takes a different approach.
“We went ahead and stacked the drives deep inside the server as well, which allows us to have more drive density,” Michael said.
As such, Facebook is able to pack more storage behind every compute node. Michael said that with Open Vault, Facebook is able to deploy storage in chunks of 15 drives at a time. That can then scale up incrementally to 50 drives or more; the drives can either be traditional spinning disks or SSDs.
Air Flow
Packing a large number of drives into a server enclosure poses a number of challenges. Among them is the issue of drive cooling, since so many drives packed closely together will generate lots of heat.
It’s an issue that Facebook is dealing with by proper modeling of air flow cooling through server storage enclosures. Michael said that Facebook has a lab that spends a lot of time looking at the thermodynamic properties of servers.
The Facebook approach involves both software and hardware to provide the right level of cooling. On the hardware side, it’s about air management and how engineers duct and guide air through systems and around drives. On the software side, it’s about creating algorithms that control the fans in an energy efficient manner.
“Some approaches are to just flow more air through the box and be sure that it’s cool,” Michael said. “We do it in a very scientific way that allows us to only use the required amount of air, because if you’re moving more air than you need, you’re wasting energy.”