Unlimited cloud storage provider used old tech to boost upload speed – Robson, son of Intel, makes modest returns with SSD cache, but on a much, much larger scale
Blackblaze claims to have achieved upload speeds on its B2 Cloud Storage platform that are up to 30% faster than AWS S3 by using a ‘shard inventory’ based on solid-state drives (SSDs).
Previously, when customers uploaded small files to Blackblaze B2, the data was written to several hard drives and had to be completed before a response was sent back to the customer. But data is now written to both HDDs and some SSDs – known as the ‘shard stash’. The data on the SSD is only stored until the hard drives receive all the data, at which point these copies are deleted. Because uploading to the SSD is much faster than to the HDD, the uploads are faster.
The company benchmarked its new storage technology by uploading a 256 KB file and a 1 MB file to the Blackblaze B2 US East servers and the AWS S3 equivalents – and found that the file was 30% faster than S3 for the previous file and 10% faster for the latter.
Looking for inspiration from Intel’s Robson
When a client application uploads files, Blackblaze normally implements a coordinator pod to split them into 16 data shards in addition to four parity shards. These 20 shares are then written to 20 different HDDs.
When using HDDs, most of the time it takes to write the file is spent waiting for the drive disk to spin to the correct location. This data is first written to the cache in memory and then written to the physical disk in the future. So far, users are not receiving a “success” response.
But even the best HDDs are much slower than SSDs, and the company’s engineers have managed to find a way to integrate them into the upload process and improve performance without increasing costs too much. The new and improved B2 Cloud Storage allows the coordinator to split files under 1 MB into 20 shards as normal, which are sent to HDDs, but additional shards are sent to servers equipped with ten Micron SSDs.
Because writing to the SSDs is much faster, these ‘shard stash’ servers can serve as a temporary but secure repository for the data and send a ‘success’ response to the user much faster than the previous model. Once the data is safely uploaded to the HDDs, the data is deleted from the SSDs and can be reused.
It is similar to Intel’s TurboMemory system that it developed in the 2000s, codenamed Robson, which was embedded in the best laptops of the time to boot systems much faster.