At the exascale, factors like core count, storage bandwidth, spindle count, and a trend towards I/O becoming more random means current storage architectures are going to have significant problems. Collaborating with Kove offered the ALCF a unique opportunity to explore an innovative I/O solution to further the facilityâs mission of enabling breakthrough science at the extreme scale.
According to ALCF Director, Michael Papka, âI/O is a huge issue for HPC. Through this active partnership between the ALCF and Kove, we are investigating the use of a RAM-based storage system that has the potential to not only speed up scientific discovery, but also to significantly trim time-to-solution for our researchers and to pave the way for studies that were previously impossible due to time constraints.â
As an early proof of concept, the collaborators conducted a side-by-side visualization rendering exercise using Koveâs XPD2 RAM-based storage system on one screen, and local disk on another. For subject matter, the team used the visualization of an astrophysics study that details the growth of density perturbations in both gas and dark matter using vL3, an Argonne-developed parallel volume rendering framework for visualization of large datasets, including those derived on Argonneâs Blue Gene/P.
The results of this initial RAM-based test are in, and the news is good: The team realized 6x overall speedup of the application and an 8x speedup of the I/O.
âWe were optimistic that we would see a significant speedup, and a 6x application speedup out of the box on sequential I/O is impressive,â said Bill Allcock, ALCF director of operations. âThe potential is there to see speedups that could be 100s of times faster in latency-critical, random I/O scenarios.â
Building on the successes of the initial trial results, the collaboratorsâ next steps include random I/O testing and developing a tiered exascale storage architecture that uses extremely high bandwidth, low-latency storage, like the Kove XPD2, as the âtier 0,â with disk and tape tiers behind it. Such an architecture would allow storage-system bandwidth, latency, and capacity requirements to be tailored to a given systemâs workload. A storage upgrade for the facilityâs next-generation IBM Blue Gene/Q incorporating the results of this work is planned for 2014.