The Parallel Log Structured File System (PLFS) was developed at the Los Alamos National Laboratory (LANL) to improve shared file write performance. Write performance is improved as PLFS transparently transforms the writes such that each process, while logically writing to a shared file, is physically writing to a unique file. By removing this concurrency, PLFS improved the write performance of many applications by multiple orders of magnitude. This was demonstrated on PanFS, Lustre, and GPFS but was not reproduced on PVFS. However, reconstructing the logical file from the multitude of physical files has proven difficult. To alleviate this issue we developed several collective techniques to aggregate information from multiple component pieces. This enables PLFS to maintain it's large write improvements without sacrificing read performance for many workloads. There are other workloads, however, which remain challenging. Currently, Los Alamos is developing a scalable HPC key-value store to address these remaining challenges. Additionally, the transformative properties of PLFS have recently also been leveraged to improve the metadata performance of a parallel file system. Finally, I will discuss some preliminary ideas about using PLFS to improve storage availability.
Short Bio:
Dr. Adam Manzanares is currently a Nicholas C. Metropolis postdoctoral fellow at the Los Alamos National Laboratory (LANL). He was appointed this position in November 2010 after joining LANL in July 2010 as a postdoctoral researcher. Dr. Manzanares received his Ph.D. from Auburn University in May 2010 with a focus on energy efficient storage systems. Dr. Manzanares is currently focused on storage systems for high performance computing applications. Dr. Manzanares develops middleware layers to improve the performance of HPC storage systems. Dr. Manzanares is also currently researching compression techniques and data formatting libraries for scientific data sets.