Geomancy: Automated Performance Enhancement through Data Layout Optimization

Appeared in Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20).

Abstract

Large distributed storage systems such as high-performance computing (HPC) systems used by national or international laboratories require sufficient performance and scale for demanding scientific workloads and must handle shifting workloads with ease. Ideally, data is placed in locations to optimize performance, but the size and complexity of large storage systems inhibit rapid effective restructuring of data layouts to maintain performance as workloads shift.

To address these issues, we have developed Geomancy, a tool that models the placement of data within a distributed storage system and reacts to drops in performance. Using a combination of machine learning techniques suitable for temporal modeling, Geomancy determines when and where a bottleneck may happen due to changing workloads and suggests changes in the layout that mitigate or prevent them. Our approach to optimizing throughput offers benefits for storage systems such as avoiding potential bottlenecks and increasing overall I/O throughput from 11% to 30%.

 
 

Publication date:
October 2020

Authors:
Oceane Bel
Kenneth Chang
Nathan Tallent
Dirk Duellman
Ethan L. Miller
Faisal Nawab
Darrell D. E. Long

Projects:
Scalable High-Performance QoS
Prediction and Grouping
Storage QoS

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{bel-msst20,
  author       = {Oceane Bel and Kenneth Chang and Nathan Tallent and Dirk Duellman and Ethan L. Miller and Faisal Nawab and Darrell D. E. Long},
  title        = {Geomancy: Automated Performance Enhancement through Data Layout Optimization},
  booktitle    = {Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20)},
  month        = oct,
  year         = {2020},
}
Last modified 5 Aug 2020