Fermilab Computing Division

Towards Data-Intensive Extreme-Scale Computing

Full Title: Towards Data-Intensive Extreme-Scale Computing
Date & Time: 08 Jan 2015 at 13:00
Event Location: Comitium WH2SE
Event Topic(s): Computing Techniques Seminar
Event Moderator(s):
Event Info: Speaker:
Dr. Ioan Raicu, Illinois Institute of Technology

Abstract:
State-of-the-art yet decades old architecture of high-performance computing systems has its computation and storage separated. It has shown limits for today's data-intensive applications, because every I/O needs to be transferred via the network between the computation and storage cliques. This work aims design, implement, and evaluate a new distributed storage systems for extreme scale data-intensive computing. We proposed a distributed storage layer local to the compute nodes, which is responsible for most of the I/O operations and saves extreme amount of data movement between compute and storage resources. We have designed and implemented a distributed file system FusionFS for HPC compute nodes to support metadata-intensive and write-intensive operations. It supports a variety of data-access semantics, from POSIX- like interfaces for generality, to relaxed semantics for increased scalability. FusionFS has numerous advanced features to improve performance (e.g. caching and compression), improve reliability (e.g. replication and erasure codes), and improve functionality (e.g. provenance capture and query). FusionFS has been deployed and evaluated on up to 16K compute nodes in an IBM BlueGene/P supercomputer, showing orders of magnitude improvement in metadata and I/O performance. We have compared FusionFS with other leading distributed storage systems such as GPFS, PVFS, HDFS, S3, Casandra, Memcached, and DynamoDB – and FusionFS has always come out ahead in either performance, functionality, or both. We have also done a detailed performance evaluation with various scientific applications. An extensive evaluation of FusionFS was performed through simulations showing near linear scalability up to two million nodes. The long term goals of FusionFS is to scale it to exascale levels with millions of nodes, billions of cores, petabytes per second I/O rates, and billions of operations per second – with real systems, accelerating real data- intensive scientific applications at extreme scales.

Biography:
Dr. Ioan Raicu is an assistant professor in the Department of Computer Science (CS) at Illinois Institute of Technology (IIT), as well as a guest research faculty in the Math and Computer Science Division (MCS) at Argonne National Laboratory (ANL). He is also the founder (2011) and director of the Data-Intensive Distributed Systems Laboratory (DataSys) at IIT. He has received the prestigious NSF CAREER award (2011 - 2015) for his innovative work on distributed file systems for exascale computing. He is also the recipient of the IIT Junior Faculty Research Award in 2013. He was a NSF/CRA Computation Innovation Fellow at Northwestern University in 2009 - 2010, and obtained his Ph.D. in Computer Science from University of Chicago under the guidance of Dr. Ian Foster in March 2009. He is a 3-year award winner of the GSRP Fellowship from NASA Ames Research Center. His research work and interests are in the general area of distributed systems. His work focuses on a relatively new paradigm of Many-Task Computing (MTC), which aims to bridge the gap between two predominant paradigms from distributed systems, High-Throughput Computing (HTC) and High- Performance Computing (HPC). His work has focused on defining and exploring both the theory and practical aspects of realizing MTC across a wide range of large-scale distributed systems. He is particularly interested in resource management in large scale distributed systems with a focus on many-task computing, data intensive computing, cloud computing, grid computing, and many-core computing. Over the past decade, he has co-authored over 100 peer reviewed articles, book chapters, books, theses, and dissertations, which received over 4576 citations, with a H-index of 27. His work has been funded by the NASA Ames Research Center, DOE Office of Advanced Scientific Computing Research, the NSF/CRA CIFellows program, and the NSF CAREER program. He has also founded and chaired several workshops, such as ACM Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS), the IEEE Int. Workshop on Data-Intensive Computing in the Clouds (DataCloud), and the ACM Workshop on Scientific Cloud Computing (ScienceCloud). He is on the editorial board of the IEEE Transaction on Cloud Computing (TCC), the Springer Journal of Cloud Computing Advances, Systems and Applications (JoCCASA), and the Springer Cluster Computing Journal (Cluster). He has been leadership roles in several high profile conferences, such as HPDC, CCGrid, Grid, eScience, Cluster, and ICAC. He is a member of the IEEE and ACM. More information can be found at http://www.cs.iit.edu/~iraicu/.

No talks in agenda


DocDB Home ]  [ Search ] [ Authors ] [ Events ] [ Topics ]

DocDB Version 8.7.23, contact Document Database Administrators
Execution time: 0 wallclock secs ( 0.22 usr + 0.04 sys = 0.26 CPU)