Fermilab Computing Division

Challenges and Opportunities in Large-Scale Storage Systems

Simple document list
(2 extra documents)

Full Title: Challenges and Opportunities in Large-Scale Storage Systems
Date & Time: 29 Mar 2012 at 14:00
Event Location: FCC1 Conference Room
Event Moderator(s):
Event Info: Speaker:
Dr. Ioan Raicu, Assistant Professor in the Department of Computer Science at Illinois Institute of Technology

Abstract:
Exascale computers will enable the unraveling of significant scientific mysteries. Predictions are that 2019 will be the year of exascale, with millions of compute nodes and billions of threads of execution. The current architecture of high-end computing systems is decades-old and has persisted as we scaled from gigascales to petascales. In this architecture, storage is completely segregated from the compute resources and are connected via a network interconnect. This approach will not scale several orders of magnitude in terms of concurrency and throughput, and will thus prevent the move from petascale to exascale. At exascale, basic functionality at high concurrency levels will suffer poor performance, and combined with system mean-time-to-failure in hours, will lead to a performance collapse for large-scale heroic applications. Storage has the potential to be the Achilles heel of exascale systems. We propose that future high-end computing systems be designed with non-volatile memory on every compute node, allowing every compute node to actively participate in the metadata and data management and leveraging many-core processors high bisection bandwidth in torus networks. This presentation discusses this revolutionary new distributed storage architecture that will make exascale computing more tractable, touching virtually all disciplines in high-end computing and fueling scientific discovery.

Speaker Bio:
Dr. Ioan Raicu is an assistant professor in the Department of Computer Science (CS) at Illinois Institute of Technology (IIT), as well as a guest research faculty in the Math and Computer Science Division (MCS) at Argonne National Laboratory (ANL). He is also the founder and director of the Data-Intensive Distributed Systems Laboratory (DataSys) at IIT. He has received the prestigious NSF CAREER award (2011 - 2015) for his innovative work on distributed file systems for exascale computing. He was a NSF/CRA Computation Innovation Fellow at Northwestern University in 2009 - 2010, and obtained his Ph.D. in Computer Science from University of Chicago under the guidance of Dr. Ian Foster in March 2009. He is a 3-year award winner of the GSRP Fellowship from NASA Ames Research Center. His research work and interests are in the general area of distributed systems. His work focuses on a relatively new paradigm of Many-Task Computing (MTC), which aims to bridge the gap between two predominant paradigms from distributed systems, High-Throughput Computing (HTC) and High-Performance Computing (HPC). His work has focused on defining and exploring both the theory and practical aspects of realizing MTC across a wide range of large-scale distributed systems. He is particularly interested in resource management in large scale distributed systems with a focus on many-task computing, data intensive computing, cloud computing, grid computing, and many-core computing. His work has been funded by the NASA Ames Research Center, DOE Office of Advanced Scientific Computing Research, the NSF/CRA CIFellows program, and the NSF CAREER program. He is a member of the IEEE and ACM. More information can be found at http://www.cs.iit.edu/~iraicu/, http://datasys.cs.iit.edu/, or http://www.linkedin.com/in/ioanraicu.

No talks in agenda


Other documents for this event

CS-doc-# Title Author(s) Topic(s) Last Updated
5847-v1 HEPiX 2016 Fall Workshop - Fermilab Site Report Glenn Cooper et al. Communications & Outreach
17 Oct 2016
4701-v1 Challenges and opportunites in large-scale storage systems Ioan Raicu Computing Techniques Seminars
02 Apr 2012

DocDB Home ]  [ Search ] [ Authors ] [ Events ] [ Topics ]

DocDB Version 8.7.23, contact Document Database Administrators
Execution time: 0 wallclock secs ( 0.19 usr + 0.03 sys = 0.22 CPU)