[AI Hardware & Systems Design Track]: Cloud Resiliency in the Age of High-Performance Computing

As the era of high-performance computing (HPC) and artificial intelligence (AI) ushers in unprecedented advancements, the reliance on cloud strategies becomes vital. As cloud infrastructure becomes increasingly integral to supporting demanding computational workloads, maintaining the availability and robustness of these systems becomes paramount.

This panel will delve into the critical intersection of HPC/AI and cloud technology, spotlighting strategies for ensuring uninterrupted operations in the face of emerging challenges. The session brings together leading experts to examine architectural design paradigms that foster robustness, redundancy trade-offs, load balancing, and intelligent fault detection and predictive monitoring mechanisms. Experts will share insights on best practices for optimizing resource allocation, orchestrating seamless workload migrations, and deploying resilient cloud-native solutions. By exploring real-world cases, emerging trends, and practical insights, this discussion aims to equip data center and cloud professionals with insights to elevate their resiliency strategies amidst evolving computational demands.

Speaker(s):

Moderator

Author:

Alam Akbar

Director, Product Marketing

proteanTecs

Alam Akbar is a veteran of the semiconductor industry with experience spanning multiple engineering, product management, and product marketing roles. He holds a Bachelors of Science degree in Electrical Engineering from Texas A&M, and an MBA from Santa Clara University.

Alam began his career at Synopsys as an Application Consultant where he helped grow their market share in the signoff domain. He then joined the business management team at Cadence where he helped launch a new physical verification solution. After Cadence, Alam joined Intel Foundry services as a design kit program manager, and then moved into the client compute group as director of product marketing. There, he helped scale Intel's storage business, and developed product strategy for new memory solutions for the PC market.

At ProteanTecs, he's part of a team that’s bringing greater insight into the health and performance of semiconductors across the value chain, from the design stage to in field operation, and all the steps in the middle.

Panellists

Author:

Venkat Ramesh

Hardware Systems Engineer

Author:

Yun Jin

Engineering Director

Author:

Paolo Faraboschi

Vice President and HPE Fellow; Director, AI Research Lab

Hewlett Packard Labs, HPE

Paolo Faraboschi is a Vice President and HPE Fellow and directs the Artificial Intelligence Research Lab at Hewlett Packard Labs. Paolo has been at HP/HPE for three decades, and worked on a broad range of technologies, from embedded printer processors to exascale supercomputers. He previously led exascale computing research (2017-2020), and the hardware architecture of “The Machine” project (2014-2016), pioneered low-energy servers with HP’s project Moonshot (2010-2014), drove scalable system-level simulation research (2004-2009), and was the principal architect of a family of embedded VLIW cores (1994-2003), widely used in video SoCs and HP’s printers. Paolo is an IEEE Fellow (2014) for “contributions to embedded processor architecture and system-on-chip technology”, author of over 100 publications, 70 granted patents, and the book “Embedded Computing: a VLIW approach”. He received a Ph.D. in EECS from the University of Genoa, Italy.