Authors/Presenters:Yazhou Zu, Alireza Ghaffarkhah, Hoang-Vu Dang, Brian Towles, Steven Hand, Safeen Huda, Adekunle Bello, Alexander Kolbasov, Arash Re…
First seen on securityboulevard.com
Jump to article: securityboulevard.com/2024/10/usenix-nsdi-24-resiliency-at-scale-managing-googles-tpuv4-machine-learning-supercomputer/