OFA的会议资料,我曾经在公众号上发过2次:
《
RDMA、HPC资料分享:2019 OpenFabrics Alliance会议
》
《
OpenFabrics Alliance Workshop会议资料分享(2021、2020)
》
但具体到OFA联盟的历史,还是在之前一篇唐杰总的文章
《
SoC之三:AWS Elastic Fabric Adapter
》里介绍得更好:
“
RDMA的技术是在一个有Mellanox主导的行业组织OFA[7]主导的...
OFA是2004年成立的工业组织,在整个HPC行业从Myrinet[8]转换到IB的时候成立的。在2005年, Myrinet在TOP500的市场份额占到了28%,之后就一路下降,被IB替换掉了。对于诞生于HPC专业的领域,可用性一直是个大问题,HPC一切为了性能,不要虚拟化,不要通用操作系统和架构,每台超算恨不得自成一台体系。大家看看Mellanox的Linux 驱动的家族就知道这个有多复杂了。
”
BTW. 近几年在OFA里比较积极的是Intel,比如CXL 3.1也在这次会议内容里。
2024 OFA Virtual Workshop资料网盘下载
链接:
https://pan.baidu.com/s/1WPIT1LqEegAlAEjcoNZTuQ?pwd=4dfy
提取码:4dfy
官网链接 https://www.openfabrics.org/2024-ofa-virtual-workshop-agenda/(里面还有视频,qiang外面的)
演讲主题
Session 1
“OFI 2.0 Update”
Jianxin Xiong, Intel
Session 2
“Status of OpenFabrics Interfaces (OFI) Support in MPICH”
Yanfei Guo, Argonne National Laboratory
Session 3
"Accelerating MPI AllReduce Communication with Efficient GPU-Based Compression Schemes on Modern GPU Clusters"
Hari Subramoni and Qinghua Zhou, The Ohio State University
Session 4
"High Performance & Scalable MPI library over Broadcom RoCE"
Mustafa Abduljabbar, The Ohio State University; Hemal Shah, Broadcom Inc; and Shulei Xu, The Ohio State University
Session 5
"Scaling Large Language Model Training using Hybrid GPU-based Compression in MVAPICH"
Speakers: Aamir Shafi and Lang Xu, The Ohio State University
Session 6
"OFI Integrated Shared Memory Offload"
Speakers: Alexia Ingerson, Intel; Shi Jin, Amazon; and Amir Shehata, Oak Ridge National Laboratories
Session 7
"Managing Composable Disaggregated Infrastructure With OFA Sunfish"
Christian Pinto, IBM Research Europe; Michael Aguilar, Sandia National Laboratories; Phil
Cayton, Intel; Russ Herrell, Hewlett Packard Enterprise; and Brian Pan, H3 Platform