|
| 1 | +* Design Guidelines for Correct, Efficient, and Scalable Synchronization using One-Sided RDMA |
| 2 | + |
| 3 | +This is the source code for our SIGMOD 2023 Paper: |
| 4 | + |
| 5 | +** Paper Abstract |
| 6 | +Remote data structures built with one-sided Remote Direct Memory Access (RDMA) are at the heart of many disaggregated database management systems today. Concurrent access to these data structures by thousands of remote workers necessitates a highly efficient synchronization scheme. Remarkably, our investigation reveals that existing synchronization schemes display substantial variations in performance and scalability. Even worse, some schemes do not correctly synchronize, resulting in rare and hard-to-detect data corruption. Motivated by these observations, we conduct the first comprehensive analysis of one-sided synchronization techniques and provide general principles for correct synchronization using one-sided RDMA. Our research demonstrates that adherence to these principles not only guarantees correctness but also results in substantial performance enhancements. |
| 7 | + |
| 8 | +** Citation |
| 9 | + |
| 10 | + |
| 11 | +#+begin_src |
| 12 | +@article{10.1145/3589276, |
| 13 | + author = {Ziegler, Tobias and Nelson-Slivon, Jacob and Leis, Viktor and Binnig, Carsten}, |
| 14 | + title = {Design Guidelines for Correct, Efficient, and Scalable Synchronization Using One-Sided RDMA}, |
| 15 | + year = {2023}, |
| 16 | + url = {https://doi.org/10.1145/3589276}, |
| 17 | + doi = {10.1145/3589276}, |
| 18 | + journal = {Proc. ACM Manag. Data}, |
| 19 | +} |
| 20 | +#+end_src |
| 21 | + |
| 22 | +** Benchmarks |
| 23 | +The benchmarks and lock implementations can be found in `frontend`. |
| 24 | +The experiment scripts can be found in `distexperiments/experiments` |
| 25 | + |
| 26 | +** Setup |
| 27 | + |
| 28 | +*** Cluster Setup |
| 29 | +All experiments were conducted on a 5-node cluster running Ubuntu 18.04.1 LTS, with Linux 4.15.0 kernel. |
| 30 | +Each node is equipped with two Intel(R) Xeon(R) Gold 5120 CPUs (14 cores), 512 GB main-memory split between both sockets, and four Samsung |
| 31 | +SSD 980 Pro M.2 1 TB connected via PCIe by one ASRock Hyper Quad M.2 PCIe card. |
| 32 | +The nodes of the cluster are connected with an InfiniBand network using one Mellanox ConnectX-5 MT27800 NICs (InfiniBand EDR 4x, 100 Gbps) per node. |
| 33 | + |
| 34 | +*** Mellanox RDMA |
| 35 | +We used the following Mellanox OFED installation: |
| 36 | + |
| 37 | +**** ofed_info |
| 38 | +#+begin_src shell |
| 39 | +MLNX_OFED_LINUX-5.1-2.5.8.0 (OFED-5.1-2.5.8): |
| 40 | +Installed Packages: |
| 41 | +------------------- |
| 42 | +ii ar-mgr 1.0-0.3.MLNX20200824.g8577618.51258 amd64 Adaptive Routing Manager |
| 43 | +ii dapl2-utils 2.1.10.1.mlnx-OFED.51258 amd64 Utilities for use with the DAPL libraries |
| 44 | +ii dpcp 1.1.0-1.51258 amd64 Direct Packet Control Plane (DPCP) is a library to use Devx |
| 45 | +ii dump-pr 1.0-0.3.MLNX20200824.g8577618.51258 amd64 Dump PathRecord Plugin |
| 46 | +ii hcoll 4.6.3125-1.51258 amd64 Hierarchical collectives (HCOLL) |
| 47 | +ii ibacm 51mlnx1-1.51258 amd64 InfiniBand Communication Manager Assistant (ACM) |
| 48 | +ii ibdump 6.0.0-1.51258 amd64 Mellanox packets sniffer tool |
| 49 | +ii ibsim 0.9-1.51258 amd64 InfiniBand fabric simulator for management |
| 50 | +ii ibsim-doc 0.9-1.51258 all documentation for ibsim |
| 51 | +ii ibutils2 2.1.1-0.126.MLNX20200721.gf95236b.51258 amd64 OpenIB Mellanox InfiniBand Diagnostic Tools |
| 52 | +ii ibverbs-providers:amd64 51mlnx1-1.51258 amd64 User space provider drivers for libibverbs |
| 53 | +ii ibverbs-utils 51mlnx1-1.51258 amd64 Examples for the libibverbs library |
| 54 | +ii infiniband-diags 51mlnx1-1.51258 amd64 InfiniBand diagnostic programs |
| 55 | +ii iser-dkms 5.1-OFED.5.1.2.5.3.1 all DKMS support fo iser kernel modules |
| 56 | +ii isert-dkms 5.1-OFED.5.1.2.5.3.1 all DKMS support fo isert kernel modules |
| 57 | +ii kernel-mft-dkms 4.15.1-100 all DKMS support for kernel-mft kernel modules |
| 58 | +ii knem 1.1.4.90mlnx1-OFED.5.1.2.5.0.1 amd64 userspace tools for the KNEM kernel module |
| 59 | +ii knem-dkms 1.1.4.90mlnx1-OFED.5.1.2.5.0.1 all DKMS support for mlnx-ofed kernel modules |
| 60 | +ii libdapl-dev 2.1.10.1.mlnx-OFED.51258 amd64 Development files for the DAPL libraries |
| 61 | +ii libdapl2 2.1.10.1.mlnx-OFED.51258 amd64 The Direct Access Programming Library (DAPL) |
| 62 | +ii libibmad-dev:amd64 51mlnx1-1.51258 amd64 Development files for libibmad |
| 63 | +ii libibmad5:amd64 51mlnx1-1.51258 amd64 Infiniband Management Datagram (MAD) library |
| 64 | +ii libibnetdisc5:amd64 51mlnx1-1.51258 amd64 InfiniBand diagnostics library |
| 65 | +ii libibumad-dev:amd64 51mlnx1-1.51258 amd64 Development files for libibumad |
| 66 | +ii libibumad3:amd64 51mlnx1-1.51258 amd64 InfiniBand Userspace Management Datagram (uMAD) library |
| 67 | +ii libibverbs-dev:amd64 51mlnx1-1.51258 amd64 Development files for the libibverbs library |
| 68 | +ii libibverbs1:amd64 51mlnx1-1.51258 amd64 Library for direct userspace use of RDMA (InfiniBand/iWARP) |
| 69 | +ii libibverbs1-dbg:amd64 51mlnx1-1.51258 amd64 Debug symbols for the libibverbs library |
| 70 | +ii libopensm 5.7.3.MLNX20201102.e56fd90-0.1.51258 amd64 Infiniband subnet manager libraries |
| 71 | +ii libopensm-devel 5.7.3.MLNX20201102.e56fd90-0.1.51258 amd64 Developement files for OpenSM |
| 72 | +ii librdmacm-dev:amd64 51mlnx1-1.51258 amd64 Development files for the librdmacm library |
| 73 | +ii librdmacm1:amd64 51mlnx1-1.51258 amd64 Library for managing RDMA connections |
| 74 | +ii mlnx-ethtool 5.4-1.51258 amd64 This utility allows querying and changing settings such as speed, |
| 75 | +ii mlnx-iproute2 5.6.0-1.51258 amd64 This utility allows querying and changing settings such as speed, |
| 76 | +ii mlnx-ofed-kernel-dkms 5.1-OFED.5.1.2.5.8.1 all DKMS support for mlnx-ofed kernel modules |
| 77 | +ii mlnx-ofed-kernel-utils 5.1-OFED.5.1.2.5.8.1 amd64 Userspace tools to restart and tune mlnx-ofed kernel modules |
| 78 | +ii mpitests 3.2.20-5d20b49.51258 amd64 Set of popular MPI benchmarks and tools IMB 2018 OSU benchmarks ver 4.0.1 mpiP-3.3 IPM-2.0.6 |
| 79 | +ii mstflint 4.14.0-3.51258 amd64 Mellanox firmware burning application |
| 80 | +ii openmpi 4.0.4rc3-1.51258 all Open MPI |
| 81 | +ii opensm 5.7.3.MLNX20201102.e56fd90-0.1.51258 amd64 An Infiniband subnet manager |
| 82 | +ii opensm-doc 5.7.3.MLNX20201102.e56fd90-0.1.51258 amd64 Documentation for opensm |
| 83 | +ii perftest 4.4+0.5-1 amd64 Infiniband verbs performance tests |
| 84 | +ii rdma-core 51mlnx1-1.51258 amd64 RDMA core userspace infrastructure and documentation |
| 85 | +ii rdmacm-utils 51mlnx1-1.51258 amd64 Examples for the librdmacm library |
| 86 | +ii sharp 2.2.2.MLNX20201102.b26a0fd-1.51258 amd64 SHArP switch collectives |
| 87 | +ii srp-dkms 5.1-OFED.5.1.2.5.3.1 all DKMS support fo srp kernel modules |
| 88 | +ii srptools 51mlnx1-1.51258 amd64 Tools for Infiniband attached storage (SRP) |
| 89 | +ii ucx 1.9.0-1.51258 amd64 Unified Communication X |
| 90 | +#+end_src |
| 91 | + |
| 92 | + |
| 93 | +*** Libraries |
| 94 | +- gflags |
| 95 | +- lib_aio |
| 96 | +- ibverbs |
| 97 | +- tabulate |
| 98 | +- rdma cm |
| 99 | + |
0 commit comments