两台双路7773x服务器和bluefield-2 DPU开荒测试

课题组上了新货,这次玩点不一样的,两台服务器均配备了 Nvidia bluefield 2 DPU 智能网卡,支持200GB infiniband 网络,两台服务器,一共搭载了4块AMD Epyc 7773x处理器,总和算力超过了双路9684x,这种双机并行的计算性能会不会超过单机双路9684x,让我们拭目以待。今天的目标是1)配置好硬盘的lvm,2)装好智能网卡的驱动程序,3)更新智能网卡的固件,4)测试智能网卡和服务器的性能。

先看下这个服务器的靓照。

这台服务器和我们之前测试过的另一台(https://www.cfdem.cn/amd9654es-benchmark/),同属华硕,外观差异不大。

上架之后就是配置和测试了。完成一系列IP地址、用户、ssh配置、更新源等基础配置后,为两块三星PM1733 7.68TB硬盘安装驱动程序,并组建lvm虚拟卷。两块硬盘都安装在服务器1上,通过nfs共享给另一台服务器。具体操作方式参考chatgpt(https://chatgpt.com/share/1d25edce-44f7-406c-943f-91044324a64e),不再赘述。

安装网卡驱动和更新固件的过程比较曲折,这里详细记录下,或许可以帮到大家。这网卡是工程样品,英伟达官方不提供对应型号的固件,且有B站up主警告,不要尝试刷相似产品的固件,容易变砖。且这网卡似乎不支持IB模式,默认工作在以太网模式,不进行配置的情况下,Ubuntu 20.04可以正常识别成200Gbps以太网卡,因而可以无脑使用,没有问题。但小编不甘心,希望解锁IB模式。使用lspci | grep Mellanox 命令查看网卡的pci地址,结果如下

21:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
21:00.1 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)
61:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
61:00.1 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)Code language: CSS (css)

使用mstconfig -d 21:00.0 q 命令查询网卡配置参数,如下

Device #1:
----------

Device type:    BlueField2
Name:           MBF2M345A-VENOT_ES_Ax
Description:    NVIDIA BlueField-2 E-Series Eng. sample DPU; 200GbE single-port QSFP56; PCIe Gen4 x16; Secure Boot Disabled; Crypto Enabled; 16GB on-board DDR; 1GbE OOB management
Device:         21:00.0

Configurations:                              Next Boot
         MEMIC_BAR_SIZE                      0
         MEMIC_SIZE_LIMIT                    _256KB(1)
         HOST_CHAINING_MODE                  DISABLED(0)
         HOST_CHAINING_CACHE_DISABLE         False(0)
         HOST_CHAINING_DESCRIPTORS           Array[0..7]
         HOST_CHAINING_TOTAL_BUFFER_SIZE     Array[0..7]
         INTERNAL_CPU_MODEL                  EMBEDDED_CPU(1)
         FLEX_PARSER_PROFILE_ENABLE          0
         PROG_PARSE_GRAPH                    False(0)
         FLEX_IPV4_OVER_VXLAN_PORT           0
         ROCE_NEXT_PROTOCOL                  254
         ESWITCH_HAIRPIN_DESCRIPTORS         Array[0..7]
         ESWITCH_HAIRPIN_TOT_BUFFER_SIZE     Array[0..7]
         PF_BAR2_SIZE                        3
         NON_PREFETCHABLE_PF_BAR             False(0)
         VF_VPD_ENABLE                       False(0)
         PF_NUM_PF_MSIX_VALID                False(0)
         PER_PF_NUM_SF                       False(0)
         STRICT_VF_MSIX_NUM                  False(0)
         VF_NODNIC_ENABLE                    False(0)
         NUM_PF_MSIX_VALID                   True(1)
         NUM_OF_VFS                          8
         NUM_OF_PF                           1
         PF_BAR2_ENABLE                      True(1)
         HIDE_PORT2_PF                       False(0)
         SRIOV_EN                            True(1)
         PF_LOG_BAR_SIZE                     5
         VF_LOG_BAR_SIZE                     0
         NUM_PF_MSIX                         63
         NUM_VF_MSIX                         11
         INT_LOG_MAX_PAYLOAD_SIZE            AUTOMATIC(0)
         PCIE_CREDIT_TOKEN_TIMEOUT           0
         ACCURATE_TX_SCHEDULER               False(0)
         PARTIAL_RESET_EN                    False(0)
         RESET_WITH_HOST_ON_ERRORS           False(0)
         NVME_EMULATION_ENABLE               False(0)
         NVME_EMULATION_NUM_VF               0
         NVME_EMULATION_NUM_PF               1
         NVME_EMULATION_VENDOR_ID            5555
         NVME_EMULATION_DEVICE_ID            24577
         NVME_EMULATION_CLASS_CODE           67586
         NVME_EMULATION_REVISION_ID          0
         NVME_EMULATION_SUBSYSTEM_VENDOR_ID  0
         NVME_EMULATION_SUBSYSTEM_ID         0
         NVME_EMULATION_NUM_MSIX             0
         NVME_EMULATION_MAX_QUEUE_DEPTH      0
         PCI_SWITCH_EMULATION_NUM_PORT       0
         PCI_SWITCH_EMULATION_ENABLE         False(0)
         VIRTIO_NET_EMULATION_ENABLE         False(0)
         VIRTIO_NET_EMULATION_NUM_VF         0
         VIRTIO_NET_EMULATION_NUM_PF         0
         VIRTIO_NET_EMU_SUBSYSTEM_VENDOR_ID  6900
         VIRTIO_NET_EMULATION_SUBSYSTEM_ID   1
         VIRTIO_NET_EMULATION_NUM_MSIX       2
         VIRTIO_BLK_EMULATION_ENABLE         False(0)
         VIRTIO_BLK_EMULATION_NUM_VF         0
         VIRTIO_BLK_EMULATION_NUM_PF         0
         VIRTIO_BLK_EMU_SUBSYSTEM_VENDOR_ID  6900
         VIRTIO_BLK_EMULATION_SUBSYSTEM_ID   2
         VIRTIO_BLK_EMULATION_NUM_MSIX       2
         PCI_DOWNSTREAM_PORT_OWNER           Array[0..15]
         CQE_COMPRESSION                     BALANCED(0)
         IP_OVER_VXLAN_EN                    False(0)
         MKEY_BY_NAME                        False(0)
         PRIO_TAG_REQUIRED_EN                False(0)
         UCTX_EN                             True(1)
         REAL_TIME_CLOCK_ENABLE              False(0)
         RDMA_SELECTIVE_REPEAT_EN            False(0)
         PCI_ATOMIC_MODE                     PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
         TUNNEL_ECN_COPY_DISABLE             False(0)
         LRO_LOG_TIMEOUT0                    6
         LRO_LOG_TIMEOUT1                    7
         LRO_LOG_TIMEOUT2                    8
         LRO_LOG_TIMEOUT3                    13
         LOG_TX_PSN_WINDOW                   7
         LOG_MAX_OUTSTANDING_WQE             7
         TUNNEL_IP_PROTO_ENTROPY_DISABLE     False(0)
         ICM_CACHE_MODE                      DEVICE_DEFAULT(0)
         TLS_OPTIMIZE                        False(0)
         TX_SCHEDULER_BURST                  0
         ROCE_CC_LEGACY_DCQCN                True(1)
         LOG_DCR_HASH_TABLE_SIZE             11
         DCR_LIFO_SIZE                       16384
         ROCE_CC_PRIO_MASK_P1                255
         CLAMP_TGT_RATE_AFTER_TIME_INC_P1    True(1)
         CLAMP_TGT_RATE_P1                   False(0)
         RPG_TIME_RESET_P1                   300
         RPG_BYTE_RESET_P1                   32767
         RPG_THRESHOLD_P1                    1
         RPG_MAX_RATE_P1                     0
         RPG_AI_RATE_P1                      5
         RPG_HAI_RATE_P1                     50
         RPG_GD_P1                           11
         RPG_MIN_DEC_FAC_P1                  50
         RPG_MIN_RATE_P1                     1
         RATE_TO_SET_ON_FIRST_CNP_P1         0
         DCE_TCP_G_P1                        1019
         DCE_TCP_RTT_P1                      1
         RATE_REDUCE_MONITOR_PERIOD_P1       4
         INITIAL_ALPHA_VALUE_P1              1023
         MIN_TIME_BETWEEN_CNPS_P1            4
         CNP_802P_PRIO_P1                    6
         CNP_DSCP_P1                         48
         LLDP_NB_DCBX_P1                     False(0)
         LLDP_NB_RX_MODE_P1                  OFF(0)
         LLDP_NB_TX_MODE_P1                  OFF(0)
         DCBX_IEEE_P1                        True(1)
         DCBX_CEE_P1                         True(1)
         DCBX_WILLING_P1                     True(1)
         KEEP_ETH_LINK_UP_P1                 True(1)
         KEEP_IB_LINK_UP_P1                  False(0)
         KEEP_LINK_UP_ON_BOOT_P1             False(0)
         KEEP_LINK_UP_ON_STANDBY_P1          False(0)
         DO_NOT_CLEAR_PORT_STATS_P1          False(0)
         AUTO_POWER_SAVE_LINK_DOWN_P1        False(0)
         NUM_OF_VL_P1                        _4_VLs(3)
         NUM_OF_TC_P1                        _8_TCs(0)
         NUM_OF_PFC_P1                       8
         VL15_BUFFER_SIZE_P1                 0
         DUP_MAC_ACTION_P1                   LAST_CFG(0)
         UNKNOWN_UPLINK_MAC_FLOOD_P1         False(0)
         SRIOV_IB_ROUTING_MODE_P1            LID(1)
         IB_ROUTING_MODE_P1                  LID(1)
         PF_TOTAL_SF                         0
         PF_SF_BAR_SIZE                      0
         PF_NUM_PF_MSIX                      63
         ROCE_CONTROL                        ROCE_ENABLE(2)
         PCI_WR_ORDERING                     per_mkey(0)
         MULTI_PORT_VHCA_EN                  False(0)
         PORT_OWNER                          True(1)
         ALLOW_RD_COUNTERS                   True(1)
         RENEG_ON_CHANGE                     True(1)
         TRACER_ENABLE                       True(1)
         IP_VER                              IPv4(0)
         BOOT_UNDI_NETWORK_WAIT              0
         UEFI_HII_EN                         True(1)
         BOOT_DBG_LOG                        False(0)
         UEFI_LOGS                           DISABLED(0)
         BOOT_VLAN                           1
         LEGACY_BOOT_PROTOCOL                PXE(1)
         BOOT_RETRY_CNT                      NONE(0)
         BOOT_INTERRUPT_DIS                  False(0)
         BOOT_LACP_DIS                       True(1)
         BOOT_VLAN_EN                        False(0)
         BOOT_PKEY                           0
         P2P_ORDERING_MODE                   DEVICE_DEFAULT(0)
         EXP_ROM_VIRTIO_NET_PXE_ENABLE       True(1)
         EXP_ROM_VIRTIO_NET_UEFI_x86_ENABLE  True(1)
         EXP_ROM_VIRTIO_BLK_UEFI_x86_ENABLE  True(1)
         EXP_ROM_NVME_UEFI_x86_ENABLE        True(1)
         ATS_ENABLED                         False(0)
         DYNAMIC_VF_MSIX_TABLE               False(0)
         EXP_ROM_UEFI_ARM_ENABLE             True(1)
         EXP_ROM_UEFI_x86_ENABLE             True(1)
         EXP_ROM_PXE_ENABLE                  True(1)
         ADVANCED_PCI_SETTINGS               False(0)
         SAFE_MODE_THRESHOLD                 10
         SAFE_MODE_ENABLE                    True(1)

Code language: PHP (php)

有关QSFP56端口的信息,参考https://community.fs.com/cn/article/introduction-to-qsfp56-form-factor.html

使用ibstat命令查看网卡信息如下

CA 'mlx5_0'
        CA type: MT41686
        Number of ports: 1
        Firmware version: 24.31.0356
        Hardware version: 1
        Node GUID: 0xb8cef60300fd0d80
        System image GUID: 0xb8cef60300fd0d80
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 200
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x00010000
                Port GUID: 0xbacef6fffefd0d80
                Link layer: Ethernet
CA 'mlx5_1'
        CA type: MT41686
        Number of ports: 1
        Firmware version: 24.31.0356
        Hardware version: 1
        Node GUID: 0xb8cef60300f661e6
        System image GUID: 0xb8cef60300f661e6
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 200
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x00010000
                Port GUID: 0xbacef6fffef661e6
                Link layer: Ethernet
Code language: JavaScript (javascript)

英伟达最新的 bluefield 驱动集成在 DOCA 开发套件中,下载对应版本的deb包,利用dpkgapt-get安装即可,doca和原先的ofed会有依赖冲突,安装doca前需要卸载之前的驱动

for f in $( dpkg --list | grep doca | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done
/usr/sbin/ofed_uninstall.sh --forceCode language: PHP (php)

安装后可以通过systemctl status rshim命令确认驱动程序是否正常运行,结果如下

● rshim.service - rshim driver for BlueField SoC
Loaded: loaded (/lib/systemd/system/rshim.service; disabled; vendor preset: enabled)
Active: active (running) since Wed 2024-07-17 15:46:33 CST; 1s ago
Docs: man:rshim(8)
Process: 86070 ExecStart=/usr/sbin/rshim $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 86071 (rshim)
Tasks: 6 (limit: 154304)
Memory: 3.2M
CPU: 811ms
CGroup: /system.slice/rshim.service
└─86071 /usr/sbin/rshim

717 15:46:33 ps systemd[1]: Started rshim driver for BlueField SoC.
717 15:46:33 ps rshim[86071]: Probing pcie-0000:61:00.1(vfio)
717 15:46:33 ps rshim[86071]: Create rshim pcie-0000:61:00.1
717 15:46:33 ps rshim[86071]: Fall-back to uio
717 15:46:33 ps rshim[86071]: rshim pcie-0000:61:00.1 enable
717 15:46:34 ps rshim[86071]: rshim0 attached
717 15:46:34 ps rshim[86071]: Probing pcie-0000:21:00.1(vfio)
717 15:46:34 ps rshim[86071]: Create rshim pcie-0000:21:00.1
717 15:46:34 ps rshim[86071]: Fall-back to uio
717 15:46:34 ps rshim[86071]: rshim pcie-0000:21:00.1 enableCode language: JavaScript (javascript)

如果没有正确运行的话,可以reboot重启下服务器。另一台服务器也执行类似操作安装doca-all。这个包安装应该会很顺利,小编看错系统版本,折腾了一通,特此提醒,doca版本的命名方式和ubuntu相同,都是利用发行时间命名,所以要仔细区分doca的版本和ubuntu的版本,千万不要下错,否则会有复杂的依赖错误,且错装的版本难以完全卸载。

这个ES版本默认的固件不支持infiniband,但类似型号的正式版支持,刷固件可能解锁infiniband,但风险很大,我们先做简单测试,之后再更新刷固件的部分。200G的以太网也非常生猛了,我们先用以太网并行,跑通mpi,这种带宽的网络,瓶颈一般在延迟,而非带宽,这也是infiniband相比以太网的优势。

配置好IP地址后,使用iperf3测速,在其中一台服务器启动iperf server

iperf3 -s

另一台服务器启动 iperf client

iperf3 -c 169.254.174.32 -P 16 -t 30Code language: CSS (css)

测试结果节选如下

Connecting to host 169.254.174.32, port 5201
[  5] local 169.254.16.66 port 59382 connected to 169.254.174.32 port 5201
[  7] local 169.254.16.66 port 59394 connected to 169.254.174.32 port 5201
[  9] local 169.254.16.66 port 59408 connected to 169.254.174.32 port 5201
[ 11] local 169.254.16.66 port 59422 connected to 169.254.174.32 port 5201
[ 13] local 169.254.16.66 port 59434 connected to 169.254.174.32 port 5201
[ 15] local 169.254.16.66 port 59436 connected to 169.254.174.32 port 5201
[ 17] local 169.254.16.66 port 59440 connected to 169.254.174.32 port 5201
[ 19] local 169.254.16.66 port 59454 connected to 169.254.174.32 port 5201
[ 21] local 169.254.16.66 port 59460 connected to 169.254.174.32 port 5201
[ 23] local 169.254.16.66 port 59464 connected to 169.254.174.32 port 5201
[ 25] local 169.254.16.66 port 59472 connected to 169.254.174.32 port 5201
[ 27] local 169.254.16.66 port 59480 connected to 169.254.174.32 port 5201
[ 29] local 169.254.16.66 port 59496 connected to 169.254.174.32 port 5201
[ 31] local 169.254.16.66 port 59504 connected to 169.254.174.32 port 5201
[ 33] local 169.254.16.66 port 59514 connected to 169.254.174.32 port 5201
[ 35] local 169.254.16.66 port 59526 connected to 169.254.174.32 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   157 MBytes  1.32 Gbits/sec    0    314 KBytes
[  7]   0.00-1.00   sec   161 MBytes  1.35 Gbits/sec    0    482 KBytes
[  9]   0.00-1.00   sec   160 MBytes  1.35 Gbits/sec    0    520 KBytes
[ 11]   0.00-1.00   sec   159 MBytes  1.34 Gbits/sec    0    454 KBytes
[ 13]   0.00-1.00   sec   159 MBytes  1.33 Gbits/sec    0    452 KBytes
[ 15]   0.00-1.00   sec   161 MBytes  1.35 Gbits/sec    0    650 KBytes
[ 17]   0.00-1.00   sec   160 MBytes  1.34 Gbits/sec    0    513 KBytes
[ 19]   0.00-1.00   sec   156 MBytes  1.31 Gbits/sec    0    303 KBytes
[ 21]   0.00-1.00   sec   158 MBytes  1.33 Gbits/sec    0    646 KBytes
[ 23]   0.00-1.00   sec   158 MBytes  1.33 Gbits/sec    0    400 KBytes
[ 25]   0.00-1.00   sec   161 MBytes  1.35 Gbits/sec    0    561 KBytes
[ 27]   0.00-1.00   sec   158 MBytes  1.32 Gbits/sec    0    410 KBytes
[ 29]   0.00-1.00   sec   158 MBytes  1.32 Gbits/sec    0    397 KBytes
[ 31]   0.00-1.00   sec   160 MBytes  1.34 Gbits/sec    0    467 KBytes
[ 33]   0.00-1.00   sec   157 MBytes  1.32 Gbits/sec    0    419 KBytes
[ 35]   0.00-1.00   sec   158 MBytes  1.32 Gbits/sec    0    444 KBytes
[SUM]   0.00-1.00   sec  2.48 GBytes  21.3 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -

... 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.00  sec  3.74 GBytes  1.07 Gbits/sec    0             sender
[  5]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[  7]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[  7]   0.00-30.02  sec  3.75 GBytes  1.07 Gbits/sec                  receiver
[  9]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[  9]   0.00-30.02  sec  3.75 GBytes  1.07 Gbits/sec                  receiver
[ 11]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 11]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 13]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 13]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 15]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[ 15]   0.00-30.02  sec  3.76 GBytes  1.07 Gbits/sec                  receiver
[ 17]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[ 17]   0.00-30.02  sec  3.75 GBytes  1.07 Gbits/sec                  receiver
[ 19]   0.00-30.00  sec  3.74 GBytes  1.07 Gbits/sec    0             sender
[ 19]   0.00-30.02  sec  3.73 GBytes  1.07 Gbits/sec                  receiver
[ 21]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 21]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 23]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 23]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 25]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[ 25]   0.00-30.02  sec  3.75 GBytes  1.07 Gbits/sec                  receiver
[ 27]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 27]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 29]   0.00-30.00  sec  3.75 GBytes  1.07 Gbits/sec    0             sender
[ 29]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 31]   0.00-30.00  sec  3.76 GBytes  1.08 Gbits/sec    0             sender
[ 31]   0.00-30.02  sec  3.75 GBytes  1.07 Gbits/sec                  receiver
[ 33]   0.00-30.00  sec  3.74 GBytes  1.07 Gbits/sec    0             sender
[ 33]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[ 35]   0.00-30.00  sec  3.74 GBytes  1.07 Gbits/sec    0             sender
[ 35]   0.00-30.02  sec  3.74 GBytes  1.07 Gbits/sec                  receiver
[SUM]   0.00-30.00  sec  60.0 GBytes  17.2 Gbits/sec    0             sender
[SUM]   0.00-30.02  sec  59.9 GBytes  17.1 Gbits/sec                  receiver

iperf Done.

可以看到30 s的平均比特率为17.2Gbps,比标称的200Gbps低很多,调整一下并行的CPU核数到32,也就是-P 32,比特率小幅上升至19.9Gbps。这种网卡的瓶颈可能出现在其他硬件或者iperf的测试方法,暂时不管iperf的测试结果,测试下rdma的应用表现和openmpi的运行速度。

类似的,使用ib_write_bw测试rdma写性能,结果如下

---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF          Device         : mlx5_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 TX depth        : 128
 CQ Moderation   : 1
 Mtu             : 1024[B]
 Link type       : Ethernet
 GID index       : 3
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x0026 PSN 0x48d983 RKey 0x184ded VAddr 0x007afee78fb000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:169:254:16:66
 remote address: LID 0000 QPN 0x0026 PSN 0xedbefd RKey 0x201dbd VAddr 0x00781cdea40000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:169:254:174:32
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
Conflicting CPU frequency values detected: 1500.000000 != 3518.545000. CPU Frequency is not max.
 65536      5000             17078.53            16826.73                  0.269228
---------------------------------------------------------------------------------------
Code language: PHP (php)

带宽峰值换算成比特率是136.63 Gbps,和标称值200 Gbps在同一量级,比较合理了。

bluefield-2 的可玩性非常强,无论是CPU并行还是GPU并行,都能提供强大的通讯能力,今天简单配置并测试性能,之后分享更详细的玩法。

常恭

作者: 常恭

略懂 OpenFOAM

发表回复