Does anybody know how to set the speed/rate on the Infiniband Subnet Manager?

Issues related to configuring your network
Post Reply
alpha754293
Posts: 69
Joined: 2019/07/29 16:15:14

Does anybody know how to set the speed/rate on the Infiniband Subnet Manager?

Post by alpha754293 » 2019/09/06 22:41:41

I have a Mellanox MSB-7890 externally managed switch (36-ports, 4X EDR Infiniband) and I have five nodes now that all have a Mellanox ConnectX-4 dual port 4x EDR 100 Gbps card (MCX456A-ECAT).

One of the nodes runs the subnet manager.

All of the nodes are running CentOS 7.6.1810 with 'Infiniband Support' package group installed.

When I go to run ibdiagnet, it shows:

Code: Select all

I---------------------------------------------------
I- IPoIB Subnets Check
I---------------------------------------------------
I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
However, when I run iblinkinfo, this is what it shows:

Code: Select all

CA: aes0 mlx5_0:
      0x7cfe9003004431f8      1    1[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       3    5[  ] "SwitchIB Mellanox Technol
ogies" ( )
CA: aes2 mlx5_0:
      0x248a0703002b1ec6      4    1[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       3    3[  ] "SwitchIB Mellanox Technol
ogies" ( )
CA: aes3 mlx5_0:
      0x248a0703002b1eca      2    1[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       3    2[  ] "SwitchIB Mellanox Technol
ogies" ( )
CA: aes4 mlx5_0:
      0x248a0703002b1ece      6    1[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       3    1[  ] "SwitchIB Mellanox Technol
ogies" ( )
Switch: 0xec0d9a0300224e70 SwitchIB Mellanox Technologies:
           3    1[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       6    1[  ] "aes4 mlx5_0" ( )
           3    2[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       2    1[  ] "aes3 mlx5_0" ( )
           3    3[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       4    1[  ] "aes2 mlx5_0" ( )
           3    4[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       5    1[  ] "aes1 mlx5_0" ( )
           3    5[  ] ==( 4X      25.78125 Gbps Active/  LinkUp)==>       1    1[  ] "aes0 mlx5_0" ( )
How do I set the subnet manager speed rate to be something higher than 10 Gbps?

I tried looking through the documentation and doing research online and it doesn't appear that anybody else has encountered this before?

(I found this out while I was trying to find out what the MTU was on the switch. All of the cards are set to 'connected mode' with a MTU of 4092 (/etc/sysconfig/network-scripts/ifcfg-ib0).)

Your help is greatly appreciated.

Thank you.

chemal
Posts: 776
Joined: 2013/12/08 19:44:49

Re: Does anybody know how to set the speed/rate on the Infiniband Subnet Manager?

Post by chemal » 2019/09/07 03:06:49

alpha754293 wrote:
2019/09/06 22:41:41

Code: Select all

I---------------------------------------------------
I- IPoIB Subnets Check
I---------------------------------------------------
I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
I get this too, but it refers to OpenSM's defaults for multicast groups. Are you using multicast groups? I don't.
All of the cards are set to 'connected mode' with a MTU of 4092
Why? The default MTU is 65520.

alpha754293
Posts: 69
Joined: 2019/07/29 16:15:14

Re: Does anybody know how to set the speed/rate on the Infiniband Subnet Manager?

Post by alpha754293 » 2019/09/08 03:56:02

chemal wrote:
2019/09/07 03:06:49
alpha754293 wrote:
2019/09/06 22:41:41

Code: Select all

I---------------------------------------------------
I- IPoIB Subnets Check
I---------------------------------------------------
I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
I get this too, but it refers to OpenSM's defaults for multicast groups. Are you using multicast groups? I don't.
All of the cards are set to 'connected mode' with a MTU of 4092
Why? The default MTU is 65520.
I'm not sure if I am using MC groups.

I'm not sure how to check that.

re: MTU
I don't know if that's a limitation of the MSB-7890 externally managed switch or something else because when I try to set it to something higher than 4092, it won't take.

Code: Select all

# ibv_devinfo
hca_id: mlx5_0
        transport:                      InfiniBand (0)
        fw_ver:                         12.24.1000
        node_guid:                      248a:0703:002b:1ed6
        sys_image_guid:                 248a:0703:002b:1ed6
        vendor_id:                      0x02c9
        vendor_part_id:                 4115
        hw_ver:                         0x0
        board_id:                       DEL2190110032
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 2
                        port_lid:               6
                        port_lmc:               0x00
                        link_layer:             InfiniBand

hca_id: mlx5_1
        transport:                      InfiniBand (0)
        fw_ver:                         12.24.1000
        node_guid:                      248a:0703:002b:1ed7
        sys_image_guid:                 248a:0703:002b:1ed6
        vendor_id:                      0x02c9
        vendor_part_id:                 4115
        hw_ver:                         0x0
        board_id:                       DEL2190110032
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_DOWN (1)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               65535
                        port_lmc:               0x00
                        link_layer:             InfiniBand
(see max_mtu)

And being that it's an externally managed switch, I'm not sure how to change the MTU on the switch itself if the switch is externally managed.


alpha754293
Posts: 69
Joined: 2019/07/29 16:15:14

Re: Does anybody know how to set the speed/rate on the Infiniband Subnet Manager?

Post by alpha754293 » 2019/09/09 03:16:29

Thanks. Yeah, I read the RedHat documentation on configuring the subnet manager, but it doesn't really say why it doesn't change the rate.

The bugzilla bug tracker says that it's going to be fixed for RHEL 7.7 and 8.1, which presumably means that until CentOS catches up, it's still going to be a bug here with no resolution course. (although the bug specifically was talking about MTU, not the SM rate).

I'll have to try setting changing the /etc/rdma/opensm.conf again and see if it works again.

Thanks.

Post Reply