(Deprecated)Infiniband priority configuration HOWTOs
This page is now deprecated because we couldn't apply our SL2VL/VLArb configuration successfully to our IB switches via opensm. Also, `smpquery pi` on the switches suggests that our switches only support 1 or 4 VLs while, according to the spec sheet, they should all support 8 VLs.
How does the OpenSM and its daemon work?
Where does OpenSM pick up the config file?
The default location is `/etc/opensm/opensm.conf`. Option `–config` can be used to override the default file.
How to configure the SL2VL mapping table and VL arbitration table of all the devices in our IB network?
todo: describe the qos policy file
todo: how does the credit-based flow control work? What if the credit limit is not enough to send a complete packet?
How to generate the default OpenSM config file template?
The default configuration is hardcoded in opensm and can always be regenerated. This is useful when you mess up the configuration file and forget to make a backup.
[root@rcmaster ~]# opensm --config <some empty file> --create-config <config file name>
`–config <some empty file>` instructs opensm to not pick up the configuration from the default location: /etc/opensm/opensm.conf.
How to display all the devices in our IB network?
Use `ibnetdiscover` to perform an IB subnet discovery and outputs a human readable topology file. Use `ibnodes` to print IB nodes (CAs and switches) only.
root@rcmaster:~# ibnodes Ca : 0x0002c903000cb454 ports 1 "rc06 HCA-1" ... Ca : 0x0002c9030057f1b4 ports 1 "rc50 HCA-1" Ca : 0x0002c9030057f118 ports 1 "rc80 mlx4_0" Ca : 0x0002c903000cb38c ports 1 "rcmonster HCA-1" Ca : 0x0002c9030009eff0 ports 1 "rcnfs HCA-1" Ca : 0x0002c9030057f208 ports 1 "rc79 mlx4_0" Ca : 0x0002c9030009efd0 ports 1 "rcmaster HCA-1" Switch : 0x0002c902004239a0 ports 36 "Infiniscale-IV Mellanox Technologies" base port 0 lid 41 lmc 0 Switch : 0x0002c9020041dde0 ports 36 "MF0;ib-leaf-switch3:IS5030/U1" enhanced port 0 lid 2 lmc 0 Switch : 0x0002c903005b7900 ports 36 "MF0;ib-leaf-switch2:SX60XX/U1" enhanced port 0 lid 74 lmc 0 Switch : 0x0002c90300652310 ports 36 "MF0;ib-core-switch1:SX60XX/U1" enhanced port 0 lid 76 lmc 0 Switch : 0x0002c90300652390 ports 36 "MF0;ib-core-switch2:SX60XX/U1" enhanced port 0 lid 77 lmc 0 Switch : 0x0002c90300652410 ports 36 "MF0;ib-leaf-switch1:SX60XX/U1" enhanced port 0 lid 78 lmc 0
Note that the second column is the node GUID and NOT the port GUID. To obtain the port GUID, use `iblinkinfo` (for the entire network) or `ibstat/ibstatus` (for the local device) instead.
How to enable QoS policy configuration?
First, enable QoS at the OFED driver level. Edit `/etc/modprobe.d/mlnx.conf` to add line: "options mlx4_core enable_qos=1". Stop `/etc/init.d/opensmd` and restart `/etc/init.d/openibd`.
root@rcmaster:~# cat /etc/modprobe.d/mlnx.conf # Module parameters for MLNX_OFED kernel modules options mlx4_core enable_qos=1 root@rcmaster:~# /etc/init.d/opensmd stop Stopping opensmd (via systemctl): opensmd.service. root@rcmaster:~# /etc/init.d/openibd restart Unloading HCA driver: [ OK ] Loading HCA driver and Access Layer: [ OK ] root@rcmaster:~# cat /sys/module/mlx4_core/parameters/enable_qos Y
Second, enable QoS in the subnet manager. Edit the OpenSM config file QoS section to change "qos FALSE" to "qos TRUE". Restart `/etc/init.d/opensmd`. Verify that your SL2VL & VLArb settings have been applied on the fabric via `smpquery`.
How many virtual lanes do our devices (i.e., switches, HCAs) support?
Use `smpquery pi <lid> | grep .VL`.
root@rcmaster:~# smpquery pi 78 | grep .VL VLCap:...........................VL0 OperVLs:.........................VL0 root@rcmaster:~# smpquery pi 77 | grep .VL VLCap:...........................VL0 OperVLs:.........................VL0 root@rcmaster:~# smpquery pi 76 | grep .VL VLCap:...........................VL0 OperVLs:.........................VL0 root@rcmaster:~# smpquery pi 74 | grep .VL VLCap:...........................VL0 OperVLs:.........................VL0 root@rcmaster:~# smpquery pi 2 | grep .VL VLCap:...........................VL0-3 OperVLs:.........................VL0-3 root@rcmaster:~# smpquery pi 1 | grep .VL VLCap:...........................VL0-3 OperVLs:.........................VL0-3
`VLCap` specifies the number of VLs implemented in the port's link layer while `OperVLs` is used by the SM to limit the number of VLs that the port is permitted to use.
It seems that all our switches should support VL0-7.
IS5025: http://www.mellanox.com/related-docs/prod_ib_switch_systems/IS5025.pdf
IS5030: http://www.mellanox.com/related-docs/prod_ib_switch_systems/IS5030_35.pdf
MSX6036: http://www.mellanox.com/related-docs/prod_ib_switch_systems/SX6036.pdf
IS50XX User Manual: http://www.mellanox.com/related-docs/user_manuals/IS50XX_user_manual.pdf
How to read and display the current SL2VL mapping table and VL arbitration table of a device?
Use `smpquery sl2vl <lid>` and `smpquery vlarb <lid>` respectively.
How can we be certain that our settings of the SL2VL and VLArb tables are working as expected?
todo: note that all but one of our switches support only 1 VL, we must be careful to use the right switch when writing the test...
Information Sources:
https://community.mellanox.com/docs/DOC-1326
https://wiki.archlinux.org/index.php/InfiniBand
http://www.mellanox.com/page/products_dyn?product_family=26&menu_section=34
https://docs.oracle.com/cd/E18476_01/doc.220/e18478/fabric.htm#ELMOG76114
http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v2.3-1.0.1.pdf
https://pkg-ofed.alioth.debian.org/howto/infiniband-howto.html
https://www.opsdash.com/blog/network-performance-linux.html
http://www.ieee802.org/1/files/public/docs2014/new-dcb-crupnicoff-ibcreditstutorial-0314.pdf
https://www.openfabrics.org/downloads/OFED/archive/ofed-1.4/OFED-1.4-docs/OFED_tips.txt
HowTo Get Started with Mellanox Switches
HowTo Upgrade MLNX-OS Software on Mellanox switch systems