Intermittent 40Gbit Link on Mikrotik CRS326-24S+2Q+

Zarathustra[H]

Extremely [H]
Joined
Oct 29, 2000
Messages
38,878
Hi Everyone,

Just picked up a new CRS326-24S+2Q+ and I am struggling a little bit.

I'm using it as a layer2 switch + VLAN's with no routing, so I opted to boot and use it in SwOS mode.

I am using 3M Molex DAC cables to connect the Fortville (Intel XL710-QDA2) NIC's in my servers to the switch. I went with Molex cables as - at least the 10gbit SFP+ varieties have always worked in Mikrotik switches for me, and I needed them quickly, and could not find Mikrotik branded cables locally.

These immediately establish a link when connected directly, NIC to NIC, suggesting the cables are fine, but when plugging into the switch, there is no link light.

Last night, after repeatedly unplugging and re-plugging the cable, and waiting, eventually a link light appeared, and once it did, the connection was solid, and worked perfectly, until the server rebooted, at which point there was - again - no link light and no connection.

I have now been trying for 20-30 minutes, and have been unable to get another link light on the QSFP+ port. I have no idea what finally got it to link last night, but I cannot seem to replicate it again today.

It's notable that the switch detects that the DAC cable is connected, and displays the information on the Link page as follows:

Code:
QSFP+2.1    Molex     1002971301         215021347     22-05-30    3m copper

...it just doesn't result in a link.

Does anyone have any suggestions here?

1.) Do I potentially have a bad switch?

2.) Was I just silly to try to use a Molex DAC cable instead of the Mikrotik branded one?

3.) I have read that in some cases disabling auto-negotiation helps with the 40Gbit ports on Mikrotik hardware. Is this something I should try? If so, how do I do it? I clicked around the settings menu but if I disable auto-negotiation, I don't get a 40Gbit option, only 10M, 100M and 1G.

4.) Anything else I can try?


I appreciate any suggestions!
 
Last edited:
No experience with this level of awesomeness of equipment, but have you tried the NIC DAC plugging first/last after giving time for the other side DAC to 'settle'? Just going off my gut of what I would try. Obviously they will work if the scenario is right and you've found that scenario once.
 
No experience with this level of awesomeness of equipment,

The equipment is finally entering the sphere were decommed stuff is affordable to us mere mortals on eBay, A Dual port 40gig NIC will run for just south of $200 on eBay now, which makes them possible to play with as a normal person. 40gbit QSFP+ is pretty much dead moving forward, being replaced by 25Gbit SFP28 and 100Gbit QSFP28 , but it is still very usable if you don't care about future proofing.

The Mikrotik switch is surprisingly affordable too.

but have you tried the NIC DAC plugging first/last after giving time for the other side DAC to 'settle'? Just going off my gut of what I would try. Obviously they will work if the scenario is right and you've found that scenario once.

In my case, there is no transducer involved. I am using DAC (Direct Attach Copper) cables from port to port, so the cable is either in, or it is not.

But maybe that is a good point. Maybe I should get optical transducers and a short optical patch cable instead of these DAC cables. That might give me a better chance of using transducers that are compatible.

I'll have to think about that, and check eBay to see if transducers are affordable yet. A 3M Molex QSFP+ cable will run you $50-$60 right now, so it doesn't take much for transducers to be more affordable.
 
I'll have to think about that, and check eBay to see if transducers are affordable yet. A 3M Molex QSFP+ cable will run you $50-$60 right now, so it doesn't take much for transducers to be more affordable.

So, in order to make this work, I'd need an SR4 transducer on each side, one compatible with the Intel NIC's (they usually take anything, but Finisar's models are a sure bet, as Intel's own transducers are usually just relabeled Finisar transducers.) These are reasonably cheap ($15-$30) on eBay. The Mikrotik side is a little dicier. I used to say Mikrotik will take "almost anyhting" and that was true on their 10gig SFP+ ports, but the QSFP+ ports seem to be a little pickier.

One way to make sure you have a supported configuration would be to get the actual Mikrotik-branded transducer. Their 40gig SR4 transducers are discontinued, but can be found on eBay for like $60-$65 each. Their new 100gig SR4 transducers are also listed as being backwards compatible with 40gig, but those go for like $90 to $120. I guess that's the cost of future proofing right there.

I'd also need a short 2-3M fiber patch cable. I'm having a difficult time figuring out which fiber connectors they take. The fiber type is well documented. OM3 or OM4 (with different max ranges) but are they the same dual LC-LC connectors I am used to from 10gig SFP+?

If so, I'd need something like this:
- Finisar SR4 transducer for Intel NIC side: ~$29
- Short 3M OM3 or OM4 patch cable: ~$10
- Mikrtik branded SR4 transducer for switch side: ~$64

$103 for a 2M cable.

A little steep, but at least this ought to be officially supported and should work.

I feel like this is the rationale that got us to crazy high Enterprise hardware pricing :p
 
I decided to just suck it up and order thransducers.

I usually have reservations about made in china brands, but if IT departments at large corporations with may more resources and knowledge than I have use FS.com parts and are comfortable with them, and have not spotted any suspicious traffic, who am I to worry?

1708720808924.png


Fingers crossed that when this gets here (looks like Monday*?) it works.

I will keep the thread posted for posterity if anyone else ever needs this info.

Edit:
*Monday was with expensive shipping. Tuesday with free shipping. Free is my favorite price. I'll go for that.
 
I'm not sure this is directly helpful but may help someone in the future. I just crossed a similar 10GB SFP+ scenario with Intel X710-DA2 NICs and HP 2910AL switches. HP compatible DAC cables from 10Gtek would result in no link on the Intel NIC. Nothing I tried could change it. Intel compatible ipolex/10Gtek DAC would get link but not work. Error in the switch of non-compatible module, the HP/Aruba command to allow foreign modules "allow-unsupported-transceiver " would not work on the latest firmware available for the switch. It's old. The Intel NIC could connect to its own second port no problem with the Intel cable. The HP DAC connects with other HP switches fine.

I've found transceivers and fiber cables very flexible between brands and have personally connected HP, Aruba, Netgear, Dell, and TP-Link to each other with short range (SR) transceivers without issue. DACs feel like a minefield of compatibility issues in comparison.

1708720733976.png
 
Last edited:
In my case, there is no transducer involved. I am using DAC (Direct Attach Copper) cables from port to port, so the cable is either in, or it is not.
No, I got where you were using DACs, but maybe plugging in one end first and waiting and then the other is better than just plugging both in for some reason.
 
I'm not sure this is directly helpful but may help someone in the future. I just crossed a similar 10GB SFP+ scenario with Intel X710-DA2 NICs and HP 2910AL switches. HP compatible DAC cables from 10Gtek would result in no link on the Intel NIC. Nothing I tried could change it. Intel compatible ipolex/10Gtek DAC would get link but not work. Error in the switch of non-compatible module, the HP/Aruba command to allow foreign modules "allow-unsupported-transceiver " would not work on the latest firmware available for the switch. It's old. The Intel NIC could connect to its own second port no problem with the Intel cable. The HP DAC connects with other HP switches fine.

I've found transceivers and fiber cables very flexible between brands and have personally connected HP, Aruba, Netgear, Dell, and TP-Link to each other with short range (SR) transceivers without issue. DACs feel like a minefield of compatibility issues in comparison.

View attachment 637291
Thank you for sharing! This great to know!
 
I decided to just suck it up and order thransducers.

I usually have reservations about made in china brands, but if IT departments at large corporations with may more resources and knowledge than I have use FS.com parts and are comfortable with them, and have not spotted any suspicious traffic, who am I to worry?

View attachment 637292

Fingers crossed that when this gets here (looks like Monday*?) it works.

I will keep the thread posted for posterity if anyone else ever needs this info.

Edit:
*Monday was with expensive shipping. Tuesday with free shipping. Free is my favorite price. I'll go for that.


Lol. I'm learning as I am going along. I ordered LC-LC cables. The spec pages for these things all seem to call out the fiber spec, but none seem to tell you the connector type.

Turns out I need MTP connectors, not LC connectors, and for whatever reason this increases the price almost by an order of magnitude.

So I have now ordered the proper OM4 MTP cables (not to be mistaken for the much cheaper MPO cables) for another $100.

There are a lot of hidden costs with QSFP beyond what I expected when I found affordable NICs and an affordable switch :p
 
Last edited:
I've thought about getting a Microtik CRS504-4XQ-IN for a long while now but compatibility concerns have kept me away. 100gbps QSFP28 ports and some QSFP28 to SFP28 breakout cables. Seems like for around $1000-$1300 w/ NICs and you could have 25-100 gbps networking.
Might be worth considering a return if the project is getting close in price to that. Then you can let me know what compatibility issues you fix so I don't have to :)
 
what are you doing at home that you need 40 and 100gb networking? 🤔

"Need" is such an ugly word :p

I've been using 10gig networking for my NAS for almos to a decade now, staring when I bought some decommed Brocade BR1020 NIC's back in 2014. When the opportunity to buy a couple of Intel XL710-QDA2 NIC's for under $200 came up, I decided to see if I could get a little better performance from the storage server, running a dedicated line between it and the desktop, on a separate subnet.

I did wind up getting a little better performance out of it, but not much. I was able to max out the 10gig interfaces, but I only get 10-15% more performance put of 40gig, so the bottleneck lies elsewhere. I'd probably see bigger gains if I had a more parallelized load, but I don't. I knew this was likely going to be the case,going in but I still wanted to play with it. At least this way I have removed the network ad the limiting factor and can move on to other things :p

That said, I still decided to grab a Mikrotik switch with two 40gig ports. This way I could use just the dual port 40gig NIC's, one direct link between the server and the workstation, one from the server to the switch, and one from the desktop to the switch, since they were already in there with an empty port anyway.

The biggest motivator for this approach was simply to be able to remove the old 10gig NIC's and free up PCIe slots. A single 40gig link from the server to the switch should be more than enough to cover the combined load of the NAS and all the VM:s on the server at the same time, and I can free up the slots used by the 10gig nics.

Turns out even when I only get marginal performance improvements at 40gig, the XL710-QDA2 NIC's are still better as they both run much cooler than and offload more CPU load than the 10gig x520 they are replacing do.

Would this be worth the cost and downtime in a production environment? Probably not.

But in the end, this is a hobby, and playing and experimenting with this is just as much what it is about as having what you "need" working. Others enjoy going to the beach. To me this is way more fun.
 
Last edited:
what are you doing at home that you need 40 and 100gb networking? 🤔
For me, nothing to reasonably justify it. For fun and learning. Plenty of people have toys that cost multiples of projects like this. A high performance Microtik is significantly cheaper than a slower big name switch. New Dell Powerswitch N2248X-ON was $4,905. Used Aruba 2540 w/ 4x10 gbps is $500-$800 on ebay. For home I'd rather have down and dirty raw performance.

By the time used 40gbps gear got cheap enough and I was paying attention, 25/50/100 was already being broadly deployed. I've been impatiently waiting for cheap switches with a couple 25gbps SFP28 ports but now 100 gbps gear is getting just about cheap enough.

1708976163813.png

1708976304099.png
 
and here is me using lacp bonding 2 1gb nics for my proxmox box😆 but I don't even need 1gb. Unifi user so I want to see them come out with a mix switch with some 2.5gbe, 10gb and Poe in one
 
and here is me using lacp bonding 2 1gb nics for my proxmox box😆 but I don't even need 1gb. Unifi user so I want to see them come out with a mix switch with some 2.5gbe, 10gb and Poe in one

I love mixed switches, unfortunately they seem pretty rare, as all the enterpise types seem to like dedicated port types. Occasionally you get 204 uplink ports that are of a higher speed (which can also be used for server/storage).

The CRS326-24S+2Q+RM has - as its name would suggest - 24x 10gig SFP+ ports and two QSFP+ 40Gbit ports. I just got a bunch of compatible gigabit copper transceiver adapters for the things in therack that need copper. Going gigabit keeps the price down. If you need multigig/10gig transcievier adapters it gets pricier.

It's a bit ugly (and yes, my cable management could be better) but it does work....

PXL_20240226_220553645.PORTRAIT.jpg


Before I had one 16 port 10gig SFP+ switch and one 24 port gigabit switch (with two SFP+ uplink ports) in the rack. I like the one switch solution better.


Now I can move the 16 port 1ogig switch to my office and link aggregate it with a couple of ports, and have easy local 10gig access. The 24port gigabit switch is a little overkill for the purpose, but it will be going in the living room now.
 
So, I got my transceivers and one of my two lengths of fiber today.

At first I thought I was going to have to wait longer, as I ordered two Intel Compatible (36281) and two Mikrotik compatible (84681) transceivers, and they sent two Intel (36281) and two Generic (75298)

I contacted their customer service, and they said the generic and Mikrotik were the same. I was a little bit concerned, as my generic DAC had just failed to work, but they assured me they would take them back opened if it didn't, so I gave it a try.

And Et Voilà!

As soon as I connected it it got a link and worked beautifully.

For whatever reason Amazon sent my two MTP fiber patch cables in separate packages. One arrived today, and one is arriving tomorrow, so I am 50% there right now.

As uncomfortable as it makes me to order networking products from China, I can see why Enterprise IT types like them. They just know their shit and it works. I guess if they planted something that was dialing home in the transceivers someone would have found it by now. I sure don't see anything in my firewall logs. Still doesn't hurt to be vigilant.
 
Once you get them all setup I'd like to see what your transfer speeds are. Maybe some iperf tests if you're willing. Never used 40gbps stuff. I've read things that it's really 4x10 gbps bonded in some way, other things said it should work like a single 40 gbps link.
The 25 gbps stuff I've worked on gets to a little over 21 gbps w/ some tweaking in iperf3. 43 gbps w/ a tream of 2x25gbps.
perf3.exe -P 8
iperf3.exe -P 8 -w 512M
 
Last edited:
Once you get them all setup I'd like to see what your transfer speeds are. Maybe some iperf tests if you're willing. Never used 40gbps stuff. I've read things that it's really 4x10 gbps bonded in some way, other things said it should work like a single 40 gbps link.
The 25 gbps stuff I've worked on gets to a little over 21 gbps w/ some tweaking in iperf3. 43 gbps w/ a tream of 2x25gbps.
perf3.exe -P 8
iperf3.exe -P 8 -w 512M

I can only give you iperf numbers between two direct connected linux boxes right now. My delayed MTP cable has not yet arrived, so I don't have a second 40gig box to iperf to.

I've also never used iperf3 before, just the regular iperf that comes with pretty much every distribution. I just installed iperf3 for testing.

All of that said, with the direct connected box it looks like this:

iperf3 -c 10.0.2.10 -P 8:

Code:
$ iperf3 -c 10.0.2.10 -P 8
Connecting to host 10.0.2.10, port 5201
[  5] local 10.0.2.116 port 55074 connected to 10.0.2.10 port 5201
[  7] local 10.0.2.116 port 55080 connected to 10.0.2.10 port 5201
[  9] local 10.0.2.116 port 55084 connected to 10.0.2.10 port 5201
[ 11] local 10.0.2.116 port 55100 connected to 10.0.2.10 port 5201
[ 13] local 10.0.2.116 port 55110 connected to 10.0.2.10 port 5201
[ 15] local 10.0.2.116 port 55118 connected to 10.0.2.10 port 5201
[ 17] local 10.0.2.116 port 55126 connected to 10.0.2.10 port 5201
[ 19] local 10.0.2.116 port 55136 connected to 10.0.2.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec   266 MBytes  2.22 Gbits/sec  383    239 KBytes    
[  7]   0.00-1.01   sec   260 MBytes  2.17 Gbits/sec  291    168 KBytes    
[  9]   0.00-1.01   sec   258 MBytes  2.16 Gbits/sec  388    502 KBytes    
[ 11]   0.00-1.01   sec   135 MBytes  1.13 Gbits/sec  403    211 KBytes    
[ 13]   0.00-1.01   sec   262 MBytes  2.19 Gbits/sec  307    645 KBytes    
[ 15]   0.00-1.01   sec   258 MBytes  2.15 Gbits/sec  154    215 KBytes    
[ 17]   0.00-1.01   sec   259 MBytes  2.16 Gbits/sec  289    184 KBytes    
[ 19]   0.00-1.01   sec   263 MBytes  2.19 Gbits/sec  494    127 KBytes    
[SUM]   0.00-1.01   sec  1.92 GBytes  16.4 Gbits/sec  2709          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.01-2.00   sec   235 MBytes  1.98 Gbits/sec  220    137 KBytes    
[  7]   1.01-2.00   sec   239 MBytes  2.01 Gbits/sec  184    161 KBytes    
[  9]   1.01-2.00   sec   252 MBytes  2.12 Gbits/sec  170    478 KBytes    
[ 11]   1.01-2.00   sec   237 MBytes  2.00 Gbits/sec  437    156 KBytes    
[ 13]   1.01-2.00   sec   230 MBytes  1.94 Gbits/sec  467    235 KBytes    
[ 15]   1.01-2.00   sec   227 MBytes  1.91 Gbits/sec  360    105 KBytes    
[ 17]   1.01-2.00   sec   253 MBytes  2.13 Gbits/sec  567    158 KBytes    
[ 19]   1.01-2.00   sec   234 MBytes  1.97 Gbits/sec  324    232 KBytes    
[SUM]   1.01-2.00   sec  1.86 GBytes  16.1 Gbits/sec  2729          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec   237 MBytes  1.99 Gbits/sec  272    170 KBytes    
[  7]   2.00-3.00   sec   227 MBytes  1.91 Gbits/sec  406    247 KBytes    
[  9]   2.00-3.00   sec   244 MBytes  2.05 Gbits/sec  436    167 KBytes    
[ 11]   2.00-3.00   sec   237 MBytes  1.99 Gbits/sec  269    311 KBytes    
[ 13]   2.00-3.00   sec   240 MBytes  2.01 Gbits/sec  430    163 KBytes    
[ 15]   2.00-3.00   sec   246 MBytes  2.06 Gbits/sec  281    228 KBytes    
[ 17]   2.00-3.00   sec   238 MBytes  2.00 Gbits/sec  443    334 KBytes    
[ 19]   2.00-3.00   sec   238 MBytes  2.00 Gbits/sec  295    173 KBytes    
[SUM]   2.00-3.00   sec  1.86 GBytes  16.0 Gbits/sec  2832          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec   232 MBytes  1.94 Gbits/sec  381    322 KBytes    
[  7]   3.00-4.00   sec   234 MBytes  1.95 Gbits/sec  282   91.9 KBytes    
[  9]   3.00-4.00   sec   232 MBytes  1.94 Gbits/sec  347    146 KBytes    
[ 11]   3.00-4.00   sec   242 MBytes  2.02 Gbits/sec  243    479 KBytes    
[ 13]   3.00-4.00   sec   231 MBytes  1.93 Gbits/sec  283    191 KBytes    
[ 15]   3.00-4.00   sec   240 MBytes  2.01 Gbits/sec  284    233 KBytes    
[ 17]   3.00-4.00   sec   235 MBytes  1.96 Gbits/sec  242    130 KBytes    
[ 19]   3.00-4.00   sec   233 MBytes  1.95 Gbits/sec  206    119 KBytes    
[SUM]   3.00-4.00   sec  1.84 GBytes  15.7 Gbits/sec  2268          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec   268 MBytes  2.25 Gbits/sec  627   96.2 KBytes    
[  7]   4.00-5.00   sec   270 MBytes  2.27 Gbits/sec  623    136 KBytes    
[  9]   4.00-5.00   sec   263 MBytes  2.21 Gbits/sec  760   69.3 KBytes    
[ 11]   4.00-5.00   sec   264 MBytes  2.22 Gbits/sec  560   76.4 KBytes    
[ 13]   4.00-5.00   sec   261 MBytes  2.19 Gbits/sec  340    129 KBytes    
[ 15]   4.00-5.00   sec   265 MBytes  2.22 Gbits/sec  1113    243 KBytes    
[ 17]   4.00-5.00   sec   254 MBytes  2.14 Gbits/sec  363   36.8 KBytes    
[ 19]   4.00-5.00   sec   256 MBytes  2.16 Gbits/sec  454   84.8 KBytes    
[SUM]   4.00-5.00   sec  2.05 GBytes  17.7 Gbits/sec  4840          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  707    191 KBytes    
[  7]   5.00-6.00   sec   215 MBytes  1.81 Gbits/sec  353   39.6 KBytes    
[  9]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  524    235 KBytes    
[ 11]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  495    252 KBytes    
[ 13]   5.00-6.00   sec   252 MBytes  2.12 Gbits/sec  636    252 KBytes    
[ 15]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  500    206 KBytes    
[ 17]   5.00-6.00   sec   264 MBytes  2.22 Gbits/sec  641    146 KBytes    
[ 19]   5.00-6.00   sec   268 MBytes  2.25 Gbits/sec  535    359 KBytes    
[SUM]   5.00-6.00   sec  2.03 GBytes  17.4 Gbits/sec  4391          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  119    270 KBytes    
[  7]   6.00-7.00   sec   242 MBytes  2.03 Gbits/sec  207    223 KBytes    
[  9]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  208    173 KBytes    
[ 11]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  205    174 KBytes    
[ 13]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec   50    181 KBytes    
[ 15]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  141    174 KBytes    
[ 17]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec   76    403 KBytes    
[ 19]   6.00-7.00   sec   250 MBytes  2.09 Gbits/sec  111    209 KBytes    
[SUM]   6.00-7.00   sec  1.95 GBytes  16.7 Gbits/sec  1117          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   7.00-8.00   sec   254 MBytes  2.13 Gbits/sec  805    368 KBytes    
[  7]   7.00-8.00   sec   245 MBytes  2.06 Gbits/sec  601    182 KBytes    
[  9]   7.00-8.00   sec   266 MBytes  2.23 Gbits/sec  564    150 KBytes    
[ 11]   7.00-8.00   sec   236 MBytes  1.98 Gbits/sec  515    205 KBytes    
[ 13]   7.00-8.00   sec   230 MBytes  1.93 Gbits/sec  191    284 KBytes    
[ 15]   7.00-8.00   sec   229 MBytes  1.92 Gbits/sec  323    215 KBytes    
[ 17]   7.00-8.00   sec   240 MBytes  2.01 Gbits/sec  799    156 KBytes    
[ 19]   7.00-8.00   sec   229 MBytes  1.93 Gbits/sec  399    256 KBytes    
[SUM]   7.00-8.00   sec  1.88 GBytes  16.2 Gbits/sec  4197          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   8.00-9.00   sec   228 MBytes  1.91 Gbits/sec  123    225 KBytes    
[  7]   8.00-9.00   sec   221 MBytes  1.86 Gbits/sec  152    191 KBytes    
[  9]   8.00-9.00   sec   229 MBytes  1.92 Gbits/sec  275    485 KBytes    
[ 11]   8.00-9.00   sec   226 MBytes  1.89 Gbits/sec  171    366 KBytes    
[ 13]   8.00-9.00   sec   220 MBytes  1.84 Gbits/sec   56    238 KBytes    
[ 15]   8.00-9.00   sec   226 MBytes  1.90 Gbits/sec  166   76.4 KBytes    
[ 17]   8.00-9.00   sec   222 MBytes  1.86 Gbits/sec  140    216 KBytes    
[ 19]   8.00-9.00   sec   220 MBytes  1.85 Gbits/sec  112    123 KBytes    
[SUM]   8.00-9.00   sec  1.75 GBytes  15.0 Gbits/sec  1195          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  147    230 KBytes    
[  7]   9.00-10.00  sec   238 MBytes  2.00 Gbits/sec  133   14.1 KBytes    
[  9]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  153    216 KBytes    
[ 11]   9.00-10.00  sec   250 MBytes  2.10 Gbits/sec  209    199 KBytes    
[ 13]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  105    260 KBytes    
[ 15]   9.00-10.00  sec   246 MBytes  2.07 Gbits/sec  213    122 KBytes    
[ 17]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec   50    240 KBytes    
[ 19]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec  105    191 KBytes    
[SUM]   9.00-10.00  sec  1.94 GBytes  16.7 Gbits/sec  1115          
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec  3784             sender
[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec                  receiver
[  7]   0.00-10.00  sec  2.33 GBytes  2.01 Gbits/sec  3232             sender
[  7]   0.00-10.00  sec  2.33 GBytes  2.00 Gbits/sec                  receiver
[  9]   0.00-10.00  sec  2.46 GBytes  2.11 Gbits/sec  3825             sender
[  9]   0.00-10.00  sec  2.46 GBytes  2.11 Gbits/sec                  receiver
[ 11]   0.00-10.00  sec  2.29 GBytes  1.97 Gbits/sec  3507             sender
[ 11]   0.00-10.00  sec  2.29 GBytes  1.97 Gbits/sec                  receiver
[ 13]   0.00-10.00  sec  2.37 GBytes  2.04 Gbits/sec  2865             sender
[ 13]   0.00-10.00  sec  2.37 GBytes  2.04 Gbits/sec                  receiver
[ 15]   0.00-10.00  sec  2.40 GBytes  2.06 Gbits/sec  3535             sender
[ 15]   0.00-10.00  sec  2.40 GBytes  2.06 Gbits/sec                  receiver
[ 17]   0.00-10.00  sec  2.41 GBytes  2.07 Gbits/sec  3610             sender
[ 17]   0.00-10.00  sec  2.41 GBytes  2.07 Gbits/sec                  receiver
[ 19]   0.00-10.00  sec  2.38 GBytes  2.05 Gbits/sec  3035             sender
[ 19]   0.00-10.00  sec  2.38 GBytes  2.05 Gbits/sec                  receiver
[SUM]   0.00-10.00  sec  19.1 GBytes  16.4 Gbits/sec  27393             sender
[SUM]   0.00-10.00  sec  19.1 GBytes  16.4 Gbits/sec                  receiver

iperf Done.

Yikes, that's a lot more output than I am used to.

The -w 512M gives me an error message that the socket buffer size is not set correctly.



I've used regular iperf for most of my testing, which has looked something like this:

Code:
$ iperf -c 10.0.2.10 -d -P 8
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  6] local 10.0.2.116 port 34760 connected with 10.0.2.10 port 5001
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  8] local 10.0.2.116 port 34800 connected with 10.0.2.10 port 5001
[  3] local 10.0.2.116 port 34796 connected with 10.0.2.10 port 5001
[  7] local 10.0.2.116 port 34798 connected with 10.0.2.10 port 5001
[  4] local 10.0.2.116 port 34746 connected with 10.0.2.10 port 5001
[  2] local 10.0.2.116 port 34768 connected with 10.0.2.10 port 5001
[  1] local 10.0.2.116 port 34730 connected with 10.0.2.10 port 5001
[  9] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37944
[ 10] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37950
[ 11] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37960
[ 12] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37966
[ 13] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37980
[ 14] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37982
[ 15] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37994
[ 16] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 38008
[  5] local 10.0.2.116 port 34782 connected with 10.0.2.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0058 sec  1.66 GBytes  1.43 Gbits/sec
[  6] 0.0000-10.0058 sec  1.68 GBytes  1.44 Gbits/sec
[  2] 0.0000-10.0058 sec  1.78 GBytes  1.53 Gbits/sec
[  4] 0.0000-10.0058 sec  1.57 GBytes  1.35 Gbits/sec
[  5] 0.0000-10.0058 sec  1.74 GBytes  1.50 Gbits/sec
[ 10] 0.0000-10.0051 sec  5.61 GBytes  4.81 Gbits/sec
[  8] 0.0000-10.0058 sec  1.69 GBytes  1.46 Gbits/sec
[  7] 0.0000-10.0058 sec  1.53 GBytes  1.31 Gbits/sec
[  3] 0.0000-10.0065 sec  1.75 GBytes  1.50 Gbits/sec
[  9] 0.0000-10.0102 sec  5.98 GBytes  5.13 Gbits/sec
[ 12] 0.0000-10.0042 sec  5.91 GBytes  5.08 Gbits/sec
[ 14] 0.0000-10.0049 sec  5.35 GBytes  4.59 Gbits/sec
[ 16] 0.0000-10.0031 sec  4.60 GBytes  3.95 Gbits/sec
[ 11] 0.0000-10.0051 sec  5.42 GBytes  4.65 Gbits/sec
[ 15] 0.0000-10.0037 sec  4.53 GBytes  3.89 Gbits/sec
[ 13] 0.0000-10.0038 sec  5.46 GBytes  4.69 Gbits/sec
[SUM] 0.0000-10.0098 sec  56.3 GBytes  48.3 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) = 0.123/0.167/0.243/0.044 ms (tot/err) = 8/0

If I just do a single thread we can see that to properly use this bandwidth you really need a lot of parallelism, as a single thread seems to peak at about 17.6 or 19.3 Gbit/s (depending on direction, probably due to different CPU's on each side):

Code:
$ iperf -c 10.0.2.10
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.2.116 port 41822 connected with 10.0.2.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0134 sec  20.5 GBytes  17.6 Gbits/sec

$ iperf -Rc 10.0.2.10
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.2.116 port 60754 connected with 10.0.2.10 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *1] 0.0000-10.0026 sec  22.4 GBytes  19.3 Gbits/sec

This is between my workstation and my main server as follows:
- Wortkstation: Threadripper 3960x, 64GB non-ECC udimm's, quad channel ddr4-3600, Running Linux Mint 21.3server
- Main Server: Epyc 7543 with 512GB octa-channel registered ECC DDR4-3200, Running Proxmox VE based on Debian Bookworm

In both of these machines, the NIC's are maxing out the NIC's 8x Gen3 lanes going directly to the CPU (so no chipset shenanigans going on here)

The Intel XL710-QDA2 NIC's are dual port 40gig NIC's though, and an astute observer would notice that 8x Gen3 is not enough PCIe bandwidth to saturate both at the same time. Gen3 is 985MB/s per lane, and we have 8 lanes, so 7880MB/s which is 61.5 Gbit/s. So even before you factor in protocol overhead across the PCIe bus, we are limited to only about 1.5 ports worth of bandwidth. While doing these tests the other ports were at near idle though, so I don't think that was a factor.

I had hoped that a 40gig NIC would be a way for me to get near native NVMe performance off of remote storage across the network, but that doesn't look like it will happen. Raw iPerf - as noted - seems to have a worst case performance of 16-17 Gbit/s, so ~2.2GB/s. Once you use a network file sharing protocol like SMB or NFS this drops further to about 1.6GB/s. Complaining about this is definitely a first world problem, but it is evident I will not be seeing near native remote m.2 performance this way.

I'm not quite sure what the limiting factor is, but something is kicking in and limiting things before we saturate 40gbit Ethernet bandwidth. QSFP is essentially hardware link aggregation in a way that is supposed to eliminate the issues with traditional link aggregation. I'd blame that, but my single threaded performance is higher than what one of the four 10gbit links could muster on its own, so that is obviously not it. I tried messing with jumbo frames but this does not appear to have accomplished anything at all.

I mean, I bought the switch, so I am going to be using these as is one way or another for some time, but I'd be curious how 40Gbit QSFP+ performance compares to single link SFP28 (25Gbit) performance in this regard. I wonder if the latter is actually a little faster, or if they have the same limitations which suggests the bottleneck is somewhere else (PCIe subsystem? Software not optimized for these transfer rates? Kernel/OS? Something else?

If you are curious how having the Mikrotik switch in the middle of things impacts performance, I'd be happy to post that once the second cable arrives.

But yeah, for the direct host to host connection it very much seems like anything above 15-20 Gbit networking really requires some serious parallelism.
 
Last edited:
I can only give you iperf numbers between two direct connected linux boxes right now. My delayed MTP cable has not yet arrived, so I don't have a second 40gig box to iperf to.

I've also never used iperf3 before, just the regular iperf that comes with pretty much every distribution. I just installed iperf3 for testing.

All of that said, with the direct connected box it looks like this:

iperf3 -c 10.0.2.10 -P 8:

Code:
$ iperf3 -c 10.0.2.10 -P 8
Connecting to host 10.0.2.10, port 5201
[  5] local 10.0.2.116 port 55074 connected to 10.0.2.10 port 5201
[  7] local 10.0.2.116 port 55080 connected to 10.0.2.10 port 5201
[  9] local 10.0.2.116 port 55084 connected to 10.0.2.10 port 5201
[ 11] local 10.0.2.116 port 55100 connected to 10.0.2.10 port 5201
[ 13] local 10.0.2.116 port 55110 connected to 10.0.2.10 port 5201
[ 15] local 10.0.2.116 port 55118 connected to 10.0.2.10 port 5201
[ 17] local 10.0.2.116 port 55126 connected to 10.0.2.10 port 5201
[ 19] local 10.0.2.116 port 55136 connected to 10.0.2.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec   266 MBytes  2.22 Gbits/sec  383    239 KBytes    
[  7]   0.00-1.01   sec   260 MBytes  2.17 Gbits/sec  291    168 KBytes    
[  9]   0.00-1.01   sec   258 MBytes  2.16 Gbits/sec  388    502 KBytes    
[ 11]   0.00-1.01   sec   135 MBytes  1.13 Gbits/sec  403    211 KBytes    
[ 13]   0.00-1.01   sec   262 MBytes  2.19 Gbits/sec  307    645 KBytes    
[ 15]   0.00-1.01   sec   258 MBytes  2.15 Gbits/sec  154    215 KBytes    
[ 17]   0.00-1.01   sec   259 MBytes  2.16 Gbits/sec  289    184 KBytes    
[ 19]   0.00-1.01   sec   263 MBytes  2.19 Gbits/sec  494    127 KBytes    
[SUM]   0.00-1.01   sec  1.92 GBytes  16.4 Gbits/sec  2709          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.01-2.00   sec   235 MBytes  1.98 Gbits/sec  220    137 KBytes    
[  7]   1.01-2.00   sec   239 MBytes  2.01 Gbits/sec  184    161 KBytes    
[  9]   1.01-2.00   sec   252 MBytes  2.12 Gbits/sec  170    478 KBytes    
[ 11]   1.01-2.00   sec   237 MBytes  2.00 Gbits/sec  437    156 KBytes    
[ 13]   1.01-2.00   sec   230 MBytes  1.94 Gbits/sec  467    235 KBytes    
[ 15]   1.01-2.00   sec   227 MBytes  1.91 Gbits/sec  360    105 KBytes    
[ 17]   1.01-2.00   sec   253 MBytes  2.13 Gbits/sec  567    158 KBytes    
[ 19]   1.01-2.00   sec   234 MBytes  1.97 Gbits/sec  324    232 KBytes    
[SUM]   1.01-2.00   sec  1.86 GBytes  16.1 Gbits/sec  2729          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec   237 MBytes  1.99 Gbits/sec  272    170 KBytes    
[  7]   2.00-3.00   sec   227 MBytes  1.91 Gbits/sec  406    247 KBytes    
[  9]   2.00-3.00   sec   244 MBytes  2.05 Gbits/sec  436    167 KBytes    
[ 11]   2.00-3.00   sec   237 MBytes  1.99 Gbits/sec  269    311 KBytes    
[ 13]   2.00-3.00   sec   240 MBytes  2.01 Gbits/sec  430    163 KBytes    
[ 15]   2.00-3.00   sec   246 MBytes  2.06 Gbits/sec  281    228 KBytes    
[ 17]   2.00-3.00   sec   238 MBytes  2.00 Gbits/sec  443    334 KBytes    
[ 19]   2.00-3.00   sec   238 MBytes  2.00 Gbits/sec  295    173 KBytes    
[SUM]   2.00-3.00   sec  1.86 GBytes  16.0 Gbits/sec  2832          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec   232 MBytes  1.94 Gbits/sec  381    322 KBytes    
[  7]   3.00-4.00   sec   234 MBytes  1.95 Gbits/sec  282   91.9 KBytes    
[  9]   3.00-4.00   sec   232 MBytes  1.94 Gbits/sec  347    146 KBytes    
[ 11]   3.00-4.00   sec   242 MBytes  2.02 Gbits/sec  243    479 KBytes    
[ 13]   3.00-4.00   sec   231 MBytes  1.93 Gbits/sec  283    191 KBytes    
[ 15]   3.00-4.00   sec   240 MBytes  2.01 Gbits/sec  284    233 KBytes    
[ 17]   3.00-4.00   sec   235 MBytes  1.96 Gbits/sec  242    130 KBytes    
[ 19]   3.00-4.00   sec   233 MBytes  1.95 Gbits/sec  206    119 KBytes    
[SUM]   3.00-4.00   sec  1.84 GBytes  15.7 Gbits/sec  2268          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec   268 MBytes  2.25 Gbits/sec  627   96.2 KBytes    
[  7]   4.00-5.00   sec   270 MBytes  2.27 Gbits/sec  623    136 KBytes    
[  9]   4.00-5.00   sec   263 MBytes  2.21 Gbits/sec  760   69.3 KBytes    
[ 11]   4.00-5.00   sec   264 MBytes  2.22 Gbits/sec  560   76.4 KBytes    
[ 13]   4.00-5.00   sec   261 MBytes  2.19 Gbits/sec  340    129 KBytes    
[ 15]   4.00-5.00   sec   265 MBytes  2.22 Gbits/sec  1113    243 KBytes    
[ 17]   4.00-5.00   sec   254 MBytes  2.14 Gbits/sec  363   36.8 KBytes    
[ 19]   4.00-5.00   sec   256 MBytes  2.16 Gbits/sec  454   84.8 KBytes    
[SUM]   4.00-5.00   sec  2.05 GBytes  17.7 Gbits/sec  4840          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  707    191 KBytes    
[  7]   5.00-6.00   sec   215 MBytes  1.81 Gbits/sec  353   39.6 KBytes    
[  9]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  524    235 KBytes    
[ 11]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  495    252 KBytes    
[ 13]   5.00-6.00   sec   252 MBytes  2.12 Gbits/sec  636    252 KBytes    
[ 15]   5.00-6.00   sec   269 MBytes  2.26 Gbits/sec  500    206 KBytes    
[ 17]   5.00-6.00   sec   264 MBytes  2.22 Gbits/sec  641    146 KBytes    
[ 19]   5.00-6.00   sec   268 MBytes  2.25 Gbits/sec  535    359 KBytes    
[SUM]   5.00-6.00   sec  2.03 GBytes  17.4 Gbits/sec  4391          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  119    270 KBytes    
[  7]   6.00-7.00   sec   242 MBytes  2.03 Gbits/sec  207    223 KBytes    
[  9]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  208    173 KBytes    
[ 11]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  205    174 KBytes    
[ 13]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec   50    181 KBytes    
[ 15]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec  141    174 KBytes    
[ 17]   6.00-7.00   sec   251 MBytes  2.10 Gbits/sec   76    403 KBytes    
[ 19]   6.00-7.00   sec   250 MBytes  2.09 Gbits/sec  111    209 KBytes    
[SUM]   6.00-7.00   sec  1.95 GBytes  16.7 Gbits/sec  1117          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   7.00-8.00   sec   254 MBytes  2.13 Gbits/sec  805    368 KBytes    
[  7]   7.00-8.00   sec   245 MBytes  2.06 Gbits/sec  601    182 KBytes    
[  9]   7.00-8.00   sec   266 MBytes  2.23 Gbits/sec  564    150 KBytes    
[ 11]   7.00-8.00   sec   236 MBytes  1.98 Gbits/sec  515    205 KBytes    
[ 13]   7.00-8.00   sec   230 MBytes  1.93 Gbits/sec  191    284 KBytes    
[ 15]   7.00-8.00   sec   229 MBytes  1.92 Gbits/sec  323    215 KBytes    
[ 17]   7.00-8.00   sec   240 MBytes  2.01 Gbits/sec  799    156 KBytes    
[ 19]   7.00-8.00   sec   229 MBytes  1.93 Gbits/sec  399    256 KBytes    
[SUM]   7.00-8.00   sec  1.88 GBytes  16.2 Gbits/sec  4197          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   8.00-9.00   sec   228 MBytes  1.91 Gbits/sec  123    225 KBytes    
[  7]   8.00-9.00   sec   221 MBytes  1.86 Gbits/sec  152    191 KBytes    
[  9]   8.00-9.00   sec   229 MBytes  1.92 Gbits/sec  275    485 KBytes    
[ 11]   8.00-9.00   sec   226 MBytes  1.89 Gbits/sec  171    366 KBytes    
[ 13]   8.00-9.00   sec   220 MBytes  1.84 Gbits/sec   56    238 KBytes    
[ 15]   8.00-9.00   sec   226 MBytes  1.90 Gbits/sec  166   76.4 KBytes    
[ 17]   8.00-9.00   sec   222 MBytes  1.86 Gbits/sec  140    216 KBytes    
[ 19]   8.00-9.00   sec   220 MBytes  1.85 Gbits/sec  112    123 KBytes    
[SUM]   8.00-9.00   sec  1.75 GBytes  15.0 Gbits/sec  1195          
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  147    230 KBytes    
[  7]   9.00-10.00  sec   238 MBytes  2.00 Gbits/sec  133   14.1 KBytes    
[  9]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  153    216 KBytes    
[ 11]   9.00-10.00  sec   250 MBytes  2.10 Gbits/sec  209    199 KBytes    
[ 13]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec  105    260 KBytes    
[ 15]   9.00-10.00  sec   246 MBytes  2.07 Gbits/sec  213    122 KBytes    
[ 17]   9.00-10.00  sec   251 MBytes  2.11 Gbits/sec   50    240 KBytes    
[ 19]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec  105    191 KBytes    
[SUM]   9.00-10.00  sec  1.94 GBytes  16.7 Gbits/sec  1115          
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec  3784             sender
[  5]   0.00-10.00  sec  2.43 GBytes  2.09 Gbits/sec                  receiver
[  7]   0.00-10.00  sec  2.33 GBytes  2.01 Gbits/sec  3232             sender
[  7]   0.00-10.00  sec  2.33 GBytes  2.00 Gbits/sec                  receiver
[  9]   0.00-10.00  sec  2.46 GBytes  2.11 Gbits/sec  3825             sender
[  9]   0.00-10.00  sec  2.46 GBytes  2.11 Gbits/sec                  receiver
[ 11]   0.00-10.00  sec  2.29 GBytes  1.97 Gbits/sec  3507             sender
[ 11]   0.00-10.00  sec  2.29 GBytes  1.97 Gbits/sec                  receiver
[ 13]   0.00-10.00  sec  2.37 GBytes  2.04 Gbits/sec  2865             sender
[ 13]   0.00-10.00  sec  2.37 GBytes  2.04 Gbits/sec                  receiver
[ 15]   0.00-10.00  sec  2.40 GBytes  2.06 Gbits/sec  3535             sender
[ 15]   0.00-10.00  sec  2.40 GBytes  2.06 Gbits/sec                  receiver
[ 17]   0.00-10.00  sec  2.41 GBytes  2.07 Gbits/sec  3610             sender
[ 17]   0.00-10.00  sec  2.41 GBytes  2.07 Gbits/sec                  receiver
[ 19]   0.00-10.00  sec  2.38 GBytes  2.05 Gbits/sec  3035             sender
[ 19]   0.00-10.00  sec  2.38 GBytes  2.05 Gbits/sec                  receiver
[SUM]   0.00-10.00  sec  19.1 GBytes  16.4 Gbits/sec  27393             sender
[SUM]   0.00-10.00  sec  19.1 GBytes  16.4 Gbits/sec                  receiver

iperf Done.

Yikes, that's a lot more output than I am used to.

The -w 512M gives me an error message that the socket buffer size is not set correctly.



I've used regular iperf for most of my testing, which has looked something like this:

Code:
$ iperf -c 10.0.2.10 -d -P 8
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  6] local 10.0.2.116 port 34760 connected with 10.0.2.10 port 5001
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  8] local 10.0.2.116 port 34800 connected with 10.0.2.10 port 5001
[  3] local 10.0.2.116 port 34796 connected with 10.0.2.10 port 5001
[  7] local 10.0.2.116 port 34798 connected with 10.0.2.10 port 5001
[  4] local 10.0.2.116 port 34746 connected with 10.0.2.10 port 5001
[  2] local 10.0.2.116 port 34768 connected with 10.0.2.10 port 5001
[  1] local 10.0.2.116 port 34730 connected with 10.0.2.10 port 5001
[  9] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37944
[ 10] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37950
[ 11] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37960
[ 12] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37966
[ 13] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37980
[ 14] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37982
[ 15] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37994
[ 16] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 38008
[  5] local 10.0.2.116 port 34782 connected with 10.0.2.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0058 sec  1.66 GBytes  1.43 Gbits/sec
[  6] 0.0000-10.0058 sec  1.68 GBytes  1.44 Gbits/sec
[  2] 0.0000-10.0058 sec  1.78 GBytes  1.53 Gbits/sec
[  4] 0.0000-10.0058 sec  1.57 GBytes  1.35 Gbits/sec
[  5] 0.0000-10.0058 sec  1.74 GBytes  1.50 Gbits/sec
[ 10] 0.0000-10.0051 sec  5.61 GBytes  4.81 Gbits/sec
[  8] 0.0000-10.0058 sec  1.69 GBytes  1.46 Gbits/sec
[  7] 0.0000-10.0058 sec  1.53 GBytes  1.31 Gbits/sec
[  3] 0.0000-10.0065 sec  1.75 GBytes  1.50 Gbits/sec
[  9] 0.0000-10.0102 sec  5.98 GBytes  5.13 Gbits/sec
[ 12] 0.0000-10.0042 sec  5.91 GBytes  5.08 Gbits/sec
[ 14] 0.0000-10.0049 sec  5.35 GBytes  4.59 Gbits/sec
[ 16] 0.0000-10.0031 sec  4.60 GBytes  3.95 Gbits/sec
[ 11] 0.0000-10.0051 sec  5.42 GBytes  4.65 Gbits/sec
[ 15] 0.0000-10.0037 sec  4.53 GBytes  3.89 Gbits/sec
[ 13] 0.0000-10.0038 sec  5.46 GBytes  4.69 Gbits/sec
[SUM] 0.0000-10.0098 sec  56.3 GBytes  48.3 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) = 0.123/0.167/0.243/0.044 ms (tot/err) = 8/0

If I just do a single thread we can see that to properly use this bandwidth you really need a lot of parallelism, as a single thread seems to peak at about 17.6 or 19.3 Gbit/s (depending on direction, probably due to different CPU's on each side):

Code:
$ iperf -c 10.0.2.10
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.2.116 port 41822 connected with 10.0.2.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0134 sec  20.5 GBytes  17.6 Gbits/sec

$ iperf -Rc 10.0.2.10
------------------------------------------------------------
Client connecting to 10.0.2.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.2.116 port 60754 connected with 10.0.2.10 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *1] 0.0000-10.0026 sec  22.4 GBytes  19.3 Gbits/sec

This is between my workstation and my main server as follows:
- Wortkstation: Threadripper 3960x, 64GB non-ECC udimm's, quad channel ddr4-3600, Running Linux Mint 21.3server
- Main Server: Epyc 7543 with 512GB octa-channel registered ECC DDR4-3200, Running Proxmox VE based on Debian Bookworm

In both of these machines, the NIC's are maxing out the NIC's 8x Gen3 lanes going directly to the CPU (so no chipset shenanigans going on here)

The Intel XL710-QDA2 NIC's are dual port 40gig NIC's though, and an astute observer would notice that 8x Gen3 is not enough PCIe bandwidth to saturate both at the same time. Gen3 is 985MB/s per lane, and we have 8 lanes, so 7880MB/s which is 61.5 Gbit/s. So even before you factor in protocl overhead across the PCIe bus, we are limited to only about 1.5 ports worth of bandwidth. While doing these tests the other ports were at near idle though, so I don't think that was a factor.

I had hoped that a 40gig NIC would be a way for me to get near native NVMe performance off of remote storage across the network, but that doesn't look like it will happen. Raw iPerf - as noted - seems to ahve a worst case performance of 16-17 Gbit/s, so ~2.2GB/s. Once you use a network file sharing protocol like SMB or NFS this drops further to about 1.6GB/s. Complaining about this is definitely a first world problem, but it is evident I will not be seeing near native remote m.2 performance this way.

I'm not quite sure what the limiting factor is, but something is kicking in and limiting things before we saturate 40gbit ethernet bandwidth. QSFP is essentially hardware link aggregation in a way that is supposed to eliminate the issues with traditional link aggregation. I'd blame that, but my single threaded performance is higher than what one of the four 10gbit links could muster on its own, so that is obviously not it. I tried messing with jumbo frames but this does not appear to have accomplished anyhting at all.

I mean, I bought the switch, so I am going to be using these as is one way or another for some time, but I'd be curious how 40Gbit QSFP+ performance compares to single link SFP28 (25Gbit) performance in this regard. I wonder if th elatter is actually a little faster, or if they ahve the same limitations which sugggests the bottleneck is somewhere else (PCIe subsystem? Software not optimized for these transfer rates? Kernel/OS? Something else?

If you are curious how having the Mikrotik switch in the middle of things impacts performance, I'd be happy to post that once the second cable arrives.

But yeah, for the direct host to host connection it very much seems like anyhting above 15-20 Gbit networking really requires some serious parallelism.

I wonder if instead of SMB/NFS I might be able to make better use of the 40gig NIC by doing something like creating an iSCSI target. I vaguely remember reading that this has less overall CPU load associated with it.

Maybe that will broaden whatever undetermined bottleneck we are dealing with here that is preventing more complete low paralellism usage of the interface.
 
I wonder if instead of SMB/NFS I might be able to make better use of the 40gig NIC by doing something like creating an iSCSI target. I vaguely remember reading that this has less overall CPU load associated with it.

Maybe that will broaden whatever undetermined bottleneck we are dealing with here that is preventing more complete low paralellism usage of the interface.

I found this monologue on the TrueNAS forums:

https://www.truenas.com/community/t...uboptimal-on-nvme-pool-over-40gbe-nic.111407/

Looks like even with iSCSI storage workloads are very intensely single threaded CPU performance dependent, and likely what is happening is that performance is being held back by the CPU's on either side.

Not sure why this impacts remote storage more than - say - a local NVMe drive, but it seems to.

My dream of a client gaming machine with a boatload of fast remote storage (as long as I can find 8x PCIe lanes for the NIC) may just not be in the cards.
 
Your single thread performance seems impressive to me. It's higher than I remember seeing on 25gbps connections however I always had a pair of switches in the middle. Will be interesting to see how yours performs with a switch in the mix. The best results I ever got was from:
iperf3.exe -c hostname -P 20 -w 512M
 
Your single thread performance seems impressive to me. It's higher than I remember seeing on 25gbps connections however I always had a pair of switches in the middle. Will be interesting to see how yours performs with a switch in the mix. The best results I ever got was from:
iperf3.exe -c hostname -P 20 -w 512M

Yeah, these high bandwidth networking solutions are definitely more intended for highly parallel loads, where they excel.

I primarily bought them to experiment with low parallel direct performance between my desktop and my NAS, which was a little bit of a disappointment, and while I know it probably was going to be going in, I wanted to see just how much It could get out of them.

Being dual port adapters though, I have port 2 on both adapters direct linked together with a DAC between the workstation and the all-in-one NAS server.

Port one on each are going to the switch.

This is great for the server, as the largely parallel load will more than do the job I used to do with two LAG 10gig connections.

On the desktop, the main link to the switch is largely wasted, but it does allow me to pull the other NIC out, free up a PCIe slot and save some power, as these 710'adapters are much more efficient than the old x520's I was using.

I was originally planning on using one direct link at 40gig between the two systems and then use a breakout to four 10gig links to my old 10gig switch, but it turns out Intel doesn't allow for that.

You can configure it as 2x40gig, 4x10 on one port with the second port disabled or 2x10 on each port. Mixed, one port at 40gig and one with a 4x10gig breakout is not allowed for some reason.

So I decided to pick up a new switch with a couple of 40gig ports, because why not? :p
 
Your single thread performance seems impressive to me. It's higher than I remember seeing on 25gbps connections however I always had a pair of switches in the middle. Will be interesting to see how yours performs with a switch in the mix. The best results I ever got was from:
iperf3.exe -c hostname -P 20 -w 512M


Here are the same tests again, but this time through the switch:

Code:
iperf3 -c 10.0.1.10 -P 8
Connecting to host 10.0.1.10, port 5201
[  5] local 10.0.1.116 port 43830 connected to 10.0.1.10 port 5201
[  7] local 10.0.1.116 port 43842 connected to 10.0.1.10 port 5201
[  9] local 10.0.1.116 port 43850 connected to 10.0.1.10 port 5201
[ 11] local 10.0.1.116 port 43860 connected to 10.0.1.10 port 5201
[ 13] local 10.0.1.116 port 43876 connected to 10.0.1.10 port 5201
[ 15] local 10.0.1.116 port 43892 connected to 10.0.1.10 port 5201
[ 17] local 10.0.1.116 port 43904 connected to 10.0.1.10 port 5201
[ 19] local 10.0.1.116 port 43914 connected to 10.0.1.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   261 MBytes  2.19 Gbits/sec   16    189 KBytes       
[  7]   0.00-1.00   sec   261 MBytes  2.19 Gbits/sec   67    147 KBytes       
[  9]   0.00-1.00   sec   261 MBytes  2.19 Gbits/sec    0    141 KBytes       
[ 11]   0.00-1.00   sec   261 MBytes  2.19 Gbits/sec   38    165 KBytes       
[ 13]   0.00-1.00   sec   260 MBytes  2.18 Gbits/sec    0    202 KBytes       
[ 15]   0.00-1.00   sec   260 MBytes  2.18 Gbits/sec    0    201 KBytes       
[ 17]   0.00-1.00   sec   260 MBytes  2.18 Gbits/sec   79    182 KBytes       
[ 19]   0.00-1.00   sec   261 MBytes  2.19 Gbits/sec    3    543 KBytes       
[SUM]   0.00-1.00   sec  2.04 GBytes  17.5 Gbits/sec  203             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    189 KBytes       
[  7]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    147 KBytes       
[  9]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    141 KBytes       
[ 11]   1.00-2.00   sec   269 MBytes  2.25 Gbits/sec    0    165 KBytes       
[ 13]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    215 KBytes       
[ 15]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    214 KBytes       
[ 17]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec    0    182 KBytes       
[ 19]   1.00-2.00   sec   270 MBytes  2.26 Gbits/sec   18    379 KBytes       
[SUM]   1.00-2.00   sec  2.11 GBytes  18.1 Gbits/sec   18             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    189 KBytes       
[  7]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    147 KBytes       
[  9]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    141 KBytes       
[ 11]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    165 KBytes       
[ 13]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    215 KBytes       
[ 15]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    214 KBytes       
[ 17]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    182 KBytes       
[ 19]   2.00-3.00   sec   268 MBytes  2.24 Gbits/sec    0    379 KBytes       
[SUM]   2.00-3.00   sec  2.09 GBytes  17.9 Gbits/sec    0             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    189 KBytes       
[  7]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    147 KBytes       
[  9]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    141 KBytes       
[ 11]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    165 KBytes       
[ 13]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    215 KBytes       
[ 15]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    214 KBytes       
[ 17]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    182 KBytes       
[ 19]   3.00-4.00   sec   266 MBytes  2.24 Gbits/sec    0    379 KBytes       
[SUM]   3.00-4.00   sec  2.08 GBytes  17.9 Gbits/sec    0             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    189 KBytes       
[  7]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    147 KBytes       
[  9]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec   12    141 KBytes       
[ 11]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    165 KBytes       
[ 13]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    215 KBytes       
[ 15]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    214 KBytes       
[ 17]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    182 KBytes       
[ 19]   4.00-5.00   sec   266 MBytes  2.24 Gbits/sec    0    379 KBytes       
[SUM]   4.00-5.00   sec  2.08 GBytes  17.9 Gbits/sec   12             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    189 KBytes       
[  7]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    189 KBytes       
[  9]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    141 KBytes       
[ 11]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec   78    192 KBytes       
[ 13]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec   90    158 KBytes       
[ 15]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    239 KBytes       
[ 17]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    182 KBytes       
[ 19]   5.00-6.00   sec   266 MBytes  2.23 Gbits/sec    0    380 KBytes       
[SUM]   5.00-6.00   sec  2.08 GBytes  17.8 Gbits/sec  168             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec   36    132 KBytes       
[  7]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec   92    133 KBytes       
[  9]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec    1    180 KBytes       
[ 11]   6.00-7.00   sec   263 MBytes  2.20 Gbits/sec  136    143 KBytes       
[ 13]   6.00-7.00   sec   263 MBytes  2.21 Gbits/sec  113    192 KBytes       
[ 15]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec  104    175 KBytes       
[ 17]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec   40    130 KBytes       
[ 19]   6.00-7.00   sec   269 MBytes  2.25 Gbits/sec    0    382 KBytes       
[SUM]   6.00-7.00   sec  2.09 GBytes  17.9 Gbits/sec  522             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    132 KBytes       
[  7]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    133 KBytes       
[  9]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    180 KBytes       
[ 11]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    143 KBytes       
[ 13]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    192 KBytes       
[ 15]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    175 KBytes       
[ 17]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    130 KBytes       
[ 19]   7.00-8.00   sec   268 MBytes  2.25 Gbits/sec    0    382 KBytes       
[SUM]   7.00-8.00   sec  2.09 GBytes  18.0 Gbits/sec    0             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    132 KBytes       
[  7]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    133 KBytes       
[  9]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    180 KBytes       
[ 11]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    143 KBytes       
[ 13]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    192 KBytes       
[ 15]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    175 KBytes       
[ 17]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    0    130 KBytes       
[ 19]   8.00-9.00   sec   269 MBytes  2.25 Gbits/sec    1    382 KBytes       
[SUM]   8.00-9.00   sec  2.10 GBytes  18.0 Gbits/sec    1             
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    132 KBytes       
[  7]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec   14    133 KBytes       
[  9]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    180 KBytes       
[ 11]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    143 KBytes       
[ 13]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    192 KBytes       
[ 15]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    175 KBytes       
[ 17]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    130 KBytes       
[ 19]   9.00-10.00  sec   269 MBytes  2.26 Gbits/sec    0    382 KBytes       
[SUM]   9.00-10.00  sec  2.10 GBytes  18.0 Gbits/sec   14             
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec   52             sender
[  5]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[  7]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec  173             sender
[  7]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[  9]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec   13             sender
[  9]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[ 11]   0.00-10.00  sec  2.60 GBytes  2.23 Gbits/sec  252             sender
[ 11]   0.00-10.00  sec  2.60 GBytes  2.23 Gbits/sec                  receiver
[ 13]   0.00-10.00  sec  2.60 GBytes  2.23 Gbits/sec  203             sender
[ 13]   0.00-10.00  sec  2.60 GBytes  2.23 Gbits/sec                  receiver
[ 15]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec  104             sender
[ 15]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[ 17]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec  119             sender
[ 17]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[ 19]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec   22             sender
[ 19]   0.00-10.00  sec  2.61 GBytes  2.24 Gbits/sec                  receiver
[SUM]   0.00-10.00  sec  20.9 GBytes  17.9 Gbits/sec  938             sender
[SUM]   0.00-10.00  sec  20.9 GBytes  17.9 Gbits/sec                  receiver

iperf Done.

Code:
$ iperf -c 10.0.1.10 -d -P 8
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  2] local 10.0.1.116 port 45042 connected with 10.0.1.10 port 5001
[  5] local 10.0.1.116 port 45058 connected with 10.0.1.10 port 5001
[  3] local 10.0.1.116 port 45072 connected with 10.0.1.10 port 5001
[  4] local 10.0.1.116 port 45076 connected with 10.0.1.10 port 5001
[  7] local 10.0.1.116 port 45088 connected with 10.0.1.10 port 5001
------------------------------------------------------------
Client connecting to 10.0.1.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  8] local 10.0.1.116 port 45102 connected with 10.0.1.10 port 5001
[  1] local 10.0.1.116 port 45030 connected with 10.0.1.10 port 5001
[  6] local 10.0.1.116 port 45078 connected with 10.0.1.10 port 5001
[  9] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52020
[ 10] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52036
[ 11] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52046
[ 12] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52050
[ 13] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52064
[ 14] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52068
[ 15] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52084
[ 16] local 10.0.1.116 port 5001 connected with 10.0.1.10 port 52066
[ ID] Interval       Transfer     Bandwidth
[ 11] 0.0000-10.0073 sec  5.31 GBytes  4.56 Gbits/sec
[ 13] 0.0000-10.0039 sec  4.96 GBytes  4.26 Gbits/sec
[ 15] 0.0000-10.0028 sec  5.10 GBytes  4.38 Gbits/sec
[  4] 0.0000-10.0067 sec  1.74 GBytes  1.50 Gbits/sec
[  5] 0.0000-10.0068 sec  1.68 GBytes  1.44 Gbits/sec
[  2] 0.0000-10.0067 sec  1.70 GBytes  1.46 Gbits/sec
[ 16] 0.0000-9.9631 sec  4.55 GBytes  3.92 Gbits/sec
[  9] 0.0000-10.0069 sec  4.84 GBytes  4.15 Gbits/sec
[ 14] 0.0000-10.0040 sec  5.67 GBytes  4.87 Gbits/sec
[  1] 0.0000-10.0067 sec  1.66 GBytes  1.43 Gbits/sec
[  7] 0.0000-10.0068 sec  1.63 GBytes  1.40 Gbits/sec
[  3] 0.0000-10.0068 sec  1.58 GBytes  1.36 Gbits/sec
[  8] 0.0000-10.0068 sec  1.73 GBytes  1.48 Gbits/sec
[ 10] 0.0000-10.0056 sec  5.88 GBytes  5.05 Gbits/sec
[  6] 0.0000-10.0067 sec  1.57 GBytes  1.35 Gbits/sec
[ 12] 0.0000-10.1244 sec  6.07 GBytes  5.15 Gbits/sec
[SUM] 0.0000-10.1286 sec  55.7 GBytes  47.2 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) = 0.105/0.163/0.259/0.055 ms (tot/err) = 8/0

Code:
$ iperf -c 10.0.1.10
------------------------------------------------------------
Client connecting to 10.0.1.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.1.116 port 45978 connected with 10.0.1.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0140 sec  20.9 GBytes  18.0 Gbits/sec

$ iperf -c 10.0.1.10 -R
------------------------------------------------------------
Client connecting to 10.0.1.10, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 10.0.1.116 port 33930 connected with 10.0.1.10 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *1] 0.0000-10.0026 sec  24.7 GBytes  21.2 Gbits/sec


Oddly enough the numbers seem higher through the switch, which is confusing the shit out of me.

Maybe there is a benefit to actually using optical transcievers? I had heard this posited before, but it never made sense to me, because you'd think there would at least be some sort of latency penalty in the transducers, but...

Here are NIC ->DAC -> NIC Pings
Code:
--- 10.0.2.10 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 101361ms
rtt min/avg/max/mdev = 0.061/0.091/0.118/0.013 ms

And here are NIC -> transciever -> Fiber ->transceiver ->switch ->transceiver ->fiber ->transciever ->NIC
Code:
--- 10.0.1.10 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 101353ms
rtt min/avg/max/mdev = 0.054/0.085/0.122/0.012 ms

Going through the switch using four optical transceivers and two fibers actually both has a lower mean latency and a (slightly) lower standard deviation.

That's nuts.
 
It's not that surprising to me as DAC cables seem to be a bonus, a perk, rather than the intended purpose of the SFPs. I've been using SFPs optical transceivers for about 19 years and only had a use case for DACs for a couple. Didn't discover they were an option until then.
 
It's not that surprising to me as DAC cables seem to be a bonus, a perk, rather than the intended purpose of the SFPs. I've been using SFPs optical transceivers for about 19 years and only had a use case for DACs for a couple. Didn't discover they were an option until then.
Ah,

I started using DAC cables for all of my shorter runs (within rack, etc.) due to cost, and the fact that being passive they seemed like simpler devices that would perform better.

It wasn't until years later I found that the more complex transceivers actually perform better, which makes absolutely no sense from a science/engineering perspective.

But you are probably right. It's probably because the devices (Switches, NIC's etc.) are optimized for the transceivers.
 
A new issue I have since going 40gbit is that downstream devices are now issuing pause frames like crazy when flow control is on.

40gbit seemed like a great idea on the server. I don't need it often, but in the rare occasion that the planets align and all the different VM's and other actions on the server are running at the same time, it is great to not have a constriction. One of the many VM's on the server runs a MythTV (PVR) backend recording live TV based on my schedule direct to the ZFS pool. If the network bandwidth was exhausted it would result in failed or stuttery recordings. This wouldn't happen often at 10gbit, but when it did happen it was a big annoyance, and I often didn't notice until weeks later when going to view the recording.

(Well, the better halfs recordings usually, as I am not a big TV watcher)

Now I've replaced that problem with a switch buffer overflow and flow control issue at times of lower bandwidth use. Can't win 'em all I guess.

At least the pause frames don't seem to be having too much of a negative impact. Pings remain sub 0.1 ms even when pause frames are being issued to an interface (which is honestly quite surprising, and suggests flow control has gotten a hell of a lot better since I last had to deal with it. It used to result in crazy high latency forcing me to choose between latency with flow control on, and packet loss with flow control off. But that was a long (long) time ago.

I'm used to viewing pause frames as a problem as a result, but if they aren't negatively impacting latency, why worry? :p
 
Can you throttle the network connections of the VMs to less than 40gbit? Perhaps balancing it so you aren't fully maxing out the switch but still gain significantly from your 10gbit. I have the option in Hyper-V.

1709303243682.png
 
Can you throttle the network connections of the VMs to less than 40gbit? Perhaps balancing it so you aren't fully maxing out the switch but still gain significantly from your 10gbit. I have the option in Hyper-V.

View attachment 638742

Good idea! I'm pretty sure I can. I'll have to look at it.

I wanted to let it run free to take advantage of all the bandwidth it can get when available, but if its just going to result in pause frames anyway, what's the point, right?

Limiting it to a max of 5-10gig per VM probably wouldn't hurt, and may get rid of the pause frame.
 
A new issue I have since going 40gbit is that downstream devices are now issuing pause frames like crazy when flow control is on.

Yeah, I never liked Ethernet flow control. IMHO packet loss is preferable, but that was with 1/10 and almost all tcp flows.

One of the many VM's on the server runs a MythTV (PVR) backend recording live TV based on my schedule direct to the ZFS pool.
Not sure if I've asked you about MythTV --- what are you using for a frontend these days? Also, have you done anything with ATSC3? My MythTV has gone pretty much dormant as I lost interest as broadcast quality has dropped (both in terms of content and encoding)
 
Yeah, I never liked Ethernet flow control. IMHO packet loss is preferable, but that was with 1/10 and almost all tcp flows.


Not sure if I've asked you about MythTV --- what are you using for a frontend these days? Also, have you done anything with ATSC3? My MythTV has gone pretty much dormant as I lost interest as broadcast quality has dropped (both in terms of content and encoding)
Agree. I only use it for live Sports and that's it and news or specials. I download through usenet+Sonarr and plex all my wifes shows. Wayyy more reliable
 
Back
Top