PCIE Bifurcation

One last shot for the road. Installed in the case.

Well this has been an interesting journey. I've now got M.2 SSD and 10GbE - a previously impossible combination with this case / mobo. I actually can't remember the last time I designed a PCB that didn't have to be revised. These were made blue because I expected they'd be a prototype, and the final was going to be black. So much for that.

A big thanks to C_Payne for the tips and many renders of his designs which inspired mine ;-)

installed.jpg
 
Impressive work with the cable. I would have lost all patience soldering it. Looks pretty decent quality too, one can see some experience there.
 
Are bifurcation options limited to ITX boards or have there been micro ATX (mATX) boards known to support it, too? Read here that while some series support splitting it's not necessarily exposed in the UEFI settings so I'm not sure whether there's a way of checking without someone's first-hand experience.

Was planning a potential build around AMD and trying to find a way of splitting two x16 slots (available in some motherboards) into x8/x8, for a total of four x8 usable slots. However it depends how feasible this is.

Also, on adapters: apart from adding ribbon risers are there any workarounds for the hard right-angle offset of the bifurcation adapters? I understand they're mostly used for SFF cases with vertical GPU brackets but for cases without such an arrangement it could pose challenges to where the hardware could be positioned/mounted. It also seemed from reading smallformfactor.net that very few ribbon/flexible risers are said to be high quality (3M appears to be the most reliable?), although this aspect might be overblown, I'm not familiar enough with it.
 
Note that on Ryzen two x16 slots.means that they are already split into x8.
can't do x8x8 then of course, only x4x4.

Bifurcation cards and risers in combo are good for gen3. gen4 poses significant challenges regarding signal integrity. Careful consideration neccesary when planning a gen4 build.
 
Note that on Ryzen two x16 slots.means that they are already split into x8.
can't do x8x8 then of course, only x4x4.
Hmm, bit confused. As when looking earlier I saw this post showing the UEFI settings for a B550M Pro4 (board has one PCIe v4.0 (x16) slot + another PCIe v3.0 (x16) slot) which showed x16, x8x8, x8x4x4 and x4x4x4x4 options, while this post from 2018 mentioned that the X370/X470 series has x8x8 bifurcation.

Maybe I'm just misunderstanding how it works.

Bifurcation cards and risers in combo are good for gen3. gen4 poses significant challenges regarding signal integrity. Careful consideration neccesary when planning a gen4 build.
Would splitting PCIe v4.0 be an issue if I weren't to use it for a GPU? As I'm only really needing it for a network card and two drive controllers (all of which use x8, two of the cards being PCIe 2.0 and one PCIe 3.0). If any GPU were used it'd only be some older card for occasional video out.
 
When using gen3/2 devices you are good.

Ryzen has 24 lanes. (4 chipset, 4 M.2, 16 for slot1/2)
When you use only slot 1 you can use x16, x8x8, x8x4x4 or x4x4x4x4 in slot 1.
When you use slot 1 and 2 you can use x8 or x4x4 in both of them.
 
When using gen3/2 devices you are good.

Ryzen has 24 lanes. (4 chipset, 4 M.2, 16 for slot1/2)
When you use slot 1 and 2 you can use x8 or x4x4 in both of them.
I see now. That's a pity, will have to rethink this. Found a post suggesting with a Threadripper CPU (with its 64 lanes) and a compatible motherboard it should be possible with four slots at that bandwidth, although they're much pricier.
 
Last edited:
The DHL van has been and gone, and it's new riser day again today. This is my first physical build of the other riser I showed on the previous page. The Mezzanine PCB is shown in position but isn't electrically attached yet.

I'm hoping this one will be a slam dunk. My previous riser's schematic and footprints were derived from this, it's also 4-layer so SE impedance will be closer to spec. Should just work right? ;-)


t9120a.jpg

t9120b.jpg

t9120c.jpg
 
Off to a good start. x8 and M.2 working good. Booting from my sacrificial sabrent, with a SAS 2308 in the X8 slot.

incase.jpgwindows.png

Now we get to the tricky bit, running an x4 from the left hand 3.5" bay. Going to be a bit of a job to build all of that. Watch this space...
 
Last edited:
And there we go. Finally...

bc1.jpgbc2.pg.jpgsc3.jpg

Works brilliant! Two PCIe cards in a case designed for one. Quite a bit of metalwork in all of that, but it all worked out well in the end. Mezzanine PCB has a 0.5mm offset error (not too surprising as I hadn't decided on the exact position of the card).

As I said previously, I have spares of these, so could make them available if anyone else wanted them, although the metalwork here is a one-off.
 
Last edited:
New board coming soon. Similar to C_Payne's larger and better designed slimline adapters, but for non GPU use (apologies if anyone is still reading, and getting bored of these posts ;-)

It's a compact OCULink x4 to PCIe x8 (mechanical) adapter. I don't have all the 3D models so some imagination required. I actually don't have any specific use for it right now, but designing PCIe adapters is quite addictive and my ASRock Rack boards are bristling with OCULink ports crying out to have something connected to them.

oculinkadapter.png
 
It's built! Soldering down the OCuLink connector was a bit more trouble than I was expecting but figured it out eventually. Smashed it in the benchmark, full Gen.3 performance achieved (don't have anything newer, but doubt it'd work anyway).

Using an OCuLink to PCIe adapter is a pretty interesting alternative to bifurcation, making me a bit off topic here I admit. I'm quite surprised how few products there are out there which do this, given that there are quite a lot of boards out there with OCuLink/U.2 connectors fitted.

assembled.jpg


fullsetup.jpgocuperf.pngwindowsocu.png
 
Last edited:
One more test, doubled the cable length to 75cm, and replaced the active carrier with passive, just in case I was hanging on by a thread.

Still got full gen 3 speeds ;-)

passivetest.jpg
 
Hi,

Anyone with a ncase splitter that can show with images how it is placed?

I do not see visually how it would look or how I would have to put buying one for one gpu 2 slots and capture card.

thanks
 
My apologies if I'm asking something that was covered previously. I've skimmed most of the thread but it is 28 pages so I may have missed it. If I were looking to run 8 GPU's at PCIe x8 what are my options? Preferably leaning towards the less expensive side. It seems like an X399 motherboard with something like a Threadripper 1900x? Maybe a low end EPYC setup? A lot of information I've found online is pretty old but it seems like a lot of boards only allow x4/x4/x4/x4 bifurcation instead of splitting an x16 slot into 2 x8 slots.

I would be using it to run BOINC projects and Folding@home, and a PCIe x1 or x4 slot can bottleneck some of those projects. I appreciate any input anybody has! I would probably be using this card from C_Payne https://c-payne.com/products/pcie-bifurcation-card-x8x8-3w
 
My apologies if I'm asking something that was covered previously. I've skimmed most of the thread but it is 28 pages so I may have missed it. If I were looking to run 8 GPU's at PCIe x8 what are my options? Preferably leaning towards the less expensive side. It seems like an X399 motherboard with something like a Threadripper 1900x? Maybe a low end EPYC setup? A lot of information I've found online is pretty old but it seems like a lot of boards only allow x4/x4/x4/x4 bifurcation instead of splitting an x16 slot into 2 x8 slots.

I would be using it to run BOINC projects and Folding@home, and a PCIe x1 or x4 slot can bottleneck some of those projects. I appreciate any input anybody has! I would probably be using this card from C_Payne https://c-payne.com/products/pcie-bifurcation-card-x8x8-3w
I would suggest something like this: https://www.asrockrack.com/general/productdetail.asp?Model=ROMED4ID-2T#Specifications - actually there isn't anything else like it quite frankly.

Then use C_Payne's slimline cards for the GPUs https://c-payne.com/products/slimsas-pcie-device-adapter-2-8i-to-x16 (assuming the clock comes off the connector serving lanes 1-8)

Basically a scaled up version of what I just demonstrated. It would be built like a mining rig, GPUs all cabled off the motherboard.
 
Last edited:
Preferably leaning towards the less expensive side.
I don't think you're going to see this side no matter which path you take, but just how far away you'll get depends on a few things.

I think the main questions you need to think about and answer are:
  1. What's my budget?
  2. When will I be bottlenecked by pcie gen3?
  3. Will I go beyond eight GPUs?
  4. Will I use the CPU for more than just orchestrating the GPUs? With what kind of workloads?

Looking at the ROMED4ID-2T that inaxeon linked above, I see a few potential issues already:
  • It doesn't support 1st gen epyc. There's a massive gulf in price between 1st gen and newer chips.
  • Two of the six slimline connectors are low-profile. Like the linked article, I couldn't find a proper cable for these connectors.
  • There is no room for expansion w.r.t. 8-lane slots; eight is your limit on this board.
    If two of the slimline ports can't be used reliably for pcie due to cabling, then the board fails your basic requirements.
 
For a minimal open-air system based on the Asrock ROMED8-2T with eight fat GPUs, I get about $191 per x8 slot for gen3, $273/slot for gen4.

The board slot configuration would look like this:
Code:
Slot  Device
----------------
1     GPU1
2     Empty (Blocked by GPU1)
3     Empty (Blocked by GPU1)
4     Bifurcation Host Adapter 1 -> GPU2, GPU3
5     Bifurcation Host Adapter 2 -> GPU4, GPU5
6     Bifurcation Host Adapter 3 -> GPU6, GPU7
7     GPU8

You can slowly expand up to 14 x8 slots on this board. Fully bifurcated, I get about $149/slot for gen3, $237/slot for gen4.

Gen3 is bifurcated using MaxCloudOn Kits.
Gen4 is bifurcated using C_Payne's Slimline host/device adapters and recommended 3M cable.


Powering this many GPUs/risers at once is an exercise left to the reader (and completely unaccounted for in the above figures).
 
I appreciate the information from both of you. I may just stick to 4-5 GPU's per machine which would allow me to stick with less expensive hardware, or may look into something dual socket. I think a dual socket Haswell Xeon would have plenty of PCIe lanes, but I would need to do some research to find a board that supports bifurcation. To answer the questions, though-

  1. What's my budget? -Ultimately, I guess unlimited, but I'd prefer to spend as little as possible without bottlenecking the GPU's. I would favor an older platform that uses more power over a more efficient platform that's way more expensive. I'm looking to build at least a couple of these, so EPYC Rome hardware might exceed what I want to spend. Was hoping a Naples EPYC or 1st Gen Threadripper could be a good option, or something like a Haswell Xeon.
  2. When will I be bottlenecked by pcie gen3? -I don't think I will, ever. Even x4 PCIe gen 3 is probably sufficient for what I'm running, but I was wanting to stick to x8 just to make sure I don't cause any bottlenecks
  3. Will I go beyond eight GPUs? -I will, but in additional machines. 8 seemed like a solid choice per machine, but maybe 6 or something would make more sense
  4. Will I use the CPU for more than just orchestrating the GPUs? With what kind of workloads? -It will not, but the BOINC GPU projects I run do require a fairly decent processor core per GPU to run without bottlenecking. Anything pre Haswell is likely to bottleneck the GPU. I have other machines without GPU's for running anything that is CPU intensive.
 
Wow, that's a non-trivial amount of CPU load. You really should test for bottlenecks with like hardware before going any further.

For Zen, adding additional gpus via bifurcation means you may end up with more than one worker per CCX. The specific slots you bifurcate may matter in this regard. You'll also want to investigate cross-CCX latency. You may find out your workers can eat cross-CCX penalties just fine, or that you need to budget for a DIMM per channel per host.

For Haswell, the mitigations for meltdown/spectre crush I/O performance. It's probably worth finding out whether this penalty is flat or scales with the GPU count.
 
Wow, that's a non-trivial amount of CPU load. You really should test for bottlenecks with like hardware before going any further.
Yeah, that's probably what I need to do. I have a bunch of hardware here already I'm sure at least some of it supports bifurcation, I should just order some of those C Payne adapters and try it out. I'll post back to this thread if I get a functioning setup and how it worked out. It's fairly challenging to test for bottlenecks with BOINC though since there are so many different projects that have different requirements, etc. Some projects barely use any CPU at all to run the GPU, others will use a whole CPU core, some need PCIe lanes others run fine on an x1 riser, etc.
 
Hello, I wish to move my nas to more compact box like node 304
I seen there was offered ncase bifurcation cable for one person, but I can’t understand how you can plug splitter and card to fit them in slot.
Can you please advise ?
I want to put LSI card and single shot gpu
 
Hi guys! I need bifurcation adapters for two U.2 to PCIE x8 but with vertical layout (I can only fit half-length card there). They will be used in my homelab server.
Found 2 options and both are hardcore Chinese (no official documentation, tech sheets, good reviews, etc). So I can only choose by picture.

1. Option 1 (PCIE gen3 x8)
This option has no electronic components at all - just PCIE lines split and that's all. Strangely, it is more expensive than the next one. High-quality pics here.

2. Option 2 (maybe gen4, but not clear)
This one has some components installed. I don't understand if its gen4 or not, and I don't really care (my MB and disks are gen3 anyway). Cannot find any high-res pics of this one, but looks it has one capacitor per disk plus some integrated circuit in the middle. My disks are PLP-enabled, so again I don't care about capacitors.

Which one would you choose and why? I.e. should I pay more for the architecturally simpler card or not.
I'm aiming for minimal additional latency and reliability.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
malexejev

1. option one has no clock distribution. Drives AND Mainboard/CPU must support SRIS. (seperate reference independet spread) clocking architecture.
2. This one has most likely a gen3 clock buffer on board.

Both cards may or may not work in gen4 mode. That depends on too many things to make any reliable forecast.
Choose option 2.
 
Hi,

I’m looking for a Mini-ITX motherboard supporting 4x4 (x4+x4+x4+x4) bifurcation because I want to use a Hyper card to add 4 extra NVMe SSD drives.

All I can find are motherboards supporting x8+2x4 which means (if I’m correct) I only can have 3 extra slots.

Thanks!
 
Yes your understanding is correct. The latest intel mainstream even only supports x8x8.

The Gigabyte aurus x570 does support x4x4x4x4 though.

Also check the asrock rack x570 as it also boasts 2* oculink 4i for even more storage.

the craziest option would be the asrockrack deep itx.

ROMED4ID-2T​

Which has 64 available gen4 lanes.

C.
 
Yes your understanding is correct. The latest intel mainstream even only supports x8x8.

The Gigabyte aurus x570 does support x4x4x4x4 though.

Also check the asrock rack x570 as it also boasts 2* oculink 4i for even more storage.

the craziest option would be the asrockrack deep itx.

ROMED4ID-2T​

Which has 64 available gen4 lanes.

C.

Thanks. I did saw some AM4 motherboards indeed like the Gigabyte one you mentioned.

Is there some Intel Mini-ITX motherboards with 4x4 support?

The other options you mentioned are too expensive sadly.
 
Thanks. I did saw some AM4 motherboards indeed like the Gigabyte one you mentioned.

Is there some Intel Mini-ITX motherboards with 4x4 support?

The other options you mentioned are too expensive sadly.
Only the asrock x99 and asrock x299, as well as their asrockrack counterparts.
Intel mainstream CPUs don't support x4x4x4x4 at all.
 
Only the asrock x99 and asrock x299, as well as their asrockrack counterparts.
Intel mainstream CPUs don't support x4x4x4x4 at all.
Not even other form factor motherboards?

If not, I guess I’m stuck with either Intel x8+2x4 (3 slots) or AMD…
 
Its the cpu, not the motherboard.

If you want a budget NVME system consider using and old xeon v2 system.
I can suggest one of the Chinese x79 boards. You can fit 2*hyperX cards.
I think the Antermiter x79 turbo could work for you.

Another option is to use an x8x8 bifurcation Card and then two ASM2824 nvme adapter cards. (Those have a pcie packet switch on board, and don't require bifurcation support) Could use one directly as well, but they only have an x8 uplink.
 
Another option is to use an x8x8 bifurcation Card and then two ASM2824 nvme adapter cards. (Those have a pcie packet switch on board, and don't require bifurcation support) Could use one directly as well, but they only have an x8 uplink.

What do you mean by this? Sorry, I’m a total n00b as my title says.
 
I bought a MaxCloudON 1x16->2x8 riser from Bulgaria. I'll write about it later. I am waiting for other computer parts, not sure when I will test it out, but it might take another month to finish the computer.
 
Hello everyone,
Are the Bifurcation Cards from C_Payne working well with Supermicro Motherboards? I have X10SDV one.
 
Back
Top