Building a 2U AMD Ryzen server (Proxmox GPU Passthrough / OBS or Xsplit in VM)

As I mentioned previously, one of the reasons for this server is to use it as a live streaming server for some of the IP cameras we use on our LAN parties. Since we now have 8MP/4K IP cameras I also wanted to stream in 4K, preferably in 60FPS to make scrolling text look all beautiful and crisp.

Content Index:

This build has spawned several articles:

  1. The build hardware and a ZFS storage how-to
  2. How to quiet the server down using different fans and PWM control
  3. This Article: A how-to on GPU passthrough within Proxmox for running X-split/OBS with GPU encoding and streaming to YouTube/Twitch

Proxmox/KVM GPU Passthrough

I use either OBS or Xsplit to do my streaming but OBS is mostly the most efficient with given resources. I’ve tested with this in a Windows virtual machine before but without GPU assistance for compositing and stream decoding/encoding the amount of CPU power that this needs is insane (~80% of the Ryzen 1700x with 8 cores) to run 4K/30.

With the GTX1050 doing the compositing in passthrough and using NVENC to do the encoding CPU usage drops to about 10% of 4 cores assigned and the GPU is utilized around 60% while sending 4K/60, a very big improvement and freeing up the CPU for other tasks. 🙂

But how to configure this, sadly it’s not (yet) possible to just configure it from the GUI.

This information below is gathered from a lot of helpful forum threads on the proxmox forum and all over the internet! Sadly there didn’t seem to be one place that lists everything so I’m going to give it a try.

The following posts helped me:

PCI passthrough proxmox wiki

GPU passthrough tutorial/reference

Testing repository

At the time of writing it’s smart to enable the test repository within proxmox. Especially for AMD Ryzen, the newer kernel has certain patches which make virtualization and passthrough work a lot smoother and efficient then it did before!

The kernel I’m running at the time of writing is:

Linux duhmedia 4.13.13-5-pve #1 SMP PVE 4.13.13-38 (Fri, 26 Jan 2018 10:47:09 +0100) x86_64 GNU/Linux

Change boot parameters

Inside of /etc/default/grub change the following line to include:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on video=efifb:off"

This changed the grub boot parameters to enable IOMMU for AMD in PT mode. It also parses a video command essential to making the first slot available for GPU passthrough.

After changing the line run a:

upgrade-grub

to make sure the changes are taken into account after you reboot. After the command completes, do a reboot.

When rebooted (Make sure IOMMU is enabled in your BIOS) check to see if IOMMU is now active:

root@duhmedia:/etc/default# dmesg | grep -e DMAR -e IOMMU
[    0.594621] AMD-Vi: IOMMU performance counters supported
[    0.596624] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[    0.597487] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[   14.039859] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

Looking good!

Blacklist drivers from loading

I specifically spec’ced the server to include a AMD Radeon video card and an NVidia video card. Since we want to use the AMD Radeon card as our console card, we need to let proxmox load any necessary drivers for it. But the NVidia card we want to passthrough so it shouldn’t load any drivers or initialize that card.

To accomplish this enter the following:

echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf

When that is done enter the following commands to make sure these drivers are loaded during kernel initialization:

echo vfio >> /etc/modules
echo vfio_iommu_type1 >> /etc/modules
echo vfio_pci >> /etc/modules
echo vfio_virqfd >> /etc/modules

When that is done run a:

update-initramfs -u

Once that is done, we’re going to find the PCI ID’s of the NVidia GPU we want to passthrough. Run a:

lspci -v

which will output a whole lot of information about all the PCI cards in your system. Find the NVidia card:

08:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Gigabyte Technology Co., Ltd GP107 [GeForce GTX 1050]
        Flags: fast devsel, IRQ 354
        Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
        Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Memory at d0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at e000 [size=128]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] #19
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

08:00.1 Audio device: NVIDIA Corporation Device 0fb9 (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3747
        Flags: fast devsel, IRQ 355
        Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

and note it’s PCI ID, in my case it’s 08:00 that we need. Remember, if you ever change anything in your device configuration (enable/disable sound card, USB, add a NIC, etc.) the order of the PCI devices can change and the following will need to be changed accordingly.

Run a:

lspci -n -s 08:00

to find the vendor IDs, we are going to need them to assign a different driver to it, in my case the output was:

root@duhmedia:/etc/default# lspci -n -s 08:00
08:00.0 0300: 10de:1c81 (rev a1)
08:00.1 0403: 10de:0fb9 (rev a1)

Now that we know that, do the following:

echo options vfio-pci ids=10de:1c81,10de:0fb9 disable_vga=1 > /etc/modprobe.d/vfio.conf

Once that is done, reboot your system.

Once the reboot is complete, use lspci -v to check that the NVidia card is now using the vfio-pci instead of any NVidia driver as can be seen in my example above.

Creating the VM

Create a new virtual machine inside of Proxmox. During the wizard make sure to select these things:

  • Create the VM using “SCSI” as the Hard Disk controller
  • Under CPU select type “Host”
  • Under Netwerk select Model “VirtIO”

After the wizard completes, we need to change a few things:

  • Next to linking the default DVD-ROM drive to a Windows 10 ISO (if you are passing through to windows), create a second DVD-ROM drive and link the VFIO driver ISO , you will need it while installing windows. Make sure the second DVD-ROM drive is assigned IDE 0
  • Under Options, change the BIOS to “OVMF (UEFI)”
  • Then under Hardware click Add and select “EFI Disk”

The next step I only know how to do using CLI, for this open an SSH session to the Proxmox host and run the following:

cd /etc/pve/qemu-server
ls

Here you will find the config files for your VMs. We need to add a line to the VM you just build. In the console you can see the ID such as 100 or 103, etc.. Once you know the right ID, run the following:

echo machine: q35 >> /etc/pve/qemu-server/100.conf

Now return to the web GUI and install Windows as normal. Because of the Hard Disk controller selected it won’t find a drive to install to during the install wizard but just select the ISO we linked to the second DVD-ROM drive and select the “vioscsi” directory.

Make sure you can remote control

Once installing and updating Windows there is one thing you need to make sure of, that you can remotely login to the VM. The reason why is because if you boot the VM with GPU Passhtrough, it disables to standard VGA adapter after which the built-in VNC/Spice from Proxmox will no longer work.

This means you need to enable RDP and set a static IP or install any other form of remote control (VNC, Teamviewer, etc.).

Once that is done, shutdown the VM, we need to make another config change!

Passing through the GPU

Now that we have everything prepared, we need to add the passthrough information:

echo hostpci0: XX:XX,x-vga=on,pcie=1 >> /etc/pve/qemu-server/100.conf

Replace the XX:XX with the PCI-IDs we found earlier. In my case this was “08:00”.

Save the file and see if the VM starts. If it does, great, you are probably done! Windows should see the GPU and start installing a driver from Windows Update. Let it do that for a few minutes and if you want you can then replace the driver with a freshly downloaded one.

It’s not working, now what?

Well, this is a bit complex but if GPU passthrough will work depends on your PCIe layout and a few other things.

One of the things that is important is your GPU BIOS. Some motherboards are more difficult to deal with then others but if you are trying to passthrough the same GPU that also displayed your BIOS information, you might find that it’s not working in that slot but does work in another slot or as a “second” video card.

If that’s the case, you need to get a copy of your GPU video BIOS and add it into the config file.

GPU passthrough ROM loading

As mentioned above, sometimes you need to give KVM your video ROM to be able to boot the VM with GPU passthrough on correctly. It has something to do with having the re-initialize the video card to “disconnect” it from your motherboards BIOS control.

Download or Extract your video BIOS

Although there are guides how to extract your video BIOS under Linux, all of these did not work for me.

I looked around on TechPowerUp for a BIOS trying to match my card (GTX1050) and download one, but I couldn’t find the correct one.

So I proceeded to install the GPU as a second video card (primary will work too) in another PC and booted Windows. There I used GPU-Z to extract the BIOS to a file.

Adding the ROM file

Get the ROM file downloaded to your Proxmox host and put it in “/usr/share/kvm/”.

Once that is done, we need to add the ROM to the line we added to the config earlier. So open /etc/pve/qemu-server/100.conf again and change the line to something like this:

hostpci0: XX:XX,x-vga=on,pcie=1,romfile=GTX1050.rom

Save it and try to start the VM. Hopefully now it works!

For me, it still didn’t though. Turns out there is another little trick to this!

Stripping the ROM file

NVidia has some sort of lock in some Video ROMs to disable the usage inside of a virtual machine, they want you to buy their more expensive cards which do enable this functionality.

But we’re not trying to abuse the GPU for anything like it would be used for in the Enterprise, we just want to pass it through to a VM.

Recently, someone made a little Python tool to strip off the first part of the Video ROM to make it work again!

The Python script is called NVIDIA-vBIOS-VFIO-Patcher and be found at the link.

Run the script on your Video ROM and upload the stripped/patched version to your Proxmox host (or run the script on there after installing Python). Put the file in the “/usr/share/kvm/” directory again and edit /etc/pve/qemu-server/100.conf to now include the following:

hostpci0: XX:XX,x-vga=on,pcie=1,romfile=patchedGTX1050.rom

Once that is done, try starting your VM again. For me, that fixed the issue and all was working!

Still having problems?

If you are still having problems, there could be noumerous things that are the problem. This is also by no means a complete guide. For instance I did not talk about any BIOS switches you need to set because I believe those to be obvious (such as IOMMU, etc.)

The best place to start resolving any issues you are still having is googling around for more KVM and GPU passthrough guides. As mentioned earlier there are a lot of excellent posts on the Proxmox forum. But also posts from for instance “UnRAID” from limetech or other operating systems such as Fedora can apply. It’s all using Qemu/KVM and as such you might find a hint where to look or what to change to make it work!

Currently I have one remaining problem, once I boot the Proxmox host and start the VM, all is well, GPU usage is as expect and only lose around 1%-3% from bare metal. But once I shutdown the VM and start it again, suddenly the GPU usage while doing the same tasks (for me it’s video encoding and compositing using OBS) is more then double (~35% to 85%)! I’m not yet sure what is causing this but a workaround is rebooting the host.

As always, questions and comments are always welcome! Let me know if you got it to work!

Please follow and like us:

4 thoughts on “Building a 2U AMD Ryzen server (Proxmox GPU Passthrough / OBS or Xsplit in VM)”

  1. hi quindor,

    are you are using both the built-in gpu of the 1700x and the 1050 for the pass through?

    I would like to have this kind of setup but i’m not too familiar with linux, do you think in the future that gpu pass through would be easier to do?

    can you post a tutorial or vm for home automation?

    1. No, the Ryzen 1700/1700x CPU’s do not have a built-in GPU. That’s why I have a cheap low-powered AMD card in there and the NVidia GTX1050. I use the low-powered card for console duty when needed (only works AFTER booting OS!) and as I mentioned, I passed the GTX1050 to a VM for GPU encoding.

      Although it might seem daunting and it can take a bit of work to get it all working, it’s very possible to do it right now. Just use my guide and follow other guides and forum posts and you’ll be able to figure it out.

      If there will be GUI enhancements to make it easier, I don’t know, some config variables are a bit complex to fit into a GUI.

      Actually, I have some tutorials on setting up a Domoticz server in a (Fedora) VM and connecting it to QuinLED! Check them out here: http://blog.quindorian.org/2017/02/esp8266-led-lighting-using-quinled-with-domoticz.html/

  2. i thought there is gpu in amd cpu, thanks for the guide, I added it to my resources.

    thanks for the automation tutorial, i will study it.

  3. Hi there,

    i _really_ wanted to thank you.
    I have some Threadripper 1950x CPU’s and tried to get that working for months now.

    First i had to wait for a new BIOS, because Version 3x for the Gigabyte Aorus X399 Gaming 7 had A.G.E.S.A Version 1.0.0.5 which seemed *very* broken for IOMMU things.
    The new BIOS (Version 10) released about two weeks ago.

    I installed it and found your guide providing the needed tweeks.

    It all works perfectly now and i’m quite happy with that.

    I’ll write a guide for Threadripper with this particular motherboard myself (Because i think it is very very good for doing especially this type of things)

    Anyway, thanks a lot!

Leave a Reply

Your email address will not be published. Required fields are marked *