Porting nvidia-drm.ko to FreeBSD

What I learned while using the linuxkpi to port NVIDIA's DRM driver to FreeBSD.

As a side project, I wanted try rendering to the screen without X11. Using EGL without X11 on the NVIDIA driver requires DRM, which is supported through the nvidia-drm.ko module. I decided to try porting it to FreeBSD.

The linuxkpi, or linux kernel programming interface, allows linux drivers to run in a FreeBSD kernel. Its main use is the linux-based DRM on FreeBSD.

I have seen a couple references to my GitHub issue on the mailing lists and etc. This will serve as (probably) my final status on this port.

Update (12/10/19)

Here's some pretty colors being displayed using only libdrm on a gtx1070 without X11! EGL doesn't work for reasons that will be mentioned later, but I finally got around to finding a test program to demonstrate that this works.

nvidia-drm proof of concept

The program running is a slightly modified version of this.

Goal

Port nvidia-drm.ko to FreeBSD using the linuxkpi.

Andy Ritger has a great repository demonstrating how to use EGL and DRM on NVIDIA cards. I filed this issue when I first tried using x11-less EGL on FreeBSD and ran into problems. I tried to run the code to test my DRM port.

I'd definitely like to thank Andy Ritger and Miguel A Vico Moya for patiently answering all of my questions on that thread. They were an enormous help and I appreciate them taking time out of their day to point me in the right direction when I encountered closed source portions of the code.

Does it work?

Well yes but actually no

As shown by the picture above, nvidia-drm.ko does work. I'll explain more in-depth later, but the DRM interface does work. The screen will change and as kernel modesetting happens and all of the libdrm bits are successful.

The problem is that NVIDIA's FreeBSD libEGL implementation does not support DRM. Because of this lack of support I can't actually render anything. This is okay though. Adding these DRM bits into libEGL for FreeBSD would be a lot of bloat for something that nobody except me really wants. I am definitely against this being added. A better idea is to add Vulkan support for FreeBSD and use that instead.

Did you actually do anything useful?

While doing this I found a couple issues that I reported:

Add full support for PCI_ANY_ID when matching PCI IDs in the LinuxKPI.
- The linuxkpi needed changing to respect values of PCI_ANY_ID in the module table for device detection. This could have been my first real contribution to FreeBSD, the lesson here is always submit your local patch when you report an issue. (also thanks to hselasky for fixing this.)
Duplicate lock acquisition in nvidia.ko when WITNESS(4) is enabled
- Witness would cause a panic when nvidia.ko tried to re-acquire a lock it already held.
- NVIDIA internal bug 2507077

Was it a success?

Definitely

"Why on earth would you work on a closed source driver?"

-- a valid question I asked myself

I got bit pretty hard by the closed source portions of the driver not working. It took about a month and a half to port this, and I have no cool demos to show anyone (unless you count turning my screen black a demo). So from a developer standpoint this was pretty much a failure.

But I am not just a developer, I am really a student. And from a student's perspective this was a total blast. I had never done any sort of graphics driver development before, its kind of hard to find introductory projects to learn with. I also got familiar with the linuxkpi and had a wild tour of the virtual memory system. I got to write a GEM buffer page fault handler, and got better at using dtrace to debug the kernel. All in all, I loved it. Sometimes the best way to test if you really love a subject is to just dig in and see how far you can go. I definitely recommend attempting bad ideas and impossible tasks for the sake of personal development.

Porting

The following code bits are taken from the source, and aren't too pretty on their own. Most of the development takes place in this directory of the repository. The start of the FreeBSD specific bits can be found in nvidia-drm-freebsd-lkpi.c.

Obviously I'm going to leave out quite a bit of detail because, lets be honest, you're probably not that interested.

There were a variety of minor changes that had to happen, most of them involving how to make the module loadable through the linuxkpi. Declaring our module looks like this:

LKPI_DRIVER_MODULE(nvidia_drm, nv_drm_init, nv_drm_exit);
MODULE_DEPEND(nvidia_drm, linuxkpi, 1, 1, 1);
MODULE_DEPEND(nvidia_drm, drmn, 2, 2, 2);
MODULE_DEPEND(nvidia_drm, nvidia, 1, 1, 1);
MODULE_DEPEND(nvidia_drm, nvidia_modeset, 1, 1, 1);

If you've ever written kernel modules for FreeBSD, this will look familiar. LKPI_DRIVER_MODULE is used as a sort of 'linux version' of DRIVER_MODULE(9). It sets methods to be called when the module is loaded or unloaded. We then specify our dependencies on drm, along with the other nvidia modules so that they are loaded too.

devclass_t nv_drm_devclass;
struct pci_driver nv_drm_pci_driver = {
        .name = "nvidia-drm-pci",
        .id_table = nv_pci_table,
        .probe = nv_drm_bsd_probe,
        /* .bsdclass = nv_drm_devclass */
};

Now we make a struct pci_driver to register with the linux subsystem. This is done in nv_drm_init when the module is first loaded:

linux_pci_register_drm_driver(&nv_drm_pci_driver);

Using linux_pci_register_drm_driver is important. Real linux drivers will use pci_register_drm_driver, but this caused problems for me.

The linux version of nvidia-drm.ko expects a list of detected devices including some important information about them. There is no such list in the FreeBSD driver, so I have to use my own probe function (nv_drm_bsd_probe) to re-detect all GPU's and register them. This is demonstrated in the following simplified code:

for (int i = 0; i < NV_MAX_DEVICES; i++) {
       sc = devclass_get_softc(nvidia_devclass, i);
       nv = sc->nv_state;

       /*
        * Compare vendor/device ID's to find the matching
	* GPU. We then register the GPU
	*
	* 'ent' is the pci_device_id entry we are probing
	*/
       if (nv->pci_info.vendor_id == ent->vendor
           && nv->pci_info.device_id == ent->device) {
	       /* register this device with drm */
       }
}

Internally, GPU's are identified by an ID placed in a nv_gpu_info_t structure. We need this ID to be correct as it will be used later. Once we have an info structure we can register with the DRM subsystem:

/* this calls drm_dev_alloc */
nv_drm_register_drm_device(&gpu_info);

So what does all of this do and where does it get us? The code above outlines how to initialize a linuxkpi based module and create devfs paths that we can open in userspace applications. We've registered these devfs nodes with the drm subsystem, so that when we open the device it will initialize the GPU. We can then set the output mode so we can display things.

While most of my changes were to the nvidia-drm.ko, I did have to add some kernel api functions to nvidia-modeset.ko. These functions are only used in nvidia-drm, and were excluded from nvidia-modeset.ko when it was first ported. These kapi functions all begin with nvkms_.

nvkms_open_gpu			<-- common on all platforms
   |
    -- nvidia_open_dev_kernel	<-- FreeBSD specific

Once I finished loading the module and fleshed out the kapi support it was pretty smooth sailing. The final problem was implementing a FreeBSD version of the GEM buffer page fault handler.

The linuxkpi is quite well done. It gets a lot of negative attention in the community, so it was a pleasant surprise that using it was relatively painless. Because drm-kmod is dependent on it there is plenty of active development and maintanance. Linux based DRM is a necessary evil due to developer constraints in FreeBSD, and I'm very happy with the way they have done it.

If you've lasted this long, thanks for reading! If you see any errors with the content please let me know so I can keep things accurate.