Victor is a full stack software engineer who loves travelling and building things. Most recently created Ewolo, a cross-platform workout logger.
Compiling the NVIDIA 415.27 drivers on Linux kernel 5.x

My primary workstation is a beat up Dell Precision M3800 which has been going strong for the last 4 years. It features a NVIDIA Quadro K1100M graphics adapter and unfortunately for me, the latest 5.0.0-31 kernel update borked my display (I was running Ubuntu 19.04 Disco Dingo at the time). Prior to this kernel update, my system was happily running on the proprietary nvidia 390.x drivers.

The kernel update and the resulting black screen led me down a rabbit hole of Xorg configuration until I figured out that it was the nvidia driver that was busted. One issue that threw me off track was the fact that I had set up my display manager (gdm) to boot via Xorg and not wayland. Changing this setting back to use wayland let GDM start and clued me in as to what the problem could be.

The current suggested driver for my graphics card is nvidia 418.88 at the time of writing, available from the nvidia driver downloads website. However this driver did not work with my Quadro K1100M graphics adapter. Moreover, the 418.x line is now deprecated in favour of 430.x and installing the 418.x driver in Ubuntu leads to 430.x being installed. After trying various driver versions, 415.27 seemed to support my graphics card with one minor glitch - the DKMS installation failed with the following error:


...
error: implicit declaration of 
function 'do_gettimeofday'; did you mean 'efi_gettimeofday'? [-Werror=implicit-function-declaration]
do_gettimeofday(&tm);
^~~~~~~~~~~~~~~
efi_gettimeofday
cc1: some warnings being treated as errors
...

DKMS stands for dynamic kernel module support which provides automatic installation of kernel modules whenever a new kernel is installed. Normally I would not have any issue with manually installing the display driver on a kernel update but the dkms failure meant that anytime I would touch the software repository the installation/update would say that it failed because of the dkms errors.

Anyways, enough of the prologue, I managed to find someone's patch which fixes the missing functions that the nvidia drivers depend upon. However that patch needs to be applied with a -p2 option and there is no way to make dkms automatically do that. Therefore, here is a pre-stripped version: https://gist.github.com/wheresvic/389f2dd4d6367158e5f4585311bdc089

If you have already installed the nvidia drivers then on ubuntu you should have a /usr/src/nvidia-415.27 directory. Put the patch in the patches subdirectory and call it buildfix_kernel_5.0.patch. Then edit /usr/src/nvidia-415.27/dkms.conf and add these lines to configure the patch to be applied:


PATCH[1]="buildfix_kernel_5.0.patch"
PATCH_MATCH[1]="^5."  

This should make dkms automatically apply the patch, but only when building the module for 5.x kernels. Force dkms to recompile and install it with this command: sudo dkms install -k 5.0.0-31-generic/x86_64 nvidia/415.27.

While I was a happy camper with the above patch, the latest Ubuntu 19.10 brought along the linux kernel 5.3.0-x with it. This patch worked fine all the way until 5.3.0-19, but dkms broke again with kernel version 5.3.0-23 with the following error:


...
error: redefinition of 'list_is_first'
static inline int list_is_first(const struct list_head *list,
^~~~~~~~~~~~~
...
note: previous definition of 'list_is_first' was here
static inline int list_is_first(const struct list_head *list,
^~~~~~~~~~~~~
...

The solution was this was found here and I've made a gist out of the p2 version here. After updating the dkms.conf file to include this patch and re-running the dkms compilation via sudo dkms install -k 5.3.0-23-generic/x86_64 nvidia/415.27, I got the next error:


In file included from /var/lib/dkms/nvidia/415.27/build/nvidia/nv_uvm_interface.c:21:
/var/lib/dkms/nvidia/415.27/build/nvidia/nv_uvm_interface.c: In function 'nvUvmInterfaceDeRegisterUvmOps':
/var/lib/dkms/nvidia/415.27/build/common/inc/nv-linux.h:733:21: error: void value not ignored as it ought to be
  733 |         int __ret = on_each_cpu(func, info, 1);        \
      |                     ^~~~~~~~~~~
/var/lib/dkms/nvidia/415.27/build/nvidia/nv_uvm_interface.c:991:5: note: in expansion of macro 'NV_ON_EACH_CPU'
  991 |     NV_ON_EACH_CPU(flush_top_half, NULL);
      |     ^~~~~~~~~~~~~~

This was a tricky one as googling did not lead to very many answers and I contemplated attempting a patch of my own. Fortunately however, I did find something on someone's website where I did not understand the text but I could try the patch that they provided. Again after updating dkms.conf to apply this patch and attempting to recompile dkms, I got the next error:


/var/lib/dkms/nvidia/415.27/build/nvidia-uvm/uvm8_tools.c:209:13: error: conflicting types for 'put_user_pages'
    209 | static void put_user_pages(struct page **pages, NvU64 page_count)
        |             ^~~~~~~~~~~~~~
  In file included from /var/lib/dkms/nvidia/415.27/build/common/inc/nv-pgprot.h:17,
                  from /var/lib/dkms/nvidia/415.27/build/common/inc/nv-linux.h:20,
                  from /var/lib/dkms/nvidia/415.27/build/nvidia-uvm/uvm_linux.h:39,
                  from /var/lib/dkms/nvidia/415.27/build/nvidia-uvm/uvm_common.h:48,
                  from /var/lib/dkms/nvidia/415.27/build/nvidia-uvm/uvm8_tools.c:23:
  ./include/linux/mm.h:1062:6: note: previous declaration of 'put_user_pages' was here
  1062 | void put_user_pages(struct page **pages, unsigned long npages);
        |      ^~~~~~~~~~~~~~

A fix for this was quickly found here. I've created a gist as well just for completeness. Here are all the changes to my dkms.conf:


PATCH[1]="buildfix_kernel_5.0.patch"
PATCH[2]="buildfix_kernel_5.1.patch"
PATCH[3]="buildfix_kernel_5.3.patch"
PATCH[4]="buildfix_kernel_5.3-a.patch"
#PATCH_MATCH[0]="^4.[6-7]"
PATCH_MATCH[0]="^5."

As you can see, I was just about resigning myself to the fact that this would be an endless series of patches but it somehow stopped at this point. Let's see what the next kernel update brings!

A final note of warning: please go through the patch contents yourself manually before applying them on kernel modules willy nilly. You can test whether this process works on a Virtual Machine before attempting it on main system.

A big thanks to this reddit user and their initial post for pointing me in the right direction.