Bug 6963

Summary: nvidia-390xx-kmod-390.157-9.fc41.src.rpm fails to build on kernel 5.14.0-427.20.1.el9_4.x86_64 (re:post as Bug)
Product: Fedora EPEL Reporter: NevilleDNZ <NevilleD.rpmfusion>
Component: nvidia-390xx-kmodAssignee: Nicolas Chauvet <kwizart>
Status: NEW ---    
Severity: normal CC: nerijus
Priority: P1    
Version: 9   
Hardware: x86_64   
OS: GNU/Linux   
namespace:

Description NevilleDNZ 2024-06-12 03:28:28 CEST
The current driver at both nvidia.com and https://rpmfusion.org/ are either broken (don't build) or are missing... (RHEL9.0 kernel was working)

* https://www.nvidia.com/Download/driverResults.aspx/196213/en-us
Linux x64 (AMD64/EM64T) Display Driver:
Version:    390.157 / Release Date:   2022.11.22
Operating System:   Linux 64-bit

However for Fedora 39+40+41 the driver actually builds and exists...
* nvidia-390xx-kmod-390.157-9.fc40.src.rpm etc.

I have discovered (what I suspect is) the reason nvidia-390xx-kmod-390.157 isn't building on RHEL9.4's kernel 5.x.

In short: It looks like a kernel 5.x backport/build problem... drm_mode_config.h v6.4 to 5.14.0-427.20.1.el9_4 and thereby broken NVIDIA RHEL nvidia-390 graphics drivers.

https://github.com/torvalds/linux/blob/v6.1-rc8/include/drm/drm_mode_config.h#545 => resource_size_t fb_base; goes AWOL in v6.2 ...

When I try to rpmbuild I get.... eg.

File: drm_mode_config.h:
// Rel. commit "drm: Remove drm_mode_config::fb_base" (Zack Rusin, 18 Oct 2022)
#if defined(CONFIG_FB) && (LINUX_VERSION_CODE < KERNEL_VERSION(6, 2, 0))
    /* Currently unused. Update when needed. */
    dev->mode_config.fb_base = 0;
#endif

nvidia-drm-drv.c:244:21: error: 'struct drm_mode_config' has no member named 'fb_base'
nvidia-drm-drv.c:765:18: error: 'struct drm_driver' has no member named 'dumb_destroy'

The good news is that I think I can "fix" the problem by simply #ifdef-ing the offending lines.

But 1: how do I #ifdef the nvidia for kernel 5.14.0-427.20.1 to indicate it is a backport from KERNEL_VERSION(6, 2, 0)... eg Is there a macro KERNEL_BACKPORT(5,14,0,427,20,1)?

APPENDIX1:

It turns out my local kernel [5.14.0-427.20.1.el9_4.x86_64 drm_mode_config.h](/usr/src/kernels/5.14.0-427.20.1.el9_4.x86_64/include/drm/drm_mode_config.h) file does not match [v5.14-rc7 drm_mode_config.h](https://github.com/torvalds/linux/blob/v5.14-rc7/include/drm/drm_mode_config.h).  

Indeed my local version is identical to [v6.4](https://github.com/torvalds/linux/blob/v6.4/include/drm/drm_mode_config.h) ...

APPENDIX2:

$ grep -E "^ID=|^VERSION=" /etc/os-release ; uname -r
VERSION="9.4 (Plow)"
ID="rhel"
5.14.0-427.20.1.el9_4.x86_64

$ rpmbuild -ra ~/Downloads/nvidia-390xx-kmod-390.157-9.fc41.src.rpm 

/~/rpmbuild/BUILD/nvidia-390xx-kmod-390.157/_kmod_build_5.14.0-362.8.1.el9_3.x86_64/nvidia-drm/nvidia-drm-drv.c: In function 'nv_drm_init_mode_config':
~/rpmbuild/BUILD/nvidia-390xx-kmod-390.157/_kmod_build_5.14.0-362.8.1.el9_3.x86_64/nvidia-drm/nvidia-drm-drv.c:247:21: error: 'struct drm_mode_config' has no member named 'fb_base'
  247 |     dev->mode_config.fb_base = 0;
      |                     ^

$ rpmbuild -ra ~/Downloads/nvidia-390xx-kmod-390.157-9.fc40.src.rpm

~/rpmbuild/BUILD/nvidia-390xx-kmod-390.157/_kmod_build_5.14.0-362.8.1.el9_3.x86_64/nvidia-drm/nvidia-drm-drv.c: In function 'nv_drm_init_mode_config':
~/rpmbuild/BUILD/nvidia-390xx-kmod-390.157/_kmod_build_5.14.0-362.8.1.el9_3.x86_64/nvidia-drm/nvidia-drm-drv.c:247:21: error: 'struct drm_mode_config' has no member named 'fb_base'
  247 |     dev->mode_config.fb_base = 0;
      |                     ^
Comment 1 NevilleDNZ 2024-06-12 06:05:53 CEST
See also: [Request NVIDIA 390xx for EL9](https://bugzilla.rpmfusion.org/show_bug.cgi?id=6921)
Comment 2 Nerijus Baliƫnas 2024-06-12 15:43:43 CEST
> But 1: how do I #ifdef the nvidia for kernel 5.14.0-427.20.1 to indicate it is a backport from KERNEL_VERSION(6, 2, 0)... eg Is there a macro KERNEL_BACKPORT(5,14,0,427,20,1)?

Another project (blackmagic drivers) use RHEL_RELEASE_OR_LATER macro, for example:

#if KERNEL_VERSION_OR_LATER(5, 15, 0) || RHEL_RELEASE_OR_LATER(8, 9)
Comment 3 Nicolas Chauvet 2024-06-12 16:10:13 CEST
(In reply to Nerijus Baliƫnas from comment #2)
...
> #if KERNEL_VERSION_OR_LATER(5, 15, 0) || RHEL_RELEASE_OR_LATER(8, 9)

Theses are likely custom macro from this vendor. One should use theses instead:
(taken from a centos-stream kernel, but should apply RHEL and derivates kernels).

cat /usr/src/kernels/5.14.0-452.el9.x86_64/include/generated/uapi/linux/version.h
#define LINUX_VERSION_CODE 331264
#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)))
#define LINUX_VERSION_MAJOR 5
#define LINUX_VERSION_PATCHLEVEL 14
#define LINUX_VERSION_SUBLEVEL 0
#define RHEL_MAJOR 9
#define RHEL_MINOR 5
#define RHEL_RELEASE_VERSION(a,b) (((a) << 8) + (b))
#define RHEL_RELEASE_CODE 2309
#define RHEL_RELEASE "452"


We will welcomed anyone that would volunteer to maintain or co-maintain any driver wrt RHEL compatibility support.

Feel free to suggest working (tested) patches. Testing on centos-stream kernel would be fine but not required (at least it may helps to add support for RHEL N+1 kernel).