| Summary: | gdm will not start unless vt is switched back and forth | ||
|---|---|---|---|
| Product: | Fedora | Reporter: | Julian Sikorski <belegdol> |
| Component: | nvidia-kmod | Assignee: | Nicolas Chauvet <kwizart> |
| Status: | RESOLVED EXPIRED | ||
| Severity: | enhancement | CC: | leigh123linux, leigh123linux |
| Priority: | P1 | ||
| Version: | f32 | ||
| Hardware: | x86_64 | ||
| OS: | GNU/Linux | ||
| namespace: | |||
| Attachments: |
/var/log/journal
journal with nvidia-fallback.service masked journal with lightdm |
||
|
Description
Julian Sikorski
2020-09-08 13:01:03 CEST
Why nvidia-drm.modeset=0 ? in cmdline ? Because I get a stack trace on this machine when setting modeset to 1: https://forums.developer.nvidia.com/t/stack-trace-when-attempting-to-use-kms-on-fedora-31-with-geforce-680m/46353 There is also that: wrz 06 08:31:35 snowball2 kernel: resource sanity check: requesting [mem 0x000e0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000e0000-0x000e3fff window] wrz 06 08:31:35 snowball2 kernel: caller _nv030454rm+0x58/0xa0 [nvidia] mapping multiple BARs wrz 06 08:31:35 snowball2 kernel: ACPI Warning: \_SB.PCI0.PEG0.MXM3._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20200528/nsarguments-59) wrz 06 08:31:35 snowball2 kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000d3fff window] wrz 06 08:31:35 snowball2 kernel: caller _nv000745rm+0x1af/0x200 [nvidia] mapping multiple BARs Which seems related to resources reservation conflicts with the nvidia card. You might have a look on any bios update from your Clevo laptop vendor. You might also need to forward the issue to nvidia (devtalk.nvidia.com) with attached the output of the nvidia-bug-report.sh archive. This laptop is about 10 years old and has not had a bios update in years. I do not really mind having modeset disabled, it is experimental according to the documentation anyway. Isn't the fact that the nvidia kernel module only gets loaded after gdm fails to start once not a packaging problem? (In reply to Julian Sikorski from comment #4) > Isn't the fact that the nvidia kernel module only gets loaded after gdm > fails to start once not a packaging problem? IMO it's gdm that should wait for X before trying start. (In reply to Julian Sikorski from comment #4) > This laptop is about 10 years old and has not had a bios update in years. I > do not really mind having modeset disabled, it is experimental according to > the documentation anyway. > Isn't the fact that the nvidia kernel module only gets loaded after gdm > fails to start once not a packaging problem? I'm not reproducing on gdm/gnome f31 f32... But having nvidia-drm=modeset=0 is certainly the root cause of such a race. Where have you forwarded the nvidia-bug-report.sh archive to devtalk.nvidia.com ? (In reply to leigh scott from comment #5) > IMO it's gdm that should wait for X before trying start. Not at all (gdm starts X, not the other way), until gdm starts wayland... (In reply to Nicolas Chauvet from comment #6) > (In reply to Julian Sikorski from comment #4) > > This laptop is about 10 years old and has not had a bios update in years. I > > do not really mind having modeset disabled, it is experimental according to > > the documentation anyway. > > Isn't the fact that the nvidia kernel module only gets loaded after gdm > > fails to start once not a packaging problem? > > I'm not reproducing on gdm/gnome f31 f32... > But having nvidia-drm=modeset=0 is certainly the root cause of such a race. > > Where have you forwarded the nvidia-bug-report.sh archive to > devtalk.nvidia.com ? I did, years ago (see the linked forum thread, posts #1 and #4). Nothing happened unfortunately. It appears to be a race condition indeed. I added vga=0x34d just to see what happens and gdm starts as expected. (In reply to Nicolas Chauvet from comment #6) > (In reply to Julian Sikorski from comment #4) ... > Where have you forwarded the nvidia-bug-report.sh archive to > devtalk.nvidia.com ? Please provide a recent nvidia-bug-report archive to this thread: https://forums.developer.nvidia.com/t/stack-trace-when-attempting-to-use-kms-on-fedora-31-with-geforce-680m/46353/10 Please also verify the option that you can enable/disable related to display and/or PCI ressources. I have looked at the log and I'm wondering why gdm is attempting to use wayland. We used to disable it till gdm added there blacklisting for nvidia. /usr/lib/udev/rules.d/61-gdm.rules https://pkgs.rpmfusion.org/cgit/nonfree/xorg-x11-drv-nvidia.git/commit/?h=f32&id=6f7f9a3cbb4d2f0d957a2a8f21cb2ca50238666f Maybe try disabling gdn wayland using the old method. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) NOUVEAU driver for NVIDIA chipset families : wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: RIVA TNT (NV04) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: RIVA TNT2 (NV05) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 256 (NV10) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 2 (NV11, NV15) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 4MX (NV17, NV18) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 3 (NV20) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 4Ti (NV25, NV28) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce FX (NV3x) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 6 (NV4x) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 7 (G7x) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce 8 (G8x) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce GTX 200 (NVA0) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: GeForce GTX 400 (NVC0) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) modesetting: Driver for Modesetting Kernel Drivers: kms wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) FBDEV: driver for framebuffer: fbdev wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) VESA: driver for VESA chipsets: vesa wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) [drm] Failed to open DRM device for pci:0000:01:00.0: -19 wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) open /dev/dri/card0: No such file or directory wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (WW) Falling back to old probe method for modesetting wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) open /dev/dri/card0: No such file or directory wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Loading sub module "fbdevhw" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) LoadModule: "fbdevhw" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Module fbdevhw: vendor="X.Org Foundation" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: compiled for 1.20.8, module version = 0.0.2 wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: ABI class: X.Org Video Driver, version 24.1 wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Unable to find a valid framebuffer device wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (WW) Falling back to old probe method for fbdev wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Loading sub module "fbdevhw" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) LoadModule: "fbdevhw" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) Module fbdevhw: vendor="X.Org Foundation" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: compiled for 1.20.8, module version = 0.0.2 wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: ABI class: X.Org Video Driver, version 24.1 wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) open /dev/fb0: No such file or directory wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: vesa: Ignoring device with a bound kernel driver wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Screen 0 deleted because of no matching config section. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) UnloadModule: "modesetting" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Screen 0 deleted because of no matching config section. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) UnloadModule: "fbdev" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) UnloadSubModule: "fbdevhw" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Screen 0 deleted because of no matching config section. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (II) UnloadModule: "vesa" wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Device(s) detected, but none match those in the config file. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: Fatal server error: wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) no screens found(EE) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: Please consult the Fedora Project support wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: at http://wiki.x.org wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: for help. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1150]: (EE) Server terminated with error (1). Closing log file. wrz 06 08:28:29 snowball2 /usr/libexec/gdm-x-session[1148]: Unable to run X server wrz 06 08:28:29 snowball2 gdm-launch-environment][1144]: pam_unix(gdm-launch-environment:session): session closed for user gdm wrz 06 08:28:29 snowball2 audit[1144]: USER_END pid=1144 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_keyinit,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_umask acct="gdm" exe="/usr/libexec/gdm-session-worker" hostname=snowball2 addr=? terminal=/dev/tty1 res=success' wrz 06 08:28:29 snowball2 audit[1144]: CRED_DISP pid=1144 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_permit acct="gdm" exe="/usr/libexec/gdm-session-worker" hostname=snowball2 addr=? terminal=/dev/tty1 res=success' wrz 06 08:28:29 snowball2 gdm[991]: Child process -1148 was already dead. wrz 06 08:28:29 snowball2 systemd[1]: session-c2.scope: Succeeded. wrz 06 08:28:29 snowball2 systemd-logind[921]: Session c2 logged out. Waiting for processes to exit. wrz 06 08:28:29 snowball2 systemd-logind[921]: Removed session c2. wrz 06 08:28:30 snowball2 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 450.66 Wed Aug 12 19:42:48 UTC 2020 wrz 06 08:28:30 snowball2 systemd-udevd[667]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255'' failed with exit code 1. wrz 06 08:28:30 snowball2 kernel: nvidia-uvm: Loaded the UVM driver, major device number 236. wrz 06 08:28:30 snowball2 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 450.66 Wed Aug 12 19:37:58 UTC 2020 wrz 06 08:28:30 snowball2 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver wrz 06 08:28:30 snowball2 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 wrz 06 08:28:30 snowball2 systemd[1]: Created slice system-systemd\x2dbacklight.slice. wrz 06 08:28:30 snowball2 systemd[1]: Condition check resulted in Fallback to nouveau as nvidia did not load being skipped. (In reply to Nicolas Chauvet from comment #7) > (In reply to leigh scott from comment #5) > > > IMO it's gdm that should wait for X before trying start. > Not at all (gdm starts X, not the other way), until gdm starts wayland... Looking at the log, nvidia module takes too long to load so gdm loads nouveau which fails, then on the second attempt it uses nvidia. Maybe we should revert? https://pkgs.rpmfusion.org/cgit/nonfree/xorg-x11-drv-nvidia.git/commit/?h=f32&id=6f7f9a3cbb4d2f0d957a2a8f21cb2ca50238666f (In reply to leigh scott from comment #11) > (In reply to Nicolas Chauvet from comment #7) > > (In reply to leigh scott from comment #5) > > > > > IMO it's gdm that should wait for X before trying start. > > Not at all (gdm starts X, not the other way), until gdm starts wayland... > > Looking at the log, nvidia module takes too long to load so gdm loads > nouveau which fails, then on the second attempt it uses nvidia. nouveau isn't loaded by gdm or Xorg with blacklisted in cmdline. But more likely by nvidia-fallback.service Try to reproduce with systemctl mask nvidia-fallback.service > Maybe we should revert? NO !? Created attachment 2225 [details] journal with nvidia-fallback.service masked nvidia-fallback.service seems to have no effect. The last entry before switching back to vt1 is at 17:34:18. Regarding the ACPI warning, this appears to be a known, harmless problem: https://askubuntu.com/questions/842134/acpi-warning-argument-4-type-mismatch Does this issue occur with another DM? Try reproducing the issue with lightdm. Created attachment 2227 [details]
journal with lightdm
With lightdm X appears to start without need for VT switch, but lightdm itself appears not to work (only white screen and cursor is shown).
(In reply to Julian Sikorski from comment #13) > Created attachment 2225 [details] > journal with nvidia-fallback.service masked > > nvidia-fallback.service seems to have no effect. The last entry before What do you mean by no effect ? Does systemctl mask nvidia-fallback prevent nouveau from loading in a "racy" condition ? As a side note, I have a similar issue on my ARM devices, I need to restart gdm before it can display anything. Seems like there is a race between the display-manager and the graphic driver stack. And disabling modeset allows to trigger the error more easily. If removing nvidia-drm.modeset=1 from grub, can you replace it with rd.driver.pre=nvidia-drm instead ? Does it workaround your problem ? It helped once and didn't once, I need to test more. I have upgraded to F33 since in case it matters. Is this still reproducible ? Is a gdm report was made ? I have retired the machine affected by this bug meaning I can no longer test it. Sorry. No problem, let's close the bug then. |