Bug 360

Summary: RFE: Logging out after an nvidia update leaves X broken
Product: Fedora Reporter: Roderick Johnstone <roderick.johnstone>
Component: nvidia-kmodAssignee: Nicolas Chauvet <kwizart>
Status: RESOLVED FIXED    
Severity: enhancement CC: kwizart
Priority: P5    
Version: unspecified   
Hardware: All   
OS: GNU/Linux   
namespace:

Description Roderick Johnstone 2009-02-04 12:40:57 CET
When users log out X does not restart automatically if yum has (automatically) updated the nvidia drivers.
A reboot seems to be required.

Either the new kmod for the new nvidia version should be installed or it should be built on update from the akmod if it is installed. Presumably one or other should be required as a dependency of the update.

X should reload the kmod when starting to make sure it has the right version.

Users are then not left with a broken system after logging out following an nvidia update.
Comment 1 Stewart Adam 2009-02-04 13:45:24 CET
There isn't much we can do to get around this - rpmfusion-config-display will attempt to reload modules once we use it, but there's still many other problems that can occur if you don't completely reboot.
Comment 2 Nicolas Chauvet 2009-02-04 14:02:54 CET
I'm perfectly aware of the problem. This is why I tend to only update the driver once a new kernel is here. But the real fix would be to update the driver only on shutdown and/or to recommend a restart.

In the past, I've experienced the same kind of issue while updating firefox which was in used.
same for glibc that usually needs a complete reboot.

> X should reload the kmod when starting to make sure it has the right version.
There is no way to assume that the right version will be here for now.
Or course we could still rebuild an updated kmod for every kernel that was released for fedora, but that seems much work. Only the last and the previous kernel would be enought.
Then, getting akmod to be rebuilt once the new version is updated should be checked. But it will move from one problem to another, what if end-users want revert back to the previous nvidia driver, assuming that the previous kernel will have the matching previous module? We could have a mechanism that will check if the produced module by akmod have the same version as the running replacement libraries (Nvidia's libGL and xorg)


Comment 3 Thorsten Leemhuis 2009-02-04 14:05:40 CET
(In reply to comment #1)
> There isn't much we can do to get around this - rpmfusion-config-display will
> attempt to reload modules once we use it,

If that doesn't work reliable then...

> but there's still many other problems
> that can occur if you don't completely reboot.

...we work towards a solution where the user gets told to reboot once a update happened. The drivers should be disabled if he doesn't restarts X without rebooting; another pop-up should tell user why that happened.

Comment 4 Stewart Adam 2009-02-04 19:30:06 CET
(In reply to comment #3)
> (In reply to comment #1)
> > There isn't much we can do to get around this - rpmfusion-config-display will
> > attempt to reload modules once we use it,
> 
> If that doesn't work reliable then...
> 
> > but there's still many other problems
> > that can occur if you don't completely reboot.
> 
> ...we work towards a solution where the user gets told to reboot once a update
> happened. The drivers should be disabled if he doesn't restarts X without
> rebooting; another pop-up should tell user why that happened.
I wonder if it's possible to add the same metadata Bodhi adds to updates so that PK requests a system restart...
Comment 5 Thorsten Leemhuis 2009-02-07 11:46:37 CET
(In reply to comment #4)
>
> I wonder if it's possible to add the same metadata Bodhi adds to updates so
> that PK requests a system restart...

Might be a good idea and should be possibl somehow. But that only solves parts of the problem, as the users might not do what PK told him...
Comment 6 Stewart Adam 2009-02-07 15:02:05 CET
(In reply to comment #5)
> (In reply to comment #4)
> >
> > I wonder if it's possible to add the same metadata Bodhi adds to updates so
> > that PK requests a system restart...
> 
> Might be a good idea and should be possibl somehow. But that only solves parts
> of the problem, as the users might not do what PK told him...
kwizart knows more in this area than I do, but IIRC there's nothing we can really do apart from strongly suggesting the user reboots _right away_... We can do our best to reload the kernel modules, but a variety of other problems can still popup if they don't choose to reboot (for example the old libGL would still in use by some programs, but all newly started programs use the new libGL).
Comment 7 Roderick Johnstone 2009-02-08 22:06:57 CET
As in my original report, the problem for us is that yum automatically updated the nvidia driver. The user didn't know anything was wrong until they logged out, at which point they had a broken system. I understand that reloading modules is not guaranteed to work in all circumstances, but if it helps in a lot of cases, then its worth doing. Having a broken system after an automatic update is quite poor.
Comment 8 Stewart Adam 2009-02-08 23:22:45 CET
(In reply to comment #7)
> As in my original report, the problem for us is that yum automatically updated
> the nvidia driver.
I'm a bit confused by what you mean here - what do you mean by "automatic"? Did he update via the graphical interface (PackageKit) or does the system have another script?

> The user didn't know anything was wrong until they logged
> out, at which point they had a broken system. I understand that reloading
> modules is not guaranteed to work in all circumstances, but if it helps in a
> lot of cases, then its worth doing. Having a broken system after an automatic
> update is quite poor.
Hopefully we can add the metadata that will suggest a reboot if you're using PackageKit, but ultimately we can't force a user to reboot. The only alternative is disabling the nvidia driver when updating, which (unless I'm mistaken) could be equally damaging if they don't logout since the nvidia driver will be in use by Xorg but libGL.so.1 will point to the ones provided by Mesa.
Comment 9 Roderick Johnstone 2009-02-08 23:50:02 CET
So, yum update runs nightly from a cron job. If the nvidia drivers are updated, the next time the user logs out X will be broken.
Comment 10 Stewart Adam 2009-02-09 04:08:37 CET
(In reply to comment #9)
> So, yum update runs nightly from a cron job. If the nvidia drivers are updated,
> the next time the user logs out X will be broken.
As I've explained we can't force a system reboot, so it's ultimately up to the user. If you don't want breakage, you should add the --exclude={xorg-x11-drv,kmod}-nvidia argument to yum and update only when you know it's safe to reboot afterwards.

Don't get me wrong, there are still issues that we need to fix... I'm going to look into the metadata so that users are prompted to reboot when updating from PackageKit, but if you're updating manually, from yum or from a script, we can't do much more than reload the modules and hope everything turns out OK.
Comment 11 Nicolas Chauvet 2011-01-25 23:11:43 CET
I expect that the problem is solved in current release.
If not, reopen and reassign to me.
Comment 12 Roderick Johnstone 2011-02-04 17:44:52 CET
I still see the same broken behaviour updating to 260.19.29, or 260.19.36 in rpmfusion-nonfree-updates-testing. Update nvidia drivers, logout and you are left at the console, not the X login screen.

Reopening.

Updated Version  to 14.

Nicolas: I'm unable to change  Assigned To field to be you.
Comment 13 Roderick Johnstone 2011-02-04 17:59:07 CET
Ah, I wonder if the way to work around this is to have kdm just reset the X-server at session exit, rather than restarting it. Its so long ago now, I can't remember why we changed the default to restart the X server on session exit.
Comment 14 Nicolas Chauvet 2011-02-04 19:58:18 CET
I'm afraid that corner case cannot be handled.

The reason is not only related to the Xorg Server, but the nvidia kernel module.
When doing this, the driver is never reloaded and you end with having an updated user-land binaries with the old kernel module. So there is a version mismatch.

Not every drivers hard-code that way the kernel module version with the Xorg driver version. But even if your display manager handles the reload of the kernel module, not all memory will be free, and the nvidia.ko needs a lot of contiguous memory IIRC.

So the best is to restart with a fresh boot on nvidia driver upgrade.


Now, it would be interesting as an experiment to try to reload the driver in such case. That kind of hack would matters in the case of 'optimus' support also. (And the correct place to handle driver re-configuration IMO).


Comment 15 Nicolas Chauvet 2014-04-11 11:07:01 CEST
I still don't foresee to fix this issue in a simple and reliable way in the nvidia packaging. That been said the current way that is been solved on the general and end-users point of view is to update during a reboot to a specific update.target. This is the way it's done with current F-20 and systemd.

Hence I consider this very old issue fixed.