uevent / call_usermoderrelated breakage on suspend to disk [Was: Re: PCMCIA-related breakage on suspend to disk]

Patrick Mochel mochel at digitalimplant.org
Tue Dec 20 14:53:31 EST 2005


Apologies for the delay in responding; full email preserved for context..

On Tue, 6 Dec 2005, Dominik Brodowski wrote:

> Hi,
>
> On Thu, Nov 24, 2005 at 05:29:08AM +0000, Matthew Garrett wrote:
> > In the Ubuntu suspend to disk script, we eject all PCMCIA cards before
> > suspend in order to minimise problems caused by drivers that aren't
> > terribly happy with being suspended. This seems to cause the following
> > issue:
> >
> > 1) Userspace ejects the card
> > 2) Userspace triggers suspend to disk
> > 3) Userspace is frozen
> > 4) The kernel suspends all hardware
> > 5) An atomic copy of memory is made
> > 6) The kernel resumes the hardware in order to write the image to disk
> > 7) socket_resume is called
> > 8) socket_resume calls socket_insert
> > 9) The machine hangs
> >
> > It's fairly clear /why/ socket_insert is called - it's possible that a
> > card has been inserted between suspend and resume (we call cardctl
> > insert from userspace in order to deal with this case). However, it
> > seems to cause problems in this specific case, and I don't understand
> > why. The following patch avoids the issue, but plainly isn't the correct
> > solution.
>
> No, it doesn't seem so...
>
> The actual hang appears in this call trace (many functions left out...)
>
> drivers/pcmcia/cs.c:	socket_resume
> drivers/pcmcia/cs.c:	socket_insert
> drivers/pcmcia/cs.c:	pcmcia_device_add
> drivers/base/core.c:	device_register
> drivers/base/core.c:	device_add
> 		(additional info:
> 			if ((error = device_pm_add(dev)))
> 		 is never reached, as the pr_debug() didn't show up...)
> lib/kobject_uevent.c:	kobject_hotplug
>
> 	pr_debug ("%s: %s %s seq=%llu %s %s %s %s %s\n",
> 		  __FUNCTION__, argv[0], argv[1], (unsigned long long)seq,
> 		  envp[0], envp[1], envp[2], envp[3], envp[4]);
>
> 	is still shown, it then hangs either in
>
> --->	send_uevent(action_string, kobj_path, envp, GFP_KERNEL);
>
> 	or
>
> --->	retval = call_usermodehelper (argv[0], argv, envp, 0);
>
>
> What to do?
> a) fix send_uevent or call_usermodehelper to not lock up in the
> 	resume-when-suspending-to-disk-path
> b) make device_add fail if in
> 	resume-when-suspending-to-disk-path
> c) disable device-adding in drivers/pcmcia/cs.c:socket_resume() if in
> 	resume-when-suspending-to-disk-path

I think that (c) is the best approach. If it can happen in the subsystems,
that's great, though perhaps there is core infrastructure that must be in
place for it to work in general..

We (the core) should resume all devices that are present, then discard all
devices that have data structures but are not present, then add all new
devices.

Maybe we can eventually do all of those things simultaneously, but I see
no problem with serializing them, even long-term.


Thanks,


	Patrick




More information about the linux-pcmcia mailing list