Lockdep warnings on kexec (virtio_blk, hrtimers)

David Woodhouse dwmw2 at infradead.org
Fri Dec 13 12:16:51 PST 2024


On 13 December 2024 19:06:52 GMT, "Rafael J. Wysocki" <rafael at kernel.org> wrote:
>On Fri, Dec 13, 2024 at 6:32 PM Rafael J. Wysocki <rafael at kernel.org> wrote:
>>
>> On Fri, Dec 13, 2024 at 6:05 PM Thomas Gleixner <tglx at linutronix.de> wrote:
>> >
>> > On Fri, Dec 13 2024 at 14:07, David Woodhouse wrote:
>> > > On Fri, 2024-12-13 at 14:23 +0100, Thomas Gleixner wrote:
>> > >> That's only true for the case where the new kernel takes over.
>> > >>
>> > >> In the case KEXEC_JUMP=n and kexec_image->preserve_context == true, then
>> > >> it is supposed to align with suspend/resume and if you look at the code
>> > >> then it actually mimics suspend/resume in the most dilettanteish way.
>> > >
>> > > Did you mean KEXEC_JUMP=y there?
>> >
>> > Yes, of course.
>> >
>> > > I spent a while the other week trying to understand the case where
>> > > CONFIG_KEXEC_JUMP=n and kexec_image->preserve_context=true, and came to
>> > > the conclusion that it was a mirage. Userspace can't *actually* set the
>> > > KEXEC_PRESERVE_CONTEXT bit when setting up the image, if KEXEC_JUMP=n.
>> > >
>> > > The whole of the code path for that case is dead code. It's confusing
>> > > because as discussed elsewhere, we don't just #ifdef out the whole of
>> > > that dead code path, but only the bits which don't actually *compile*
>> > > (like references to restore_processor_state() etc.).
>> >
>> > Yes, I had to stare at it quite a while. :)
>> >
>> > >> It's a patently bad idea to clobber the kernel with kexec jump "fixes"
>> > >> instead of using the well tested and established suspend/resume
>> > >> machinery.
>> > >>
>> > >> All it takes is to:
>> > >>
>> > >>     1) disable the wakeup logic
>> > >>
>> > >>     2) provide a mechanism to invoke machine_kexec() instead of the
>> > >>        actual suspend mechanism.
>> > >>
>> > >> No?
>> > >
>> > > Agreed. The hacky proof of concept I posted earlier invoking
>> > > machine_kexec() instead of suspend_ops->enter() works fine. I'll look
>> > > at cleaning it up and making it not invoke all the ACPI hooks for
>> > > *actual* suspend to RAM, etc.
>> >
>> > Something like the below? It survived an hour of loop testing.
>>
>> I think that this KEXEC_JUMP thing can be dropped entirely and forgotten.
>>
>> I'm not aware of anyone actually using it.
>
>And now I've been made aware that it's used.  Oh well.
>
>As discussed with Dave over IRC, the current implementation isn't
>actually that bad.  It might use PMSG_THAW instead of PMSG_RESTORE in
>kernel_kexec(), but other than this it reflects the code flow around
>the jump from the restore kernel to the image one during resume from
>hibernation.
>
>This means that hibernation and kexec jump could be unified somewhat.

Fair enough. I'm happy to do whatever cleanups or consolidation make sense, if we have a consensus.

There remains the question of why the blk-mq thing explodes on the way down for both kjump and, apparently, even the plain kexec case.



More information about the kexec mailing list