[PATCH 0/3] drm/radeon kexec fixes

Alex Deucher alexdeucher at gmail.com
Mon Sep 9 09:04:01 EDT 2013


On Mon, Sep 9, 2013 at 5:21 AM, Markus Trippelsdorf
<markus at trippelsdorf.de> wrote:
> On 2013.09.08 at 17:32 -0700, Eric W. Biederman wrote:
>> Markus Trippelsdorf <markus at trippelsdorf.de> writes:
>>
>> > Here are a couple of patches that get kexec working with radeon devices.
>> > I've tested this on my RS780.
>> > Comments or flames are welcome.
>> > Thanks.
>>
>> A couple of high level comments.
>>
>> This looks promising for the usual case.
>>
>> Removing the printk at the end of the kexec path seems a little dubious,
>> what of other cpus, interrupt handlers, etc.  Basically estabilishing a
>> new rule on when printk is allowed seems a little dubious at this point,
>> even if it is a useful debugging trick.
>
> OK. I will drop this patch. It doesn't seem to be necessary, because I
> cannot reproduce the printk related hang anymore.
>
>> Having a clean shutdown of the radeon definitely seems worth doing,
>> because the cases where we care abouty video are when a person is in
>> front of the system.
>
> Yes. But please note that even with radeon_pci_shutdown implemented, I
> still get ring test failures on roughly every eighth kexec boot:
>
>  [drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD)
>  radeon 0000:01:05.0: disabling GPU acceleration
>
> That's definitely better than the current state of affairs, with ring
> test failures on every second boot. But I haven't figured out the reason
> for these failures yet. It's curious that once a ring test failure
> occurs, it will reliably fail after each kexec invocation, no matter how
> often repeated. Only a reboot brings the machine back to normal.
>
>> I don't know if you want to remove the sanity checks.  They seem cheap
>> and safe regardless.  Are they expensive or ineffective?  Moreover if
>> they work a reasonable amount of the time that means that the kexec on
>> panic case (where we don't shut anything down) can actually use the
>> video, and that in general the driver will be more robust.  I don't
>> expect anyone much cares as kexec on panic is mostly used to just write
>> a core file to the network, or the local disk.  But if it is easy to
>> keep that case working most of the time, why not.
>
> IIRC Alex said the sanity checks are expensive and boot-time could be
> improved by dropping them. Maybe he can chime in?

They shouldn't be necessary with a proper shutdown, but in this
particular case, they are not very expensive.  What is expensive is
having a separate sanity check functions for all the various hw blocks
to teardown everything on startup prior to starting it up in case
kexec, etc. left the system in a bad state.  It ends up amounting to a
full tear down sequence followed by a full start up sequence every
time you load the driver.

I can't really comment on the first patch, but the rest seem fine.

Alex



More information about the kexec mailing list