[RFC PATCH 0/4] purgatory: Add basic support for IPMI command execution

河合英宏 / KAWAI,HIDEHIRO hidehiro.kawai.ez at hitachi.com
Thu Jan 21 20:39:18 PST 2016


Hello,

Thanks for your comments.

> I understand what you are trying to accomplish here, but I'm not sure of
> the wisdom of this approach.  I'll give some more information and the
> kexec maintainers can decide, I suppose.
> 
> The KCS interface given here probably covers ~70% of the systems out there
> right now.

I decided to use KCS I/F because I believe it covers relatively high
rate of servers and it is the only system interface which one of our
targets supports for.

>  Other systems have:
>    * KCS interfaces at a different port or in a different place like
> memory, PCI,
>      and with different register sizes and spacing.

At least, I'm going to add an option to change the base I/O port.
I'm not sure about other cases because I don't know how supporting
them raises the cover rate and I don't have hardware to test.

>    * Other standard interfaces.  SMIC (probably not relevant), BT
> (faster, it does
>      block transfers) and SSIF (which is IPMI over I2C).
>    * Those other standard interfaces can be in different places, just
> like KCS.
>      Hundreds of I2C interfaces exist.
>    * Non-standard interfaces.  Power systems have their own IPMI interfaces,
>      for instance.  Some systems have IPMI over serial ports, though
> hopefully
>      that has pretty much gone away.
> 
> I'd guess that over half of the IPMI SI driver is discovering and
> handling all the
> various interface types, locations from all the sources it can come from.
> 
> As time goes on that 70% number is decreasing in favour of other faster
> and more convenient interfaces.  I expect that SSIF will become much more
> popular over time because it has block transfer capability and all the
> hardware
> is already there on systems.

Thanks for this information.  I'll check out the BMC implementation
of other servers, then I'll decide whether I should support SSIF, too.

> This is no different, of course, than any other common hardware
> interface out
> there.  USB, ATA, etc.  But it makes it hard to cover all the
> possibilities in something
> like purgatory.

Such kind of device sometimes become a cause of dump failure
due to wrong DMA or interrupts.  So we shouldn't use them.
Also, I think it is not suitable for purgatory to implement
their drivers.

Other possible options are to use serial device and NVRAM (like
the pstore driver).  However, serial port is not always available
for this purpose (actually this is our case), and NVRAM is not
always equipped.  Furthermore, we'd like to start/stop
the BMC's watchdog timer before starting the 2nd kernel.  This
can be done only by IPMI.

> I know how valuable this information can be.  It has saved my butt on
> occasions,
> which is why I go through the inconvenience of handling it in the IPMI
> driver.
> But it seems to me that the failure rate of doing this in the crashing
> kernel should
> be pretty low.  Not zero of course.  But I have no idea what it is.

At first, I tried to use crash_kexec_post_notifiers to call
the callback of IPMI driver before jumping to the 2nd kernel.
But the feature has a couple of bugs and I tried to fix them:

https://lkml.org/lkml/2015/7/10/316
https://lkml.org/lkml/2015/7/23/864

However, no one supports this bugfix and the discussion didn't
make a progress because calling notifiers before kdump is not
reliable.  So, I suggested to use the purgatory.

Regards,

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




More information about the kexec mailing list