[Xen-devel] [PATCHv10 0/9] Xen: extend kexec hypercall for use with pv-ops kernels
dslutz at verizon.com
Fri Nov 8 09:22:26 EST 2013
On 11/08/13 09:01, Andrew Cooper wrote:
> On 08/11/13 13:19, Jan Beulich wrote:
>>>>> On 08.11.13 at 14:13, David Vrabel <david.vrabel at citrix.com> wrote:
>>> Sorry, forgot to CC you on this series.
>>> Can we have your opinion on whether this kexec series can be merged?
>>> And if not, what further work and/or testing is required?
>> Just to clarify - unless I missed something, there was still no
>> review of this from Daniel or someone else known to be
>> familiar with the subject. If Keir gave his ack, formally this
>> could go in, but I wouldn't feel too well with that (the more
>> that apart from not having reviewed it, Daniel seems to also
>> continue to have problems with it).
If I am following this correctly, Jan is testing this by running xen
under QEMU. All my testing has been on bare metal.
> Can I have myself deemed to be familiar with the subject as far as this
> is concerned?
> A noticeable quantity of my contributions to Xen have been in the kexec
> / crash areas, and I am the author of the xen-crashdump-analyser.
> I do realise that I certainly not impartial as far as this series is
> concerned, being a co-developer.
> Davids statement of "the current implementation is so broken and
> useless that..." is completely accurate. It is frankly a miracle
> that the current code ever worked at all (and from XenServers point of
> view, failed far more often than it worked).
> For reference, XenServer 6.2 shipped with approximately v7 of this
> series, and an appropriate kexec-tools and xen-crashdump-analyser.
> Since we put the code in, we have not had a single failure-to-kexec in
> automated testing (both specific crash tests, and from unexpected host
> crashes), whereas we were seeing reliable failures to crash on most of
> our test infrastructure.
Verizon is also using an older version back ported to 4.2.1, and we have
yet to see a failure in getting into the crash kernel via kexec (it is a
very small sample size ~6 Dom0 crashes so far). I have only done 10
crashes so far with v10+ (soon to be v11).
> In stark contrast to previous versions of XenServer, we have not had a
> single customer reported host crash where the kexec path has failed.
> There was one systematic failure where the HPSA driver was unhappy with
> the state of the hardware, resulting in no root filesystem to write logs
> to, and a repeated panic and Xen deadlock in the queued invalidation
> kexec mailing list
> kexec at lists.infradead.org
More information about the kexec