[PATCH 0/6] kexec: A new system call to allow in kernel loading

Wed Dec 4 14:34:31 EST 2013

On Fri, Nov 22, 2013 at 07:23:39PM -0800, Eric W. Biederman wrote:
> 
> > [..]
> >> >> There is also a huge missing piece of this in that your purgatory is not
> >> >> checking a hash of the loaded image before jumping too it.  Without that
> >> >> this is a huge regression at least for the kexec on panic case.  We
> >> >> absolutely need to check that the kernel sitting around in memory has
> >> >> not been corrupted before we let it run very far.
> >> >
> >> > Agreed. This should not be hard. It is just a matter of calcualting
> >> > digest of segments. I will store it in kimge and verify digest again
> >> > before passing control to control page. Will fix it in next version.
> >> 
> >> Nak.  The verification needs to happen in purgatory. 
> >> 
> >> The verification needs to happen in code whose runtime environment is
> >> does not depend on random parts of the kernel.  Anything else is a
> >> regression in maintainability and reliability.
> >> 
> >> It is the wrong direction to add any code to what needs to run in the
> >> known broken environment of the kernel when a panic happens.
> >> 
> >> Which means that you almost certainly need to go to the trouble of
> >> supporting the complexity needed to support purgatory code written in C.
> >> 
> >> (For those just tuning in purgatory is our term for the code that runs
> >> between the kernels to do those things that can not happen a priori).
> >
> > In general, I agree with not using kernel parts after crash.
> >
> > But what protects against that purgatory itself has been scribbled over.
> > IOW, how different purgatory memory is as compared to kernel memory where
> > digest routines are stored. They have got equal probably of being scribbled
> > over and if that's the case one is not better than other?
> >
> > And if they both got equal probability to getting corrupted, then there does
> > not seem to be an advantage in moving digest verification inside
> > purgatory.
> 
> The primary reason is that maintenance of code in the kernel that is
> safe during a crash dump is hard.  That is why we boot a second kernel
> after all.  If the code to do the signature verification resides in
> machine_kexec on the kexec on panic code path in the kernel that has
> called panic it is almost a given that at some point or other someone
> will add an option that will add a weird dependency that makes the code
> unsafe when the kernel is crashing.  I have seen it happen several times
> on the existing kexec on panic code path.  I have seen it on other code
> paths like netconsole.  Which can currently on some kernels I have
> running cause the kernel go go into an endless printk loop if you call
> printk from interrupt context.  So what we really gain by moving the
> verification into purgatory is protection from inappropriate code reuse.
> 
> So having a completely separate piece of code may be a little harder to
> write initially but the code is much simpler and more reliable to
> maintain.  Essentially requiring no maintenance effort.  Further getting
> to the point where purgatory is written in C makes small changes much
> more approachable.

Hi Eric,

So you want a separate purgatory code and that purgatory should be self
contained and should not share any code with rest of the kernel. No
inclusion of header files, no linking against kernel libraries? That means
even re-implementing sha256 functions separately (like user space)?

If code maintenance is a concern, then I think I can reimplement some
of the functions to calculate sha256 in separate crash files and invoke
those to reduce code sharing with rest of the kernel. And we should be
able to link against the kernel and not have to create separate
relocatable purgatory object and relocate it.

IOW, does purgatory still have to be a relocatable object? I think
user space had no choice but given the fact that we are implementing
thing in kernel, I should be able to implement my own hash calculation
and segment verification code and link it to existing kernel and invoke
these outside purgatory. Anyway, we call so many other functions after
crash to stop cpus, save registers, etc.

Thanks
Vivek