Question regardin intel64 arch and page table setup
Eric W. Biederman
ebiederm at xmission.com
Wed Aug 11 23:21:33 EDT 2010
Neil Horman <nhorman at redhat.com> writes:
> On Wed, Aug 11, 2010 at 01:02:10PM -0700, H. Peter Anvin wrote:
>> On 08/11/2010 12:47 PM, Neil Horman wrote:
>> > Hey all-
>> > I've got a question regarding x86_64 and how linux uses the paging
>> > hardware. I'm tinkering with ways to get kexec to boot a new kernel on panic
>> > without leaving long mode. The idea being that if we can do that, then we don't
>> > need to store the new kdump kernel below the 4G physical limit for 32 bit
>> > systems. In doing this though, I figured I would have to re-initalize the page
>> > table with an identity mapped set of page tables to cover all of ram and load
>> > that into cr3. My question is, is it safe to do so while paging is enabled.
>> > The docs I've read are unclear on that and if I have to disable paging that
>> > automatically drops me out of long mode, which is bad. I would think its safe
>> > to do, since I imagined we had to do on context switches in the scheduler, but
>> > the __switch_to implementation for x86_64 sems to do nothing but update the task
>> > register. Intel vol 3a says we need to update cr3, but I don't see where that
>> > happens, so I'm not sure if theres some automated bit that does a cr3 update
>> > safely when we write tr.
>> > Anywho, any guidance, clarification would be appreciated. Thanks!
>> > Neil
>> It is definitely safe to load a new CR3 while paging is done; it is done
>> all the time. The currently executing page needs to be mapped to the
>> same physical and virtual address in most kernels.
>> However, there are a *LOT* of issues with having a kernel that is
>> completely above 4 GiB. For one thing, a lot of device drivers simply
>> will not work if there is no memory below 4 GiB awavilable to the
>> kernel. As such, I don't think you will be successful in this project.
> Thanks for all the info, guys. I hadn't considered that we couldn't access the
> 64 bit startup point for the bzImage. I just figured we could jump to
> startup_32 + 0x200 in the bzImage header once I had the page table bit set up
> I hadn't considered the problems we might encounter with driver issues loading
> above 4gb and what have you, nor the starting of AP's.
> Regardless, I'll keep tinkering. One more question. When setting up the page
> table in the panic boot case, is it sufficient to setup an identity map for the
> pages in the reserved crashkernel range, or do we need to identity map the
> entire range of ram?
You should be able to get away with simply using the page tables the
crashing/initial kernel sets up, as those should map all of memory,
and definitely all of memory the kernel you will be booting needs to
run in (the memory areas we tell it we are using).
I didn't do it much but I did test the 64bit kernel entry point ages ago
when I did the first round of implementing everything.
More information about the kexec