[PATCH] Implement support for mem command line parameter

Vivek Goyal vgoyal at redhat.com
Mon Jun 2 23:04:40 EDT 2008


On Tue, Jun 03, 2008 at 01:29:44AM +0200, Bernhard Walle wrote:
> When the kernel is booted with the "mem" kernel command line (see
> Documentation/kernel-parameters.txt of kernel source tree), the /proc/iomem
> is not modified. Instead, it shows the whole memory space as "System RAM".
> I consider that as correct because the file is named "iomem", and for I/O,
> the behaviour makes sense.
> 

IIUC, in the past the behavior of /proc/iomem was different for i386 and
x86_64. One arch used to truncate it and other would not. I don't remember
which one used to do what.

I had a quick look at the current code and looks like truncation of e820
map is taking place before request_resource(). In that case we should
see the truncated map. But I am not sure and will test it tomorrow...

I am not sure which one is the right behavior as one can argue for both.
So it essentially boils down to two interfaces. One view corresponds to
BIOS view and other view corresponds to kernel view (override BIOS by
user options). I think we need to export both to user space.
 
We need BIOS view so that pure "kexec" can pass it to new kernel, 
irrespecitve of the view seen by first kernel. For example if system
has 4G of memory but user passed mem=2G and first kernel is using only
2G, but if we kexec a new kernel, it should see the full 4G mememory as
obtained by BIOS.

We need user view for "kdump" purposes as we don't want to dump memory
not used by first kernel. (you have already explained it).

> However, when the kernel is booted with the "mem" parameter, the user expects
> the crashdump to be as small as the system memory, not containing the whole
> unused system RAM. To implement this, there are several options:
> 
>  1. Modify /proc/iomem.
>  2. Add a new /proc/iomem_used or something like that, i.e. a new kernel
>     interface.
>  3. Parse /proc/meminfo to read the system RAM.
>  4. Parse /proc/cmdline to read the command line.
> 
> I choosed the 4th possibility because of several reasons.
> 
>  - The /proc/iomem interface should be stable and not modified. That may break
>    other stuff we don't know. It may also be difficult to convince kernel
>    maintainers.

We probably will not touch /proc/iomem. We need to create a new interface
which will change based on user options. That should not break any 
user space applications?

>  - We should not add yet another interface between kernel and userspace for
>    a feature 99 % of the people don't need and don't even know about.
> 
> The semantics of mem is different on different architectures. i386 and x86_64
> (x86) treat the limit specified on the command line as physical address limit

You mean system RAM limit? Because PCI devices are still mapped at higher
physical addresses. So it is not physical address limit?

> while IA64 count the real memory. That is because of different practises of
> memory mapping on PC architecture vs. "new" architectures.
> 
> However, on x86 (which that implementation covers) it's most easy to read
> the /proc/cmdline and the mem parameter. That parameter should be very stable
> since bootloaders need to parse it, so no fancy features are likely to be
> added in future. So we can use that.
> 
> The new function limit_system_memory() now reads the memory map kexec built for

What happens if user booted first kernel with user specified map (using
memmap=exactmap)? How will /proc/iomem look like? I think it will show
user specified IO regions and ignore BIOS map?

So lets say a system has got 4G of RAM, and for testing purpose a user
boots with user speicified map which says 1G of RAM is available. Now
we kexec into second kernel. Should second kernel see 4G of RAM or 1G
of RAM. I feel, second kernel should see 4G of RAM.

Hence I feel that we need to create two views. /proc/iomem can serve
as unmodied io resource view as reported by BIOS, and /proc/iomem_used
can serve as modified view as seen by kernel (due to user options.)

I think its the hard way of doing things as it might break something but
I feel this will also make semantics very clear than patching things
in user space.

Thanks
Vivek



More information about the kexec mailing list