[PATCH 01/14] x86, boot: align the .bss section in the decompressor

Eric Dumazet dada1 at cosmosbay.com
Fri May 8 04:18:07 EDT 2009


Sam Ravnborg a écrit :
> On Thu, May 07, 2009 at 03:26:49PM -0700, H. Peter Anvin wrote:
>> From: H. Peter Anvin <hpa at zytor.com>
>>
>> Aligning the .bss section makes it trivially faster, and makes using
>> larger transfers for the clear slightly easier.
>>
>> [ Impact: trivial performance enhancement, future patch prep ]
>>
>> Signed-off-by: H. Peter Anvin <hpa at zytor.com>
>> ---
>>  arch/x86/boot/compressed/vmlinux.lds.S |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
>> index 0d26c92..27c168d 100644
>> --- a/arch/x86/boot/compressed/vmlinux.lds.S
>> +++ b/arch/x86/boot/compressed/vmlinux.lds.S
>> @@ -42,6 +42,7 @@ SECTIONS
>>  		*(.data.*)
>>  		_edata = . ;
>>  	}
>> +	. = ALIGN(32);
> 
> Where does this magic 32 comes from?
> I would assume the better choice would be:
> 	. = ALIGN(L1_CACHE_BYTES);
> 
> So we match the relevant CPU.
> 
> In general for alignmnet of output sections I see the need for:
> 1) Function call
> 2) L1_CACHE_BYTES
> 3) PAGE_SIZE
> 4) 2*PAGE_SIZE
> 
> But I see magic constant used here and there that does not match
> the above (when looking at all archs).
> So I act when I see a new 'magic' number..
> 

I totally agree

gcc itself has a strange 32 bytes alignement rule (unless using -Os) for 
object of a >= 32 bytes size. Did you know that ?

$ cat try.c
char foo[32] = {1};
$ gcc -O -S try.c
        .file   "try.c"
.globl foo
        .data
        .align 32            <<< HERE , what a mess >>
        .type   foo, @object
        .size   foo, 32
foo:
        .byte   1
        .zero   31
        .ident  "GCC: (GNU) 4.4.0"
        .section        .note.GNU-stack,"", at progbits


It makes many .o kernel files marked with a 2**5 alignement of .data or percpudata
At link time, it creates many holes.

In my opinion, gcc should have a separate option than -Os, as this as too expensive
side effects on the code speed.

I can save lot of data space if I patch gcc-4.4.0/config/i386/i386.c

to :

/* Compute the alignment for a static variable.
   TYPE is the data type, and ALIGN is the alignment that
   the object would ordinarily have.  The value of this function is used
   instead of that alignment to align the object.  */

int
ix86_data_alignment (tree type, int align)
{
-  int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
+  int max_align = BITS_PER_WORD;

  if (AGGREGATE_TYPE_P (type)
      && TYPE_SIZE (type)
      && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
      && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
          || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
      && align < max_align)
    align = max_align;

  /* x86-64 ABI requires arrays greater than 16 bytes to be aligned
     to 16byte boundary.  */
  if (TARGET_64BIT)
    {
      if (AGGREGATE_TYPE_P (type)
           && TYPE_SIZE (type)
           && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
           && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
               || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
        return 128;





More information about the kexec mailing list