traversing vma on nommu

Liam R. Howlett Liam.Howlett at oracle.com
Wed Nov 6 09:23:49 PST 2024


* Hajime Tazaki <thehajime at gmail.com> [241105 18:48]:
> 
> Hello Liam,
> 
> Thanks for your time to give me useful information, even though my
> insufficient report for the situation; apologize my laziness.
> 

...

> I pulled/rebased v6.12-rc6 and the issue still persists.

Thanks.

> 
> Thread 1 "vmlinux" received signal SIGSEGV, Segmentation fault.
>                                                                acct_collect (exitcode=exitcode at entry=256, group_dead=group_dead at entry=1) at ../kernel/acct.c:565
> 565      vsize += vma->vm_end - vma->vm_start;
> (gdb) p vma
> $4 = (struct vm_area_struct *) 0xe
> (gdb) p vmi
> $5 = {
>   mas = {
>     tree = 0x703e8708,
>     index = 1886371840,
>     last = 1886371839,
>     node = 0x70bef30c,
>     min = 0,
>     max = 1886371839,
>     alloc = 0x0,
>     status = ma_active,
>     depth = 1 '\001',
>     offset = 15 '\017',
>     mas_flags = 0 '\000',
>     end = 14 '\016',
      ^^^^^^^^

>     store_type = wr_invalid
>   }
> }

...

> 
> (gdb) p vma
> $4 = (struct vm_area_struct *) 0xe

0xe is 14, or the end of the node.  offset 15 is the last slot in a
node.  We use the last slot to store the end of the node if there isn't
any data to store there.  So what is happening is the maple state is
incorrect and ends up returning the metadata as the vma.

This is very helpful.

The loop in acct_collect() has the correct locking, iteration of
for_each_vma() will eventually call mas_find() with an upper limit of
ULONG_MAX.

This will lead to the code path mas_find(), mas_find_setup() will
quickly return false and it will try to get the mas_next_slot().

The offset would be equal or larger than the end, so it would proceed to
mas_next_node().

mas_next_node() should walk up the tree following the node's parent
pointer (note that node above is encoded and would actually be at
0x70bef300) and go to the next node and return to continue to find the
next entry - offset is set to zero and mas->index = mas->min.

What I see above seems to show that mas->index was updated by the offset
was not, and neither was the rest of the settings from that code block
mas->last is wrong as it is less than index.  I am not sure how the
iterator arrived at this state.

I cannot see a way for the code to get to offset = 15; it is set to zero
on return from mas_next_node(), if we rewalk then the entire maple state
will be reset through mas_start().

Can you add some (totally untested) debug to the exit path in
kernel/acct.c?

+#include <linux/maple_tree.h>

...
		mmap_read_lock(mm);                                                                                     
+		mt_dump(&mm->mm_mt, mt_dump_hex);
+		mt_validate(&mm->mm_mt);
                for_each_vma(vmi, vma)
                        vsize += vma->vm_end - vma->vm_start;                                                           
                mmap_read_unlock(mm);                                                                                   

You will need to enable CONFIG_DEBUG_MAPLE_TREE on the kernel.

Thanks,
Liam



More information about the maple-tree mailing list