[PATCH v2 1/4] mm: Add optional close() to struct vm_special_mapping

Nathan Chancellor nathan at kernel.org
Mon Aug 19 12:51:20 PDT 2024


On Mon, Aug 19, 2024 at 12:29:34PM -0700, Linus Torvalds wrote:
> On Mon, 19 Aug 2024 at 11:53, Nathan Chancellor <nathan at kernel.org> wrote:
> >
> >
> > Modules linked in:
> > Pid: 24, comm: mount Not tainted 6.11.0-rc4-next-20240819
> > RIP: 0033:0x68006f6c
> > RSP: 000000006c8bfc68  EFLAGS: 00010206
> > RAX: 0000000068006f6c RBX: 0000000068a0aa18 RCX: 00000000600d8b09
> > RDX: 0000000000000000 RSI: 0000000068a0aa18 RDI: 0000000068805120
> > RBP: 000000006c8bfc70 R08: 0000000000000001 R09: 0000000068ae0308
> > R10: 000000000000000e R11: ffffffffffffffff R12: 0000000000000001
> > R13: 0000000068a0aa18 R14: 0000000000000015 R15: 0000000068944a88
> > Kernel panic - not syncing: Segfault with no mm
> > CPU: 0 UID: 0 PID: 24 Comm: mount Not tainted 6.11.0-rc4-next-20240819 #1
> > Stack:
> >  600caeff 6c8bfc90 600d8b2a 68944a80
> >  00000047 6c8bfda0 600cbfd9 6c8bfd50
> >  68944ad0 68944a88 7f7ffff000 7f7fffffff
> > Call Trace:
> >  [<600caeff>] ? special_mapping_close+0x16/0x19
> 
> Hmm. No "Code:" line? Did you just edit that out, or maybe UML doesn't
> print one out?

Nope, no editing, it is straight from my terminal. I guess UML just doesn't
print one.

> Anyway, for me that special_mapping_close() disassembles to
> 
> 
>  <+0>:  mov    %rdi,%rsi
>  <+3>:  mov    0x78(%rdi),%rdi
>  <+7>:  mov    0x20(%rdi),%rax
>  <+11>: test   %rax,%rax
>  <+14>: je     0x600caa11 <special_mapping_close+24>
>  <+16>: push   %rbp
>  <+17>: mov    %rsp,%rbp
>  <+20>: call   *%rax
>  <+22>: pop    %rbp
>  <+23>: ret
>  <+24>: ret
> 
> which may just match yours, because special_mapping_close+0x16 is
> obviously that +22, and it's the return point for that call.

Yeah seems like it, objdump -dr shows:

0000000000000027 <special_mapping_close>:
      27:   48 89 fe                mov    %rdi,%rsi
      2a:   48 8b 7f 78             mov    0x78(%rdi),%rdi
      2e:   48 8b 47 20             mov    0x20(%rdi),%rax
      32:   48 85 c0                test   %rax,%rax
      35:   74 08                   je     3f <special_mapping_close+0x18>
      37:   55                      push   %rbp
      38:   48 89 e5                mov    %rsp,%rbp
      3b:   ff d0                   call   *%rax
      3d:   5d                      pop    %rbp
      3e:   c3                      ret
      3f:   c3                      ret

> And your %rax value does match that invalid %rip value of 0x68006f6c.
> 
> So it does look like it's jumping off to la-la-land, and the problem is the code
> 
>         const struct vm_special_mapping *sm = vma->vm_private_data;
> 
>         if (sm->close)
>                 sm->close(sm, vma);
> 
> where presumably 'vm_private_data' isn't a "struct vm_special_mapping *" at all.
> 
> And I think I see the problem.
> 
> When we have that 'legacy_special_mapping_vmops', then the
> vm_private_data field actually points to 'pages'.
> 
> So the 'legacy_special_mapping_vmops' can *only* contain the '.fault'
> handler, not the other handlers.
> 
> IOW, does something like this fix it?
> 
>   --- a/mm/mmap.c
>   +++ b/mm/mmap.c
>   @@ -2095,7 +2095,6 @@ static const struct vm_operations_struct
> special_mapping_vmops = {
>    };
> 
>    static const struct vm_operations_struct legacy_special_mapping_vmops = {
>   -       .close = special_mapping_close,
>           .fault = special_mapping_fault,
>    };

Yes, that appears to fix it for me. I don't have much to say about the
rest but others might :)

Cheers,
Nathan



More information about the linux-um mailing list