[PATCH] mm: vmalloc: simplify vread/vwrite to use existing mappings
Ard Biesheuvel
ard.biesheuvel at linaro.org
Thu Jun 8 09:15:13 PDT 2017
On 8 June 2017 at 16:06, Russell King - ARM Linux <linux at armlinux.org.uk> wrote:
> On Wed, Jun 07, 2017 at 06:20:52PM +0000, Ard Biesheuvel wrote:
>> The current safe path iterates over each mapping page by page, and
>> kmap()'s each one individually, which is expensive and unnecessary.
>> Instead, let's use kern_addr_valid() to establish on a per-VMA basis
>> whether we may safely derefence them, and do so via its mapping in
>> the VMALLOC region. This can be done safely due to the fact that we
>> are holding the vmap_area_lock spinlock.
>
> This doesn't sound correct if you look at the definition of
> kern_addr_valid(). For example, x86-32 has:
>
> /*
> * kern_addr_valid() is (1) for FLATMEM and (0) for
> * SPARSEMEM and DISCONTIGMEM
> */
> #ifdef CONFIG_FLATMEM
> #define kern_addr_valid(addr) (1)
> #else
> #define kern_addr_valid(kaddr) (0)
> #endif
>
> The majority of architectures simply do:
>
> #define kern_addr_valid(addr) (1)
>
That is interesting, thanks for pointing it out.
The function read_kcore() [which is where the issue I am trying to fix
originates] currently has this logic:
if (kern_addr_valid(start)) {
unsigned long n;
/*
* Using bounce buffer to bypass the
* hardened user copy kernel text checks.
*/
memcpy(buf, (char *) start, tsz);
n = copy_to_user(buffer, buf, tsz);
/*
* We cannot distinguish between fault on source
* and fault on destination. When this happens
* we clear too and hope it will trigger the
* EFAULT again.
*/
if (n) {
if (clear_user(buffer + tsz - n,
n))
return -EFAULT;
}
} else {
if (clear_user(buffer, tsz))
return -EFAULT;
}
and the implementation I looked at [on arm64] happens to be the only
one that does something non-trivial.
> So, the result is that on the majority of architectures, we're now
> going to simply dereference 'addr' with very little in the way of
> checks.
>
Indeed.
> I think this makes these functions racy - the point at which the
> entry is placed onto the vmalloc list is quite different from the
> point where the page table entries for it are populated (which
> happens with the lock dropped.) So, I think this is asking for
> an oops.
>
Fair enough. I will try to find a different approach then.
More information about the linux-arm-kernel
mailing list