[PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults

Andreas Gruenbacher agruenba at redhat.com
Thu Nov 25 13:40:54 PST 2021


On Thu, Nov 25, 2021 at 10:02 PM Matthew Wilcox <willy at infradead.org> wrote:
> On Thu, Nov 25, 2021 at 08:43:57PM +0000, Catalin Marinas wrote:
> > > I really believe that the fix is to make the read/write probing just
> > > be more aggressive.
> > >
> > > Make the read/write probing require that AT LEAST <n> bytes be
> > > readable/writable at the beginning, where 'n' is 'min(len,ALIGN)', and
> > > ALIGN is whatever size that copy_from/to_user_xyz() might require just
> > > because it might do multi-byte accesses.
> > >
> > > In fact, make ALIGN be perhaps something reasonable like 512 bytes or
> > > whatever, and then you know you can handle the btrfs "copy a whole
> > > structure and reset if that fails" case too.
> >
> > IIUC what you are suggesting, we still need changes to the btrfs loop
> > similar to willy's but that should work fine together with a slightly
> > more aggressive fault_in_writable().
> >
> > A probing of at least sizeof(struct btrfs_ioctl_search_key) should
> > suffice without any loop changes and 512 would cover it but it doesn't
> > look generic enough. We could pass a 'probe_prefix' argument to
> > fault_in_exact_writeable() to only probe this and btrfs would just
> > specify the above sizeof().
>
> How about something like this?
>
> +++ b/mm/gup.c
> @@ -1672,6 +1672,13 @@ size_t fault_in_writeable(char __user *uaddr, size_t size)
>
>         if (unlikely(size == 0))
>                 return 0;
> +       if (SUBPAGE_PROBE_INTERVAL) {
> +               while (uaddr < PAGE_ALIGN((unsigned long)uaddr)) {
> +                       if (unlikely(__put_user(0, uaddr) != 0))
> +                               goto out;
> +                       uaddr += SUBPAGE_PROBE_INTERVAL;
> +               }
> +       }
>         if (!PAGE_ALIGNED(uaddr)) {
>                 if (unlikely(__put_user(0, uaddr) != 0))
>                         return size;
>
> ARM then defines SUBPAGE_PROBE_INTERVAL to be 16 and the rest of us
> leave it as 0.  That way we probe all the way to the end of the current
> page and the start of the next page.
>
> Oh, that needs to be checked to not exceed size as well ... anyway,
> you get the idea.

Note that we don't need this additional probing when accessing the
buffer with byte granularity. That's probably the common case.

Andreas




More information about the linux-arm-kernel mailing list