[PATCH v2 09/17] mm: Add unsafe_follow_pfn

Mauro Carvalho Chehab mchehab+huawei at kernel.org
Fri Oct 9 06:34:21 EDT 2020


Hi,

Em Fri,  9 Oct 2020 09:59:26 +0200
Daniel Vetter <daniel.vetter at ffwll.ch> escreveu:

> Way back it was a reasonable assumptions that iomem mappings never
> change the pfn range they point at. But this has changed:
> 
> - gpu drivers dynamically manage their memory nowadays, invalidating
> ptes with unmap_mapping_range when buffers get moved
> 
> - contiguous dma allocations have moved from dedicated carvetouts to
> cma regions. This means if we miss the unmap the pfn might contain
> pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
> 
> - even /dev/mem now invalidates mappings when the kernel requests that
> iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> ("/dev/mem: Revoke mappings when a driver claims the region")
> 
> Accessing pfns obtained from ptes without holding all the locks is
> therefore no longer a good idea.
> 
> Unfortunately there's some users where this is not fixable (like v4l
> userptr of iomem mappings) or involves a pile of work (vfio type1
> iommu). For now annotate these as unsafe and splat appropriately.
> 
> This patch adds an unsafe_follow_pfn, which later patches will then
> roll out to all appropriate places.

NACK, as this breaks an existing userspace API on media.

While I agree that using the userptr on media is something that
new drivers may not support, as DMABUF is a better way of
handling it, changing this for existing ones is a big no, 
as it may break usersapace.

The right approach here would be to be able to fine-tune
support for it on a per-driver basis, e. g. disabling such
feature only for drivers that would use a movable page.

The media subsystem has already a way to disable USERPTR
support from VB2. the right approach would be to ensure
that newer drivers will only set this if they won't use
movable pages.

Regards,
Mauro

> 
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> Cc: Jason Gunthorpe <jgg at ziepe.ca>
> Cc: Kees Cook <keescook at chromium.org>
> Cc: Dan Williams <dan.j.williams at intel.com>
> Cc: Andrew Morton <akpm at linux-foundation.org>
> Cc: John Hubbard <jhubbard at nvidia.com>
> Cc: Jérôme Glisse <jglisse at redhat.com>
> Cc: Jan Kara <jack at suse.cz>
> Cc: Dan Williams <dan.j.williams at intel.com>
> Cc: linux-mm at kvack.org
> Cc: linux-arm-kernel at lists.infradead.org
> Cc: linux-samsung-soc at vger.kernel.org
> Cc: linux-media at vger.kernel.org
> Cc: kvm at vger.kernel.org
> ---
>  include/linux/mm.h |  2 ++
>  mm/memory.c        | 32 +++++++++++++++++++++++++++++++-
>  mm/nommu.c         | 17 +++++++++++++++++
>  security/Kconfig   | 13 +++++++++++++
>  4 files changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2a16631c1fda..ec8c90928fc9 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1653,6 +1653,8 @@ int follow_pte_pmd(struct mm_struct *mm, unsigned long address,
>  		   pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp);
>  int follow_pfn(struct vm_area_struct *vma, unsigned long address,
>  	unsigned long *pfn);
> +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
> +		      unsigned long *pfn);
>  int follow_phys(struct vm_area_struct *vma, unsigned long address,
>  		unsigned int flags, unsigned long *prot, resource_size_t *phys);
>  int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
> diff --git a/mm/memory.c b/mm/memory.c
> index f7cbc4dde0ef..7c7b234ffb24 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4821,7 +4821,12 @@ EXPORT_SYMBOL(follow_pte_pmd);
>   * @address: user virtual address
>   * @pfn: location to store found PFN
>   *
> - * Only IO mappings and raw PFN mappings are allowed.
> + * Only IO mappings and raw PFN mappings are allowed. Note that callers must
> + * ensure coherency with pte updates by using a &mmu_notifier to follow updates.
> + * If this is not feasible, or the access to the @pfn is only very short term,
> + * use follow_pte_pmd() instead and hold the pagetable lock for the duration of
> + * the access instead. Any caller not following these requirements must use
> + * unsafe_follow_pfn() instead.
>   *
>   * Return: zero and the pfn at @pfn on success, -ve otherwise.
>   */
> @@ -4844,6 +4849,31 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address,
>  }
>  EXPORT_SYMBOL(follow_pfn);
>  
> +/**
> + * unsafe_follow_pfn - look up PFN at a user virtual address
> + * @vma: memory mapping
> + * @address: user virtual address
> + * @pfn: location to store found PFN
> + *
> + * Only IO mappings and raw PFN mappings are allowed.
> + *
> + * Returns zero and the pfn at @pfn on success, -ve otherwise.
> + */
> +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
> +	unsigned long *pfn)
> +{
> +#ifdef CONFIG_STRICT_FOLLOW_PFN
> +	pr_info("unsafe follow_pfn usage rejected, see CONFIG_STRICT_FOLLOW_PFN\n");
> +	return -EINVAL;
> +#else
> +	WARN_ONCE(1, "unsafe follow_pfn usage\n");
> +	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> +
> +	return follow_pfn(vma, address, pfn);
> +#endif
> +}
> +EXPORT_SYMBOL(unsafe_follow_pfn);
> +
>  #ifdef CONFIG_HAVE_IOREMAP_PROT
>  int follow_phys(struct vm_area_struct *vma,
>  		unsigned long address, unsigned int flags,
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 75a327149af1..3db2910f0d64 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -132,6 +132,23 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address,
>  }
>  EXPORT_SYMBOL(follow_pfn);
>  
> +/**
> + * unsafe_follow_pfn - look up PFN at a user virtual address
> + * @vma: memory mapping
> + * @address: user virtual address
> + * @pfn: location to store found PFN
> + *
> + * Only IO mappings and raw PFN mappings are allowed.
> + *
> + * Returns zero and the pfn at @pfn on success, -ve otherwise.
> + */
> +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
> +	unsigned long *pfn)
> +{
> +	return follow_pfn(vma, address, pfn);
> +}
> +EXPORT_SYMBOL(unsafe_follow_pfn);
> +
>  LIST_HEAD(vmap_area_list);
>  
>  void vfree(const void *addr)
> diff --git a/security/Kconfig b/security/Kconfig
> index 7561f6f99f1d..48945402e103 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -230,6 +230,19 @@ config STATIC_USERMODEHELPER_PATH
>  	  If you wish for all usermode helper programs to be disabled,
>  	  specify an empty string here (i.e. "").
>  
> +config STRICT_FOLLOW_PFN
> +	bool "Disable unsafe use of follow_pfn"
> +	depends on MMU
> +	help
> +	  Some functionality in the kernel follows userspace mappings to iomem
> +	  ranges in an unsafe matter. Examples include v4l userptr for zero-copy
> +	  buffers sharing.
> +
> +	  If this option is switched on, such access is rejected. Only enable
> +	  this option when you must run userspace which requires this.
> +
> +	  If in doubt, say Y.
> +
>  source "security/selinux/Kconfig"
>  source "security/smack/Kconfig"
>  source "security/tomoyo/Kconfig"



Thanks,
Mauro



More information about the linux-arm-kernel mailing list