[PATCH 00/19] mm: Support huge pfnmaps

Kefeng Wang wangkefeng.wang at huawei.com
Mon Aug 19 06:14:26 PDT 2024



On 2024/8/16 22:33, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 11:05:33AM +0800, Kefeng Wang wrote:
>>
>>
>> On 2024/8/16 3:20, Peter Xu wrote:
>>> On Wed, Aug 14, 2024 at 09:37:15AM -0300, Jason Gunthorpe wrote:
>>>>> Currently, only x86_64 (1G+2M) and arm64 (2M) are supported.
>>>>
>>>> There is definitely interest here in extending ARM to support the 1G
>>>> size too, what is missing?
>>>
>>> Currently PUD pfnmap relies on THP_PUD config option:
>>>
>>> config ARCH_SUPPORTS_PUD_PFNMAP
>>> 	def_bool y
>>> 	depends on ARCH_SUPPORTS_HUGE_PFNMAP && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>>>
>>> Arm64 unfortunately doesn't yet support dax 1G, so not applicable yet.
>>>
>>> Ideally, pfnmap is too simple comparing to real THPs and it shouldn't
>>> require to depend on THP at all, but we'll need things like below to land
>>> first:
>>>
>>> https://lore.kernel.org/r/20240717220219.3743374-1-peterx@redhat.com
>>>
>>> I sent that first a while ago, but I didn't collect enough inputs, and I
>>> decided to unblock this series from that, so x86_64 shouldn't be affected,
>>> and arm64 will at least start to have 2M.
>>>
>>>>
>>>>> The other trick is how to allow gup-fast working for such huge mappings
>>>>> even if there's no direct sign of knowing whether it's a normal page or
>>>>> MMIO mapping.  This series chose to keep the pte_special solution, so that
>>>>> it reuses similar idea on setting a special bit to pfnmap PMDs/PUDs so that
>>>>> gup-fast will be able to identify them and fail properly.
>>>>
>>>> Make sense
>>>>
>>>>> More architectures / More page sizes
>>>>> ------------------------------------
>>>>>
>>>>> Currently only x86_64 (2M+1G) and arm64 (2M) are supported.
>>>>>
>>>>> For example, if arm64 can start to support THP_PUD one day, the huge pfnmap
>>>>> on 1G will be automatically enabled.
>>
>> A draft patch to enable THP_PUD on arm64, only passed with DEBUG_VM_PGTABLE,
>> we may test pud pfnmaps on arm64.
> 
> Thanks, Kefeng.  It'll be great if this works already, as simple.
> 
> Might be interesting to know whether it works already if you have some
> few-GBs GPU around on the systems.
> 
> Logically as long as you have HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD selected
> below, 1g pfnmap will be automatically enabled when you rebuild the kernel.
> You can double check that by looking for this:
> 
>    CONFIG_ARCH_SUPPORTS_PUD_PFNMAP=y
> 
> And you can try to observe the mappings by enabling dynamic debug for
> vfio_pci_mmap_huge_fault(), then map the bar with vfio-pci and read
> something from it.


I don't have such device, but we write a driver which use
vmf_insert_pfn_pmd/pud in huge_fault,

static const struct vm_operations_struct test_vm_ops = {
         .huge_fault = test_huge_fault,
	...
}

and read/write it after mmap(,2M/1G,test_fd,...), it works as expected,
since it could be used by dax, let's send it separately.



More information about the linux-arm-kernel mailing list