[EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above zero

Florian Fainelli florian.fainelli at broadcom.com
Thu Jan 4 10:28:38 PST 2024


On 1/4/24 04:38, Elad Nachman wrote:
> 
> 
>> -----Original Message-----
>> From: Florian Fainelli <florian.fainelli at broadcom.com>
>> Sent: Wednesday, January 3, 2024 8:32 PM
>> To: Will Deacon <will at kernel.org>; Elad Nachman
>> <enachman at marvell.com>
>> Cc: catalin.marinas at arm.com; thunder.leizhen at huawei.com;
>> bhe at redhat.com; akpm at linux-foundation.org; yajun.deng at linux.dev;
>> chris.zjh at huawei.com; linux-kernel at vger.kernel.org; linux-arm-
>> kernel at lists.infradead.org
>> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above
>> zero
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On 1/3/24 09:45, Will Deacon wrote:
>>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote:
>>>> From: Elad Nachman <enachman at marvell.com>
>>>>
>>>> Some SOCs, like the Marvell AC5/X/IM, have a combination
>>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers
>>>> limited to 31 and 32 bit of addressing.
>>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for
>>>> these SOCs, so swiotlb and coherent DMA allocation would work
>>>> properly.
>>>> Change initialization so device tree dma zone bits are taken as
>>>> function of offset from DRAM start, and when calculating the
>>>> maximal zone physical RAM address for physical DDR starting above
>>>> 32-bit, combine the physical address start plus the zone mask
>>>> passed as parameter.
>>>> This creates the proper zone splitting for these SOCs:
>>>> 0..2GB for ZONE_DMA
>>>> 2GB..4GB for ZONE_DMA32
>>>> 4GB..8GB for ZONE_NORMAL
>>>>
>>>> Signed-off-by: Elad Nachman <enachman at marvell.com>
>>>> ---
>>>>    arch/arm64/mm/init.c | 20 +++++++++++++++-----
>>>>    1 file changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 74c1db8ce271..8288c778916e 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -115,20 +115,21 @@ static void __init
>> arch_reserve_crashkernel(void)
>>>>
>>>>    /*
>>>>     * Return the maximum physical address for a zone accessible by the
>> given bits
>>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
>>>> - * available memory, otherwise cap it at 32-bit.
>>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available
>> memory
>>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit.
>>>>     */
>>>>    static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>>>>    {
>>>>    	phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
>>>>    	phys_addr_t phys_start = memblock_start_of_DRAM();
>>>> +	phys_addr_t phys_end = memblock_end_of_DRAM();
>>>>
>>>>    	if (phys_start > U32_MAX)
>>>> -		zone_mask = PHYS_ADDR_MAX;
>>>> +		zone_mask = phys_start | zone_mask;
>>>>    	else if (phys_start > zone_mask)
>>>>    		zone_mask = U32_MAX;
>>>>
>>>> -	return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
>>>> +	return min(zone_mask, phys_end - 1) + 1;
>>>>    }
>>>>
>>>>    static void __init zone_sizes_init(void)
>>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void)
>>>>
>>>>    #ifdef CONFIG_ZONE_DMA
>>>>    	acpi_zone_dma_bits =
>> fls64(acpi_iort_dma_get_max_cpu_address());
>>>> -	dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
>>>> +	/*
>>>> +	 * When calculating the dma zone bits from the device tree, subtract
>>>> +	 * the DRAM start address, in case it does not start from address
>>>> +	 * zero. This way. we pass only the zone size related bits to
>>>> +	 * max_zone_phys(), which will add them to the base of the DRAM.
>>>> +	 * This prevents miscalculations on arm64 SOCs which combines
>>>> +	 * DDR starting above 4GB with memory controllers limited to
>>>> +	 * 32-bits or less:
>>>> +	 */
>>>> +	dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) -
>> memblock_start_of_DRAM());
>>>>    	zone_dma_bits = min3(32U, dt_zone_dma_bits,
>> acpi_zone_dma_bits);
>>>>    	arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>>>>    	max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>>>
>>> Hmm, I'm a bit worried this could regress other platforms since you seem
>>> to be assuming that DMA address 0 corresponds to the physical start of
>>> DRAM. What if that isn't the case?
>>
>> All of our most recent Set-top-box SoCs map DRAM starting at PA
>> 0x4000_0000 FWIW. We have the following memory maps:
> 
> That is below 4GB

Right, I was just making sure that there would be no regressions with 
non-zero base addresses for DRAM, not that there should be, but getting 
Tested-by tags for such changes is always a good thing IMHO.

> 
>>
>> 1. DRAM mapped at PA 0x0000_0000
>> 2. DRAM mapped at PA 0x4000_0000 with another memory controller
>> starting
>> at 0x3_0000_0000
>> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller.
>>
>> Here is the before/after diff with debugging prints introduced to print
>> start, end, min, mask and the dt_zone_dma_bits value:
>>
>> Memory map 1. with 2GB -> no change in output
>>
>> Memory map 2. with 2+2GB -> no change in output
>>
>> Memory map 3. with 4GB:
>>
>> @@ -39,7 +39,7 @@
>>    [    0.000000] OF: reserved mem:
>> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>>    [    0.000000] OF: reserved mem:
>> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable
>> SRR at fe000000
>>    [    0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>> -[    0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021
>> +[    0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>>    [    0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>>    [    0.000000] Zone ranges:
>>    [    0.000000]   DMA      [mem 0x0000000040000000-0x00000000ffffffff]
>> @@ -88,243 +88,243 @@
>>
>> Memory map 3. with 2GB:
>>
>> @@ -39,11 +39,11 @@
>>    [    0.000000] OF: reserved mem:
>> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>>    [    0.000000] OF: reserved mem:
>> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable
>> SRR at be000000
>>    [    0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> -[    0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>> -[    0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> +[    0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f
>> +[    0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff
>>    [    0.000000] Zone ranges:
>> -[    0.000000]   DMA      [mem 0x0000000040000000-0x00000000bfffffff]
>> -[    0.000000]   DMA32    empty
>> +[    0.000000]   DMA      [mem 0x0000000040000000-0x000000007fffffff]
>> +[    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000bfffffff]
>>    [    0.000000]   Normal   empty
>>    [    0.000000] Movable zone start for each node
>>    [    0.000000] Early memory node ranges
>> @@ -54,7 +54,7 @@
>> --
>> Florian
> 
> Do you have an example where this works when DDR memory starts above 4GB?
> The code in max_zone_phys() has separate if clauses for memory starting above 4GB,
> and for memory starting above zone mask but below 4GB...

I am afraid I don't have such a system handy. The one with 2+2GB with 
the second 2GB starting at 12GB might be a tad difficult to start 
without the first 2GB due to a number of critical reserved memory 
regions being there.
-- 
Florian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4221 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20240104/557b6e74/attachment.p7s>


More information about the linux-arm-kernel mailing list