Suspicious error for CMA stress test

Vlastimil Babka vbabka at suse.cz
Sat Mar 19 15:11:24 PDT 2016


On 03/19/2016 08:24 AM, Hanjun Guo wrote:
> On 2016/3/18 22:10, Vlastimil Babka wrote:
>>>>
>>>> Oh, ok, will try to send proper patch, once I figure out what to write in
>>>> the changelog :)
>>> Thanks in advance!
>>>
>> OK, here it is. Hanjun can you please retest this, as I'm not sure if you had
>
> I tested this new patch with stress for more than one hour, and it works!

That's good news, thanks!

> Since Lucas has comments on it, I'm willing to test further versions if needed.
>
> One minor comments below,
>
>> the same code due to the followup one-liner patches in the thread. Lucas, see if
>> it helps with your issue as well. Laura and Joonsoo, please also test and review
>> and check changelog if my perception of the problem is accurate :)
>>
>> Thanks
>>
> [...]
>> +	if (max_order < MAX_ORDER) {
>> +		/* If we are here, it means order is >= pageblock_order.
>> +		 * We want to prevent merge between freepages on isolate
>> +		 * pageblock and normal pageblock. Without this, pageblock
>> +		 * isolation could cause incorrect freepage or CMA accounting.
>> +		 *
>> +		 * We don't want to hit this code for the more frequent
>> +		 * low-order merging.
>> +		 */
>> +		if (unlikely(has_isolate_pageblock(zone))) {
>
> In the first version of your patch, it's
>
> +		if (IS_ENABLED(CONFIG_CMA) &&
> +				unlikely(has_isolate_pageblock(zone))) {
>
> Why remove the IS_ENABLED(CONFIG_CMA) in the new version?

Previously I thought the problem was CMA-specific, but after more detailed look 
I think it's not, as start_isolate_page_range() releases zone lock between 
pageblocks, so unexpected merging due to races can happen also between isolated 
and non-isolated non-CMA pageblocks. This function is called from memory hotplug 
code, and recently also alloc_contig_range() itself is outside CONFIG_CMA for 
allocating gigantic hugepages. Joonsoo's original commit 3c60509 was also not 
restricted to CMA, and same with his patch earlier in this thread.

Hmm I guess another alternate solution would indeed be to modify 
start_isolate_page_range() and undo_isolate_page_range() to hold zone->lock 
across MAX_ORDER blocks (not whole requested range, as that could lead to 
hardlockups). But that still wouldn't help Lucas, IUUC.


> Thanks
> Hanjun
>
>




More information about the linux-arm-kernel mailing list