[PATCH] arm64: clear_page: use stnp non-temporal instruction for performance optimizing

Catalin Marinas catalin.marinas at arm.com
Tue Nov 16 10:17:12 PST 2021


On Tue, Nov 16, 2021 at 11:08:14PM +0800, Guanghui Feng wrote:
> When clear page mem, there is no need to alloc cache for storing these
> mem value.

I theory, DC ZVA is supposed to trigger write streaming mode and all
writes go directly to memory avoiding cache allocation.

> And the copy_page.S have used stnp instruction for optimizing.
> So I rewrite the clear_page.S with stnp. At the same time, I have tested it
> with stnp instruction which will get about twice the performance improvement.

On which CPU implementation? Is the same improvement seen on a wider
range of CPUs?

-- 
Catalin



More information about the linux-arm-kernel mailing list