[PATCH RFC 10/15] x86: add an arch helper function to invalidate all cache for nvdimm

Wed Aug 10 13:06:18 PDT 2022

Mark Rutland wrote:
> On Tue, Aug 09, 2022 at 02:47:06PM -0700, Dave Jiang wrote:
> > 
> > On 8/3/2022 10:37 AM, Jonathan Cameron wrote:
> > > On Tue, 19 Jul 2022 12:07:03 -0700
> > > Dave Jiang <dave.jiang at intel.com> wrote:
> > > 
> > > > On 7/17/2022 10:30 PM, Davidlohr Bueso wrote:
> > > > > On Fri, 15 Jul 2022, Dave Jiang wrote:
> > > > > > The original implementation to flush all cache after unlocking the
> > > > > > nvdimm
> > > > > > resides in drivers/acpi/nfit/intel.c. This is a temporary stop gap until
> > > > > > nvdimm with security operations arrives on other archs. With support CXL
> > > > > > pmem supporting security operations, specifically "unlock" dimm, the
> > > > > > need
> > > > > > for an arch supported helper function to invalidate all CPU cache for
> > > > > > nvdimm has arrived. Remove original implementation from acpi/nfit and
> > > > > > add
> > > > > > cross arch support for this operation.
> > > > > > 
> > > > > > Add CONFIG_ARCH_HAS_NVDIMM_INVAL_CACHE Kconfig and allow x86_64 to
> > > > > > opt in
> > > > > > and provide the support via wbinvd_on_all_cpus() call.
> > > > > So the 8.2.9.5.5 bits will also need wbinvd - and I guess arm64 will need
> > > > > its own semantics (iirc there was a flush all call in the past). Cc'ing
> > > > > Jonathan as well.
> > > > > 
> > > > > Anyway, I think this call should not be defined in any place other
> > > > > than core
> > > > > kernel headers, and not in pat/nvdimm. I was trying to make it fit in
> > > > > smp.h,
> > > > > for example, but conviniently we might be able to hijack
> > > > > flush_cache_all()
> > > > > for our purposes as of course neither x86-64 arm64 uses it :)
> > > > > 
> > > > > And I see this as safe (wrt not adding a big hammer on unaware
> > > > > drivers) as
> > > > > the 32bit archs that define the call are mostly contained thin their
> > > > > arch/,
> > > > > and the few in drivers/ are still specific to those archs.
> > > > > 
> > > > > Maybe something like the below.
> > > > Ok. I'll replace my version with yours.
> > > Careful with flush_cache_all(). The stub version in
> > > include/asm-generic/cacheflush.h has a comment above it that would
> > > need updating at very least (I think).
> > > Note there 'was' a flush_cache_all() for ARM64, but:
> > > https://patchwork.kernel.org/project/linux-arm-kernel/patch/1429521875-16893-1-git-send-email-mark.rutland@arm.com/
> > 
> > 
> > flush_and_invalidate_cache_all() instead given it calls wbinvd on x86? I
> > think other archs, at least ARM, those are separate instructions aren't
> > they?
> 
> On arm and arm64 there is no way to perform maintenance on *all* caches; it has
> to be done in cacheline increments by address. It's not realistic to do that
> for the entire address space, so we need to know the relevant address ranges
> (as per the commit referenced above).
> 
> So we probably need to think a bit harder about the geenric interface, since
> "all" isn't possible to implement. :/
> 

I expect the interface would not be in the "flush_cache_" namespace
since those functions are explicitly for virtually tagged caches that
need maintenance on TLB operations that change the VA to PA association.
In this case the cache needs maintenance because the data at the PA
changes. That also means that putting it in the "nvdimm_" namespace is
also wrong because there are provisions in the CXL spec where volatile
memory ranges can also change contents at a given PA, for example caches
might need to be invalidated if software resets the device, but not the
platform.

Something like:

    region_cache_flush(resource_size_t base, resource_size_t n, bool nowait)

...where internally that function can decide if it can rely on an
instruction like wbinvd, use set / way based flushing (if set / way
maintenance can be made to work which sounds like no for arm64), or map
into VA space and loop. If it needs to fall back to that VA-based loop
it might be the case that the caller would want to just fail the
security op rather than suffer the loop latency.