[RFC 3/8] ARM64: Base TLB functions

Mon Mar 4 10:15:25 PST 2024

Hi Christoph,

I don't think any of these patches made it to the list for some reason,
only the cover letter. I'll add some general comments below in case you
plan to repost.

On Wed, Dec 06, 2023 at 07:57:06PM -0800, Christoph Lameter wrote:
> Add a series of TLB function that allow TLB flushing in a variety of
> ways depending on the mode encoded in enum tlb_state:
> 
> TLB_BROADCOAST	-> Use single TLBI propagated to all units via the mesh
> TLB_LOCAL	-> Use TLBI that only perform local flushes
> TLB_IPI		-> Use TLBIs that perform local flushes on multiple cpus
> TLB_NONE	-> Suppress TLBI because there are no users of the address space

TLB_IPI - I really don't want this on arm64, irrespective of the
performance improvement on specific hardware. IPIs need additional care
to avoid deadlocking, e.g. don't do a TLBI with interrupts disabled. I'd
rather not have to care about this.

TLB_LOCAL and TLB_NONE - these are potentially useful _but_ they don't
take the SMMU with SVA into account (another potential issue below). The
mm_cpumask() does not track other non-CPU contexts but they do need the
broadcast TLBI.

> +static enum tlb_state tlbstat(struct cpumask *mask)
> +{
> +	unsigned int weight = cpumask_weight(mask);
> +	bool present = cpumask_test_cpu(smp_processor_id(), mask);

Is this guaranteed to be called in a non-preemptible context? My biggest
concern with the cpumask tracking is that the thread is migrated in the
middle of a TLB flush and it would wrongly assume that only a local TLBI
is needed. On arm64 we don't flush the TLBs on context switch. I don't
think we can solve this without additional locking around mm_cpumask()
or at least preemption disabling.

-- 
Catalin