[PATCH v2 0/4] kasan: Fix ordering between MTE tag colouring and page->flags

Qun-wei Lin (林群崴) Qun-wei.Lin at mediatek.com
Sun Feb 12 17:56:26 PST 2023


On Thu, 2023-02-09 at 22:19 -0800, Peter Collingbourne wrote:
> On Wed, Feb 08, 2023 at 05:41:45AM +0000, Qun-wei Lin (林群崴) wrote:
> > On Fri, 2023-02-03 at 18:51 +0100, Andrey Konovalov wrote:
> > > On Fri, Feb 3, 2023 at 4:41 AM Kuan-Ying Lee (李冠穎)
> > > <Kuan-Ying.Lee at mediatek.com> wrote:
> > > > 
> > > > > Hi Kuan-Ying,
> > > > > 
> > > > > There recently was a similar crash due to incorrectly
> > > > > implemented
> > > > > sampling.
> > > > > 
> > > > > Do you have the following patch in your tree?
> > > > > 
> > > > > 
> > > > 
> > > > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/9f7f5a25f335e6e1484695da9180281a728db7e2__;Kw!!CTRNKA9wMg0ARbw!hUjRlXirPMSusdIWe0RIPt0PNqIHYDCJyd7GSd4o-TgLMP0CKRUkjElH-jcvtaz42-sgE2U58964rCCbuNTJE5Jx$
> > > > > 
> > > > > 
> > > > > If not, please sync your 6.1 tree with the Android common
> > > > > kernel.
> > > > > Hopefully this will fix the issue.
> > > > > 
> > > > > Thanks!
> > > > 
> > > > Hi Andrey,
> > > > 
> > > > Thanks for your advice.
> > > > 
> > > > I saw this patch is to fix ("kasan: allow sampling page_alloc
> > > > allocations for HW_TAGS").
> > > > 
> > > > But our 6.1 tree doesn't have following two commits now.
> > > > ("FROMGIT: kasan: allow sampling page_alloc allocations for
> > > > HW_TAGS")
> > > > (FROMLIST: kasan: reset page tags properly with sampling)
> > > 
> > > Hi Kuan-Ying,
> > > 
> > 
> > Hi Andrey,
> > I'll stand in for Kuan-Ying as he's out of office.
> > Thanks for your help!
> > 
> > > Just to clarify: these two patches were applied twice: once here
> > > on
> > > Jan 13:
> > > 
> > > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/a2a9e34d164e90fc08d35fd097a164b9101d72ef__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745_3oO-3k$
> >  
> > >  
> > > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/435e2a6a6c8ba8d0eb55f9aaade53e7a3957322b__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745sDEOYWY$
> >  
> > >  
> > > 
> > 
> > Our codebase does not contain these two patches.
> > 
> > > but then reverted here on Jan 20:
> > > 
> > > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/5503dbe454478fe54b9cac3fc52d4477f52efdc9__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745Bl77dFY$
> >  
> > >  
> > > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/4573a3cf7e18735a477845426238d46d96426bb6__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745K-J8O-w$
> >  
> > >  
> > > 
> > > And then once again via the link I sent before together with a
> > > fix on
> > > Jan 25.
> > > 
> > > It might be that you still have to former two patches in your
> > > tree if
> > > you synced it before the revert.
> > > 
> > > However, if this is not the case:
> > > 
> > > Which 6.1 commit is your tree based on?
> > 
> > 
> > 
https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/53b3a7721b7aec74d8fa2ee55c2480044cc7c1b8__;Kw!!CTRNKA9wMg0ARbw!iEzuh9LYXlwXkpcWaHjncfr6lNgTky7OEAEzQ7cIFjlTD__7lwXqAhPJwWJXEnD8THUS7jnBK7hjnHw$ 
> >  
> > (53b3a77 Merge 6.1.1 into android14-6.1) is the latest commit in
> > our
> > tree.
> > 
> > > Do you have any private MTE-related changes in the kernel?
> > 
> > No, all the MTE-related code is the same as Android Common Kernel.
> > 
> > > Do you have userspace MTE enabled?
> > 
> > Yes, we have enabled MTE for both EL1 and EL0.
> 
> Hi Qun-wei,
> 
> Thanks for the information. We encountered a similar issue internally
> with the Android 5.15 common kernel. We tracked it down to an issue
> with page migration, where the source page was a userspace page with
> MTE tags, and the target page was allocated using KASAN (i.e. having
> a non-zero KASAN tag). This caused tag check faults when the page was
> subsequently accessed by the kernel as a result of the mismatching
> tags
> from userspace. Given the number of different ways that page
> migration
> target pages can be allocated, the simplest fix that we could think
> of
> was to synchronize the KASAN tag in copy_highpage().
> 
> Can you try the patch below and let us know whether it fixes the
> issue?
> 
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 24913271e898c..87ed38e9747bd 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -23,6 +23,8 @@ void copy_highpage(struct page *to, struct page
> *from)
>  
>  	if (system_supports_mte() && test_bit(PG_mte_tagged, &from-
> >flags)) {
>  		set_bit(PG_mte_tagged, &to->flags);
> +		if (kasan_hw_tags_enabled())
> +			page_kasan_tag_set(to, page_kasan_tag(from));
>  		mte_copy_page_tags(kto, kfrom);
>  	}
>  }
> 

Thank you so much, this patch has solved the problem.

> Catalin, please let us know what you think of the patch above. It
> effectively partially undoes commit 20794545c146 ("arm64: kasan:
> Revert
> "arm64: mte: reset the page tag in page->flags""), but this seems
> okay
> to me because the mentioned race condition shouldn't affect "new"
> pages
> such as those being used as migration targets. The smp_wmb() that was
> there before doesn't seem necessary for the same reason.
> 
> If the patch is okay, we should apply it to the 6.1 stable kernel.
> The
> problem appears to be "fixed" in the mainline kernel because of
> a bad merge conflict resolution on my part; when I rebased commit
> e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
> past commit 20794545c146, it looks like I accidentally brought back
> the
> page_kasan_tag_reset() line removed in the latter. But we should
> align
> the mainline kernel with whatever we decide to do on 6.1.
> 
> Peter



More information about the linux-arm-kernel mailing list