[PATCH v2 0/4] kasan: Fix ordering between MTE tag colouring and page->flags

Peter Collingbourne pcc at google.com
Thu Feb 9 22:19:20 PST 2023


On Wed, Feb 08, 2023 at 05:41:45AM +0000, Qun-wei Lin (林群崴) wrote:
> On Fri, 2023-02-03 at 18:51 +0100, Andrey Konovalov wrote:
> > On Fri, Feb 3, 2023 at 4:41 AM Kuan-Ying Lee (李冠穎)
> > <Kuan-Ying.Lee at mediatek.com> wrote:
> > > 
> > > > Hi Kuan-Ying,
> > > > 
> > > > There recently was a similar crash due to incorrectly implemented
> > > > sampling.
> > > > 
> > > > Do you have the following patch in your tree?
> > > > 
> > > > 
> > > 
> > > 
> https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/9f7f5a25f335e6e1484695da9180281a728db7e2__;Kw!!CTRNKA9wMg0ARbw!hUjRlXirPMSusdIWe0RIPt0PNqIHYDCJyd7GSd4o-TgLMP0CKRUkjElH-jcvtaz42-sgE2U58964rCCbuNTJE5Jx$
> > > > 
> > > > 
> > > > If not, please sync your 6.1 tree with the Android common kernel.
> > > > Hopefully this will fix the issue.
> > > > 
> > > > Thanks!
> > > 
> > > Hi Andrey,
> > > 
> > > Thanks for your advice.
> > > 
> > > I saw this patch is to fix ("kasan: allow sampling page_alloc
> > > allocations for HW_TAGS").
> > > 
> > > But our 6.1 tree doesn't have following two commits now.
> > > ("FROMGIT: kasan: allow sampling page_alloc allocations for
> > > HW_TAGS")
> > > (FROMLIST: kasan: reset page tags properly with sampling)
> > 
> > Hi Kuan-Ying,
> > 
> 
> Hi Andrey,
> I'll stand in for Kuan-Ying as he's out of office.
> Thanks for your help!
> 
> > Just to clarify: these two patches were applied twice: once here on
> > Jan 13:
> > 
> > 
> https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/a2a9e34d164e90fc08d35fd097a164b9101d72ef__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745_3oO-3k$ 
> >  
> > 
> https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/435e2a6a6c8ba8d0eb55f9aaade53e7a3957322b__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745sDEOYWY$ 
> >  
> > 
> 
> Our codebase does not contain these two patches.
> 
> > but then reverted here on Jan 20:
> > 
> > 
> https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/5503dbe454478fe54b9cac3fc52d4477f52efdc9__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745Bl77dFY$ 
> >  
> > 
> https://urldefense.com/v3/__https://android.googlesource.com/kernel/common/*/4573a3cf7e18735a477845426238d46d96426bb6__;Kw!!CTRNKA9wMg0ARbw!kE1XiSmunRcQb9rTpKGkFc1EFJA57qr1cj7v9EZAjUBzXcSzMl-ofCI2mdtEQsxn3J4n7Lkgxb0_G745K-J8O-w$ 
> >  
> > 
> > And then once again via the link I sent before together with a fix on
> > Jan 25.
> > 
> > It might be that you still have to former two patches in your tree if
> > you synced it before the revert.
> > 
> > However, if this is not the case:
> > 
> > Which 6.1 commit is your tree based on?
> 
> 
> https://android.googlesource.com/kernel/common/+/53b3a7721b7aec74d8fa2ee55c2480044cc7c1b8
> (53b3a77 Merge 6.1.1 into android14-6.1) is the latest commit in our
> tree.
> 
> > Do you have any private MTE-related changes in the kernel?
> 
> No, all the MTE-related code is the same as Android Common Kernel.
> 
> > Do you have userspace MTE enabled?
> 
> Yes, we have enabled MTE for both EL1 and EL0.

Hi Qun-wei,

Thanks for the information. We encountered a similar issue internally
with the Android 5.15 common kernel. We tracked it down to an issue
with page migration, where the source page was a userspace page with
MTE tags, and the target page was allocated using KASAN (i.e. having
a non-zero KASAN tag). This caused tag check faults when the page was
subsequently accessed by the kernel as a result of the mismatching tags
from userspace. Given the number of different ways that page migration
target pages can be allocated, the simplest fix that we could think of
was to synchronize the KASAN tag in copy_highpage().

Can you try the patch below and let us know whether it fixes the issue?

diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 24913271e898c..87ed38e9747bd 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -23,6 +23,8 @@ void copy_highpage(struct page *to, struct page *from)
 
 	if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
 		set_bit(PG_mte_tagged, &to->flags);
+		if (kasan_hw_tags_enabled())
+			page_kasan_tag_set(to, page_kasan_tag(from));
 		mte_copy_page_tags(kto, kfrom);
 	}
 }

Catalin, please let us know what you think of the patch above. It
effectively partially undoes commit 20794545c146 ("arm64: kasan: Revert
"arm64: mte: reset the page tag in page->flags""), but this seems okay
to me because the mentioned race condition shouldn't affect "new" pages
such as those being used as migration targets. The smp_wmb() that was
there before doesn't seem necessary for the same reason.

If the patch is okay, we should apply it to the 6.1 stable kernel. The
problem appears to be "fixed" in the mainline kernel because of
a bad merge conflict resolution on my part; when I rebased commit
e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
past commit 20794545c146, it looks like I accidentally brought back the
page_kasan_tag_reset() line removed in the latter. But we should align
the mainline kernel with whatever we decide to do on 6.1.

Peter



More information about the linux-arm-kernel mailing list