Regression on linux-next (next-20240625)
Sidhartha Kumar
sidhartha.kumar at oracle.com
Fri Jun 28 07:53:53 PDT 2024
On 6/27/24 9:45 PM, Borah, Chaitanya Kumar wrote:
> [converted to plain text]
> +intel-gfx
>
> Gentle Reminder.
>
Hello,
This patch will be dropped from mm-unstable and will not be in linux-next after
that. I am working on a fix to include for the next version of this series.
Thanks,
Sid
> From: Borah, Chaitanya Kumar
> Sent: Wednesday, June 26, 2024 8:52 PM
> To: sidhartha.kumar at oracle.com
> Cc: Liam.Howlett at oracle.com; akpm at linux-foundation.org; linux-mm at kvack.org; maple-tree at lists.infradead.org; Nikula, Jani <jani.nikula at intel.com>; Saarinen, Jani <jani.saarinen at intel.com>; Kurmi, Suresh Kumar <Suresh.Kumar.Kurmi at intel.com>
> Subject: Regression on linux-next (next-20240625)
>
> Hello Sidhartha,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
>
> Since the version next-20240625 [2], we are seeing the following regression
>
> `````````````````````````````````````````````````````````````````````````````````
> <3>[ 2.336948] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337
> <3>[ 2.336974] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 95, name: kdevtmpfs
> <3>[ 2.336989] preempt_count: 1, expected: 0
> <3>[ 2.336998] RCU nest depth: 0, expected: 0
> <4>[ 2.337006] 3 locks held by kdevtmpfs/95:
> <4>[ 2.337015] #0: ffff888100d2c3f0 (sb_writers){.+.+}-{0:0}, at: filename_create+0x5d/0x160
> <4>[ 2.337041] #1: ffff888100800840 (&type->i_mutex_dir_key/1){+.+.}-{3:3}, at: filename_create+0x9d/0x160
> <4>[ 2.337065] #2: ffff888100800658 (&simple_offset_lock_class){+.+.}-{2:2}, at: mtree_alloc_cyclic+0x71/0xf0
> <3>[ 2.337089] Preemption disabled at:
> <3>[ 2.337091] [<0000000000000000>] 0x0
> <4>[ 2.337105] CPU: 13 UID: 0 PID: 95 Comm: kdevtmpfs Not tainted 6.10.0-rc5-next-20240625-next-20240625-g0fc4bfab2cd4+ #1
> <4>[ 2.337126] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
> <4>[ 2.337141] Call Trace:
> <4>[ 2.337147] <TASK>
> <4>[ 2.337152] dump_stack_lvl+0xb0/0xd0
> <4>[ 2.337163] __might_resched+0x194/0x2b0
> <4>[ 2.337175] kmem_cache_alloc_noprof+0x20c/0x280
> <4>[ 2.337186] ? mas_alloc_nodes+0x173/0x230
> <4>[ 2.337197] mas_alloc_nodes+0x173/0x230
> <4>[ 2.337207] mas_alloc_cyclic+0x27b/0x550
> <4>[ 2.337220] mtree_alloc_cyclic+0x92/0xf0
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].
>
> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> maple_tree: remove mas_destroy() from mas_nomem()
>
> Separate call to mas_destroy() from mas_nomem() so we can check for no
> memory errors without destroying the current maple state in
> mas_store_gfp(). We then add calls to mas_destroy() to callers of
> mas_nomem().
>
> Link: https://lkml.kernel.org/r/20240618204750.79512-6-sidhartha.kumar@oracle.com
> Signed-off-by: Sidhartha Kumar mailto:sidhartha.kumar at oracle.com
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>
> We could not revert the patch because of merge conflicts but resetting to the parent of the commit seems to fix the issue.
>
> Could you please check why the patch causes this regression and provide a fix if necessary?
>
> Thank you.
>
> Regards
>
> Chaitanya
>
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240625
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20240625/bat-rpls-4/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=187827d2dc3749d66546696b78584ee4c54687b0
More information about the maple-tree
mailing list