[syzbot] linux-next boot error: general protection fault in add_mtd_device
Vanessa Page
Vebpe at outlook.com
Sat Jul 23 18:37:58 PDT 2022
You can’t prevent it dumb ass
Sent from my iPhone
> On Jul 23, 2022, at 9:36 PM, Tetsuo Handa <penguin-kernel at i-love.sakura.ne.jp> wrote:
>
> On 2022/07/22 11:51, Christian Marangi wrote:
>>> On Fri, Jul 22, 2022 at 11:34:57AM +0900, Tetsuo Handa wrote:
>>> mtd_check_of_node() was added by commit ad9b10d1eaada169 ("mtd: core:
>>> introduce of support for dynamic partitions").
>>>
>>> I guess that sometimes (depending on probe timing) mtd->parent is NULL.
>>> Please check what mtd->parent == NULL means.
>>>
>>> + /* Check if a partitions node exist */
>>> + parent = mtd->parent;
>>> + parent_dn = dev_of_node(&parent->dev);
>>>
>>
>> Currently there is thix [1].
>
> OK. Then for now can we try that "prevent the crash" patch (and defer
> examining the cause of being NULL) ?
>
>>
>> Anyway you comment means a device may probe defer and have the parent
>> still set to NULL? How can we check that?
>
> My comment is just a guess. I have zero experience in this area.
>
> If this problem is not timing dependent, syzbot would have already
> reported this problem for thousands times. But since syzbot reported
> this problem only 84 times in 23 days, I think that this problem is
> timing dependent (i.e. some race condition).
>
>>
>> Return PROBE_DEFER always when no mtd parent is found?
>
> We could try tracing what value is assigned to mtd->parent, by
> introducing a validation wrapper function (which validates that
> the address is a valid kernel address) like
>
> ----------------------------------------
> diff --git a/include/linux/bug.h b/include/linux/bug.h
> index 348acf2558f3..6b1626776b95 100644
> --- a/include/linux/bug.h
> +++ b/include/linux/bug.h
> @@ -91,4 +91,12 @@ static inline __must_check bool check_data_corruption(bool v) { return v; }
> corruption; \
> }))
>
> +extern void check_valid_kernel_ptr(void *arg, const char *name);
> +#define valid_kernel_ptr(ptr) \
> + ({ \
> + typeof(ptr) tmp = (ptr); \
> + check_valid_kernel_ptr(tmp, __stringify(ptr)); \
> + tmp; \
> + })
> +
> #endif /* _LINUX_BUG_H */
> diff --git a/lib/bug.c b/lib/bug.c
> index c223a2575b72..6555766134cf 100644
> --- a/lib/bug.c
> +++ b/lib/bug.c
> @@ -231,3 +231,10 @@ void generic_bug_clear_once(void)
>
> clear_once_table(__start___bug_table, __stop___bug_table);
> }
> +
> +void check_valid_kernel_ptr(void *arg, const char *name)
> +{
> + WARN(!arg, "%s is NULL\n", name);
> + WARN(IS_ERR(arg), "%s is %pe\n", name, arg);
> +}
> +EXPORT_SYMBOL(check_valid_kernel_ptr);
> ----------------------------------------
>
> and wrapping all locations that modifies mtd->parent like
>
> ----------------------------------------
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..2e81539ddfd8 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -81,9 +81,9 @@ static struct mtd_info *allocate_partition(struct mtd_info *parent,
> * distinguish between the parent and its partitions in sysfs.
> */
> child->dev.parent = IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER) || mtd_is_partition(parent) ?
> - &parent->dev : parent->dev.parent;
> + &parent->dev : valid_kernel_ptr(parent->dev.parent);
> child->dev.of_node = part->of_node;
> - child->parent = parent;
> + child->parent = valid_kernel_ptr(parent);
> child->part.offset = part->offset;
> INIT_LIST_HEAD(&child->partitions);
>
> ----------------------------------------
>
> . If none of all wrapped assignments triggers, we would need to check
> locations that explicitly assign NULL. (That is, possibility of hitting
> a race that the parent was once non-NULL but then became NULL due to
> error or cleanup path.)
>
> We can add some CONFIG_ option for controlling whether check_valid_kernel_ptr()
> should become
>
> #define valid_kernel_ptr(ptr) (ptr)
>
> so that we can avoid overhead when this problem if fixed.
>
> Linus, can we have macros like valid_kernel_ptr() ? Macros like
> valid_kernel_ptr(), valid_kernel_ptr_or_null(), valid_kernel_ptr_or_err() can
> serve for documentation purpose without runtime cost for non-debug builds.
>
>>
>> [1] https://patchwork.ozlabs.org/project/linux-mtd/patch/20220703095631.16508-1-ansuelsmth@gmail.com/
>>
>>> On 2022/06/30 18:32, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: 6cc11d2a1759 Add linux-next specific files for 20220630
>>>> git tree: linux-next
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1640f850080000
>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=54f75b620e3845dd
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=fe013f55a2814a9e8cfd
>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+fe013f55a2814a9e8cfd at syzkaller.appspotmail.com
>>
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
More information about the linux-mtd
mailing list