[syzbot] linux-next boot error: general protection fault in add_mtd_device

Vanessa Page Vebpe at outlook.com
Sat Jul 23 18:37:58 PDT 2022


You can’t prevent it dumb ass

Sent from my iPhone

> On Jul 23, 2022, at 9:36 PM, Tetsuo Handa <penguin-kernel at i-love.sakura.ne.jp> wrote:
> 
> On 2022/07/22 11:51, Christian Marangi wrote:
>>> On Fri, Jul 22, 2022 at 11:34:57AM +0900, Tetsuo Handa wrote:
>>> mtd_check_of_node() was added by commit ad9b10d1eaada169 ("mtd: core:
>>> introduce of support for dynamic partitions").
>>> 
>>> I guess that sometimes (depending on probe timing) mtd->parent is NULL.
>>> Please check what mtd->parent == NULL means.
>>> 
>>> +    /* Check if a partitions node exist */
>>> +       parent = mtd->parent;
>>> +       parent_dn = dev_of_node(&parent->dev);
>>> 
>> 
>> Currently there is thix [1].
> 
> OK. Then for now can we try that "prevent the crash" patch (and defer
> examining the cause of being NULL) ?
> 
>> 
>> Anyway you comment means a device may probe defer and have the parent
>> still set to NULL? How can we check that?
> 
> My comment is just a guess. I have zero experience in this area.
> 
> If this problem is not timing dependent, syzbot would have already
> reported this problem for thousands times. But since syzbot reported
> this problem only 84 times in 23 days, I think that this problem is
> timing dependent (i.e. some race condition).
> 
>> 
>> Return PROBE_DEFER always when no mtd parent is found?
> 
> We could try tracing what value is assigned to mtd->parent, by
> introducing a validation wrapper function (which validates that
> the address is a valid kernel address) like
> 
> ----------------------------------------
> diff --git a/include/linux/bug.h b/include/linux/bug.h
> index 348acf2558f3..6b1626776b95 100644
> --- a/include/linux/bug.h
> +++ b/include/linux/bug.h
> @@ -91,4 +91,12 @@ static inline __must_check bool check_data_corruption(bool v) { return v; }
>        corruption;                         \
>    }))
> 
> +extern void check_valid_kernel_ptr(void *arg, const char *name);
> +#define valid_kernel_ptr(ptr)                        \
> +    ({                                \
> +        typeof(ptr) tmp = (ptr);                \
> +        check_valid_kernel_ptr(tmp, __stringify(ptr));        \
> +        tmp;                            \
> +    })
> +
> #endif    /* _LINUX_BUG_H */
> diff --git a/lib/bug.c b/lib/bug.c
> index c223a2575b72..6555766134cf 100644
> --- a/lib/bug.c
> +++ b/lib/bug.c
> @@ -231,3 +231,10 @@ void generic_bug_clear_once(void)
> 
>    clear_once_table(__start___bug_table, __stop___bug_table);
> }
> +
> +void check_valid_kernel_ptr(void *arg, const char *name)
> +{
> +    WARN(!arg, "%s is NULL\n", name);
> +    WARN(IS_ERR(arg), "%s is %pe\n", name, arg);
> +}
> +EXPORT_SYMBOL(check_valid_kernel_ptr);
> ----------------------------------------
> 
> and wrapping all locations that modifies mtd->parent like
> 
> ----------------------------------------
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..2e81539ddfd8 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -81,9 +81,9 @@ static struct mtd_info *allocate_partition(struct mtd_info *parent,
>     * distinguish between the parent and its partitions in sysfs.
>     */
>    child->dev.parent = IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER) || mtd_is_partition(parent) ?
> -                &parent->dev : parent->dev.parent;
> +        &parent->dev : valid_kernel_ptr(parent->dev.parent);
>    child->dev.of_node = part->of_node;
> -    child->parent = parent;
> +    child->parent = valid_kernel_ptr(parent);
>    child->part.offset = part->offset;
>    INIT_LIST_HEAD(&child->partitions);
> 
> ----------------------------------------
> 
> . If none of all wrapped assignments triggers, we would need to check
> locations that explicitly assign NULL. (That is, possibility of hitting
> a race that the parent was once non-NULL but then became NULL due to
> error or cleanup path.)
> 
> We can add some CONFIG_ option for controlling whether check_valid_kernel_ptr()
> should become
> 
> #define valid_kernel_ptr(ptr) (ptr)
> 
> so that we can avoid overhead when this problem if fixed.
> 
> Linus, can we have macros like valid_kernel_ptr() ? Macros like
> valid_kernel_ptr(), valid_kernel_ptr_or_null(), valid_kernel_ptr_or_err() can
> serve for documentation purpose without runtime cost for non-debug builds.
> 
>> 
>> [1] https://patchwork.ozlabs.org/project/linux-mtd/patch/20220703095631.16508-1-ansuelsmth@gmail.com/
>> 
>>> On 2022/06/30 18:32, syzbot wrote:
>>>> Hello,
>>>> 
>>>> syzbot found the following issue on:
>>>> 
>>>> HEAD commit:    6cc11d2a1759 Add linux-next specific files for 20220630
>>>> git tree:       linux-next
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1640f850080000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=54f75b620e3845dd
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=fe013f55a2814a9e8cfd
>>>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>> 
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+fe013f55a2814a9e8cfd at syzkaller.appspotmail.com
>> 
> 
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/


More information about the linux-mtd mailing list