[PATCH v16 01/14] power: reset: reboot-mode: Synchronize list traversal

Mukesh Ojha mukesh.ojha at oss.qualcomm.com
Mon Nov 3 00:16:01 PST 2025


On Wed, Oct 29, 2025 at 11:18:52PM +0530, Shivendra Pratap wrote:
> 
> 
> On 10/28/2025 9:04 AM, Bjorn Andersson wrote:
> > On Wed, Oct 15, 2025 at 10:08:16AM +0530, Shivendra Pratap wrote:
> >> List traversals must be synchronized to prevent race conditions
> >> and data corruption. The reboot-mode list is not protected by a
> >> lock currently, which can lead to concurrent access and race.
> > 
> > Is it a theoretical future race or something that we can hit in the
> > current implementation?
> > 
> >>
> >> Introduce a mutex lock to guard all operations on the reboot-mode
> >> list and ensure thread-safe access. The change prevents unsafe
> >> concurrent access on reboot-mode list.
> > 
> > I was under the impression that these lists where created during boot
> > and then used at some later point, which at best would bring a
> > theoretical window for a race... Reviewing the code supports my
> > understanding, but perhaps I'm missing something?
> > 
> >>
> >> Fixes: 4fcd504edbf7 ("power: reset: add reboot mode driver")
> >> Fixes: ca3d2ea52314 ("power: reset: reboot-mode: better compatibility with DT (replace ' ,/')")
> >>
> > 
> > Skip this empty line, please.
> > 
> > 
> > And given that you have fixes here, I guess this is a problem today. In
> > which case, this shouldn't have been carried for 16 versions - but have
> > sent and been merged on its own already.
> > 
> > So please, if this is a real issue, start your commit message with a
> > descriptive problem description, to make it clear that this needs to be
> > merged yesterday - or drop the fixes.
> > 
> >> Signed-off-by: Shivendra Pratap <shivendra.pratap at oss.qualcomm.com>
> >> ---
> >>  drivers/power/reset/reboot-mode.c | 96 +++++++++++++++++++++------------------
> >>  include/linux/reboot-mode.h       |  4 ++
> >>  2 files changed, 57 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/drivers/power/reset/reboot-mode.c b/drivers/power/reset/reboot-mode.c
> >> index fba53f638da04655e756b5f8b7d2d666d1379535..8fc3e14638ea757c8dc3808c240ff569cbd74786 100644
> >> --- a/drivers/power/reset/reboot-mode.c
> >> +++ b/drivers/power/reset/reboot-mode.c
> >> @@ -29,9 +29,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
> >>  	if (!cmd)
> >>  		cmd = normal;
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		if (!strcmp(info->mode, cmd))
> >> -			return info->magic;
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			if (!strcmp(info->mode, cmd))
> >> +				return info->magic;
> >> +	}
> >>  
> >>  	/* try to match again, replacing characters impossible in DT */
> >>  	if (strscpy(cmd_, cmd, sizeof(cmd_)) == -E2BIG)
> >> @@ -41,9 +43,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
> >>  	strreplace(cmd_, ',', '-');
> >>  	strreplace(cmd_, '/', '-');
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		if (!strcmp(info->mode, cmd_))
> >> -			return info->magic;
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			if (!strcmp(info->mode, cmd_))
> >> +				return info->magic;
> >> +	}
> >>  
> >>  	return 0;
> >>  }
> >> @@ -78,46 +82,50 @@ int reboot_mode_register(struct reboot_mode_driver *reboot)
> >>  
> >>  	INIT_LIST_HEAD(&reboot->head);
> >>  
> >> -	for_each_property_of_node(np, prop) {
> >> -		if (strncmp(prop->name, PREFIX, len))
> >> -			continue;
> >> -
> >> -		info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
> >> -		if (!info) {
> >> -			ret = -ENOMEM;
> >> -			goto error;
> >> -		}
> >> -
> >> -		if (of_property_read_u32(np, prop->name, &info->magic)) {
> >> -			dev_err(reboot->dev, "reboot mode %s without magic number\n",
> >> -				info->mode);
> >> -			devm_kfree(reboot->dev, info);
> >> -			continue;
> >> -		}
> >> -
> >> -		info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
> >> -		if (!info->mode) {
> >> -			ret =  -ENOMEM;
> >> -			goto error;
> >> -		} else if (info->mode[0] == '\0') {
> >> -			kfree_const(info->mode);
> >> -			ret = -EINVAL;
> >> -			dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
> >> -				prop->name);
> >> -			goto error;
> >> +	mutex_init(&reboot->rb_lock);
> >> +
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> > 
> > I don't see how this can race with anything, reboot_mode_register() is
> > supposed to be called from some probe function, with reboot_mode_driver
> > being a "local" object.
> > 
> > The guard here "protects" &reboot->head, but that is not a shared
> > resources at this point.
> > 
> >> +		for_each_property_of_node(np, prop) {
> >> +			if (strncmp(prop->name, PREFIX, len))
> >> +				continue;
> >> +
> >> +			info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
> >> +			if (!info) {
> >> +				ret = -ENOMEM;
> >> +				goto error;
> >> +			}
> >> +
> >> +			if (of_property_read_u32(np, prop->name, &info->magic)) {
> >> +				dev_err(reboot->dev, "reboot mode %s without magic number\n",
> >> +					info->mode);
> >> +				devm_kfree(reboot->dev, info);
> >> +				continue;
> >> +			}
> >> +
> >> +			info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
> >> +			if (!info->mode) {
> >> +				ret =  -ENOMEM;
> >> +				goto error;
> >> +			} else if (info->mode[0] == '\0') {
> >> +				kfree_const(info->mode);
> >> +				ret = -EINVAL;
> >> +				dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
> >> +					prop->name);
> >> +				goto error;
> >> +			}
> >> +
> >> +			list_add_tail(&info->list, &reboot->head);
> >>  		}
> >>  
> >> -		list_add_tail(&info->list, &reboot->head);
> >> -	}
> >> -
> >> -	reboot->reboot_notifier.notifier_call = reboot_mode_notify;
> >> -	register_reboot_notifier(&reboot->reboot_notifier);
> >> +		reboot->reboot_notifier.notifier_call = reboot_mode_notify;
> >> +		register_reboot_notifier(&reboot->reboot_notifier);
> > 
> > Once register_reboot_notifier() has been called, &reboot->head is
> > visible outside the specific driver instance.
> > 
> > So, there's no reason to lock in reboot_mode_register().
> > 
> >>  
> >> -	return 0;
> >> +		return 0;
> >>  
> >>  error:
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		kfree_const(info->mode);
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			kfree_const(info->mode);
> >> +	}
> >>  
> >>  	return ret;
> >>  }
> >> @@ -133,8 +141,10 @@ int reboot_mode_unregister(struct reboot_mode_driver *reboot)
> >>  
> >>  	unregister_reboot_notifier(&reboot->reboot_notifier);
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		kfree_const(info->mode);
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> > 
> > get_reboot_mode_magic() is only called from reboot_mode_notify(), which
> > is only invoked by blocking_notifier_call_chain().
> > 
> > blocking_notifier_call_chain() takes a read semaphore.
> > unregister_reboot_notifier() take a write semaphore.
> > 
> > So, if we're racing with a shutdown or reboot, I see two possible
> > things:
> > 
> > 1) blocking_notifier_call_chain() happens first and calls
> >    reboot_mode_notify(), blocking unregister_reboot_notifier(). Once it
> >    returns, the unregister proceeds and we enter case #2
> > 
> > 2) unregister_reboot_notifier() happens first (or after the
> >    blocking_notifier_call_chain() returns). Our reboot object is removed
> >    from the list and blocking_notifier_call_chain() will not invoke
> >    reboot_mode_notify().
> > 
> > In either case, the list has a single owner here.
> > 
> > 
> > As far as I can see, the only race left is if multiple concurrent calls
> > happens to blocking_notifier_call_chain(), the behavior of
> > reboot->write() might be undefined. But I think that is reasonable.
> > 
> > 
> > Please let me know if I'm missing something.
> 
> Thank you for the detailed review. Tried to summarize below:
> 
> The mutex lock was introduced in v13 following earlier discussions about
> whether the issue was a theoretical future race or something that could
> occur in the current implementation.
> 
> At the time (prior to v13), we concluded that while no race condition was
> observable in the current code, there could be a potential in future
> changes or usage patterns — making it a theoretical concern.
> 
> During review, there was further discussion around whether it's acceptable
> to leave the list unprotected simply because no race is currently suspected.
> The general consensus was that it's better practice to protect shared data
> structures like lists with a mutex to ensure correctness and future-proofing.
> 
> Based on that feedback, the mutex lock introduced in v13.

I agree with Bjorn here, we do not need a lock here..

> 
> Later in v15, the reboot-mode maintainer suggested that the patch should
> include a Fixes tag, which was incorporated accordingly.
> 
> So both the mutex addition in v13 and the Fixes tag in v15 were made in
> response to upstream review comments.
> 
> Need some guidance on how to take this forward or is it ok to protect all
> list operation, as done in this patch and keep the fixes tag as suggested
> in earlier reviews?

We should drop this patch., 

> 
> thanks,
> Shivendra

-- 
-Mukesh Ojha



More information about the linux-arm-kernel mailing list