[PATCH] tracing: Fix tracing_marker may trigger page fault during preempt_disable

Luo Gengkun luogengkun at huaweicloud.com
Mon Sep 1 20:47:32 PDT 2025


On 2025/9/1 23:56, Masami Hiramatsu (Google) wrote:
> On Fri, 29 Aug 2025 08:26:04 -0400
> Steven Rostedt <rostedt at goodmis.org> wrote:
>
>> [ Adding arm64 maintainers ]
>>
>> On Fri, 29 Aug 2025 16:29:07 +0800
>> Luo Gengkun <luogengkun at huaweicloud.com> wrote:
>>
>>> On 2025/8/20 1:50, Steven Rostedt wrote:
>>>> On Tue, 19 Aug 2025 10:51:52 +0000
>>>> Luo Gengkun <luogengkun at huaweicloud.com> wrote:
>>>>   
>>>>> Both tracing_mark_write and tracing_mark_raw_write call
>>>>> __copy_from_user_inatomic during preempt_disable. But in some case,
>>>>> __copy_from_user_inatomic may trigger page fault, and will call schedule()
>>>>> subtly. And if a task is migrated to other cpu, the following warning will
>>>> Wait! What?
>>>>
>>>> __copy_from_user_inatomic() is allowed to be called from in atomic context.
>>>> Hence the name it has. How the hell can it sleep? If it does, it's totally
>>>> broken!
>>>>
>>>> Now, I'm not against using nofault() as it is better named, but I want to
>>>> know why you are suggesting this change. Did you actually trigger a bug here?
>>> yes, I trigger this bug in arm64.
>> And I still think this is an arm64 bug.
> I think it could be.
>
>>>>   
>>>>> be trigger:
>>>>>           if (RB_WARN_ON(cpu_buffer,
>>>>>                          !local_read(&cpu_buffer->committing)))
>>>>>
>>>>> An example can illustrate this issue:
> You've missed an important part.
>
>>>>> process flow						CPU
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> tracing_mark_raw_write():				cpu:0
>>>>>      ...
>>>>>      ring_buffer_lock_reserve():				cpu:0
>>>>>         ...
> 	preempt_disable_notrace(); --> this is unlocked by ring_buffer_unlock_commit()
>
>>>>>         cpu = raw_smp_processor_id()			cpu:0
>>>>>         cpu_buffer = buffer->buffers[cpu]			cpu:0
>>>>>         ...
>>>>>      ...
>>>>>      __copy_from_user_inatomic():				cpu:0
> So this is called under preempt-disabled.
>
>>>>>         ...
>>>>>         # page fault
>>>>>         do_mem_abort():					cpu:0
>>>> Sounds to me that arm64 __copy_from_user_inatomic() may be broken.
>>>>   
>>>>>            ...
>>>>>            # Call schedule
>>>>>            schedule()					cpu:0
> If this does not check the preempt flag, it is a problem.
> Maybe arm64 needs to do fixup and abort instead of do_mem_abort()?

My kernel was built without CONFIG_PREEMPT_COUNT, so the preempt_disable()
does nothing more than act as a barrier. In this case, it can pass the
check by schedule(). Perhaps this is another issue?

>
>>>>> 	 ...
>>>>>      # the task schedule to cpu1
>>>>>      __buffer_unlock_commit():				cpu:1
>>>>>         ...
>>>>>         ring_buffer_unlock_commit():			cpu:1
>>>>> 	 ...
>>>>> 	 cpu = raw_smp_processor_id()			cpu:1
>>>>> 	 cpu_buffer = buffer->buffers[cpu]		cpu:1
> 	preempt_enable_notrace(); <-- here we enable preempt again.
>
>>>>> As shown above, the process will acquire cpuid twice and the return values
>>>>> are not the same.
>>>>>
>>>>> To fix this problem using copy_from_user_nofault instead of
>>>>> __copy_from_user_inatomic, as the former performs 'access_ok' before
>>>>> copying.
>>>>>
>>>>> Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in trace_marker writing")
>>>> The above commit was intorduced in 2016. copy_from_user_nofault() was
>>>> introduced in 2020. I don't think this would be the fix for that kernel.
>>>>
>>>> So no, I'm not taking this patch. If you see __copy_from_user_inatomic()
>>>> sleeping, it's users are not the issue. That function is.
>
> BTW, the biggest difference between __copy_from_user() and
>   __copy_from_user_inatomic() is `might_fault()` and `should_fail_usercopy()`.
> The latter is a fault injection, so we can ignore it. But since
> the `might_fail()` is NOT in __copy_from_user_inatomic(), it is designed
> not to cause fault as Steve said?
>
> Thank you,
>




More information about the linux-arm-kernel mailing list