[PATCH] riscv: Report error when repeatedly recording CPU hardware ID
Conor Dooley
conor at kernel.org
Tue Aug 27 01:27:32 PDT 2024
On Tue, Aug 27, 2024 at 09:39:30AM +0800, qiaozhe wrote:
>
> On 2024/8/26 16:15, Conor Dooley wrote:
>
> > On Sat, Aug 24, 2024 at 10:41:28AM +0800, qiaozhe wrote:
> >>
> >>
> >> On 2024/8/23 20:57, Conor Dooley wrote:
> >>
> >>> On Fri, Aug 23, 2024 at 05:11:00PM +0800, Zhe Qiao wrote:
> >>>> In the of_parse_and_init_cpus() function, when the __cpuid_to_hartid_map[]
> >>>> array records the CPU hardware ID, if the same CPU hardware attribute has
> >>>> been recorded, an error report is issued, thereby ensuring the uniqueness
> >>>> of the CPU hardware ID recorded in the __cpuid_to_hartid_map[] array.
> >>> Why is this actually required? On what system did you encounter this?
> >> This is not actually a patch submitted for problems encountered in actual
> >> development environments, but rather a comparison of ARM architecture when
> >> I was learning Linux kernel and found similar judgments on ARM architecture.
> > Okay, it's good that you didn't find such a bad dtb "in the wild" :)
> >
> >> In addition, if the same attribute exists on the CPU hardware ID and is
> >> recorded in __cpuid_to_hartid_map[], the kernel may need to make a judgment
> >> on this error.
> >>
> >>
> >>>> Signed-off-by: Zhe Qiao <qiaozhe at iscas.ac.cn>
> >>>> ---
> >>>> arch/riscv/kernel/smpboot.c | 16 ++++++++++++++++
> >>>> 1 file changed, 16 insertions(+)
> >>>>
> >>>> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> >>>> index 0f8f1c95ac38..698f9fe791f7 100644
> >>>> --- a/arch/riscv/kernel/smpboot.c
> >>>> +++ b/arch/riscv/kernel/smpboot.c
> >>>> @@ -118,6 +118,16 @@ static void __init acpi_parse_and_init_cpus(void)
> >>>> #define acpi_parse_and_init_cpus(...) do { } while (0)
> >>>> #endif
> >>>>
> >>>> +static bool __init is_mpidr_duplicate(unsigned int cpuid, u64 hart)
> >>>> +{
> >>>> + unsigned int i;
> >>>> +
> >>>> + for (i = 1; (i < cpuid) && (i < NR_CPUS); i++)
> >>>> + if (cpuid_to_hartid_map(i) == hart)
> >>>> + return true;
> >>>> + return false;
> >>>> +}
> >>>> +
> >>>> static void __init of_parse_and_init_cpus(void)
> >>>> {
> >>>> struct device_node *dn;
> >>>> @@ -131,6 +141,12 @@ static void __init of_parse_and_init_cpus(void)
> >>>> if (rc < 0)
> >>>> continue;
> >>>>
> >>>> + if (is_mpidr_duplicate(cpuid, hart)) {
> >>>> + pr_err("%pOF: duplicate cpu reg properties in the DT\n",
> >>>> + dn);
> >>>> + continue;
> >>> Why would we continue in this case? If the devicetree is this broken,
> >>> why shouldn't we just BUG() and abort immediately?
> >> This is because I did not find any judgment on this issue in the previous code
> >> during the analysis process, so I did not take more aggressive measures in this
> >> regard, but only issued an error alarm.
> > What do you think though? Should we continue to boot in this case?
> > If you read the function a bit further, you'll see that we abort boot
> > if there are two instances of the boot CPU. Do you think the same should
> > be done for all CPUs?
>
>
> Yes, I saw that if there are two boot CPUs, a BUG will occur. For all CPUs,
>
> if there are two CPUs with the same attributes, I think a bug should be generated
>
> directly. This will attract more attention than issuing false warnings, while also
>
> reducing the difficulty of troubleshooting related issues.
>
>
> These are some of my opinions, I don't know if they are reasonable to consider.
>
> What do you think?
I think that BUG()ing here would be reasonable to do.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20240827/ca81b388/attachment.sig>
More information about the linux-riscv
mailing list