[PATCH] perf: RISC-V: fix IRQ detection on T-Head C908

Mon Mar 18 17:48:13 PDT 2024

On 3/18/24 16:48, Conor Dooley wrote:
> On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote:
>> On 3/15/24 01:11, Andrew Jones wrote:
>>> On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote:
>>> ...
>>>> IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or
>>>> "<vendor>,cpu-errata". It can achieve almost everything like using pseudo
>>>> isa. And the only cost I think is a small amount code to parse this.
>>>>
>>>
>>> What's the ACPI equivalent for this new DT property? If there isn't one,
>>> then the cost is also to introduce something to the ACPI spec and add the
>>> ACPI parsing code.
>>>
>>> I'd much rather we call specified behaviors "extensions", whether they
>>> are vendor-specific or RVI standard, and then treat all extensions the
>>> same way in hardware descriptions and Linux. It'd also be best if errata
>>> in extension implementations were handled by replacing the extension in
>>> the hardware description with a new name which is specifically for the
>>> behavior Linux should expect. (Just because two extensions are almost the
>>> same doesn't mean we should say we have one and then have some second
>>> mechanism to say, "well, not really, instead of that, it's this". It's
>>> cleaner to just remove the extension it doesn't properly implement from
>>> its hardware description and create a name for the behavior it does have.)
>>>
>>> Errata in behaviors which don't have extension names (are hopefully few)
>>> and are where mvendorid and friends would need to be checked, but then why
>>> not create a pseudo extension name, as Conor suggests, so the rest of
>>> Linux code can manage errata the same way it manages every other behavior?
>>>
>>> The growth rate of the ISA bitmap is worth thinking about, though, since
>>> we have several copies of it (at least one "all harts" bitmap, one bitmap
>>> for each hart, another one for each vcpu, and then there's nested virt...)
>>> We don't have enough extensions to worry about it now, but we can
>>> eventually try partitioning, using common maps for common bits, not
>>> storing bits which can be inferred from other bits, etc.
>>
>> This is my biggest worry going forward. We already have a ever growing
>> standard RVI extension list. On top of that we have genuine vendor
>> extensions. IMHO, errata are bit different than extensions as there will be
>> few vendor extensions in the future but many hardware erratas :)
> 
> I dunno, I think there's going to be plenty of both. We may not see (or
> use) a lot of vendor extensions in mainline Linux, but they will exist.
> 

I hope that will happen. But I fear we will have lot of vendor 
extensions in mainline Linux. That is the "sad reality" I was talking 
about at the end of the thread.

>> If we start calling every hardware errata as an pseudo ISA extensions, we
>> will much bigger problem maintaining it in the future.
> 
> I've explained to you at least once already that this is not my goal.
> Where there are genuine issues with the implementation of an extension
> creating a "pseudo" extension is not what I am suggesting we do.
> I have no problem with with the approach taken for the SiFive errata,
> for example.
> 

Thanks for clarifying. But we have to define the rules what gets in as 
pseudo extension very clearly to avoid any kind of abuse in the future.

>> We discussed this earlier during the Andes PMU extension series[1] as well.
>> We have three types of extensions in discussions now.
>>
>> 1. standard RVI extensions
>> 2. Vendor extensions
>> 	a. Genuine vendor extension
>> 	b. Vendor erratas which can be described as pseudo-extensions now
> 
>> Keeping all these within a single ISA bitmap space seems very odd to me.
>> I think the feasible approach would be to partition the standard and vendor
>> ISA extension space as you suggested.
> 
> Let's be clear - partitioning the space is unrelated to the detection
> method. We can go ahead and partition the space and use "pseudo"
> extensions or we can have a unified space but use archid/impid for
> detection. Having a unified space is the simpler thing to implement
> right now, but it totally does not stop us breaking them out in the
> future. We could even gate these custom implementations behind config
> options if bloat is a concern - but multiplatform kernels are likely to
> enable all the options anyway.
> 

Agreed.

>> For 2.b, either we can start defining pseudo extensions or adding
>> vendor/arch/impid checks.
>>
>> @Conor: You seems to prefer the earlier approach instead of adding the
>> checks. Care to elaborate why do you think that's a better method compared
>> to a simple check ?
> 
> Because I don't think that describing these as "errata" in the first
> place is even accurate. This is not a case of a vendor claiming they
> have Sscofpmf support but the implementation is flawed. As far as I
> understand, this is a vendor creating a useful feature prior to the
> creation of a standard extension.
> A bit of a test for this could be "If the standard extension never
> existed, would this be considered a new feature or an implementation
> issue". I think this is pretty clearly in the former camp.
> 

So we have 3 cases.

1. Pseudo extension: An vendor extension designed and/or implemented 
before the standard RVI extension was ratified but do not violate any 
standard encoding space.

2. Erratas: An genuine bug/design issue in the expected behavior from a 
standard RVI extension (including violating standard encoding space)

3. Vendor extension: A new or a variant of standard RVI extension which 
is different enough from standard extension.

IMO, the line between #2 and #1 may get blurry as we going forward 
because of the sheer number of small extensions RVI is comping up with 
(which is a problem as well).

Just to clarify: I am not too worried about this particular case as we 
know that T-head's implementation predates the Sscofpmf extension.
But once we define a standard mechanism for this kind of situation, 
vendor may start to abuse it.

> I do not think we should be using m*id detection implementations of a
> feature prior to creation of a standard extension for the same purpose.
> To me the main difference between a case like this and VentanaCondOps/Zicond
> is that we are the ones calling this an extension (hence my use of pseudo)
> and not the vendor of the IP. If T-Head were to publish a document tomorrow
> on the T-Head github repo for official vendor extensions, that difference
> would not even exist any longer.
> 

Exactly! If vendor publishes these as an extension or an errata, that's 
a binding agreement to call it in a specific way.

> I also do not believe that it is a "simple" check. The number of
> implementations that could end up using this PMU could just balloon
> if T-Head has no intention of switching to Sscofpmf. If they don't
> balloon in this case, there's nothing stopping them ballooning in a

Ideally, they shouldn't as it a simple case of CSR number & IRQ number.
If they care to implement AIA, then they must change it to standard 
sscofpmf as the current IRQ violates the AIA spec. But who knows if they 
care to implement AIA or not.

> similar case in the future. We should let the platform firmware tell  > explicitly, be that via DT or ACPI, what features are supported rather
> than try to reverse engineer it ourselves via m*id.
>
Fair enough.

> That leads into another general issue I have with using m*id detection,
> which I think I have mentioned several times on the list - it prevents the
> platform (hypervisor, emulator or firmware) from disabling that feature.
> 

If that is the only concern, platform can just disable the actual 
extension(i.e. sscofpmf in this case) to disable that feature for that 
particular vendor.

> If I had a time machine back to when the T-Head perf or cmo stuff was
> submitted, I was try to avoid any of it being merged with the m*id
> detection method.
> 
>> I agree that don't have the crystal ball and may be proven wrong in the
>> future (I will be definitely happy about that!). But given the diversity of
>> RISC-V ecosystem, I feel that may be our sad reality.
> 
> I don't understand what this comment is referring to, it lacks context
> as to what the sad reality actually is.
> 
> I hope that all made sense and explained why I am against this method
> for detecting what I believe to be features rather than errata,
> Conor.
> 

Yes.Thanks again for the clarification. Again, I am not opposed to the 
idea. I just wanted to understand if this is the best option we have 
right now.

> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv