[RFC RESEND 0/3] vp9 v4l2 stateless uapi

Hans Verkuil hverkuil at xs4all.nl
Thu Apr 29 10:23:02 BST 2021


On 27/04/2021 01:34, Ezequiel Garcia wrote:
> On Mon, 26 Apr 2021 at 14:38, Nicolas Dufresne <nicolas at ndufresne.ca> wrote:
>>
>> Le lundi 26 avril 2021 à 09:38 +0200, Hans Verkuil a écrit :
>>> Hi Andrzej,
>>>
>>> Thank you for working on this!
>>>
>>> On 21/04/2021 12:00, Andrzej Pietrasiewicz wrote:
>>>> Dear All,
>>>>
>>>> This is an RFC on stateless uapi for vp9 decoding with v4l2. This work is based on https://lkml.org/lkml/2020/11/2/1043, but has been substantially reworked. The important change is that the v4l2 control used to pass boolean decoder probabilities has been made unidirectional, and is now called V4L2_CID_STATELESS_VP9_COMPRESSED_HDR_PROBS.
>>>>
>>>> In the previous proposal, to queue a frame the userspace must fully dequeue the previous one, which effectively results in a forced lockstep behavior and defeats vb2's capability to enqueue multiple buffers. Such a design was a consequence of backward probability updates being performed by the kernel driver (which has direct access to appropriate counter values) but forward probability updates being coupled with compressed header parsing performed by the userspace.
>>>>
>>>> In vp9 the boolean decoder used to decode the bitstream needs certain parameters to work. Those are probabilities, which change with each frame. After each frame is decoded it is known how many times a given symbol occured in the frame, so the probabilities can be adapted. This process is known as backward probabilities update. A next frame header can also contain information which modifies probabilities resulting from backward update. The said modification is called forward probabilities update. The data for backward update is generated by the decoder hardware, while the data for forward update is prepared by reading the compressed frame header. The natural place to parse something is userspace, while the natural place to access hardware-provided counters is the kernel. Such responsibilties assignment was used in the original work.
>>>>
>>>> To overcome the lockstep, we moved forward probability updates to the kernel, while leaving parsing them in userspace. This way the v4l2 control which is used to pass the probs becomes unidirectional (user->kernel) and the userspace can keep parsing and enqueueing succeeding frames.
>>>>
>>>> If a particular driver parses the compressed header and does backward probability updates on its own then V4L2_CID_STATELESS_VP9_COMPRESSED_HDR_PROBS does not need to be used.
>>>>
>>>> This series adds vp9 uapi in proper locations, which means it is a proper, "official" uapi, as opposed to staging uapi which was proposed in the above mentioned lkml thread.
>>>
>>> Why? I rather liked the way that the other codec APIs started life in a private header
>>> (like include/media/vp8-ctrls.h) and were given time to mature before moving them to
>>> the uAPI. Is there a reason why you think that VP9 doesn't need that?
>>
>> I'll be honest, I accepted early code into GStreamer for H264, and it ended up
>> in a nightmare for the users. We now have a released GStreamer that supports
>> kernel API up to 5.9, a blackwhole at 5.10 and finally master catched up and can
>> support 5.11+. It is so complicated for packagers to understand what is going
>> on, that they endup wasting a lot of their time for a single feature in their
>> OS. Same breakage is happening for VP8 in 5.13, even though VP8 has been working
>> great all this time. I will for sure for now on ignore any contribution that
>> depends on staged uAPI.
>>
>> As for FFMPEG, even though now H264 API is table, the maintainers just simply
>> ignore the patches as they have been bitten by the reviewing stuff based on
>> unstable APIs and downstream work.
>>
>> I believe the staged uAPI has been used wrongly in the past. Stuff has been
>> staged quicky right before associated project budget for it was exhausted, so it
>> was in the end a way to look good, and someone else had to pick it up and finish
>> it. Going straight for final API put more pressure on making good research from
>> the start, doing more in-depth reviews and avoiding delaying for multiple years
>> the support. I believe the staging API are confusing even for the Linux
>> projects. Going straight to stable here is a commitment to finish this work and
>> doing it correctly.
>>
>> This specially make sense for VP9, which is a very Open CODEC and were all HW
>> implementation are Google/Hantro derivatives. Also, unlike when this work all
>> started, we do have multiple HW we can look at to validate the API, with more
>> then enough in-depth information to make the right decisions.
>>
> 
> +1
> 
> Although I can understand how, from the kernel point of view, it's
> tempting to merge
> the uAPI as staging first and then de-stage it, I have to say that I
> agree fully with
> Nicolas, the experience wasn't really good for the userspace.

It was a completely new API and it took quite a long time to really understand
what was needed and how to get it right. Not to mention implement it for
different platforms. Both H264 and MPEG-2 saw major changes. VP8 was the exception,
so it might well be that VP9 is equally good at the first attempt.

> 
> I really hope we can do better than this for at least VP9. So, let's make sure
> the hardware decoders that are currently available (Rockchip,
> Verisilicon, Mediatek)
> are covered, as well as any future features (dynamic frame resize).

Sure, if we can have this supported on several platforms and it is well reviewed,
then I am not opposed to merging it as a public API without going through
staging. We have build up a lot of experience by now.

> 
> A well-thought, honest effort for a sane uAPI is IMO the right way,
> and if we find out
> something is missing (which may happen, as we are all humans), we can still
> introduce another API control (V4L2_CID_STATELESS_VP9_V2) and use it
> to supersede the current API. If I understand correctly, this should work,
> and allow backward compatibility without issues.

Yes, but it is something we'd like to avoid. You need to have sufficient
confidence that the uAPI has been well tested and is in good shape. If you
have that, then great, we can merge it without going through staging.

In particular, if 1) we support at least two HW platforms, and 2) testing
with test suites (I assume those are available for VP9) passed on those
platforms, then I'd be happy to merge.

Regarding the 'userspace mess': why did support for staging APIs end up
in released gstreamer/ffmpeg implementations at all? What did you expect
would happen? Released versions of those applications should only support
public APIs, not staging APIs. Unless perhaps if you explicitly enable it
with some test config option that is by default off. And once a codec API
becomes public, rip out the old code.

I don't use gstreamer/ffmpeg myself, so I hadn't noticed. I thought people
were testing using test branches of those apps. It's staging for a reason,
you are completely on your own if you use it.

I wonder if I should make a patch that issues some really serious messages
in the kernel log if someone attempts to use these staging APIs.

Of course, ideally you would have a perfect API from the beginning, but that
wasn't an option here, and keeping everything out of the kernel entirely until
we're happy with the uAPI would be almost impossible to maintain.

Regards,

	Hans



More information about the Linux-rockchip mailing list