[RFC v2 6/8] media: uapi: Remove bit_size field from v4l2_ctrl_hevc_slice_params

John Cox jc at kynesim.co.uk
Wed Feb 16 02:38:02 PST 2022


On Tue, 15 Feb 2022 21:27:30 +0100, you wrote:

>Dne torek, 15. februar 2022 ob 17:31:28 CET je John Cox napisal(a):
>> On Tue, 15 Feb 2022 17:11:12 +0100, you wrote:
>> 
>> >Dne torek, 15. februar 2022 ob 17:00:33 CET je John Cox napisal(a):
>> >> On Tue, 15 Feb 2022 10:28:55 -0500, you wrote:
>> >> 
>> >> >Le mardi 15 février 2022 à 14:50 +0000, John Cox a écrit :
>> >> >> On Tue, 15 Feb 2022 15:35:12 +0100, you wrote:
>> >> >> 
>> >> >> > 
>> >> >> > Le 15/02/2022 à 15:17, John Cox a écrit :
>> >> >> > > Hi
>> >> >> > > 
>> >> >> > > > The bit size of the slice could be deduced from the buffer 
>payload
>> >> >> > > > so remove bit_size field to avoid duplicated the information.
>> >> >> > > I think this is a bad idea. In the future we are (I hope) going to 
>> >want
>> >> >> > > to have an array (variable) of slice headers all referring to the 
>> >same
>> >> >> > > bit buffer.  When we do that we will need this field.
>> >> >> > 
>> >> >> > I wonder if that could be considering like another decode mode and 
>so
>> >> >> > use an other control ?
>> >> >> 
>> >> >> I, personally, would be in favour of making the slice header control a
>> >> >> variable array just as it is.  If userland can't cope with multiple
>> >> >> entries then just send them one at a time and the code looks exactly
>> >> >> like it does at the moment and if the driver can't then set max array
>> >> >> entries to 1.
>> >> >> 
>> >> >> Having implemented this in rpi port of ffmpeg and the RPi V4L2 driver I
>> >> >> can say with experience that the code and effort overhead is very low.
>> >> >> 
>> >> >> Either way having a multiple slice header control in the UAPI is
>> >> >> important for efficiency.
>> >> >
>> >> >Just to clarify the idea, we would have a single slice controls, always 
>> >dynamic:
>> >> >
>> >> >1. For sliced based decoder
>> >> >
>> >> >The dynamic array slice control is implemented by the driver and its 
>size 
>> >must
>> >> >be 1.
>> >> 
>> >> Yes
>> >> 
>> >> >2. For frame based decoder that don't care for slices
>> >> >
>> >> >The dynamic array slice controls is not implement. Userland detects that 
>at
>> >> >runtime, similar to the VP9 compressed headers.
>> >> 
>> >> If the driver parses all the slice header then that seems plausible
>> >> 
>> >> >3. For frame based decoders that needs slices (or driver that supports 
>> >offset
>> >> >and can gain performance with such mode)
>> >> >
>> >> >The dynamic array slice controls is implemented, and should contain all 
>the
>> >> >slices found in the OUTPUT buffer.
>> >> >
>> >> >So the reason for this bit_size (not sure why its bits though, perhaps 
>> >someone
>> >> >can educate me ?)
>> >> 
>> >> RPi doesn't need bits and would be happy with bytes however
>> >> slice_segment data isn't byte aligned at the end so its possible that
>> >> there might be decoders out there that want an accurate length for that.
>> >
>> >There are two fields, please don't mix them up:
>> >
>> >__u32	bit_size;
>> >__u32	data_bit_offset; (changed to data_byte_offset in this series)
>> >
>> >data_bit_offset/data_byte_offset is useful, while bit_size is IMO not. If you 
>> >have multiple slices in array, you only need to know start of the slice 
>data 
>> >and that offset is always offset from start of the buffer (absolute, it's not 
>> >relative to previous slice data).
>> 
>> No... or at least I think not. RPi needs the start and end of the
>> slice_segment_data elements of each slice. 
>
>It would be good to know if size needs to be exact or can overshoot, like 
>using end of buffer for that.
>
>Cedrus also wants to know slice data size, but it turns out that bigger than 
>necessary size doesn't pose any problems. If that's not the case, then 
>bit_size needs stay in for sure.

RPi needs the exact size (bytes will do - but I don't see the harm in
specifying it in bits in case some future h/w wants the extra precision
as long as we nail down exactly which bit we mean)

Regards

JC

>Best regards,
>Jernej
>
>> If slices are arranged in the
>> buffer with slice_segment_headers attached then I don't see how I get to
>> know that.  Also if the OUTPUT buffer is just a bit of bitstream, which
>> might well be very convienient for some userspace, then it is legitimate
>> to have SEIs between slice headers so you can't even guarantee that your
>> coded slice segments are contiguous.
>> 
>> Regards
>> 
>> JC
>> 
>> >Best regards,
>> >Jernej
>> >
>> >> 
>> >> > Would be to let the driver offset inside the the single
>> >> >OUTPUT/bitstream buffer in case this is not automatically found by the 
>> >driver
>> >> >(or that no start-code is needed). Is that last bit correct ? If so, 
>should 
>> >we
>> >> >change it to an offset rather then a size ? Shall we allow using offesets 
>> >inside
>> >> >larger buffer (e.g. to avoid some memory copies) for the Sliced Base 
>cases ?
>> >> 
>> >> I use (in the current structure) data_bit_offset to find the start of
>> >> each slice's slice_segment_data within the OUTPUT buffer and bit_size to
>> >> find the end.  RPi doesn't / can't parse the slice_header and so wants
>> >> all of that.  Decoders that do parse the header might plausably want
>> >> header offsets too and it would facilitate zero copy of the bit buffer.
>> >> 
>> >>  
>> >> >> Regards
>> >> >> 
>> >> >> John Cox
>> >> >> 
>> >> >> > > > Signed-off-by: Benjamin Gaignard 
><benjamin.gaignard at collabora.com>
>> >> >> > > > ---
>> >> >> > > > .../userspace-api/media/v4l/ext-ctrls-codec.rst       |  3 ---
>> >> >> > > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c      | 11 +++
>> >+-------
>> >> >> > > > include/uapi/linux/v4l2-controls.h                    |  3 +--
>> >> >> > > > 3 files changed, 5 insertions(+), 12 deletions(-)
>> >> >> > > > 
>> >> >> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-
>> >codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> >> > > > index 3296ac3b9fca..c3ae97657fa7 100644
>> >> >> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> >> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> >> > > > @@ -2965,9 +2965,6 @@ enum 
>v4l2_mpeg_video_hevc_size_of_length_field 
>> >-
>> >> >> > > >      :stub-columns: 0
>> >> >> > > >      :widths:       1 1 2
>> >> >> > > > 
>> >> >> > > > -    * - __u32
>> >> >> > > > -      - ``bit_size``
>> >> >> > > > -      - Size (in bits) of the current slice data.
>> >> >> > > >      * - __u32
>> >> >> > > >        - ``data_bit_offset``
>> >> >> > > >        - Offset (in bits) to the video data in the current slice 
>> >data.
>> >> >> > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/
>> >drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> >> > > > index 8ab2d9c6f048..db8c7475eeb8 100644
>> >> >> > > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> >> > > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> >> > > > @@ -312,8 +312,8 @@ static void cedrus_h265_setup(struct 
>cedrus_ctx 
>> >*ctx,
>> >> >> > > > 	const struct v4l2_hevc_pred_weight_table 
>*pred_weight_table;
>> >> >> > > > 	unsigned int width_in_ctb_luma, ctb_size_luma;
>> >> >> > > > 	unsigned int log2_max_luma_coding_block_size;
>> >> >> > > > +	size_t slice_bytes;
>> >> >> > > > 	dma_addr_t src_buf_addr;
>> >> >> > > > -	dma_addr_t src_buf_end_addr;
>> >> >> > > > 	u32 chroma_log2_weight_denom;
>> >> >> > > > 	u32 output_pic_list_index;
>> >> >> > > > 	u32 pic_order_cnt[2];
>> >> >> > > > @@ -370,8 +370,8 @@ static void cedrus_h265_setup(struct 
>cedrus_ctx 
>> >*ctx,
>> >> >> > > > 
>> >> >> > > > 	cedrus_write(dev, VE_DEC_H265_BITS_OFFSET, 0);
>> >> >> > > > 
>> >> >> > > > -	reg = slice_params->bit_size;
>> >> >> > > > -	cedrus_write(dev, VE_DEC_H265_BITS_LEN, reg);
>> >> >> > > > +	slice_bytes = vb2_get_plane_payload(&run->src-
>>vb2_buf, 0);
>> >> >> > > > +	cedrus_write(dev, VE_DEC_H265_BITS_LEN, slice_bytes);
>> >> >> > > I think one of these must be wrong. bit_size is in bits,
>> >> >> > > vb2_get_plane_payload is in bytes?
>> >> >> > 
>> >> >> > You are right it should be vb2_get_plane_payload() * 8 to get the 
>size 
>> >in bits.
>> >> >> > 
>> >> >> > I will change that in v3.
>> >> >> > 
>> >> >> > > 
>> >> >> > > Regards
>> >> >> > > 
>> >> >> > > John Cox
>> >> >> > >   
>> >> >> > > > 	/* Source beginning and end addresses. */
>> >> >> > > > 
>> >> >> > > > @@ -384,10 +384,7 @@ static void cedrus_h265_setup(struct 
>> >cedrus_ctx *ctx,
>> >> >> > > > 
>> >> >> > > > 	cedrus_write(dev, VE_DEC_H265_BITS_ADDR, reg);
>> >> >> > > > 
>> >> >> > > > -	src_buf_end_addr = src_buf_addr +
>> >> >> > > > -			   DIV_ROUND_UP(slice_params-
>>bit_size, 
>> >8);
>> >> >> > > > -
>> >> >> > > > -	reg = 
>VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_end_addr);
>> >> >> > > > +	reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_addr + 
>slice_bytes);
>> >> >> > > > 	cedrus_write(dev, VE_DEC_H265_BITS_END_ADDR, reg);
>> >> >> > > > 
>> >> >> > > > 	/* Coding tree block address */
>> >> >> > > > diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/
>> >linux/v4l2-controls.h
>> >> >> > > > index b1a3dc05f02f..27f5d272dc43 100644
>> >> >> > > > --- a/include/uapi/linux/v4l2-controls.h
>> >> >> > > > +++ b/include/uapi/linux/v4l2-controls.h
>> >> >> > > > @@ -2457,7 +2457,6 @@ struct v4l2_hevc_pred_weight_table {
>> >> >> > > > #define V4L2_HEVC_SLICE_PARAMS_FLAG_DEPENDENT_SLICE_SEGMENT	
>> >(1ULL << 9)
>> >> >> > > > 
>> >> >> > > > struct v4l2_ctrl_hevc_slice_params {
>> >> >> > > > -	__u32	bit_size;
>> >> >> > > > 	__u32	data_bit_offset;
>> >> >> > > > 
>> >> >> > > > 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header 
>*/
>> >> >> > > > @@ -2484,7 +2483,7 @@ struct v4l2_ctrl_hevc_slice_params {
>> >> >> > > > 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing 
>SEI message 
>> >*/
>> >> >> > > > 	__u8	pic_struct;
>> >> >> > > > 
>> >> >> > > > -	__u8	reserved;
>> >> >> > > > +	__u8	reserved[5];
>> >> >> > > > 
>> >> >> > > > 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice 
>segment 
>> >header */
>> >> >> > > > 	__u32	slice_segment_addr;
>> >> 
>> >
>> 
>



More information about the linux-arm-kernel mailing list