[PATCH v4 5/6] media: Add controls for JPEG quantization tables

Ian Arkver ian.arkver.dev at gmail.com
Mon Sep 3 14:09:26 PDT 2018


Hi Ezequiel,

On 03/09/2018 18:15, Ezequiel Garcia wrote:
> On Mon, 2018-09-03 at 16:51 +0100, Ian Arkver wrote:
>> Hi Hans, Ezequiel,
>>
>> On 03/09/2018 16:33, Hans Verkuil wrote:
>>> On 09/03/2018 05:27 PM, Ezequiel Garcia wrote:
>>>> Hi Ian, Hans:
>>>>
>>>> On Mon, 2018-09-03 at 14:29 +0100, Ian Arkver wrote:
>>>>> Hi,
>>>>>
>>>>> On 03/09/2018 10:50, Hans Verkuil wrote:
>>>>>> On 08/31/2018 05:52 PM, Ezequiel Garcia wrote:
>>>>>>> From: Shunqian Zheng <zhengsq at rock-chips.com>
>>>>>>>
>>>>>>> Add V4L2_CID_JPEG_QUANTIZATION compound control to allow userspace
>>>>>>> configure the JPEG quantization tables.
>>>>>>>
>>>>>>> Signed-off-by: Shunqian Zheng <zhengsq at rock-chips.com>
>>>>>>> Signed-off-by: Ezequiel Garcia <ezequiel at collabora.com>
>>>>>>> ---
>>>>>>>     .../media/uapi/v4l/extended-controls.rst      | 23 +++++++++++++++++++
>>>>>>>     .../media/videodev2.h.rst.exceptions          |  1 +
>>>>>>>     drivers/media/v4l2-core/v4l2-ctrls.c          | 10 ++++++++
>>>>>>>     include/uapi/linux/v4l2-controls.h            |  5 ++++
>>>>>>>     include/uapi/linux/videodev2.h                |  1 +
>>>>>>>     5 files changed, 40 insertions(+)
>>>>>>>
>>>>>>> diff --git a/Documentation/media/uapi/v4l/extended-controls.rst b/Documentation/media/uapi/v4l/extended-controls.rst
>>>>>>> index 9f7312bf3365..e0dd03e452de 100644
>>>>>>> --- a/Documentation/media/uapi/v4l/extended-controls.rst
>>>>>>> +++ b/Documentation/media/uapi/v4l/extended-controls.rst
>>>>>>> @@ -3354,7 +3354,30 @@ JPEG Control IDs
>>>>>>>         Specify which JPEG markers are included in compressed stream. This
>>>>>>>         control is valid only for encoders.
>>>>>>>     
>>>>>>> +.. _jpeg-quant-tables-control:
>>>>>>>     
>>>>>>> +``V4L2_CID_JPEG_QUANTIZATION (struct)``
>>>>>>> +    Specifies the luma and chroma quantization matrices for encoding
>>>>>>> +    or decoding a V4L2_PIX_FMT_JPEG_RAW format buffer. The two matrices
>>>>>>> +    must be set in JPEG zigzag order, as per the JPEG specification.
>>>>>>
>>>>>> Can you change "JPEG specification" to a reference to the JPEG spec entry
>>>>>> in bibio.rst?
>>>>>>
>>>>>>> +
>>>>>>> +
>>>>>>> +.. c:type:: struct v4l2_ctrl_jpeg_quantization
>>>>>>> +
>>>>>>> +.. cssclass:: longtable
>>>>>>> +
>>>>>>> +.. flat-table:: struct v4l2_ctrl_jpeg_quantization
>>>>>>> +    :header-rows:  0
>>>>>>> +    :stub-columns: 0
>>>>>>> +    :widths:       1 1 2
>>>>>>> +
>>>>>>> +    * - __u8
>>>>>>> +      - ``luma_quantization_matrix[64]``
>>>>>>> +      - Sets the luma quantization table.
>>>>>>> +
>>>>>>> +    * - __u8
>>>>>>> +      - ``chroma_quantization_matrix[64]``
>>>>>>> +      - Sets the chroma quantization table.
>>>>>>
>>>>>> Just checking: the JPEG standard specifies this as unsigned 8-bit values as well?
>>>>
>>>> I thought this was already discussed, but I think the only thing I've added
>>>> is this comment in one of the driver's headers:
>>>>
>>>>    JPEG encoder
>>>>    ------------
>>>>    The VPU JPEG encoder produces JPEG baseline sequential format.
>>>>    The quantization coefficients are 8-bit values, complying with
>>>>    the baseline specification. Therefore, it requires application-defined
>>>>    luma and chroma quantization tables. The hardware does entrophy
>>>>    encoding using internal Huffman tables, as specified in the JPEG
>>>>    specification.
>>>>
>>>> Certainly controls should be specified better.
>>>>
>>>>> As far as I can see ISO/IEC 10918-1 does not specify the precision or
>>>>> signedness of the quantisation value Qvu. The default tables for 8-bit
>>>>> baseline JPEG all fit into __u8 though.
>>>>>
>>>>
>>>> Paragraph 4.7 of that spec, indicates the "sample" precision:
>>>> 8-bit for baseline; 8-bit or 12-bit for extended.
>>>>
>>>> For the quantization coefficients, the DQT segment contains a bit
>>>> that indicates if the quantization coefficients are 8-bit or 16-bit.
>>>> See B.2.4.1 for details.
>>
>> See below (and Tq which follows the Pq field)
>>
>>>>
>>>>> However there can be four sets of tables in non-baseline JPEG and it's
>>>>
>>>> You lost me here, which four sets of tables are you refering to?
>>>>
>>>>> not clear (to me) whether 12-bit JPEG would need more precision (I'd
>>>>> guess it would).
>>>>
>>>> It seems it would. From B.2.4.1:
>>>>
>>>> "An 8-bit DCT-based process shall not use a 16-bit precision quantization table."
>>>>
>>>>> Since this patch is defining UAPI I think it might be
>>>>> good to build in some additional information, eg. number of tables,
>>>>> element size. Maybe this can all be inferred from the selected pixel
>>>>> format? If so then it would need documented that the above structure
>>>>> only applies to baseline.
>>>>>
>>>>
>>>> For quantization coefficients, I can only see two tables: one for luma
>>>> one for chroma. Huffman coefficients are a different story and we are
>>>> not really adding them here.
>>
>> I was looking at the definition of Tqi in the frame header in B.2.2
>> which seems to allow up to four (sets of?) quantization tables.
>> Rereading it, it seems these are per component. Table B.2 implies that
>> this applies to Baseline Sequential too. In the DQT marker description
>> there's a Tq field to specify the destination for the new table. I think
>> this means that an encoder can use up to four (sets of) tables and a
>> decoder should be able to store four from the stream.
>>
>> This may well not be relevant to the encoder under discussion, but if
>> it's not allowed for in UAPI it's almost a given that it'll need to be
>> added later.
>>
> 
> Hm, I think it's not really relevant for us, either on the encoding
> or decoding side. Let me explain how I read the spec.
> 
> First of all, keep in mind it seems to be written with streams
> in mind, which explains different start-of-image and start-of-frames
> segments (SOI and SOF).
> 
> The way I read the four tables thing, the decoder expects to be first
> transmitted a set of quantization tables, via DQT segments. Each DQT
> segment contains a 4-bit index (0-3), allowing up to four quantization
> tables to be defined.
> 
> Then, each frame is transmitted in a SOF segment. The SOF header
> contains an 8-bit field (Tq_i) indicating which quantization table has
> to be used for this frame.
> 
> In Video4Linux, we are frame-based, and the JPEG segments have no meaning,
> because these controls are meant to be used with the JPEG RAW format,
> as specified.
> 
> For the decoder side, the application is expected to parse all the segments,
> and set quantization tables and compressed bitstream, via legacy or
> request APIs (probably the latter).
> 
> Same goes for encoding, the user will set a quantization table for the
> encoding process and then take care of the JPEG segments creation.

I hadn't got my head round only one table being in use at a time for 
this frame based API, nor had I appreciated that all the DQT 
parsing/creation etc was handled in userspace.

So, yeah, I agree that this JPEG control only needs a single luma and 
chroma pair of tables for this usage scenario.

Thanks for the explanation.
Ian.

> 
>> BTW, my copy of the spec is dated 1993(E). Maybe it's out of date?
>>
> 
> My copy is also 1992, so even more dated :-)
> 
>>>
>>> Since (if I understand this correctly) we would need u16 for extended precision
>>> JPEG, shouldn't we use u16 instead of u8? That makes the control more generic.
>>
>> This seems like a safer option to me.
>>
> 
> Yes, I agree too.
> 
>>
>>>
>>> BTW, are the coefficients always unsigned? I think so, but I never read the
>>> JPEG spec.
>>>
> 
> I don't think signed quantization coefficients make sense, so perhaps
> this is not explicitly specified?
> 
> Regards,
> Eze
> 



More information about the Linux-rockchip mailing list