[RFC PATCH v6] media: mediatek: vcodec: support stateless AV1 decoder

Xiaoyong Lu (卢小勇) Xiaoyong.Lu at mediatek.com
Mon Jan 2 23:57:21 PST 2023


Hi Andrzej,
> Hi,
> 
> W dniu 29.11.2022 o 12:50, Xiaoyong Lu (卢小勇) pisze:
> > Hi Andrzej,
> > 
> > Thanks for your comments, I have fixed all except for some items
> > which
> > need more discussion with you.
> > 
> > 
> > 1.
> > > +    for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > +        if (instance->cdf[i].va)
> > > +            mtk_vcodec_mem_free(ctx, &instance->cdf[i]);
> > > +        instance->cdf[i].size = size;
> > > +        ret = mtk_vcodec_mem_alloc(ctx, &instance->cdf[i]);
> > > +        if (ret)
> > > +            goto err;
> > > +        work_buffer[i].cdf_addr.buf = instance->cdf[i].dma_addr;
> > > +        work_buffer[i].cdf_addr.size = size;
> > > +    }
> > 
> > The 3 for loops are supposed to iterate from 0 to
> > AV1_MAX_FRAME_BUF_COUNT - 1,
> > inclusive. Is it possible to merge them?
> > 
> > -->can you help to explain more clearly, and what  does  "3" stand
> > for?
> > 
> 
> In vdec_av1_slice_alloc_working_buffer() there are 3 similar "for"
> loops
> which (are supposed to) iterate the same number of times.
> I was wondering if was possible to collapse them into one loop.
OK,it can collapse into one loop.

> > 2.
> > > +
> > > +    ret = mtk_vcodec_mem_alloc(ctx, &instance->tile);
> > > +    if (ret)
> > > +        goto err;
> > > +
> > > +    vsi->tile.buf = instance->tile.dma_addr;
> > > +    vsi->tile.size = size;
> > 
> > vsi->tile.size = instance->tile.size;
> > 
> > and now it is clear the size in vsi is the same as the one in
> > instance.
> > BTW, is vsi->tile.size supposed to always be equal to the one in
> > instance?
> > If yes:
> >    - Is instance available whenever we need to access vsi-
> > >tile.size?
> >    - What's the point of duplicating this value? Can it be stored
> > in one
> > place?
> >    
> > --> vsi->tile.size is always equal to instance->tile.size, but they
> > should not be stored in one place. because vsi will be used for
> > micro-
> > processor,and instance is only used in kernel side.
> 
> Makes sense.
> 
> > Also I can remove following two statements.
> > > +    vsi->tile.buf = instance->tile.dma_addr;
> > > +    vsi->tile.size = size;
> > 
> > because  it will re-assign values for each frame before entering to
> > micro-processor in vdec_av1_slice_setup_lat_buffer.
> >    
> 
> the fewer lines to review and compile, the better :)
> 
> > 3.
> > > +static void vdec_av1_slice_free_working_buffer(struct
> > 
> > vdec_av1_slice_instance *instance)
> > > +{
> > > +    struct mtk_vcodec_ctx *ctx = instance->ctx;
> > > +    int i;
> > > +
> > > +    for (i = 0; i < ARRAY_SIZE(instance->mv); i++)
> > > +        if (instance->mv[i].va)
> > > +            mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> > 
> > Perhaps mtk_vcodec_mem_free() can properly handle the case
> > 
> > (!instance->mv[i].va) ? This would eliminate 7 of 20 lines of code
> > in this function.
> > 
> > --> I think it will be better to add this , because
> > 1.if mtk_vcodec_mem_free is changed by someone later and donnot
> > do  va
> > null check,  it will ensure safely in caller.
> 
> There's no stable internal api in the kernel. Whoever changes
> internal api,
> they are responsible to make sure eiter its users are unaffected,
> or to change all users accordingly. And you can't think of all
> possible
> crazy ways people might want to change the code 5 or 10 or 20 years
> from now. In other words, trying to prevent a situation that does
> not exist in exchange for more lines of code seems not a good deal.
> 
> > 2.If I donnot do  va null check,  there will print an error log in
> > mtk_vcodec_mem_free  when va is null. But it may be normal behivor.
> >       if (!mem->va) {
> >          mtk_v4l2_err("%s dma_free size=%ld failed!",
> > dev_name(dev),
> >                   size);
> >          return;
> > }
> > what about your opinion?
> > 
> 
> What I meant by "properly handling the NULL case" was to be prepared
> to handle the NULL case in mtk_vcodec_mem_free(), that is to assume
> that NULL can as well be passed to it. Given NULL is allowed, there's
> no point in announcing that "we've been passed a NULL".
> That's how kfree() behaves, for example.
OK, I will remove null check here.

> > 4.
> > 
> >   +static void vdec_av1_slice_setup_gm(struct vdec_av1_slice_gm
> > *gm,
> > > +                    struct v4l2_av1_global_motion *ctrl_gm)
> > > +{
> > > +    u32 i, j;
> > > +
> > > +    for (i = 0; i < V4L2_AV1_TOTAL_REFS_PER_FRAME; i++) {
> > > +        gm[i].wmtype = ctrl_gm->type[i];
> > > +        for (j = 0; j < 6; j++)
> > 
> > Maybe #define this magic 6?
> > 
> > --> I  see uapi  driver use magic number 6 too, which is in
> > struct v4l2_av1_global_motion {
> >      __u8 flags[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> >      enum v4l2_av1_warp_model type[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> >      __u32 params[V4L2_AV1_TOTAL_REFS_PER_FRAME][6];
> >      __u8 invalid;
> > };
> >  From av1 official document in 5.9.24 Global motion params syntax
> > section, it also use magic number 6.
> > 
> > so i think it will not need to change this.
> > 
> 
> fair enough
> 
> > 5.
> > static void vdec_av1_slice_setup_quant(struct
> > vdec_av1_slice_quantization *quant,
> > > +                       struct v4l2_av1_quantization *ctrl_quant)
> > > +{
> > > +    quant->base_q_idx = ctrl_quant->base_q_idx;
> > > +    quant->delta_qydc = ctrl_quant->delta_q_y_dc;
> > > +    quant->delta_qudc = ctrl_quant->delta_q_u_dc;
> > > +    quant->delta_quac = ctrl_quant->delta_q_u_ac;
> > > +    quant->delta_qvdc = ctrl_quant->delta_q_v_dc;
> > > +    quant->delta_qvac = ctrl_quant->delta_q_v_ac;
> > > +    quant->qm_y = ctrl_quant->qm_y;
> > > +    quant->qm_u = ctrl_quant->qm_u;
> > > +    quant->qm_v = ctrl_quant->qm_v;
> > 
> > Can a common struct be introduced to hold these parameters?
> > And then copied in one go?
> > 
> > Maybe there's a good reason the code is the way it is now. However,
> > a series of "dumb" assignments (no value modifications) makes me
> > wonder.
> > 
> >   +        cdef->cdef_y_strength[i] =
> > > +            ctrl_cdef->y_pri_strength[i] <<
> > 
> > SECONDARY_FILTER_STRENGTH_NUM_BITS |
> > > +            ctrl_cdef->y_sec_strength[i];
> > > +        cdef->cdef_uv_strength[i] =
> > > +            ctrl_cdef->uv_pri_strength[i] <<
> > 
> > SECONDARY_FILTER_STRENGTH_NUM_BITS |
> > > +            ctrl_cdef->uv_sec_strength[i];
> > > +    }
> > > +}
> > 
> > Both vdec_av1_slice_setup_lf() and vdec_av1_slice_setup_cdef():
> > 
> > I'm wondering if the user of struct vdec_av1_slice_loop_filter and
> > struct
> > vdec_av1_slice_cdef could work with the uAPI variants of these
> > structs?
> > Is there
> > a need for driver-specific mutations? (Maybe there is, the driver's
> > author
> > should know).
> > 
> > -->1.There is a mediatek map structure in kernel part which also
> > has
> > the same structure in micro-processor.
> > 2.Uapi v4l2 structure will not access in micro-processor, and
> > mediatek
> > driver structure may not be all the same with v4l2.
> > 
> > so these will not need to change.
> 
> You know your hardware. If you say so, you must be right.
> 
> > 
> > 6.
> > > +    /* bs NULL means flush decoder */
> > > +    if (!bs)
> > > +        return vdec_av1_slice_flush(h_vdec, bs, fb, res_chg);
> > > +
> > > +    lat_buf = vdec_msg_queue_dqbuf(&ctx->msg_queue.lat_ctx);
> > > +    if (!lat_buf) {
> > > +        mtk_vcodec_err(instance, "failed to get AV1 lat buf\n");
> > 
> > there exists vdec_msg_queue_deinit(). Should it be called in this
> > (and
> > subsequent) error recovery path(s)?
> > 
> > > +        return -EBUSY;
> > > +    }
> > 
> > -->vdec_msg_queue_deinit is called when do deinit operation before
> > stop
> > streaming.
> > But here is just error handle for current frame,  so it shouldnot
> > call
> > vdec_msg_queue_deinit here.
> 
> This is confusing. It would be better if vdec_msg_queue_deinit() were
> called from another place, not from the function which is called
> per-frame. The "mirror location" for "before stop streaming" seems
> "after start streaming", but I don't know.
Now, vdec_msg_queue_deinit() is called only once druing one playing at
properly location which is not from the function for per-frame. 

> > 7.
> > > +    src = v4l2_m2m_next_src_buf(instance->ctx->m2m_ctx);
> > > +    if (!src)
> > > +        return -EINVAL;
> > > +
> > > +    lat_buf->src_buf_req = src->vb2_buf.req_obj.req;
> > > +    dst = &lat_buf->ts_info;
> > 
> > the "ts_info" actually contains a struct vb2_v4l2_buffer. Why such
> > a
> > name?
> > 
> > -->"ts_info" is not named by me, and i just use it here, so i will
> > not
> > rename it.
> > 
> 
> fair enough
> 
> > I ask the owner why it is named with ts_info and the answer is:
> > "This parame is used to store timestamp releated information from
> > output queue to capture queue."
> 
> So you reached out to the author of the "ts_info" name.
> I haven't seen you asking here (maybe I haven't noticed?).
> It is better to ask such a question in the mailing list rather than
> in a private message.
The author is my colleague, so I can ask him directly.
> > 
> > 8.
> > > +    /* frame header */
> > > +    ctrl_fh = (struct v4l2_ctrl_av1_frame *)
> > > +          vdec_av1_get_ctrl_ptr(instance->ctx,
> > > +                    V4L2_CID_STATELESS_AV1_FRAME);
> > > +    if (IS_ERR(ctrl_fh))
> > > +        return PTR_ERR(ctrl_fh);
> > > +
> > > +    ctrl_seq = (struct v4l2_ctrl_av1_sequence *)
> > > +           vdec_av1_get_ctrl_ptr(instance->ctx,
> > > +                     V4L2_CID_STATELESS_AV1_SEQUENCE);
> > > +    if (IS_ERR(ctrl_seq))
> > > +        return PTR_ERR(ctrl_seq);
> > 
> > Just to make sure: I assume request api is used? If so, does vdec's
> > framework
> > ensure that v4l2_ctrl_request_setup() has been called? It
> > influences
> > what's
> > actually in ctrl->p_cur.p (current or previous value), and the
> > vdec_av1_get_ctrl_ptr() wrapper returns ctrl->p_cur.p.
> > 
> > -->v4l2_ctrl_request_setup() is called in  mtk_vdec_worker which is
> > before this function.
> > 
> 
> great
> 
> > 9.enum vdec_av1_slice_resolution_level level;
> > > > +u32 max_sb_w, max_sb_h, max_w, max_h, w, h;
> > > > +size_t size;
> > > > +int i, ret;
> > > > +
> > > > +w = vsi->frame.uh.frame_width;
> > > > +h = vsi->frame.uh.frame_height;
> > > > +
> > > > +if (w > VCODEC_DEC_4K_CODED_WIDTH || h >
> > > > VCODEC_DEC_4K_CODED_HEIGHT)
> > > > +/* 8K */
> > > > +return -EINVAL;
> > > > +
> > > > +if (w > MTK_VDEC_MAX_W || h > MTK_VDEC_MAX_H) {
> > > > +/* 4K */
> > > > +level = AV1_RES_4K;
> > > > +max_w = VCODEC_DEC_4K_CODED_WIDTH;
> > > > +max_h = VCODEC_DEC_4K_CODED_HEIGHT;
> > > > +} else {
> > > > +/* FHD */
> > > > +level = AV1_RES_FHD;
> > > > +max_w = MTK_VDEC_MAX_W;
> > > > +max_h = MTK_VDEC_MAX_H;
> > > > +}
> > > > +
> > > > +if (level == instance->level)
> > > > +return 0;
> > > > +
> > > > +mtk_vcodec_debug(instance, "resolution level changed from %u
> > > > to
> > > > %u, %ux%u",
> > > > + instance->level, level, w, h);
> > > > +
> > > > +max_sb_w = DIV_ROUND_UP(max_w, 128);
> > > > +max_sb_h = DIV_ROUND_UP(max_h, 128);
> > > > +size = max_sb_w * max_sb_h * SZ_1K;
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +if (instance->mv[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> > > > +instance->mv[i].size = size;
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->mv[i]);
> > > > +if (ret)
> > > > +goto err;
> > > 
> > > Please ignore this comment if this has been discussed and
> > > settled.
> > > Maybe it's just me, but I feel it is idiomatic in the kernel to
> > > undo all previous allocations if at some iteration we fail. Here
> > > a
> > > different
> > > approach is taken: we stop iterating and return an error, and
> > > free
> > > next time
> > > we are called. Why?
The buffer size has something to do with picture width and height, so
when resolution is changed, it will re-allocate the buffer.
> > Thanks
> > Best Regards
> > xiaoyong Lu
> > 
> 
> Please avoid top-posting. It is easier for your reviewer to read your
> answers
> inline. And, actually, it is easier for you not having to find the
> relevant
> fragments, copy and then paste on top. Instead, try answering just
> below
> reviewer's comments.
> 
Thanks, i will pay attention to this.
> Also, when communicating on the mailing list eliminate the footer
> which makes the readers wonder if they are allowed to read your
> messages.
> 
> Regards,
> 
> Andrzej
> 
> > 
> > On Thu, 2022-11-17 at 13:42 +0100, Andrzej Pietrasiewicz wrote:
> > > Hi Xiaoyong Lu,
> > > 
> > > Sorry about chiming in only at v6. Please see inline below.
> > > 
> > > Andrzej
> > > 
> > > W dniu 17.11.2022 o 07:17, Xiaoyong Lu pisze:
> > > > Add mediatek av1 decoder linux driver which use the stateless
> > > > API
> > > > in
> > > > MT8195.
> > > > 
> > > > Signed-off-by: Xiaoyong Lu<xiaoyong.lu at mediatek.com>
> > > > ---
> > > > Changes from v5:
> > > > 
> > > > - change av1 PROFILE and LEVEL cfg
> > > > - test by av1 fluster, result is 173/239
> > > > 
> > > > Changes from v4:
> > > > 
> > > > - convert vb2_find_timestamp to vb2_find_buffer
> > > > - test by av1 fluster, result is 173/239
> > > > 
> > > > Changes from v3:
> > > > 
> > > > - modify comment for struct vdec_av1_slice_slot
> > > > - add define SEG_LVL_ALT_Q
> > > > - change use_lr/use_chroma_lr parse from av1 spec
> > > > - use ARRAY_SIZE to replace size for loop_filter_level and
> > > > loop_filter_mode_deltas
> > > > - change array size of loop_filter_mode_deltas from 4 to 2
> > > > - add define SECONDARY_FILTER_STRENGTH_NUM_BITS
> > > > - change some hex values from upper case to lower case
> > > > - change *dpb_sz equal to V4L2_AV1_TOTAL_REFS_PER_FRAME + 1
> > > > - test by av1 fluster, result is 173/239
> > > > 
> > > > Changes from v2:
> > > > 
> > > > - Match with av1 uapi v3 modify
> > > > - test by av1 fluster, result is 173/239
> > > > 
> > > > ---
> > > > Reference series:
> > > > [1]: v3 of this series is presend by Daniel Almeida.
> > > >       message-id: 
> > > > 20220825225312.564619-1-daniel.almeida at collabora.com
> > > > 
> > > >   .../media/platform/mediatek/vcodec/Makefile   |    1 +
> > > >   .../vcodec/mtk_vcodec_dec_stateless.c         |   47 +-
> > > >   .../platform/mediatek/vcodec/mtk_vcodec_drv.h |    1 +
> > > >   .../vcodec/vdec/vdec_av1_req_lat_if.c         | 2234
> > > > +++++++++++++++++
> > > >   .../platform/mediatek/vcodec/vdec_drv_if.c    |    4 +
> > > >   .../platform/mediatek/vcodec/vdec_drv_if.h    |    1 +
> > > >   .../platform/mediatek/vcodec/vdec_msg_queue.c |   27 +
> > > >   .../platform/mediatek/vcodec/vdec_msg_queue.h |    4 +
> > > >   8 files changed, 2318 insertions(+), 1 deletion(-)
> > > >   create mode 100644
> > > > drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if
> > > > .c
> > > > 
> > > > diff --git a/drivers/media/platform/mediatek/vcodec/Makefile
> > > > b/drivers/media/platform/mediatek/vcodec/Makefile
> > > > index 93e7a343b5b0e..7537259130072 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/Makefile
> > > > +++ b/drivers/media/platform/mediatek/vcodec/Makefile
> > > > @@ -10,6 +10,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > > >   vdec/vdec_vp8_req_if.o \
> > > >   vdec/vdec_vp9_if.o \
> > > >   vdec/vdec_vp9_req_lat_if.o \
> > > > +vdec/vdec_av1_req_lat_if.o \
> > > >   vdec/vdec_h264_req_if.o \
> > > >   vdec/vdec_h264_req_common.o \
> > > >   vdec/vdec_h264_req_multi_if.o \
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_statele
> > > > ss.c
> > > > b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_statele
> > > > ss.c
> > > > index c45bd2599bb2d..ceb6fabc67749 100644
> > > > ---
> > > > a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_statele
> > > > ss.c
> > > > +++
> > > > b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_statele
> > > > ss.c
> > > > @@ -107,11 +107,51 @@ static const struct mtk_stateless_control
> > > > mtk_stateless_controls[] = {
> > > >   },
> > > >   .codec_type = V4L2_PIX_FMT_VP9_FRAME,
> > > >   },
> > > > +{
> > > > +.cfg = {
> > > > +.id = V4L2_CID_STATELESS_AV1_SEQUENCE,
> > > > +
> > > > +},
> > > > +.codec_type = V4L2_PIX_FMT_AV1_FRAME,
> > > > +},
> > > > +{
> > > > +.cfg = {
> > > > +.id = V4L2_CID_STATELESS_AV1_FRAME,
> > > > +
> > > > +},
> > > > +.codec_type = V4L2_PIX_FMT_AV1_FRAME,
> > > > +},
> > > > +{
> > > > +.cfg = {
> > > > +.id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
> > > > +.dims = { V4L2_AV1_MAX_TILE_COUNT },
> > > > +
> > > > +},
> > > > +.codec_type = V4L2_PIX_FMT_AV1_FRAME,
> > > > +},
> > > > +{
> > > > +.cfg = {
> > > > +.id = V4L2_CID_STATELESS_AV1_PROFILE,
> > > > +.min = V4L2_STATELESS_AV1_PROFILE_MAIN,
> > > > +.def = V4L2_STATELESS_AV1_PROFILE_MAIN,
> > > > +.max = V4L2_STATELESS_AV1_PROFILE_MAIN,
> > > > +},
> > > > +.codec_type = V4L2_PIX_FMT_AV1_FRAME,
> > > > +},
> > > > +{
> > > > +.cfg = {
> > > > +.id = V4L2_CID_STATELESS_AV1_LEVEL,
> > > > +.min = V4L2_STATELESS_AV1_LEVEL_2_0,
> > > > +.def = V4L2_STATELESS_AV1_LEVEL_4_0,
> > > > +.max = V4L2_STATELESS_AV1_LEVEL_5_1,
> > > > +},
> > > > +.codec_type = V4L2_PIX_FMT_AV1_FRAME,
> > > > +},
> > > >   };
> > > >   
> > > >   #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > >   
> > > > -static struct mtk_video_fmt mtk_video_formats[5];
> > > > +static struct mtk_video_fmt mtk_video_formats[6];
> > > >   
> > > >   static struct mtk_video_fmt default_out_format;
> > > >   static struct mtk_video_fmt default_cap_format;
> > > > @@ -351,6 +391,7 @@ static void mtk_vcodec_add_formats(unsigned
> > > > int
> > > > fourcc,
> > > >   case V4L2_PIX_FMT_H264_SLICE:
> > > >   case V4L2_PIX_FMT_VP8_FRAME:
> > > >   case V4L2_PIX_FMT_VP9_FRAME:
> > > > +case V4L2_PIX_FMT_AV1_FRAME:
> > > >   mtk_video_formats[count_formats].fourcc = fourcc;
> > > >   mtk_video_formats[count_formats].type = MTK_FMT_DEC;
> > > >   mtk_video_formats[count_formats].num_planes = 1;
> > > > @@ -407,6 +448,10 @@ static void
> > > > mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
> > > >   mtk_vcodec_add_formats(V4L2_PIX_FMT_VP9_FRAME, ctx);
> > > >   out_format_count++;
> > > >   }
> > > > +if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_AV1_FRAME) {
> > > > +mtk_vcodec_add_formats(V4L2_PIX_FMT_AV1_FRAME, ctx);
> > > > +out_format_count++;
> > > > +}
> > > >   
> > > >   if (cap_format_count)
> > > >   default_cap_format = mtk_video_formats[cap_format_count
> > > > - 1];
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_drv.h
> > > > b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_drv.h
> > > > index 6a47a11ff654a..a6db972b1ff72 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_drv.h
> > > > +++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_drv.h
> > > > @@ -344,6 +344,7 @@ enum mtk_vdec_format_types {
> > > >   MTK_VDEC_FORMAT_H264_SLICE = 0x100,
> > > >   MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
> > > >   MTK_VDEC_FORMAT_VP9_FRAME = 0x400,
> > > > +MTK_VDEC_FORMAT_AV1_FRAME = 0x800,
> > > >   MTK_VCODEC_INNER_RACING = 0x20000,
> > > >   };
> > > >   
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_
> > > > if.c
> > > > b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_
> > > > if.c
> > > > new file mode 100644
> > > > index 0000000000000..2ac77175dad7c
> > > > --- /dev/null
> > > > +++
> > > > b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_
> > > > if.c
> > > > @@ -0,0 +1,2234 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * Copyright (c) 2022 MediaTek Inc.
> > > > + * Author: Xiaoyong Lu <xiaoyong.lu at mediatek.com>
> > > > + */
> > > > +
> > > > +#include <linux/module.h>
> > > > +#include <linux/slab.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > +
> > > > +#include "../mtk_vcodec_util.h"
> > > > +#include "../mtk_vcodec_dec.h"
> > > > +#include "../mtk_vcodec_intr.h"
> > > > +#include "../vdec_drv_base.h"
> > > > +#include "../vdec_drv_if.h"
> > > > +#include "../vdec_vpu_if.h"
> > > > +
> > > > +#define AV1_MAX_FRAME_BUF_COUNT(V4L2_AV1_TOTAL_REFS_PE
> > > > R_FRAME + 1)
> > > > +#define AV1_TILE_BUF_SIZE64
> > > > +#define AV1_SCALE_SUBPEL_BITS10
> > > > +#define AV1_REF_SCALE_SHIFT14
> > > > +#define AV1_REF_NO_SCALEBIT(AV1_REF_SCALE_SHIFT)
> > > > +#define AV1_REF_INVALID_SCALE-1
> > > > +
> > > > +#define AV1_INVALID_IDX-1
> > > > +
> > > > +#define AV1_DIV_ROUND_UP_POW2(value, n)\
> > > > +({\
> > > > +typeof(n) _n  = n;\
> > > > +typeof(value) _value = value;\
> > > > +(_value + (BIT(_n) >> 1)) >> _n;\
> > > > +})
> > > > +
> > > > +#define AV1_DIV_ROUND_UP_POW2_SIGNED(value, n)
> > > > \
> > > > +({
> > > > \
> > > > +typeof(n) _n_  = n;\
> > > > +typeof(value) _value_ = value;
> > > > \
> > > > +(((_value_) < 0) ? -AV1_DIV_ROUND_UP_POW2(-(_value_), (_n_))
> > > > \
> > > > +: AV1_DIV_ROUND_UP_POW2((_value_), (_n_)));\
> > > > +})
> > > > +
> > > > +#define BIT_FLAG(x, bit)(!!((x)->flags & (bit)))
> > > > +#define SEGMENTATION_FLAG(x, name)(!!((x)->flags &
> > > > V4L2_AV1_SEGMENTATION_FLAG_##name))
> > > > +#define QUANT_FLAG(x, name)(!!((x)->flags &
> > > > V4L2_AV1_QUANTIZATION_FLAG_##name))
> > > > +#define SEQUENCE_FLAG(x, name)(!!((x)->flags &
> > > > V4L2_AV1_SEQUENCE_FLAG_##name))
> > > > +#define FH_FLAG(x, name)(!!((x)->flags &
> > > > V4L2_AV1_FRAME_FLAG_##name))
> > > > +
> > > > +#define MINQ 0
> > > > +#define MAXQ 255
> > > > +
> > > > +#define DIV_LUT_PREC_BITS 14
> > > > +#define DIV_LUT_BITS 8
> > > > +#define DIV_LUT_NUM BIT(DIV_LUT_BITS)
> > > > +#define WARP_PARAM_REDUCE_BITS 6
> > > > +#define WARPEDMODEL_PREC_BITS 16
> > > > +
> > > > +#define SEG_LVL_ALT_Q 0
> > > > +#define SECONDARY_FILTER_STRENGTH_NUM_BITS 2
> > > > +
> > > > +static const short div_lut[DIV_LUT_NUM + 1] = {
> > > > +16384, 16320, 16257, 16194, 16132, 16070, 16009, 15948, 15888,
> > > > 15828, 15768,
> > > > +15709, 15650, 15592, 15534, 15477, 15420, 15364, 15308, 15252,
> > > > 15197, 15142,
> > > > +15087, 15033, 14980, 14926, 14873, 14821, 14769, 14717, 14665,
> > > > 14614, 14564,
> > > > +14513, 14463, 14413, 14364, 14315, 14266, 14218, 14170, 14122,
> > > > 14075, 14028,
> > > > +13981, 13935, 13888, 13843, 13797, 13752, 13707, 13662, 13618,
> > > > 13574, 13530,
> > > > +13487, 13443, 13400, 13358, 13315, 13273, 13231, 13190, 13148,
> > > > 13107, 13066,
> > > > +13026, 12985, 12945, 12906, 12866, 12827, 12788, 12749, 12710,
> > > > 12672, 12633,
> > > > +12596, 12558, 12520, 12483, 12446, 12409, 12373, 12336, 12300,
> > > > 12264, 12228,
> > > > +12193, 12157, 12122, 12087, 12053, 12018, 11984, 11950, 11916,
> > > > 11882, 11848,
> > > > +11815, 11782, 11749, 11716, 11683, 11651, 11619, 11586, 11555,
> > > > 11523, 11491,
> > > > +11460, 11429, 11398, 11367, 11336, 11305, 11275, 11245, 11215,
> > > > 11185, 11155,
> > > > +11125, 11096, 11067, 11038, 11009, 10980, 10951, 10923, 10894,
> > > > 10866, 10838,
> > > > +10810, 10782, 10755, 10727, 10700, 10673, 10645, 10618, 10592,
> > > > 10565, 10538,
> > > > +10512, 10486, 10460, 10434, 10408, 10382, 10356, 10331, 10305,
> > > > 10280, 10255,
> > > > +10230, 10205, 10180, 10156, 10131, 10107, 10082, 10058, 10034,
> > > > 10010, 9986,
> > > > +9963,  9939,  9916,  9892,  9869,  9846,  9823,  9800,  9777, 
> > > >  
> > > > 9754,  9732,
> > > > +9709,  9687,  9664,  9642,  9620,  9598,  9576,  9554,  9533, 
> > > >  
> > > > 9511,  9489,
> > > > +9468,  9447,  9425,  9404,  9383,  9362,  9341,  9321,  9300, 
> > > >  
> > > > 9279,  9259,
> > > > +9239,  9218,  9198,  9178,  9158,  9138,  9118,  9098,  9079, 
> > > >  
> > > > 9059,  9039,
> > > > +9020,  9001,  8981,  8962,  8943,  8924,  8905,  8886,  8867, 
> > > >  
> > > > 8849,  8830,
> > > > +8812,  8793,  8775,  8756,  8738,  8720,  8702,  8684,  8666, 
> > > >  
> > > > 8648,  8630,
> > > > +8613,  8595,  8577,  8560,  8542,  8525,  8508,  8490,  8473, 
> > > >  
> > > > 8456,  8439,
> > > > +8422,  8405,  8389,  8372,  8355,  8339,  8322,  8306,  8289, 
> > > >  
> > > > 8273,  8257,
> > > > +8240,  8224,  8208,  8192,
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_init_vsi - VSI used to initialize
> > > > instance
> > > > + * @architecture:architecture type
> > > > + * @reserved:reserved
> > > > + * @core_vsi:for core vsi
> > > > + * @cdf_table_addr:cdf table addr
> > > > + * @cdf_table_size:cdf table size
> > > > + * @iq_table_addr:iq table addr
> > > > + * @iq_table_size:iq table size
> > > > + * @vsi_size:share vsi structure size
> > > > + */
> > > > +struct vdec_av1_slice_init_vsi {
> > > > +u32 architecture;
> > > > +u32 reserved;
> > > > +u64 core_vsi;
> > > > +u64 cdf_table_addr;
> > > > +u32 cdf_table_size;
> > > > +u64 iq_table_addr;
> > > > +u32 iq_table_size;
> > > > +u32 vsi_size;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_mem - memory address and size
> > > > + * @buf:dma_addr padding
> > > > + * @dma_addr:buffer address
> > > > + * @size:buffer size
> > > > + * @dma_addr_end:buffer end address
> > > > + * @padding:for padding
> > > > + */
> > > > +struct vdec_av1_slice_mem {
> > > > +union {
> > > > +u64 buf;
> > > > +dma_addr_t dma_addr;
> > > > +};
> > > > +union {
> > > > +size_t size;
> > > > +dma_addr_t dma_addr_end;
> > > > +u64 padding;
> > > > +};
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_state - decoding state
> > > > + * @err                   : err type for decode
> > > > + * @full                  : transcoded buffer is full or not
> > > > + * @timeout               : decode timeout or not
> > > > + * @perf                  : performance enable
> > > > + * @crc                   : hw checksum
> > > > + * @out_size              : hw output size
> > > > + */
> > > > +struct vdec_av1_slice_state {
> > > > +int err;
> > > > +u32 full;
> > > > +u32 timeout;
> > > > +u32 perf;
> > > > +u32 crc[16];
> > > > +u32 out_size;
> > > > +};
> > > > +
> > > > +/*
> > > > + * enum vdec_av1_slice_resolution_level - resolution level
> > > > + */
> > > > +enum vdec_av1_slice_resolution_level {
> > > > +AV1_RES_NONE,
> > > > +AV1_RES_FHD,
> > > > +AV1_RES_4K,
> > > > +AV1_RES_8K,
> > > > +};
> > > > +
> > > > +/*
> > > > + * enum vdec_av1_slice_frame_type - av1 frame type
> > > > + */
> > > > +enum vdec_av1_slice_frame_type {
> > > > +AV1_KEY_FRAME = 0,
> > > > +AV1_INTER_FRAME,
> > > > +AV1_INTRA_ONLY_FRAME,
> > > > +AV1_SWITCH_FRAME,
> > > > +AV1_FRAME_TYPES,
> > > > +};
> > > > +
> > > > +/*
> > > > + * enum vdec_av1_slice_reference_mode - reference mode type
> > > > + */
> > > > +enum vdec_av1_slice_reference_mode {
> > > > +AV1_SINGLE_REFERENCE = 0,
> > > > +AV1_COMPOUND_REFERENCE,
> > > > +AV1_REFERENCE_MODE_SELECT,
> > > > +AV1_REFERENCE_MODES,
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_tile_group - info for each tile
> > > > + * @num_tiles:tile number
> > > > + * @tile_size:input size for each tile
> > > > + * @tile_start_offset:tile offset to input buffer
> > > > + */
> > > > +struct vdec_av1_slice_tile_group {
> > > > +u32 num_tiles;
> > > > +u32 tile_size[V4L2_AV1_MAX_TILE_COUNT];
> > > > +u32 tile_start_offset[V4L2_AV1_MAX_TILE_COUNT];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_scale_factors - scale info for each
> > > > ref
> > > > frame
> > > > + * @is_scaled:  frame is scaled or not
> > > > + * @x_scale:    frame width scale coefficient
> > > > + * @y_scale:    frame height scale coefficient
> > > > + * @x_step:     width step for x_scale
> > > > + * @y_step:     height step for y_scale
> > > > + */
> > > > +struct vdec_av1_slice_scale_factors {
> > > > +u8 is_scaled;
> > > > +int x_scale;
> > > > +int y_scale;
> > > > +int x_step;
> > > > +int y_step;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_frame_refs - ref frame info
> > > > + * @ref_fb_idx:         ref slot index
> > > > + * @ref_map_idx:        ref frame index
> > > > + * @scale_factors:      scale factors for each ref frame
> > > > + */
> > > > +struct vdec_av1_slice_frame_refs {
> > > > +int ref_fb_idx;
> > > > +int ref_map_idx;
> > > > +struct vdec_av1_slice_scale_factors scale_factors;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_gm - AV1 Global Motion parameters
> > > > + * @wmtype:     The type of global motion transform used
> > > > + * @wmmat:      gm_params
> > > > + * @alpha:      alpha info
> > > > + * @beta:       beta info
> > > > + * @gamma:      gamma info
> > > > + * @delta:      delta info
> > > > + * @invalid:    is invalid or not
> > > > + */
> > > > +struct vdec_av1_slice_gm {
> > > > +int wmtype;
> > > > +int wmmat[8];
> > > > +short alpha;
> > > > +short beta;
> > > > +short gamma;
> > > > +short delta;
> > > > +char invalid;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_sm - AV1 Skip Mode parameters
> > > > + * @skip_mode_allowed:  Skip Mode is allowed or not
> > > > + * @skip_mode_present:  specified that the skip_mode will be
> > > > present or not
> > > > + * @skip_mode_frame:    specifies the frames to use for
> > > > compound
> > > > prediction
> > > > + */
> > > > +struct vdec_av1_slice_sm {
> > > > +u8 skip_mode_allowed;
> > > > +u8 skip_mode_present;
> > > > +int skip_mode_frame[2];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_seg - AV1 Segmentation params
> > > > + * @segmentation_enabled:        this frame makes use of the
> > > > segmentation tool or not
> > > > + * @segmentation_update_map:     segmentation map are updated
> > > > during the decoding frame
> > > > + * @segmentation_temporal_update:segmentation map are coded
> > > > relative the existing segmentaion map
> > > > + * @segmentation_update_data:    new parameters are about to
> > > > be
> > > > specified for each segment
> > > > + * @feature_data:                specifies the feature data
> > > > for a
> > > > segment feature
> > > > + * @feature_enabled_mask:        the corresponding feature
> > > > value
> > > > is coded or not.
> > > > + * @segid_preskip:               segment id will be read
> > > > before
> > > > the skip syntax element.
> > > > + * @last_active_segid:           the highest numbered segment
> > > > id
> > > > that has some enabled feature
> > > > + */
> > > > +struct vdec_av1_slice_seg {
> > > > +u8 segmentation_enabled;
> > > > +u8 segmentation_update_map;
> > > > +u8 segmentation_temporal_update;
> > > > +u8 segmentation_update_data;
> > > > +int feature_data[V4L2_AV1_MAX_SEGMENTS][V4L2_AV1_SEG_LVL_MAX];
> > > > +u16 feature_enabled_mask[V4L2_AV1_MAX_SEGMENTS];
> > > > +int segid_preskip;
> > > > +int last_active_segid;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_delta_q_lf - AV1 Loop Filter delta
> > > > parameters
> > > > + * @delta_q_present:    specified whether quantizer index
> > > > delta
> > > > values are present
> > > > + * @delta_q_res:        specifies the left shift which should
> > > > be
> > > > applied to decoded quantizer index
> > > > + * @delta_lf_present:   specifies whether loop filter delta
> > > > values
> > > > are present
> > > > + * @delta_lf_res:       specifies the left shift which should
> > > > be
> > > > applied to decoded
> > > > + *                      loop filter delta values
> > > > + * @delta_lf_multi:     specifies that separate loop filter
> > > > deltas
> > > > are sent for horizontal
> > > > + *                      luma edges,vertical luma edges,the u
> > > > edges, and the v edges.
> > > > + */
> > > > +struct vdec_av1_slice_delta_q_lf {
> > > > +u8 delta_q_present;
> > > > +u8 delta_q_res;
> > > > +u8 delta_lf_present;
> > > > +u8 delta_lf_res;
> > > > +u8 delta_lf_multi;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_quantization - AV1 Quantization
> > > > params
> > > > + * @base_q_idx:         indicates the base frame qindex. This
> > > > is
> > > > used for Y AC
> > > > + *                      coefficients and as the base value for
> > > > the
> > > > other quantizers.
> > > > + * @qindex:             qindex
> > > > + * @delta_qydc:         indicates the Y DC quantizer relative
> > > > to
> > > > base_q_idx
> > > > + * @delta_qudc:         indicates the U DC quantizer relative
> > > > to
> > > > base_q_idx.
> > > > + * @delta_quac:         indicates the U AC quantizer relative
> > > > to
> > > > base_q_idx
> > > > + * @delta_qvdc:         indicates the V DC quantizer relative
> > > > to
> > > > base_q_idx
> > > > + * @delta_qvac:         indicates the V AC quantizer relative
> > > > to
> > > > base_q_idx
> > > > + * @using_qmatrix:      specifies that the quantizer matrix
> > > > will
> > > > be used to
> > > > + *                      compute quantizers
> > > > + * @qm_y:               specifies the level in the quantizer
> > > > matrix that should
> > > > + *                      be used for luma plane decoding
> > > > + * @qm_u:               specifies the level in the quantizer
> > > > matrix that should
> > > > + *                      be used for chroma U plane decoding.
> > > > + * @qm_v:               specifies the level in the quantizer
> > > > matrix that should be
> > > > + *                      used for chroma V plane decoding
> > > > + */
> > > > +struct vdec_av1_slice_quantization {
> > > > +int base_q_idx;
> > > > +int qindex[V4L2_AV1_MAX_SEGMENTS];
> > > > +int delta_qydc;
> > > > +int delta_qudc;
> > > > +int delta_quac;
> > > > +int delta_qvdc;
> > > > +int delta_qvac;
> > > > +u8 using_qmatrix;
> > > > +u8 qm_y;
> > > > +u8 qm_u;
> > > > +u8 qm_v;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_lr - AV1 Loop Restauration parameters
> > > > + * @use_lr:                     whether to use loop
> > > > restoration
> > > > + * @use_chroma_lr:              whether to use chroma loop
> > > > restoration
> > > > + * @frame_restoration_type:     specifies the type of
> > > > restoration
> > > > used for each plane
> > > > + * @loop_restoration_size:      pecifies the size of loop
> > > > restoration units in units
> > > > + *                              of samples in the current
> > > > plane
> > > > + */
> > > > +struct vdec_av1_slice_lr {
> > > > +u8 use_lr;
> > > > +u8 use_chroma_lr;
> > > > +u8 frame_restoration_type[V4L2_AV1_NUM_PLANES_MAX];
> > > > +u32 loop_restoration_size[V4L2_AV1_NUM_PLANES_MAX];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_loop_filter - AV1 Loop filter
> > > > parameters
> > > > + * @loop_filter_level:          an array containing loop
> > > > filter
> > > > strength values.
> > > > + * @loop_filter_ref_deltas:     contains the adjustment needed
> > > > for
> > > > the filter
> > > > + *                              level based on the chosen
> > > > reference frame
> > > > + * @loop_filter_mode_deltas:    contains the adjustment needed
> > > > for
> > > > the filter
> > > > + *                              level based on the chosen mode
> > > > + * @loop_filter_sharpness:      indicates the sharpness level.
> > > > The
> > > > loop_filter_level
> > > > + *                              and loop_filter_sharpness
> > > > together
> > > > determine when
> > > > + *                              a block edge is filtered, and
> > > > by
> > > > how much the
> > > > + *                              filtering can change the
> > > > sample
> > > > values
> > > > + * @loop_filter_delta_enabled:  filetr level depends on the
> > > > mode
> > > > and reference
> > > > + *                              frame used to predict a block
> > > > + */
> > > > +struct vdec_av1_slice_loop_filter {
> > > > +u8 loop_filter_level[4];
> > > > +int loop_filter_ref_deltas[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> > > > +int loop_filter_mode_deltas[2];
> > > > +u8 loop_filter_sharpness;
> > > > +u8 loop_filter_delta_enabled;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_cdef - AV1 CDEF parameters
> > > > + * @cdef_damping:       controls the amount of damping in the
> > > > deringing filter
> > > > + * @cdef_y_strength:    specifies the strength of the primary
> > > > filter and secondary filter
> > > > + * @cdef_uv_strength:   specifies the strength of the primary
> > > > filter and secondary filter
> > > > + * @cdef_bits:          specifies the number of bits needed to
> > > > specify which
> > > > + *                      CDEF filter to apply
> > > > + */
> > > > +struct vdec_av1_slice_cdef {
> > > > +u8 cdef_damping;
> > > > +u8 cdef_y_strength[8];
> > > > +u8 cdef_uv_strength[8];
> > > > +u8 cdef_bits;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_mfmv - AV1 mfmv parameters
> > > > + * @mfmv_valid_ref:     mfmv_valid_ref
> > > > + * @mfmv_dir:           mfmv_dir
> > > > + * @mfmv_ref_to_cur:    mfmv_ref_to_cur
> > > > + * @mfmv_ref_frame_idx: mfmv_ref_frame_idx
> > > > + * @mfmv_count:         mfmv_count
> > > > + */
> > > > +struct vdec_av1_slice_mfmv {
> > > > +u32 mfmv_valid_ref[3];
> > > > +u32 mfmv_dir[3];
> > > > +int mfmv_ref_to_cur[3];
> > > > +int mfmv_ref_frame_idx[3];
> > > > +int mfmv_count;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_tile - AV1 Tile info
> > > > + * @tile_cols:                  specifies the number of tiles
> > > > across the frame
> > > > + * @tile_rows:                  pecifies the number of tiles
> > > > down
> > > > the frame
> > > > + * @mi_col_starts:              an array specifying the start
> > > > column
> > > > + * @mi_row_starts:              an array specifying the start
> > > > row
> > > > + * @context_update_tile_id:     specifies which tile to use
> > > > for
> > > > the CDF update
> > > > + * @uniform_tile_spacing_flag:  tiles are uniformly spaced
> > > > across
> > > > the frame
> > > > + *                              or the tile sizes are coded
> > > > + */
> > > > +struct vdec_av1_slice_tile {
> > > > +u8 tile_cols;
> > > > +u8 tile_rows;
> > > > +int mi_col_starts[V4L2_AV1_MAX_TILE_COLS + 1];
> > > > +int mi_row_starts[V4L2_AV1_MAX_TILE_ROWS + 1];
> > > > +u8 context_update_tile_id;
> > > > +u8 uniform_tile_spacing_flag;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_uncompressed_header - Represents an
> > > > AV1
> > > > Frame Header OBU
> > > > + * @use_ref_frame_mvs:          use_ref_frame_mvs flag
> > > > + * @order_hint:                 specifies OrderHintBits least
> > > > significant bits of the expected
> > > > + * @gm:                         global motion param
> > > > + * @upscaled_width:             the upscaled width
> > > > + * @frame_width:                frame's width
> > > > + * @frame_height:               frame's height
> > > > + * @reduced_tx_set:             frame is restricted to a
> > > > reduced
> > > > subset of the full
> > > > + *                              set of transform types
> > > > + * @tx_mode:                    specifies how the transform
> > > > size
> > > > is determined
> > > > + * @uniform_tile_spacing_flag:  tiles are uniformly spaced
> > > > across
> > > > the frame
> > > > + *                              or the tile sizes are coded
> > > > + * @interpolation_filter:       specifies the filter selection
> > > > used for performing inter prediction
> > > > + * @allow_warped_motion:        motion_mode may be present or
> > > > not
> > > > + * @is_motion_mode_switchable : euqlt to 0 specifies that only
> > > > the
> > > > SIMPLE motion mode will be used
> > > > + * @reference_mode :            frame reference mode selected
> > > > + * @allow_high_precision_mv:    specifies that motion vectors
> > > > are
> > > > specified to
> > > > + *                              quarter pel precision or to
> > > > eighth
> > > > pel precision
> > > > + * @allow_intra_bc:             ubducates that intra block
> > > > copy
> > > > may be used in this frame
> > > > + * @force_integer_mv:           specifies motion vectors will
> > > > always be integers or
> > > > + *                              can contain fractional bits
> > > > + * @allow_screen_content_tools: intra blocks may use palette
> > > > encoding
> > > > + * @error_resilient_mode:       error resislent mode is
> > > > enable/disable
> > > > + * @frame_type:                 specifies the AV1 frame type
> > > > + * @primary_ref_frame:          specifies which reference
> > > > frame
> > > > contains the CDF values
> > > > + *                              and other state that should be
> > > > loaded at the start of the frame
> > > > + *                              slots will be updated with the
> > > > current frame after it is decoded
> > > > + * @disable_frame_end_update_cdf:indicates the end of frame
> > > > CDF
> > > > update is disable or enable
> > > > + * @disable_cdf_update:         specified whether the CDF
> > > > update
> > > > in the symbol
> > > > + *                              decoding process should be
> > > > disables
> > > > + * @skip_mode:                  av1 skip mode parameters
> > > > + * @seg:                        av1 segmentaon parameters
> > > > + * @delta_q_lf:                 av1 delta loop fileter
> > > > + * @quant:                      av1 Quantization params
> > > > + * @lr:                         av1 Loop Restauration
> > > > parameters
> > > > + * @superres_denom:             the denominator for the
> > > > upscaling
> > > > ratio
> > > > + * @loop_filter:                av1 Loop filter parameters
> > > > + * @cdef:                       av1 CDEF parameters
> > > > + * @mfmv:                       av1 mfmv parameters
> > > > + * @tile:                       av1 Tile info
> > > > + * @frame_is_intra:             intra frame
> > > > + * @loss_less_array:            loss less array
> > > > + * @coded_loss_less:            coded lsss less
> > > > + * @mi_rows:                    size of mi unit in rows
> > > > + * @mi_cols:                    size of mi unit in cols
> > > > + */
> > > > +struct vdec_av1_slice_uncompressed_header {
> > > > +u8 use_ref_frame_mvs;
> > > > +int order_hint;
> > > > +struct vdec_av1_slice_gm gm[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> > > > +u32 upscaled_width;
> > > > +u32 frame_width;
> > > > +u32 frame_height;
> > > > +u8 reduced_tx_set;
> > > > +u8 tx_mode;
> > > > +u8 uniform_tile_spacing_flag;
> > > > +u8 interpolation_filter;
> > > > +u8 allow_warped_motion;
> > > > +u8 is_motion_mode_switchable;
> > > > +u8 reference_mode;
> > > > +u8 allow_high_precision_mv;
> > > > +u8 allow_intra_bc;
> > > > +u8 force_integer_mv;
> > > > +u8 allow_screen_content_tools;
> > > > +u8 error_resilient_mode;
> > > > +u8 frame_type;
> > > > +u8 primary_ref_frame;
> > > > +u8 disable_frame_end_update_cdf;
> > > > +u32 disable_cdf_update;
> > > > +struct vdec_av1_slice_sm skip_mode;
> > > > +struct vdec_av1_slice_seg seg;
> > > > +struct vdec_av1_slice_delta_q_lf delta_q_lf;
> > > > +struct vdec_av1_slice_quantization quant;
> > > > +struct vdec_av1_slice_lr lr;
> > > > +u32 superres_denom;
> > > > +struct vdec_av1_slice_loop_filter loop_filter;
> > > > +struct vdec_av1_slice_cdef cdef;
> > > > +struct vdec_av1_slice_mfmv mfmv;
> > > > +struct vdec_av1_slice_tile tile;
> > > > +u8 frame_is_intra;
> > > > +u8 loss_less_array[V4L2_AV1_MAX_SEGMENTS];
> > > > +u8 coded_loss_less;
> > > > +u32 mi_rows;
> > > > +u32 mi_cols;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_seq_header - Represents an AV1
> > > > Sequence
> > > > OBU
> > > > + * @bitdepth:                   the bitdepth to use for the
> > > > sequence
> > > > + * @enable_superres:            specifies whether the
> > > > use_superres
> > > > syntax element may be present
> > > > + * @enable_filter_intra:        specifies the use_filter_intra
> > > > syntax element may be present
> > > > + * @enable_intra_edge_filter:   whether the intra edge
> > > > filtering
> > > > process should be enabled
> > > > + * @enable_interintra_compound: specifies the mode info fo
> > > > rinter
> > > > blocks may
> > > > + *                              contain the syntax element
> > > > interintra
> > > > + * @enable_masked_compound:     specifies the mode info fo
> > > > rinter
> > > > blocks may
> > > > + *                              contain the syntax element
> > > > compound_type
> > > > + * @enable_dual_filter:         the inter prediction filter
> > > > type
> > > > may be specified independently
> > > > + * @enable_jnt_comp:            distance weights process may
> > > > be
> > > > used for inter prediction
> > > > + * @mono_chrome:                indicates the video does not
> > > > contain U and V color planes
> > > > + * @enable_order_hint:          tools based on the values of
> > > > order
> > > > hints may be used
> > > > + * @order_hint_bits:            the number of bits used for
> > > > the
> > > > order_hint field at each frame
> > > > + * @use_128x128_superblock:     indicates superblocks contain
> > > > 128*128 luma samples
> > > > + * @subsampling_x:              the chroma subsamling format
> > > > + * @subsampling_y:              the chroma subsamling format
> > > > + * @max_frame_width:            the maximum frame width for
> > > > the
> > > > frames represented by sequence
> > > > + * @max_frame_height:           the maximum frame height for
> > > > the
> > > > frames represented by sequence
> > > > + */
> > > > +struct vdec_av1_slice_seq_header {
> > > > +u8 bitdepth;
> > > > +u8 enable_superres;
> > > > +u8 enable_filter_intra;
> > > > +u8 enable_intra_edge_filter;
> > > > +u8 enable_interintra_compound;
> > > > +u8 enable_masked_compound;
> > > > +u8 enable_dual_filter;
> > > > +u8 enable_jnt_comp;
> > > > +u8 mono_chrome;
> > > > +u8 enable_order_hint;
> > > > +u8 order_hint_bits;
> > > > +u8 use_128x128_superblock;
> > > > +u8 subsampling_x;
> > > > +u8 subsampling_y;
> > > > +u32 max_frame_width;
> > > > +u32 max_frame_height;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_frame - Represents current Frame info
> > > > + * @uh:                         uncompressed header info
> > > > + * @seq:                        sequence header info
> > > > + * @large_scale_tile:           is large scale mode
> > > > + * @cur_ts:                     current frame timestamp
> > > > + * @prev_fb_idx:                prev slot id
> > > > + * @ref_frame_sign_bias:        arrays for ref_frame sign bias
> > > > + * @order_hints:                arrays for ref_frame order
> > > > hint
> > > > + * @ref_frame_valid:            arrays for valid ref_frame
> > > > + * @ref_frame_map:              map to slot frame info
> > > > + * @frame_refs:                 ref_frame info
> > > > + */
> > > > +struct vdec_av1_slice_frame {
> > > > +struct vdec_av1_slice_uncompressed_header uh;
> > > > +struct vdec_av1_slice_seq_header seq;
> > > > +u8 large_scale_tile;
> > > > +u64 cur_ts;
> > > > +int prev_fb_idx;
> > > > +u8 ref_frame_sign_bias[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> > > > +u32 order_hints[V4L2_AV1_REFS_PER_FRAME];
> > > > +u32 ref_frame_valid[V4L2_AV1_REFS_PER_FRAME];
> > > > +int ref_frame_map[V4L2_AV1_TOTAL_REFS_PER_FRAME];
> > > > +struct vdec_av1_slice_frame_refs
> > > > frame_refs[V4L2_AV1_REFS_PER_FRAME];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_work_buffer - work buffer for lat
> > > > + * @mv_addr:    mv buffer memory info
> > > > + * @cdf_addr:   cdf buffer memory info
> > > > + * @segid_addr: segid buffer memory info
> > > > + */
> > > > +struct vdec_av1_slice_work_buffer {
> > > > +struct vdec_av1_slice_mem mv_addr;
> > > > +struct vdec_av1_slice_mem cdf_addr;
> > > > +struct vdec_av1_slice_mem segid_addr;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_frame_info - frame info for each slot
> > > > + * @frame_type:         frame type
> > > > + * @frame_is_intra:     is intra frame
> > > > + * @order_hint:         order hint
> > > > + * @order_hints:        referece frame order hint
> > > > + * @upscaled_width:     upscale width
> > > > + * @pic_pitch:          buffer pitch
> > > > + * @frame_width:        frane width
> > > > + * @frame_height:       frame height
> > > > + * @mi_rows:            rows in mode info
> > > > + * @mi_cols:            cols in mode info
> > > > + * @ref_count:          mark to reference frame counts
> > > > + */
> > > > +struct vdec_av1_slice_frame_info {
> > > > +u8 frame_type;
> > > > +u8 frame_is_intra;
> > > > +int order_hint;
> > > > +u32 order_hints[V4L2_AV1_REFS_PER_FRAME];
> > > > +u32 upscaled_width;
> > > > +u32 pic_pitch;
> > > > +u32 frame_width;
> > > > +u32 frame_height;
> > > > +u32 mi_rows;
> > > > +u32 mi_cols;
> > > > +int ref_count;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_slot - slot info that needs to be
> > > > saved
> > > > in the global instance
> > > > + * @frame_info: frame info for each slot
> > > > + * @timestamp:  time stamp info
> > > > + */
> > > > +struct vdec_av1_slice_slot {
> > > > +struct vdec_av1_slice_frame_info
> > > > frame_info[AV1_MAX_FRAME_BUF_COUNT];
> > > > +u64 timestamp[AV1_MAX_FRAME_BUF_COUNT];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_fb - frame buffer for decoding
> > > > + * @y:  current y buffer address info
> > > > + * @c:  current c buffer address info
> > > > + */
> > > > +struct vdec_av1_slice_fb {
> > > > +struct vdec_av1_slice_mem y;
> > > > +struct vdec_av1_slice_mem c;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_vsi - exchange frame information
> > > > between
> > > > Main CPU and MicroP
> > > > + * @bs:input buffer info
> > > > + * @work_buffer:working buffe for hw
> > > > + * @cdf_table:cdf_table buffer
> > > > + * @cdf_tmp:cdf temp buffer
> > > > + * @rd_mv:mv buffer for lat output , core input
> > > > + * @ube:ube buffer
> > > > + * @trans:transcoded buffer
> > > > + * @err_map:err map buffer
> > > > + * @row_info:row info buffer
> > > > + * @fb:current y/c buffer
> > > > + * @ref:ref y/c buffer
> > > > + * @iq_table:iq table buffer
> > > > + * @tile:tile buffer
> > > > + * @slots:slots info for each frame
> > > > + * @slot_id:current frame slot id
> > > > + * @frame:current frame info
> > > > + * @state:status after decode done
> > > > + * @cur_lst_tile_id:tile id for large scale
> > > > + */
> > > > +struct vdec_av1_slice_vsi {
> > > > +/* lat */
> > > > +struct vdec_av1_slice_mem bs;
> > > > +struct vdec_av1_slice_work_buffer
> > > > work_buffer[AV1_MAX_FRAME_BUF_COUNT];
> > > > +struct vdec_av1_slice_mem cdf_table;
> > > > +struct vdec_av1_slice_mem cdf_tmp;
> > > > +/* LAT stage's output, Core stage's input */
> > > > +struct vdec_av1_slice_mem rd_mv;
> > > > +struct vdec_av1_slice_mem ube;
> > > > +struct vdec_av1_slice_mem trans;
> > > > +struct vdec_av1_slice_mem err_map;
> > > > +struct vdec_av1_slice_mem row_info;
> > > > +/* core */
> > > > +struct vdec_av1_slice_fb fb;
> > > > +struct vdec_av1_slice_fb ref[V4L2_AV1_REFS_PER_FRAME];
> > > > +struct vdec_av1_slice_mem iq_table;
> > > > +/* lat and core share*/
> > > > +struct vdec_av1_slice_mem tile;
> > > > +struct vdec_av1_slice_slot slots;
> > > > +u8 slot_id;
> > > > +struct vdec_av1_slice_frame frame;
> > > > +struct vdec_av1_slice_state state;
> > > > +u32 cur_lst_tile_id;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_pfc - per-frame context that contains
> > > > a
> > > > local vsi.
> > > > + *                             pass it from lat to core
> > > > + * @vsi:        local vsi. copy to/from remote vsi
> > > > before/after
> > > > decoding
> > > > + * @ref_idx:    reference buffer timestamp
> > > > + * @seq:        picture sequence
> > > > + */
> > > > +struct vdec_av1_slice_pfc {
> > > > +struct vdec_av1_slice_vsi vsi;
> > > > +u64 ref_idx[V4L2_AV1_REFS_PER_FRAME];
> > > > +int seq;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_av1_slice_instance - represent one av1 instance
> > > > + * @ctx:                pointer to codec's context
> > > > + * @vpu:                VPU instance
> > > > + * @iq_table:           iq table buffer
> > > > + * @cdf_table:          cdf table buffer
> > > > + * @mv:                 mv working buffer
> > > > + * @cdf:                cdf working buffer
> > > > + * @seg:                segmentation working buffer
> > > > + * @cdf_temp:           cdf temp buffer
> > > > + * @tile:               tile buffer
> > > > + * @slots:              slots info
> > > > + * @tile_group:         tile_group entry
> > > > + * @level:              level of current resolution
> > > > + * @width:              width of last picture
> > > > + * @height:             height of last picture
> > > > + * @frame_type:         frame_type of last picture
> > > > + * @irq:                irq to Main CPU or MicroP
> > > > + * @inneracing_mode:    is inneracing mode
> > > > + * @init_vsi:           vsi used for initialized AV1 instance
> > > > + * @vsi:                vsi used for decoding/flush ...
> > > > + * @core_vsi:           vsi used for Core stage
> > > > + * @seq:                global picture sequence
> > > > + */
> > > > +struct vdec_av1_slice_instance {
> > > > +struct mtk_vcodec_ctx *ctx;
> > > > +struct vdec_vpu_inst vpu;
> > > > +
> > > > +struct mtk_vcodec_mem iq_table;
> > > > +struct mtk_vcodec_mem cdf_table;
> > > > +
> > > > +struct mtk_vcodec_mem mv[AV1_MAX_FRAME_BUF_COUNT];
> > > > +struct mtk_vcodec_mem cdf[AV1_MAX_FRAME_BUF_COUNT];
> > > > +struct mtk_vcodec_mem seg[AV1_MAX_FRAME_BUF_COUNT];
> > > > +struct mtk_vcodec_mem cdf_temp;
> > > > +struct mtk_vcodec_mem tile;
> > > > +struct vdec_av1_slice_slot slots;
> > > > +struct vdec_av1_slice_tile_group tile_group;
> > > > +
> > > > +/* for resolution change and get_pic_info */
> > > > +enum vdec_av1_slice_resolution_level level;
> > > > +u32 width;
> > > > +u32 height;
> > > > +
> > > > +u32 frame_type;
> > > > +u32 irq;
> > > > +u32 inneracing_mode;
> > > > +
> > > > +/* MicroP vsi */
> > > > +union {
> > > > +struct vdec_av1_slice_init_vsi *init_vsi;
> > > > +struct vdec_av1_slice_vsi *vsi;
> > > > +};
> > > > +struct vdec_av1_slice_vsi *core_vsi;
> > > > +int seq;
> > > > +};
> > > > +
> > > > +static int vdec_av1_slice_core_decode(struct vdec_lat_buf
> > > > *lat_buf);
> > > > +
> > > > +static inline int vdec_av1_slice_get_msb(u32 n)
> > > > +{
> > > > +if (n == 0)
> > > > +return 0;
> > > > +return 31 ^ __builtin_clz(n);
> > > > +}
> > > > +
> > > > +static inline bool vdec_av1_slice_need_scale(u32 ref_width,
> > > > u32
> > > > ref_height,
> > > > +     u32 this_width, u32
> > > > this_height)
> > > > +{
> > > > +return ((this_width << 1) >= ref_width) &&
> > > > +((this_height << 1) >= ref_height) &&
> > > > +(this_width <= (ref_width << 4)) &&
> > > > +(this_height <= (ref_height << 4));
> > > > +}
> > > > +
> > > > +static void *vdec_av1_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > > int
> > > > id)
> > > > +{
> > > > +struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > > +
> > > > +if (!ctrl)
> > > > +return ERR_PTR(-EINVAL);
> > > > +
> > > > +return ctrl->p_cur.p;
> > > > +}
> > > 
> > > I see we keep repeating this kind of a v4l2_ctrl_find() wrapper
> > > in
> > > drivers.
> > > The only reason this code cannot be factored out is the "context"
> > > struct pointer
> > > pointing at structs of different types. Maybe we could
> > > 
> > > #define v4l2_get_ctrl_ptr(ctx, member, id) \
> > > __v4l2_get_ctrl_ptr((ctx), offsetof(typeof(*ctx), (member)),
> > > (id))
> > > 
> > > void *__v4l2_get_ctrl_ptr(void *ctx, size_t offset, u32 id)
> > > {
> > > struct v4l2_ctrl_handler *hdl = (struct v4l2_ctrl_handler
> > > *)(ctx + offset);
> > > struct v4l2_ctrl *ctrl = v4l2_ctrl_find(hdl, id);
> > > 
> > > if (!ctrl)
> > > return ERR_PTR(-EINVAL);
> > > 
> > > return ctrl->p_cur.p;
> > > }
> > > 
> > > and reuse v4l2_get_ctrl_ptr() in drivers?
> > > 
> > > A similar kind of void* arithmetic happens in container_of, only
> > > with
> > > '-'.
> > > 
> > > > +
> > > > +static int vdec_av1_slice_init_cdf_table(struct
> > > > vdec_av1_slice_instance *instance)
> > > > +{
> > > > +u8 *remote_cdf_table;
> > > > +struct mtk_vcodec_ctx *ctx;
> > > > +struct vdec_av1_slice_init_vsi *vsi;
> > > > +int ret;
> > > > +
> > > > +ctx = instance->ctx;
> > > > +vsi = instance->vpu.vsi;
> > > > +if (!ctx || !vsi) {
> > > > +mtk_vcodec_err(instance, "invalid ctx or vsi 0x%p
> > > > 0x%p\n",
> > > > +       ctx, vsi);
> > > > +return -EINVAL;
> > > > +}
> > > 
> > > The above if block is redundant, because - given the current
> > > shape of
> > > ths driver
> > > code - the condition is never true.
> > > 
> > > This function is only called from vdec_av1_slice_init(), where
> > > both
> > > instance->ctx and instance->vpu.vsi are set to some values. The
> > > latter is even
> > > checked for being null before this function is called.
> > > 
> > > In the caller, instance->ctx is set to whatever the caller
> > > receives
> > > from its
> > > caller. This perhaps might be checked, but vdec_av1_slice_init()
> > > dereferences
> > > ctx without checking anyway (instance->vpu.codec_type = ctx-
> > > > current_codec;).
> > > 
> > > So maybe add a check in vdec_av1_slice_init(), or ensure that
> > > vdec_av1_slice_init() is never passed a NULL ctx.
> > > 
> > > > +
> > > > +remote_cdf_table = mtk_vcodec_fw_map_dm_addr(ctx->dev-
> > > > > fw_handler,
> > > > 
> > > > +     (u32)vsi-
> > > > > cdf_table_addr);
> > > > 
> > > > +if (IS_ERR(remote_cdf_table)) {
> > > > +mtk_vcodec_err(instance, "failed to map cdf table\n");
> > > > +return PTR_ERR(remote_cdf_table);
> > > > +}
> > > > +
> > > > +mtk_vcodec_debug(instance, "map cdf table to 0x%p\n",
> > > > + remote_cdf_table);
> > > > +
> > > > +if (instance->cdf_table.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->cdf_table);
> > > > +instance->cdf_table.size = vsi->cdf_table_size;
> > > > +
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->cdf_table);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +memcpy(instance->cdf_table.va, remote_cdf_table, vsi-
> > > > > cdf_table_size);
> > > > 
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_init_iq_table(struct
> > > > vdec_av1_slice_instance *instance)
> > > > +{
> > > > +u8 *remote_iq_table;
> > > > +struct mtk_vcodec_ctx *ctx;
> > > > +struct vdec_av1_slice_init_vsi *vsi;
> > > > +int ret;
> > > > +
> > > > +ctx = instance->ctx;
> > > > +vsi = instance->vpu.vsi;
> > > > +if (!ctx || !vsi) {
> > > > +mtk_vcodec_err(instance, "invalid ctx or vsi 0x%p
> > > > 0x%p\n",
> > > > +       ctx, vsi);
> > > > +return -EINVAL;
> > > > +}
> > > 
> > > ditto
> > > 
> > > > +
> > > > +remote_iq_table = mtk_vcodec_fw_map_dm_addr(ctx->dev-
> > > > > fw_handler,
> > > > 
> > > > +    (u32)vsi-
> > > > > iq_table_addr);
> > > > 
> > > > +if (IS_ERR(remote_iq_table)) {
> > > > +mtk_vcodec_err(instance, "failed to map iq table\n");
> > > > +return PTR_ERR(remote_iq_table);
> > > > +}
> > > > +
> > > > +mtk_vcodec_debug(instance, "map iq table to 0x%p\n",
> > > > remote_iq_table);
> > > > +
> > > > +if (instance->iq_table.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->iq_table);
> > > > +instance->iq_table.size = vsi->iq_table_size;
> > > > +
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->iq_table);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +memcpy(instance->iq_table.va, remote_iq_table, vsi-
> > > > > iq_table_size);
> > > > 
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_get_new_slot(struct
> > > > vdec_av1_slice_vsi
> > > > *vsi)
> > > > +{
> > > > +struct vdec_av1_slice_slot *slots = &vsi->slots;
> > > > +int new_slot_idx = AV1_INVALID_IDX;
> > > > +int i;
> > > > +
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +if (slots->frame_info[i].ref_count == 0) {
> > > > +new_slot_idx = i;
> > > > +break;
> > > > +}
> > > > +}
> > > > +
> > > > +if (new_slot_idx != AV1_INVALID_IDX) {
> > > > +slots->frame_info[new_slot_idx].ref_count++;
> > > > +slots->timestamp[new_slot_idx] = vsi->frame.cur_ts;
> > > > +}
> > > > +
> > > > +return new_slot_idx;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_clear_fb(struct
> > > > vdec_av1_slice_frame_info *frame_info)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +memset((void *)frame_info, 0, sizeof(struct
> > > > vdec_av1_slice_frame_info));
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_decrease_ref_count(struct
> > > > vdec_av1_slice_slot *slots, int fb_idx)
> > > > +{
> > > > +struct vdec_av1_slice_frame_info *frame_info = slots-
> > > > > frame_info;
> > > > 
> > > > +
> > > > +if (fb_idx < 0 || fb_idx >= AV1_MAX_FRAME_BUF_COUNT) {
> > > > +mtk_v4l2_err("av1_error: %s() invalid fb_idx %d\n",
> > > > __func__, fb_idx);
> > > > +return;
> > > > +}
> > > 
> > > The above if block is redundant, because - given the current
> > > shape of
> > > this
> > > driver code - the condition is never true.
> > > 
> > > This function is only called from the below
> > > vdec_av1_slice_cleanup_slots().
> > > The fb_idx formal param comes from the caller's slot_id local
> > > variable, whose
> > > value is only assigned in the for loop, iterating from 0 to
> > > AV1_MAX_FRAME_BUF_COUNT - 1, inclusive. Hence slot_id is never <
> > > 0
> > > nor >= AV1_MAX_FRAME_BUF_COUNT.
> > > 
> > > > +
> > > > +frame_info[fb_idx].ref_count--;
> > > > +if (frame_info[fb_idx].ref_count < 0) {
> > > > +frame_info[fb_idx].ref_count = 0;
> > > > +mtk_v4l2_err("av1_error: %s() fb_idx %d decrease
> > > > ref_count error\n",
> > > > +     __func__, fb_idx);
> > > > +}
> > > > +vdec_av1_slice_clear_fb(&frame_info[fb_idx]);
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_cleanup_slots(struct
> > > > vdec_av1_slice_slot *slots,
> > > > + struct vdec_av1_slice_frame
> > > > *frame,
> > > > + struct v4l2_ctrl_av1_frame
> > > > *ctrl_fh)
> > > > +{
> > > > +int slot_id, ref_id;
> > > > +
> > > > +for (ref_id = 0; ref_id < V4L2_AV1_TOTAL_REFS_PER_FRAME;
> > > > ref_id++)
> > > > +frame->ref_frame_map[ref_id] = AV1_INVALID_IDX;
> > > > +
> > > > +for (slot_id = 0; slot_id < AV1_MAX_FRAME_BUF_COUNT;
> > > > slot_id++)
> > > > {
> > > > +u64 timestamp = slots->timestamp[slot_id];
> > > > +bool ref_used = false;
> > > > +
> > > > +/* ignored unused slots */
> > > > +if (slots->frame_info[slot_id].ref_count == 0)
> > > > +continue;
> > > > +
> > > > +for (ref_id = 0; ref_id <
> > > > V4L2_AV1_TOTAL_REFS_PER_FRAME; ref_id++) {
> > > > +if (ctrl_fh->reference_frame_ts[ref_id] ==
> > > > timestamp) {
> > > > +frame->ref_frame_map[ref_id] = slot_id;
> > > > +ref_used = true;
> > > > +}
> > > > +}
> > > > +
> > > > +if (!ref_used)
> > > > +vdec_av1_slice_decrease_ref_count(slots,
> > > > slot_id);
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_slot(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +      struct vdec_av1_slice_vsi *vsi,
> > > > +      struct v4l2_ctrl_av1_frame
> > > > *ctrl_fh)
> > > > +{
> > > > +struct vdec_av1_slice_frame_info *cur_frame_info;
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &vsi-
> > > > >frame.uh;
> > > > +int ref_id;
> > > > +
> > > > +memcpy(&vsi->slots, &instance->slots, sizeof(instance-
> > > > >slots));
> > > > +vdec_av1_slice_cleanup_slots(&vsi->slots, &vsi->frame,
> > > > ctrl_fh);
> > > > +vsi->slot_id = vdec_av1_slice_get_new_slot(vsi);
> > > > +
> > > > +if (vsi->slot_id == AV1_INVALID_IDX) {
> > > > +mtk_v4l2_err("warning:av1 get invalid index slot\n");
> > > > +vsi->slot_id = 0;
> > > > +}
> > > > +cur_frame_info = &vsi->slots.frame_info[vsi->slot_id];
> > > > +cur_frame_info->frame_type = uh->frame_type;
> > > > +cur_frame_info->frame_is_intra = ((uh->frame_type ==
> > > > AV1_INTRA_ONLY_FRAME) ||
> > > > +  (uh->frame_type ==
> > > > AV1_KEY_FRAME));
> > > > +cur_frame_info->order_hint = uh->order_hint;
> > > > +cur_frame_info->upscaled_width = uh->upscaled_width;
> > > > +cur_frame_info->pic_pitch = 0;
> > > > +cur_frame_info->frame_width = uh->frame_width;
> > > > +cur_frame_info->frame_height = uh->frame_height;
> > > > +cur_frame_info->mi_cols = ((uh->frame_width + 7) >> 3) << 1;
> > > > +cur_frame_info->mi_rows = ((uh->frame_height + 7) >> 3) << 1;
> > > > +
> > > > +/* ensure current frame is properly mapped if referenced */
> > > > +for (ref_id = 0; ref_id < V4L2_AV1_TOTAL_REFS_PER_FRAME;
> > > > ref_id++) {
> > > > +u64 timestamp = vsi->slots.timestamp[vsi->slot_id];
> > > > +
> > > > +if (ctrl_fh->reference_frame_ts[ref_id] == timestamp)
> > > > +vsi->frame.ref_frame_map[ref_id] = vsi-
> > > > > slot_id;
> > > > 
> > > > +}
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_alloc_working_buffer(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +       struct
> > > > vdec_av1_slice_vsi *vsi)
> > > > +{
> > > > +struct mtk_vcodec_ctx *ctx = instance->ctx;
> > > > +struct vdec_av1_slice_work_buffer *work_buffer = vsi-
> > > > > work_buffer;
> > > > 
> > > > +enum vdec_av1_slice_resolution_level level;
> > > > +u32 max_sb_w, max_sb_h, max_w, max_h, w, h;
> > > > +size_t size;
> > > > +int i, ret;
> > > > +
> > > > +w = vsi->frame.uh.frame_width;
> > > > +h = vsi->frame.uh.frame_height;
> > > > +
> > > > +if (w > VCODEC_DEC_4K_CODED_WIDTH || h >
> > > > VCODEC_DEC_4K_CODED_HEIGHT)
> > > > +/* 8K */
> > > > +return -EINVAL;
> > > > +
> > > > +if (w > MTK_VDEC_MAX_W || h > MTK_VDEC_MAX_H) {
> > > > +/* 4K */
> > > > +level = AV1_RES_4K;
> > > > +max_w = VCODEC_DEC_4K_CODED_WIDTH;
> > > > +max_h = VCODEC_DEC_4K_CODED_HEIGHT;
> > > > +} else {
> > > > +/* FHD */
> > > > +level = AV1_RES_FHD;
> > > > +max_w = MTK_VDEC_MAX_W;
> > > > +max_h = MTK_VDEC_MAX_H;
> > > > +}
> > > > +
> > > > +if (level == instance->level)
> > > > +return 0;
> > > > +
> > > > +mtk_vcodec_debug(instance, "resolution level changed from %u
> > > > to
> > > > %u, %ux%u",
> > > > + instance->level, level, w, h);
> > > > +
> > > > +max_sb_w = DIV_ROUND_UP(max_w, 128);
> > > > +max_sb_h = DIV_ROUND_UP(max_h, 128);
> > > > +size = max_sb_w * max_sb_h * SZ_1K;
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +if (instance->mv[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> > > > +instance->mv[i].size = size;
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->mv[i]);
> > > > +if (ret)
> > > > +goto err;
> > > 
> > > Please ignore this comment if this has been discussed and
> > > settled.
> > > Maybe it's just me, but I feel it is idiomatic in the kernel to
> > > undo all previous allocations if at some iteration we fail. Here
> > > a
> > > different
> > > approach is taken: we stop iterating and return an error, and
> > > free
> > > next time
> > > we are called. Why?
> > > 
> > > > +work_buffer[i].mv_addr.buf = instance->mv[i].dma_addr;
> > > > +work_buffer[i].mv_addr.size = size; > +}
> > > > +
> > > > +size = max_sb_w * max_sb_h * 512;
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +if (instance->seg[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->seg[i]);
> > > > +instance->seg[i].size = size;
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->seg[i]);
> > > > +if (ret)
> > > > +goto err;
> > > > +work_buffer[i].segid_addr.buf = instance-
> > > > > seg[i].dma_addr;
> > > > 
> > > > +work_buffer[i].segid_addr.size = size;
> > > > +}
> > > > +
> > > > +size = 16384;
> > > 
> > > #define a named constant for this magic number?
> > > 
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +if (instance->cdf[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->cdf[i]);
> > > > +instance->cdf[i].size = size;
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->cdf[i]);
> > > > +if (ret)
> > > > +goto err;
> > > > +work_buffer[i].cdf_addr.buf = instance-
> > > > > cdf[i].dma_addr;
> > > > 
> > > > +work_buffer[i].cdf_addr.size = size;
> > > > +}
> > > 
> > > The 3 for loops are supposed to iterate from 0 to
> > > AV1_MAX_FRAME_BUF_COUNT - 1,
> > > inclusive. Is it possible to merge them?
> > > 
> > > > +if (!instance->cdf_temp.va) {
> > > > +instance->cdf_temp.size = (SZ_1K * 16 * 100);
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->cdf_temp);
> > > > +if (ret)
> > > > +goto err;
> > > > +vsi->cdf_tmp.buf = instance->cdf_temp.dma_addr;
> > > > +vsi->cdf_tmp.size = instance->cdf_temp.size;
> > > > +}
> > > > +size = AV1_TILE_BUF_SIZE * V4L2_AV1_MAX_TILE_COUNT;
> > > 
> > > This "size" is never changed until the end of this function.
> > > It is a compile-time constant, so there's no need to assign its
> > > value to an intermediate variable.
> > > 
> > > > +
> > > > +if (instance->tile.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->tile);
> > > > +instance->tile.size = size;
> > > 
> > > instance->tile.size = AV1_TILE_BUF_SIZE *
> > > V4L2_AV1_MAX_TILE_COUNT;
> > > 
> > > > +
> > > > +ret = mtk_vcodec_mem_alloc(ctx, &instance->tile);
> > > > +if (ret)
> > > > +goto err;
> > > > +
> > > > +vsi->tile.buf = instance->tile.dma_addr;
> > > > +vsi->tile.size = size;
> > > 
> > > vsi->tile.size = instance->tile.size;
> > > 
> > > and now it is clear the size in vsi is the same as the one in
> > > instance.
> > > BTW, is vsi->tile.size supposed to always be equal to the one in
> > > instance?
> > > If yes:
> > >   - Is instance available whenever we need to access vsi-
> > > >tile.size?
> > >   - What's the point of duplicating this value? Can it be stored
> > > in
> > > one place?
> > > 
> > > > +
> > > > +instance->level = level;
> > > > +return 0;
> > > > +
> > > > +err:
> > > > +instance->level = AV1_RES_NONE;
> > > > +return ret;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_free_working_buffer(struct
> > > > vdec_av1_slice_instance *instance)
> > > > +{
> > > > +struct mtk_vcodec_ctx *ctx = instance->ctx;
> > > > +int i;
> > > > +
> > > > +for (i = 0; i < ARRAY_SIZE(instance->mv); i++)
> > > > +if (instance->mv[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> > > 
> > > Perhaps mtk_vcodec_mem_free() can properly handle the case
> > > 
> > > (!instance->mv[i].va) ? This would eliminate 7 of 20 lines of
> > > code
> > > in this function.
> > > 
> > > > +
> > > > +for (i = 0; i < ARRAY_SIZE(instance->seg); i++)
> > > > +if (instance->seg[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->seg[i]);
> > > > +
> > > > +for (i = 0; i < ARRAY_SIZE(instance->cdf); i++)
> > > > +if (instance->cdf[i].va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->cdf[i]);
> > > > +
> > > > +if (instance->tile.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->tile);
> > > > +if (instance->cdf_temp.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->cdf_temp);
> > > > +if (instance->cdf_table.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->cdf_table);
> > > > +if (instance->iq_table.va)
> > > > +mtk_vcodec_mem_free(ctx, &instance->iq_table);
> > > > +
> > > > +instance->level = AV1_RES_NONE;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_vsi_from_remote(struct
> > > > vdec_av1_slice_vsi *vsi,
> > > > +   struct vdec_av1_slice_vsi
> > > > *remote_vsi)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +memcpy(&vsi->trans, &remote_vsi->trans, sizeof(vsi->trans));
> > > > +memcpy(&vsi->state, &remote_vsi->state, sizeof(vsi->state));
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_vsi_to_remote(struct
> > > > vdec_av1_slice_vsi
> > > > *vsi,
> > > > + struct vdec_av1_slice_vsi
> > > > *remote_vsi)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +memcpy(remote_vsi, vsi, sizeof(*vsi));
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_lat_from_src_buf(struct
> > > > vdec_av1_slice_instance *instance,
> > > > + struct
> > > > vdec_av1_slice_vsi *vsi,
> > > > + struct vdec_lat_buf
> > > > *lat_buf)
> > > > +{
> > > > +struct vb2_v4l2_buffer *src;
> > > > +struct vb2_v4l2_buffer *dst;
> > > > +
> > > > +src = v4l2_m2m_next_src_buf(instance->ctx->m2m_ctx);
> > > > +if (!src)
> > > > +return -EINVAL;
> > > > +
> > > > +lat_buf->src_buf_req = src->vb2_buf.req_obj.req;
> > > > +dst = &lat_buf->ts_info;
> > > 
> > > the "ts_info" actually contains a struct vb2_v4l2_buffer. Why
> > > such a
> > > name?
> > > 
> > > > +v4l2_m2m_buf_copy_metadata(src, dst, true);
> > > > +vsi->frame.cur_ts = dst->vb2_buf.timestamp;
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static short vdec_av1_slice_resolve_divisor_32(u32 D, short
> > > > *shift)
> > > > +{
> > > > +int f;
> > > > +int e;
> > > > +
> > > > +*shift = vdec_av1_slice_get_msb(D);
> > > > +/* e is obtained from D after resetting the most significant 1
> > > > bit. */
> > > > +e = D - ((u32)1 << *shift);
> > > > +/* Get the most significant DIV_LUT_BITS (8) bits of e into f
> > > > */
> > > > +if (*shift > DIV_LUT_BITS)
> > > > +f = AV1_DIV_ROUND_UP_POW2(e, *shift - DIV_LUT_BITS);
> > > > +else
> > > > +f = e << (DIV_LUT_BITS - *shift);
> > > > +if (f > DIV_LUT_NUM)
> > > > +return -1;
> > > > +*shift += DIV_LUT_PREC_BITS;
> > > > +/* Use f as lookup into the precomputed table of multipliers
> > > > */
> > > > +return div_lut[f];
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_get_shear_params(struct
> > > > vdec_av1_slice_gm *gm_params)
> > > > +{
> > > > +const int *mat = gm_params->wmmat;
> > > > +short shift;
> > > > +short y;
> > > > +long long gv, dv;
> > > > +
> > > > +if (gm_params->wmmat[2] <= 0)
> > > > +return;
> > > > +
> > > > +gm_params->alpha = clamp_val(mat[2] - (1 <<
> > > > WARPEDMODEL_PREC_BITS), S16_MIN, S16_MAX);
> > > > +gm_params->beta = clamp_val(mat[3], S16_MIN, S16_MAX);
> > > > +
> > > > +y = vdec_av1_slice_resolve_divisor_32(abs(mat[2]), &shift) *
> > > > (mat[2] < 0 ? -1 : 1);
> > > > +
> > > > +gv = ((long long)mat[4] * (1 << WARPEDMODEL_PREC_BITS)) * y;
> > > > +gm_params->gamma =
> > > > clamp_val((int)AV1_DIV_ROUND_UP_POW2_SIGNED(gv, shift),
> > > > +     S16_MIN, S16_MAX);
> > > > +
> > > > +dv = ((long long)mat[3] * mat[4]) * y;
> > > > +gm_params->delta = clamp_val(mat[5] -
> > > > (int)AV1_DIV_ROUND_UP_POW2_SIGNED(dv, shift) -
> > > > +     (1 << WARPEDMODEL_PREC_BITS),
> > > > S16_MIN, S16_MAX);
> > > > +
> > > > +gm_params->alpha = AV1_DIV_ROUND_UP_POW2_SIGNED(gm_params-
> > > > > alpha, WARP_PARAM_REDUCE_BITS) *
> > > > 
> > > > +(1 <<
> > > > WARP_PARAM_REDUCE_BITS);
> > > > +gm_params->beta = AV1_DIV_ROUND_UP_POW2_SIGNED(gm_params-
> > > > >beta, 
> > > > WARP_PARAM_REDUCE_BITS) *
> > > > +       (1 <<
> > > > WARP_PARAM_REDUCE_BITS);
> > > > +gm_params->gamma = AV1_DIV_ROUND_UP_POW2_SIGNED(gm_params-
> > > > > gamma, WARP_PARAM_REDUCE_BITS) *
> > > > 
> > > > +(1 <<
> > > > WARP_PARAM_REDUCE_BITS);
> > > > +gm_params->delta = AV1_DIV_ROUND_UP_POW2_SIGNED(gm_params-
> > > > > delta, WARP_PARAM_REDUCE_BITS) *
> > > > 
> > > > +(1 <<
> > > > WARP_PARAM_REDUCE_BITS);
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_gm(struct vdec_av1_slice_gm
> > > > *gm,
> > > > +    struct v4l2_av1_global_motion
> > > > *ctrl_gm)
> > > > +{
> > > > +u32 i, j;
> > > > +
> > > > +for (i = 0; i < V4L2_AV1_TOTAL_REFS_PER_FRAME; i++) {
> > > > +gm[i].wmtype = ctrl_gm->type[i];
> > > > +for (j = 0; j < 6; j++)
> > > 
> > > Maybe #define this magic 6?
> > > 
> > > > +gm[i].wmmat[j] = ctrl_gm->params[i][j];
> > > > +
> > > > +gm[i].invalid = !!(ctrl_gm->invalid & BIT(i));
> > > > +gm[i].alpha = 0;
> > > > +gm[i].beta = 0;
> > > > +gm[i].gamma = 0;
> > > > +gm[i].delta = 0;
> > > > +if (gm[i].wmtype <= 3)
> > > 
> > > And this 3?
> > > 
> > > > +vdec_av1_slice_get_shear_params(&gm[i]);
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_seg(struct vdec_av1_slice_seg
> > > > *seg,
> > > > +     struct v4l2_av1_segmentation
> > > > *ctrl_seg)
> > > > +{
> > > > +u32 i, j;
> > > > +
> > > > +seg->segmentation_enabled = SEGMENTATION_FLAG(ctrl_seg,
> > > > ENABLED);
> > > > +seg->segmentation_update_map = SEGMENTATION_FLAG(ctrl_seg,
> > > > UPDATE_MAP);
> > > > +seg->segmentation_temporal_update =
> > > > SEGMENTATION_FLAG(ctrl_seg,
> > > > TEMPORAL_UPDATE);
> > > > +seg->segmentation_update_data = SEGMENTATION_FLAG(ctrl_seg,
> > > > UPDATE_DATA);
> > > > +seg->segid_preskip = SEGMENTATION_FLAG(ctrl_seg,
> > > > SEG_ID_PRE_SKIP);
> > > > +seg->last_active_segid = ctrl_seg->last_active_seg_id;
> > > > +
> > > > +for (i = 0; i < V4L2_AV1_MAX_SEGMENTS; i++) {
> > > > +seg->feature_enabled_mask[i] = ctrl_seg-
> > > > > feature_enabled[i];
> > > > 
> > > > +for (j = 0; j < V4L2_AV1_SEG_LVL_MAX; j++)
> > > > +seg->feature_data[i][j] = ctrl_seg-
> > > > > feature_data[i][j];
> > > > 
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_quant(struct
> > > > vdec_av1_slice_quantization *quant,
> > > > +       struct v4l2_av1_quantization
> > > > *ctrl_quant)
> > > > +{
> > > > +quant->base_q_idx = ctrl_quant->base_q_idx;
> > > > +quant->delta_qydc = ctrl_quant->delta_q_y_dc;
> > > > +quant->delta_qudc = ctrl_quant->delta_q_u_dc;
> > > > +quant->delta_quac = ctrl_quant->delta_q_u_ac;
> > > > +quant->delta_qvdc = ctrl_quant->delta_q_v_dc;
> > > > +quant->delta_qvac = ctrl_quant->delta_q_v_ac;
> > > > +quant->qm_y = ctrl_quant->qm_y;
> > > > +quant->qm_u = ctrl_quant->qm_u;
> > > > +quant->qm_v = ctrl_quant->qm_v;
> > > 
> > > Can a common struct be introduced to hold these parameters?
> > > And then copied in one go?
> > > 
> > > Maybe there's a good reason the code is the way it is now.
> > > However,
> > > a series of "dumb" assignments (no value modifications) makes me
> > > wonder.
> > > 
> > > > +quant->using_qmatrix = QUANT_FLAG(ctrl_quant, USING_QMATRIX);
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_get_qindex(struct
> > > > vdec_av1_slice_uncompressed_header *uh,
> > > > +     int segmentation_id)
> > > > +{
> > > > +struct vdec_av1_slice_seg *seg = &uh->seg;
> > > > +struct vdec_av1_slice_quantization *quant = &uh->quant;
> > > > +int data = 0, qindex = 0;
> > > > +
> > > > +if (seg->segmentation_enabled &&
> > > > +    (seg->feature_enabled_mask[segmentation_id] &
> > > > BIT(SEG_LVL_ALT_Q))) {
> > > > +data = seg-
> > > > > feature_data[segmentation_id][SEG_LVL_ALT_Q];
> > > > 
> > > > +qindex = quant->base_q_idx + data;
> > > > +return clamp_val(qindex, 0, MAXQ);
> > > > +}
> > > > +
> > > > +return quant->base_q_idx;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_lr(struct vdec_av1_slice_lr
> > > > *lr,
> > > > +    struct
> > > > v4l2_av1_loop_restoration  *ctrl_lr)
> > > > +{
> > > > +int i;
> > > > +
> > > > +lr->use_lr = 0;
> > > > +lr->use_chroma_lr = 0;
> > > > +for (i = 0; i < V4L2_AV1_NUM_PLANES_MAX; i++) {
> > > > +lr->frame_restoration_type[i] = ctrl_lr-
> > > > > frame_restoration_type[i];
> > > > 
> > > > +lr->loop_restoration_size[i] = ctrl_lr-
> > > > > loop_restoration_size[i];
> > > > 
> > > > +if (lr->frame_restoration_type[i]) {
> > > > +lr->use_lr = 1;
> > > > +if (i > 0)
> > > > +lr->use_chroma_lr = 1;
> > > > +}
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_lf(struct
> > > > vdec_av1_slice_loop_filter *lf,
> > > > +    struct v4l2_av1_loop_filter
> > > > *ctrl_lf)
> > > > +{
> > > > +int i;
> > > > +
> > > > +for (i = 0; i < ARRAY_SIZE(lf->loop_filter_level); i++)
> > > > +lf->loop_filter_level[i] = ctrl_lf->level[i];
> > > > +
> > > > +for (i = 0; i < V4L2_AV1_TOTAL_REFS_PER_FRAME; i++)
> > > > +lf->loop_filter_ref_deltas[i] = ctrl_lf->ref_deltas[i];
> > > > +
> > > > +for (i = 0; i < ARRAY_SIZE(lf->loop_filter_mode_deltas); i++)
> > > > +lf->loop_filter_mode_deltas[i] = ctrl_lf-
> > > > > mode_deltas[i];
> > > > 
> > > > +
> > > > +lf->loop_filter_sharpness = ctrl_lf->sharpness;
> > > > +lf->loop_filter_delta_enabled =
> > > > +   BIT_FLAG(ctrl_lf,
> > > > V4L2_AV1_LOOP_FILTER_FLAG_DELTA_ENABLED);
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_cdef(struct
> > > > vdec_av1_slice_cdef
> > > > *cdef,
> > > > +      struct v4l2_av1_cdef *ctrl_cdef)
> > > > +{
> > > > +int i;
> > > > +
> > > > +cdef->cdef_damping = ctrl_cdef->damping_minus_3 + 3;
> > > > +cdef->cdef_bits = ctrl_cdef->bits;
> > > > +
> > > > +for (i = 0; i < V4L2_AV1_CDEF_MAX; i++) {
> > > > +if (ctrl_cdef->y_sec_strength[i] == 4)
> > > > +ctrl_cdef->y_sec_strength[i] -= 1;
> > > > +
> > > > +if (ctrl_cdef->uv_sec_strength[i] == 4)
> > > > +ctrl_cdef->uv_sec_strength[i] -= 1;
> > > > +
> > > > +cdef->cdef_y_strength[i] =
> > > > +ctrl_cdef->y_pri_strength[i] <<
> > > > SECONDARY_FILTER_STRENGTH_NUM_BITS |
> > > > +ctrl_cdef->y_sec_strength[i];
> > > > +cdef->cdef_uv_strength[i] =
> > > > +ctrl_cdef->uv_pri_strength[i] <<
> > > > SECONDARY_FILTER_STRENGTH_NUM_BITS |
> > > > +ctrl_cdef->uv_sec_strength[i];
> > > > +}
> > > > +}
> > > 
> > > Both vdec_av1_slice_setup_lf() and vdec_av1_slice_setup_cdef():
> > > 
> > > I'm wondering if the user of struct vdec_av1_slice_loop_filter
> > > and
> > > struct 
> > > vdec_av1_slice_cdef could work with the uAPI variants of these
> > > structs? Is there
> > > a need for driver-specific mutations? (Maybe there is, the
> > > driver's
> > > author
> > > should know).
> > > 
> > > > +
> > > > +static void vdec_av1_slice_setup_seq(struct
> > > > vdec_av1_slice_seq_header *seq,
> > > > +     struct v4l2_ctrl_av1_sequence
> > > > *ctrl_seq)
> > > > +{
> > > > +seq->bitdepth = ctrl_seq->bit_depth;
> > > > +seq->max_frame_width = ctrl_seq->max_frame_width_minus_1 + 1;
> > > > +seq->max_frame_height = ctrl_seq->max_frame_height_minus_1 +
> > > > 1;
> > > > +seq->enable_superres = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_SUPERRES);
> > > > +seq->enable_filter_intra = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_FILTER_INTRA);
> > > > +seq->enable_intra_edge_filter = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_INTRA_EDGE_FILTER);
> > > > +seq->enable_interintra_compound = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_INTERINTRA_COMPOUND);
> > > > +seq->enable_masked_compound = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_MASKED_COMPOUND);
> > > > +seq->enable_dual_filter = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_DUAL_FILTER);
> > > > +seq->enable_jnt_comp = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_JNT_COMP);
> > > > +seq->mono_chrome = SEQUENCE_FLAG(ctrl_seq, MONO_CHROME);
> > > > +seq->enable_order_hint = SEQUENCE_FLAG(ctrl_seq,
> > > > ENABLE_ORDER_HINT);
> > > > +seq->order_hint_bits = ctrl_seq->order_hint_bits;
> > > > +seq->use_128x128_superblock = SEQUENCE_FLAG(ctrl_seq,
> > > > USE_128X128_SUPERBLOCK);
> > > > +seq->subsampling_x = SEQUENCE_FLAG(ctrl_seq, SUBSAMPLING_X);
> > > > +seq->subsampling_y = SEQUENCE_FLAG(ctrl_seq, SUBSAMPLING_Y);
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_tile(struct
> > > > vdec_av1_slice_frame
> > > > *frame,
> > > > +      struct v4l2_av1_tile_info
> > > > *ctrl_tile)
> > > > +{
> > > > +struct vdec_av1_slice_seq_header *seq = &frame->seq;
> > > > +struct vdec_av1_slice_tile *tile = &frame->uh.tile;
> > > > +u32 mib_size_log2 = seq->use_128x128_superblock ? 5 : 4;
> > > > +int i;
> > > > +
> > > > +tile->tile_cols = ctrl_tile->tile_cols;
> > > > +tile->tile_rows = ctrl_tile->tile_rows;
> > > > +tile->context_update_tile_id = ctrl_tile-
> > > > > context_update_tile_id;
> > > > 
> > > > +tile->uniform_tile_spacing_flag =
> > > > +BIT_FLAG(ctrl_tile,
> > > > V4L2_AV1_TILE_INFO_FLAG_UNIFORM_TILE_SPACING);
> > > > +
> > > > +for (i = 0; i < tile->tile_cols + 1; i++)
> > > > +tile->mi_col_starts[i] =
> > > > +ALIGN(ctrl_tile->mi_col_starts[i],
> > > > BIT(mib_size_log2)) >> mib_size_log2;
> > > > +
> > > > +for (i = 0; i < tile->tile_rows + 1; i++)
> > > > +tile->mi_row_starts[i] =
> > > > +ALIGN(ctrl_tile->mi_row_starts[i],
> > > > BIT(mib_size_log2)) >> mib_size_log2;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_uh(struct
> > > > vdec_av1_slice_instance
> > > > *instance,
> > > > +    struct vdec_av1_slice_frame *frame,
> > > > +    struct v4l2_ctrl_av1_frame
> > > > *ctrl_fh)
> > > > +{
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &frame->uh;
> > > > +int i;
> > > > +
> > > > +uh->use_ref_frame_mvs = FH_FLAG(ctrl_fh, USE_REF_FRAME_MVS);
> > > > +uh->order_hint = ctrl_fh->order_hint;
> > > > +vdec_av1_slice_setup_gm(uh->gm, &ctrl_fh->global_motion);
> > > > +uh->upscaled_width = ctrl_fh->upscaled_width;
> > > > +uh->frame_width = ctrl_fh->frame_width_minus_1 + 1;
> > > > +uh->frame_height = ctrl_fh->frame_height_minus_1 + 1;
> > > > +uh->mi_cols = ((uh->frame_width + 7) >> 3) << 1;
> > > > +uh->mi_rows = ((uh->frame_height + 7) >> 3) << 1;
> > > > +uh->reduced_tx_set = FH_FLAG(ctrl_fh, REDUCED_TX_SET);
> > > > +uh->tx_mode = ctrl_fh->tx_mode;
> > > > +uh->uniform_tile_spacing_flag = FH_FLAG(ctrl_fh,
> > > > UNIFORM_TILE_SPACING);
> > > > +uh->interpolation_filter = ctrl_fh->interpolation_filter;
> > > > +uh->allow_warped_motion = FH_FLAG(ctrl_fh,
> > > > ALLOW_WARPED_MOTION);
> > > > +uh->is_motion_mode_switchable = FH_FLAG(ctrl_fh,
> > > > IS_MOTION_MODE_SWITCHABLE);
> > > > +uh->frame_type = ctrl_fh->frame_type;
> > > > +uh->frame_is_intra = (uh->frame_type ==
> > > > V4L2_AV1_INTRA_ONLY_FRAME ||
> > > > +      uh->frame_type == V4L2_AV1_KEY_FRAME);
> > > > +
> > > > +if (!uh->frame_is_intra && FH_FLAG(ctrl_fh, REFERENCE_SELECT))
> > > > +uh->reference_mode = AV1_REFERENCE_MODE_SELECT;
> > > > +else
> > > > +uh->reference_mode = AV1_SINGLE_REFERENCE;
> > > > +
> > > > +uh->allow_high_precision_mv = FH_FLAG(ctrl_fh,
> > > > ALLOW_HIGH_PRECISION_MV);
> > > > +uh->allow_intra_bc = FH_FLAG(ctrl_fh, ALLOW_INTRABC);
> > > > +uh->force_integer_mv = FH_FLAG(ctrl_fh, FORCE_INTEGER_MV);
> > > > +uh->allow_screen_content_tools = FH_FLAG(ctrl_fh,
> > > > ALLOW_SCREEN_CONTENT_TOOLS);
> > > > +uh->error_resilient_mode = FH_FLAG(ctrl_fh,
> > > > ERROR_RESILIENT_MODE);
> > > > +uh->primary_ref_frame = ctrl_fh->primary_ref_frame;
> > > > +uh->disable_frame_end_update_cdf =
> > > > +FH_FLAG(ctrl_fh, DISABLE_FRAME_END_UPDATE_CDF);
> > > > +uh->disable_cdf_update = FH_FLAG(ctrl_fh, DISABLE_CDF_UPDATE);
> > > > +uh->skip_mode.skip_mode_present = FH_FLAG(ctrl_fh,
> > > > SKIP_MODE_PRESENT);
> > > > +uh->skip_mode.skip_mode_frame[0] =
> > > > +ctrl_fh->skip_mode_frame[0] - V4L2_AV1_REF_LAST_FRAME;
> > > > +uh->skip_mode.skip_mode_frame[1] =
> > > > +ctrl_fh->skip_mode_frame[1] - V4L2_AV1_REF_LAST_FRAME;
> > > > +uh->skip_mode.skip_mode_allowed = ctrl_fh->skip_mode_frame[0]
> > > > ?
> > > > 1 : 0;
> > > > +
> > > > +vdec_av1_slice_setup_seg(&uh->seg, &ctrl_fh->segmentation);
> > > > +uh->delta_q_lf.delta_q_present = QUANT_FLAG(&ctrl_fh-
> > > > > quantization, DELTA_Q_PRESENT);
> > > > 
> > > > +uh->delta_q_lf.delta_q_res = 1 << ctrl_fh-
> > > > > quantization.delta_q_res;
> > > > 
> > > > +uh->delta_q_lf.delta_lf_present =
> > > > +BIT_FLAG(&ctrl_fh->loop_filter,
> > > > V4L2_AV1_LOOP_FILTER_FLAG_DELTA_LF_PRESENT);
> > > > +uh->delta_q_lf.delta_lf_res = ctrl_fh-
> > > > > loop_filter.delta_lf_res;
> > > > 
> > > > +uh->delta_q_lf.delta_lf_multi =
> > > > +BIT_FLAG(&ctrl_fh->loop_filter,
> > > > V4L2_AV1_LOOP_FILTER_FLAG_DELTA_LF_MULTI);
> > > > +vdec_av1_slice_setup_quant(&uh->quant, &ctrl_fh-
> > > > >quantization);
> > > > +
> > > > +uh->coded_loss_less = 1;
> > > > +for (i = 0; i < V4L2_AV1_MAX_SEGMENTS; i++) {
> > > > +uh->quant.qindex[i] = vdec_av1_slice_get_qindex(uh, i);
> > > > +uh->loss_less_array[i] =
> > > > +(uh->quant.qindex[i] == 0 && uh-
> > > > > quant.delta_qydc == 0 &&
> > > > 
> > > > +uh->quant.delta_quac == 0 && uh-
> > > > > quant.delta_qudc == 0 &&
> > > > 
> > > > +uh->quant.delta_qvac == 0 && uh-
> > > > > quant.delta_qvdc == 0);
> > > > 
> > > > +
> > > > +if (!uh->loss_less_array[i])
> > > > +uh->coded_loss_less = 0;
> > > > +}
> > > > +
> > > > +vdec_av1_slice_setup_lr(&uh->lr, &ctrl_fh->loop_restoration);
> > > > +uh->superres_denom = ctrl_fh->superres_denom;
> > > > +vdec_av1_slice_setup_lf(&uh->loop_filter, &ctrl_fh-
> > > > > loop_filter);
> > > > 
> > > > +vdec_av1_slice_setup_cdef(&uh->cdef, &ctrl_fh->cdef);
> > > > +vdec_av1_slice_setup_tile(frame, &ctrl_fh->tile_info);
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_tile_group(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +   struct vdec_av1_slice_vsi
> > > > *vsi)
> > > > +{
> > > > +struct v4l2_ctrl_av1_tile_group_entry *ctrl_tge;
> > > > +struct vdec_av1_slice_tile_group *tile_group = &instance-
> > > > > tile_group;
> > > > 
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &vsi-
> > > > >frame.uh;
> > > > +struct vdec_av1_slice_tile *tile = &uh->tile;
> > > > +struct v4l2_ctrl *ctrl;
> > > > +u32 tge_size;
> > > > +int i;
> > > > +
> > > > +ctrl = v4l2_ctrl_find(&instance->ctx->ctrl_hdl,
> > > > V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY);
> > > > +if (!ctrl)
> > > > +return -EINVAL;
> > > > +
> > > > +tge_size = ctrl->elems;
> > > > +ctrl_tge = (struct v4l2_ctrl_av1_tile_group_entry *)ctrl-
> > > > > p_cur.p;
> > > > 
> > > > +
> > > > +tile_group->num_tiles = tile->tile_cols * tile->tile_rows;
> > > > +
> > > > +if (tile_group->num_tiles != tge_size ||
> > > > +    tile_group->num_tiles > V4L2_AV1_MAX_TILE_COUNT) {
> > > > +mtk_vcodec_err(instance, "invalid tge_size %d,
> > > > tile_num:%d\n",
> > > > +       tge_size, tile_group->num_tiles);
> > > > +return -EINVAL;
> > > > +}
> > > > +
> > > > +for (i = 0; i < tge_size; i++) {
> > > > +if (i != ctrl_tge[i].tile_row * vsi-
> > > > > frame.uh.tile.tile_cols +
> > > > 
> > > > +    ctrl_tge[i].tile_col) {
> > > > +mtk_vcodec_err(instance, "invalid tge info %d,
> > > > %d %d %d\n",
> > > > +       i, ctrl_tge[i].tile_row,
> > > > ctrl_tge[i].tile_col,
> > > > +       vsi->frame.uh.tile.tile_rows);
> > > > +return -EINVAL;
> > > > +}
> > > > +tile_group->tile_size[i] = ctrl_tge[i].tile_size;
> > > > +tile_group->tile_start_offset[i] =
> > > > ctrl_tge[i].tile_offset;
> > > > +}
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_state(struct
> > > > vdec_av1_slice_vsi
> > > > *vsi)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +memset(&vsi->state, 0, sizeof(vsi->state));
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_scale_factors(struct
> > > > vdec_av1_slice_frame_refs *frame_ref,
> > > > +       struct
> > > > vdec_av1_slice_frame_info *ref_frame_info,
> > > > +       struct
> > > > vdec_av1_slice_uncompressed_header *uh)
> > > > +{
> > > > +struct vdec_av1_slice_scale_factors *scale_factors =
> > > > &frame_ref->scale_factors;
> > > > +u32 ref_upscaled_width = ref_frame_info->upscaled_width;
> > > > +u32 ref_frame_height = ref_frame_info->frame_height;
> > > > +u32 frame_width = uh->frame_width;
> > > > +u32 frame_height = uh->frame_height;
> > > > +
> > > > +if (!vdec_av1_slice_need_scale(ref_upscaled_width,
> > > > ref_frame_height,
> > > > +       frame_width, frame_height)) {
> > > > +scale_factors->x_scale = -1;
> > > > +scale_factors->y_scale = -1;
> > > > +scale_factors->is_scaled = 0;
> > > > +return;
> > > > +}
> > > > +
> > > > +scale_factors->x_scale =
> > > > +((ref_upscaled_width << AV1_REF_SCALE_SHIFT) +
> > > > (frame_width >> 1)) / frame_width;
> > > > +scale_factors->y_scale =
> > > > +((ref_frame_height << AV1_REF_SCALE_SHIFT) +
> > > > (frame_height >> 1)) / frame_height;
> > > > +scale_factors->is_scaled =
> > > > +(scale_factors->x_scale != AV1_REF_INVALID_SCALE) &&
> > > > +(scale_factors->y_scale != AV1_REF_INVALID_SCALE) &&
> > > > +(scale_factors->x_scale != AV1_REF_NO_SCALE ||
> > > > + scale_factors->y_scale != AV1_REF_NO_SCALE);
> > > > +scale_factors->x_step =
> > > > +AV1_DIV_ROUND_UP_POW2(scale_factors->x_scale,
> > > > +      AV1_REF_SCALE_SHIFT -
> > > > AV1_SCALE_SUBPEL_BITS);
> > > > +scale_factors->y_step =
> > > > +AV1_DIV_ROUND_UP_POW2(scale_factors->y_scale,
> > > > +      AV1_REF_SCALE_SHIFT -
> > > > AV1_SCALE_SUBPEL_BITS);
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_get_relative_dist(int a, int b, u8
> > > > enable_order_hint, u8 order_hint_bits)
> > > > +{
> > > > +int diff = 0;
> > > > +int m = 0;
> > > > +
> > > > +if (!enable_order_hint)
> > > > +return 0;
> > > > +
> > > > +diff = a - b;
> > > > +m = 1 << (order_hint_bits - 1);
> > > > +diff = (diff & (m - 1)) - (diff & m);
> > > > +
> > > > +return diff;
> > > > +}
> > > 
> > > This function is called in one place only, and its result needs
> > > to be
> > > interpreted at call site. Can it return the result in a form
> > > expected
> > > at call site...
> > > 
> > > > +
> > > > +static void vdec_av1_slice_setup_ref(struct vdec_av1_slice_pfc
> > > > *pfc,
> > > > +     struct v4l2_ctrl_av1_frame
> > > > *ctrl_fh)
> > > > +{
> > > > +struct vdec_av1_slice_vsi *vsi = &pfc->vsi;
> > > > +struct vdec_av1_slice_frame *frame = &vsi->frame;
> > > > +struct vdec_av1_slice_slot *slots = &vsi->slots;
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &frame->uh;
> > > > +struct vdec_av1_slice_seq_header *seq = &frame->seq;
> > > > +struct vdec_av1_slice_frame_info *cur_frame_info =
> > > > +&slots->frame_info[vsi->slot_id];
> > > > +struct vdec_av1_slice_frame_info *frame_info;
> > > > +int i, slot_id;
> > > > +
> > > > +if (uh->frame_is_intra)
> > > > +return;
> > > > +
> > > > +for (i = 0; i < V4L2_AV1_REFS_PER_FRAME; i++) {
> > > > +int ref_idx = ctrl_fh->ref_frame_idx[i];
> > > > +
> > > > +pfc->ref_idx[i] = ctrl_fh->reference_frame_ts[ref_idx];
> > > > +slot_id = frame->ref_frame_map[ref_idx];
> > > > +frame_info = &slots->frame_info[slot_id];
> > > > +if (slot_id == AV1_INVALID_IDX) {
> > > > +mtk_v4l2_err("cannot match reference[%d]
> > > > 0x%llx\n", i,
> > > > +     ctrl_fh-
> > > > > reference_frame_ts[ref_idx]);
> > > > 
> > > > +frame->order_hints[i] = 0;
> > > > +frame->ref_frame_valid[i] = 0;
> > > > +continue;
> > > > +}
> > > > +
> > > > +frame->frame_refs[i].ref_fb_idx = slot_id;
> > > > +vdec_av1_slice_setup_scale_factors(&frame-
> > > > > frame_refs[i],
> > > > 
> > > > +   frame_info, uh);
> > > > +if (!seq->enable_order_hint)
> > > > +frame->ref_frame_sign_bias[i + 1] = 0;
> > > > +else
> > > > +frame->ref_frame_sign_bias[i + 1] =
> > > > +vdec_av1_slice_get_relative_dist(frame_
> > > > info->order_hint,
> > > > + uh-
> > > > > order_hint,
> > > > 
> > > > + seq-
> > > > > enable_order_hint,
> > > > 
> > > > + seq-
> > > > > order_hint_bits)
> > > > 
> > > > +<= 0 ? 0 : 1;
> > > 
> > > ... to get rid of this tri-argument operator altogether?
> > > 
> > > > +
> > > > +frame->order_hints[i] = ctrl_fh->order_hints[i + 1];
> > > > +cur_frame_info->order_hints[i] = frame->order_hints[i];
> > > > +frame->ref_frame_valid[i] = 1;
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_get_previous(struct
> > > > vdec_av1_slice_vsi
> > > > *vsi)
> > > > +{
> > > > +struct vdec_av1_slice_frame *frame = &vsi->frame;
> > > > +
> > > > +if (frame->uh.primary_ref_frame == 7)
> > > 
> > > #define magic number 7?
> > > 
> > > > +frame->prev_fb_idx = AV1_INVALID_IDX;
> > > > +else
> > > > +frame->prev_fb_idx = frame->frame_refs[frame-
> > > > > uh.primary_ref_frame].ref_fb_idx;
> > > > 
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_operating_mode(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +struct
> > > > vdec_av1_slice_frame *frame)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +frame->large_scale_tile = 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_pfc(struct
> > > > vdec_av1_slice_instance
> > > > *instance,
> > > > +    struct vdec_av1_slice_pfc *pfc)
> > > > +{
> > > > +struct v4l2_ctrl_av1_frame *ctrl_fh;
> > > > +struct v4l2_ctrl_av1_sequence *ctrl_seq;
> > > > +struct vdec_av1_slice_vsi *vsi = &pfc->vsi;
> > > > +int ret = 0;
> > > > +
> > > > +/* frame header */
> > > > +ctrl_fh = (struct v4l2_ctrl_av1_frame *)
> > > > +  vdec_av1_get_ctrl_ptr(instance->ctx,
> > > > +V4L2_CID_STATELESS_AV1_FRAME);
> > > > +if (IS_ERR(ctrl_fh))
> > > > +return PTR_ERR(ctrl_fh);
> > > > +
> > > > +ctrl_seq = (struct v4l2_ctrl_av1_sequence *)
> > > > +   vdec_av1_get_ctrl_ptr(instance->ctx,
> > > > + V4L2_CID_STATELESS_AV1_SEQUENC
> > > > E);
> > > > +if (IS_ERR(ctrl_seq))
> > > > +return PTR_ERR(ctrl_seq);
> > > 
> > > Just to make sure: I assume request api is used? If so, does
> > > vdec's
> > > framework
> > > ensure that v4l2_ctrl_request_setup() has been called? It
> > > influences
> > > what's
> > > actually in ctrl->p_cur.p (current or previous value), and the
> > > vdec_av1_get_ctrl_ptr() wrapper returns ctrl->p_cur.p.
> > > 
> > > > +
> > > > +/* setup vsi information */
> > > > +vdec_av1_slice_setup_seq(&vsi->frame.seq, ctrl_seq);
> > > > +vdec_av1_slice_setup_uh(instance, &vsi->frame, ctrl_fh);
> > > > +vdec_av1_slice_setup_operating_mode(instance, &vsi->frame);
> > > > +
> > > > +vdec_av1_slice_setup_state(vsi);
> > > > +vdec_av1_slice_setup_slot(instance, vsi, ctrl_fh);
> > > > +vdec_av1_slice_setup_ref(pfc, ctrl_fh);
> > > > +vdec_av1_slice_get_previous(vsi);
> > > > +
> > > > +pfc->seq = instance->seq;
> > > > +instance->seq++;
> > > > +
> > > > +return ret;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_lat_buffer(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +    struct vdec_av1_slice_vsi
> > > > *vsi,
> > > > +    struct mtk_vcodec_mem *bs,
> > > > +    struct vdec_lat_buf
> > > > *lat_buf)
> > > > +{
> > > > +struct vdec_av1_slice_work_buffer *work_buffer;
> > > > +int i;
> > > > +
> > > > +vsi->bs.dma_addr = bs->dma_addr;
> > > > +vsi->bs.size = bs->size;
> > > > +
> > > > +vsi->ube.dma_addr = lat_buf->ctx-
> > > > >msg_queue.wdma_addr.dma_addr;
> > > > +vsi->ube.size = lat_buf->ctx->msg_queue.wdma_addr.size;
> > > > +vsi->trans.dma_addr = lat_buf->ctx->msg_queue.wdma_wptr_addr;
> > > > +/* used to store trans end */
> > > > +vsi->trans.dma_addr_end = lat_buf->ctx-
> > > > > msg_queue.wdma_rptr_addr;
> > > > 
> > > > +vsi->err_map.dma_addr = lat_buf->wdma_err_addr.dma_addr;
> > > > +vsi->err_map.size = lat_buf->wdma_err_addr.size;
> > > > +vsi->rd_mv.dma_addr = lat_buf->rd_mv_addr.dma_addr;
> > > > +vsi->rd_mv.size = lat_buf->rd_mv_addr.size;
> > > > +
> > > > +vsi->row_info.buf = 0;
> > > > +vsi->row_info.size = 0;
> > > > +
> > > > +work_buffer = vsi->work_buffer;
> > > > +
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++) {
> > > > +work_buffer[i].mv_addr.buf = instance->mv[i].dma_addr;
> > > > +work_buffer[i].mv_addr.size = instance->mv[i].size;
> > > > +work_buffer[i].segid_addr.buf = instance-
> > > > > seg[i].dma_addr;
> > > > 
> > > > +work_buffer[i].segid_addr.size = instance->seg[i].size;
> > > > +work_buffer[i].cdf_addr.buf = instance-
> > > > > cdf[i].dma_addr;
> > > > 
> > > > +work_buffer[i].cdf_addr.size = instance->cdf[i].size;
> > > > +}
> > > > +
> > > > +vsi->cdf_tmp.buf = instance->cdf_temp.dma_addr;
> > > > +vsi->cdf_tmp.size = instance->cdf_temp.size;
> > > > +
> > > > +vsi->tile.buf = instance->tile.dma_addr;
> > > > +vsi->tile.size = instance->tile.size;
> > > > +memcpy(lat_buf->tile_addr.va, instance->tile.va, 64 *
> > > > instance-
> > > > > tile_group.num_tiles);
> > > > 
> > > > +
> > > > +vsi->cdf_table.buf = instance->cdf_table.dma_addr;
> > > > +vsi->cdf_table.size = instance->cdf_table.size;
> > > > +vsi->iq_table.buf = instance->iq_table.dma_addr;
> > > > +vsi->iq_table.size = instance->iq_table.size;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_seg_buffer(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +    struct vdec_av1_slice_vsi
> > > > *vsi)
> > > > +{
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &vsi-
> > > > >frame.uh;
> > > > +struct mtk_vcodec_mem *buf;
> > > > +
> > > > +/* reset segment buffer */
> > > > +if (uh->primary_ref_frame == 7 || !uh-
> > > > > seg.segmentation_enabled) {
> > > 
> > > #define magic 7?
> > > 
> > > > +mtk_vcodec_debug(instance, "reset seg %d\n", vsi-
> > > > > slot_id);
> > > > 
> > > > +if (vsi->slot_id != AV1_INVALID_IDX) {
> > > > +buf = &instance->seg[vsi->slot_id];
> > > > +memset(buf->va, 0, buf->size);
> > > > +}
> > > > +}
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_setup_tile_buffer(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +     struct vdec_av1_slice_vsi
> > > > *vsi,
> > > > +     struct mtk_vcodec_mem *bs)
> > > > +{
> > > > +struct vdec_av1_slice_tile_group *tile_group = &instance-
> > > > > tile_group;
> > > > 
> > > > +struct vdec_av1_slice_uncompressed_header *uh = &vsi-
> > > > >frame.uh;
> > > > +struct vdec_av1_slice_tile *tile = &uh->tile;
> > > > +u32 tile_num, tile_row, tile_col;
> > > > +u32 allow_update_cdf = 0;
> > > > +u32 sb_boundary_x_m1 = 0, sb_boundary_y_m1 = 0;
> > > > +int tile_info_base;
> > > > +u32 tile_buf_pa;
> > > > +u32 *tile_info_buf = instance->tile.va;
> > > > +u32 pa = (u32)bs->dma_addr;
> > > > +
> > > > +if (uh->disable_cdf_update == 0)
> > > > +allow_update_cdf = 1;
> > > > +
> > > > +for (tile_num = 0; tile_num < tile_group->num_tiles;
> > > > tile_num++) {
> > > > +/* each uint32 takes place of 4 bytes */
> > > > +tile_info_base = (AV1_TILE_BUF_SIZE * tile_num) >> 2;
> > > > +tile_row = tile_num / tile->tile_cols;
> > > > +tile_col = tile_num % tile->tile_cols;
> > > > +tile_info_buf[tile_info_base + 0] = (tile_group-
> > > > > tile_size[tile_num] << 3);
> > > > 
> > > > +tile_buf_pa = pa + tile_group-
> > > > > tile_start_offset[tile_num];
> > > > 
> > > > +
> > > > +tile_info_buf[tile_info_base + 1] = (tile_buf_pa >> 4)
> > > > << 4;
> > > > +tile_info_buf[tile_info_base + 2] = (tile_buf_pa % 16)
> > > > << 3;
> > > > +
> > > > +sb_boundary_x_m1 =
> > > > +(tile->mi_col_starts[tile_col + 1] - tile-
> > > > > mi_col_starts[tile_col] - 1) &
> > > > 
> > > > +0x3f;
> > > > +sb_boundary_y_m1 =
> > > > +(tile->mi_row_starts[tile_row + 1] - tile-
> > > > > mi_row_starts[tile_row] - 1) &
> > > > 
> > > > +0x1ff;
> > > > +
> > > > +tile_info_buf[tile_info_base + 3] = (sb_boundary_y_m1
> > > > << 7) | sb_boundary_x_m1;
> > > > +tile_info_buf[tile_info_base + 4] = ((allow_update_cdf
> > > > << 18) | (1 << 16));
> > > > +
> > > > +if (tile_num == tile->context_update_tile_id &&
> > > > +    uh->disable_frame_end_update_cdf == 0)
> > > > +tile_info_buf[tile_info_base + 4] |= (1 << 17);
> > > > +
> > > > +mtk_vcodec_debug(instance, "// tile buf %d pos(%dx%d)
> > > > offset 0x%x\n",
> > > > + tile_num, tile_row, tile_col,
> > > > tile_info_base);
> > > > +mtk_vcodec_debug(instance, "// %08x %08x %08x %08x\n",
> > > > + tile_info_buf[tile_info_base + 0],
> > > > + tile_info_buf[tile_info_base + 1],
> > > > + tile_info_buf[tile_info_base + 2],
> > > > + tile_info_buf[tile_info_base + 3]);
> > > > +mtk_vcodec_debug(instance, "// %08x %08x %08x %08x\n",
> > > > + tile_info_buf[tile_info_base + 4],
> > > > + tile_info_buf[tile_info_base + 5],
> > > > + tile_info_buf[tile_info_base + 6],
> > > > + tile_info_buf[tile_info_base + 7]);
> > > > +}
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_lat(struct
> > > > vdec_av1_slice_instance
> > > > *instance,
> > > > +    struct mtk_vcodec_mem *bs,
> > > > +    struct vdec_lat_buf *lat_buf,
> > > > +    struct vdec_av1_slice_pfc *pfc)
> > > > +{
> > > > +struct vdec_av1_slice_vsi *vsi = &pfc->vsi;
> > > > +int ret;
> > > > +
> > > > +ret = vdec_av1_slice_setup_lat_from_src_buf(instance, vsi,
> > > > lat_buf);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +ret = vdec_av1_slice_setup_pfc(instance, pfc);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +ret = vdec_av1_slice_setup_tile_group(instance, vsi);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +ret = vdec_av1_slice_alloc_working_buffer(instance, vsi);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +vdec_av1_slice_setup_seg_buffer(instance, vsi);
> > > > +vdec_av1_slice_setup_tile_buffer(instance, vsi, bs);
> > > > +vdec_av1_slice_setup_lat_buffer(instance, vsi, bs, lat_buf);
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_update_lat(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +     struct vdec_lat_buf *lat_buf,
> > > > +     struct vdec_av1_slice_pfc *pfc)
> > > > +{
> > > > +struct vdec_av1_slice_vsi *vsi;
> > > > +
> > > > +vsi = &pfc->vsi;
> > > > +mtk_vcodec_debug(instance, "frame %u LAT CRC 0x%08x, output
> > > > size is %d\n",
> > > > + pfc->seq, vsi->state.crc[0], vsi-
> > > > > state.out_size);
> > > > 
> > > > +
> > > > +/* buffer full, need to re-decode */
> > > > +if (vsi->state.full) {
> > > > +/* buffer not enough */
> > > > +if (vsi->trans.dma_addr_end - vsi->trans.dma_addr ==
> > > > vsi->ube.size)
> > > > +return -ENOMEM;
> > > > +return -EAGAIN;
> > > > +}
> > > > +
> > > > +instance->width = vsi->frame.uh.upscaled_width;
> > > > +instance->height = vsi->frame.uh.frame_height;
> > > > +instance->frame_type = vsi->frame.uh.frame_type;
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_core_to_dst_buf(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +struct vdec_lat_buf
> > > > *lat_buf)
> > > > +{
> > > > +struct vb2_v4l2_buffer *dst;
> > > > +
> > > > +dst = v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx);
> > > > +if (!dst)
> > > > +return -EINVAL;
> > > > +
> > > > +v4l2_m2m_buf_copy_metadata(&lat_buf->ts_info, dst, true);
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_core_buffer(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +    struct vdec_av1_slice_pfc
> > > > *pfc,
> > > > +    struct vdec_av1_slice_vsi
> > > > *vsi,
> > > > +    struct vdec_fb *fb,
> > > > +    struct vdec_lat_buf
> > > > *lat_buf)
> > > > +{
> > > > +struct vb2_buffer *vb;
> > > > +struct vb2_queue *vq;
> > > > +int w, h, plane, size;
> > > > +int i;
> > > > +
> > > > +plane = instance->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > > +w = vsi->frame.uh.upscaled_width;
> > > > +h = vsi->frame.uh.frame_height;
> > > > +size = ALIGN(w, VCODEC_DEC_ALIGNED_64) * ALIGN(h,
> > > > VCODEC_DEC_ALIGNED_64);
> > > > +
> > > > +/* frame buffer */
> > > > +vsi->fb.y.dma_addr = fb->base_y.dma_addr;
> > > > +if (plane == 1)
> > > > +vsi->fb.c.dma_addr = fb->base_y.dma_addr + size;
> > > > +else
> > > > +vsi->fb.c.dma_addr = fb->base_c.dma_addr;
> > > > +
> > > > +/* reference buffers */
> > > > +vq = v4l2_m2m_get_vq(instance->ctx->m2m_ctx,
> > > > V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > > +if (!vq)
> > > > +return -EINVAL;
> > > > +
> > > > +/* get current output buffer */
> > > > +vb = &v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx)->vb2_buf;
> > > > +if (!vb)
> > > > +return -EINVAL;
> > > > +
> > > > +/* get buffer address from vb2buf */
> > > > +for (i = 0; i < V4L2_AV1_REFS_PER_FRAME; i++) {
> > > > +struct vdec_av1_slice_fb *vref = &vsi->ref[i];
> > > > +
> > > > +vb = vb2_find_buffer(vq, pfc->ref_idx[i]);
> > > > +if (!vb) {
> > > > +memset(vref, 0, sizeof(*vref));
> > > > +continue;
> > > > +}
> > > > +
> > > > +vref->y.dma_addr = vb2_dma_contig_plane_dma_addr(vb,
> > > > 0);
> > > > +if (plane == 1)
> > > > +vref->c.dma_addr = vref->y.dma_addr + size;
> > > > +else
> > > > +vref->c.dma_addr =
> > > > vb2_dma_contig_plane_dma_addr(vb, 1);
> > > > +}
> > > > +vsi->tile.dma_addr = lat_buf->tile_addr.dma_addr;
> > > > +vsi->tile.size = lat_buf->tile_addr.size;
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_setup_core(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +     struct vdec_fb *fb,
> > > > +     struct vdec_lat_buf *lat_buf,
> > > > +     struct vdec_av1_slice_pfc *pfc)
> > > > +{
> > > > +struct vdec_av1_slice_vsi *vsi = &pfc->vsi;
> > > > +int ret;
> > > > +
> > > > +ret = vdec_av1_slice_setup_core_to_dst_buf(instance, lat_buf);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +ret = vdec_av1_slice_setup_core_buffer(instance, pfc, vsi, fb,
> > > > lat_buf);
> > > > +if (ret)
> > > > +return ret;
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_update_core(struct
> > > > vdec_av1_slice_instance *instance,
> > > > +      struct vdec_lat_buf *lat_buf,
> > > > +      struct vdec_av1_slice_pfc *pfc)
> > > > +{
> > > > +struct vdec_av1_slice_vsi *vsi = instance->core_vsi;
> > > > +
> > > > +/* TODO: Do something here, or remove this function entirely
> > > > */
> > > 
> > > And?
> > > 
> > > > +
> > > > +mtk_vcodec_debug(instance, "frame %u Y_CRC %08x %08x %08x
> > > > %08x\n",
> > > > + pfc->seq, vsi->state.crc[0], vsi-
> > > > > state.crc[1],
> > > > 
> > > > + vsi->state.crc[2], vsi->state.crc[3]);
> > > > +mtk_vcodec_debug(instance, "frame %u C_CRC %08x %08x %08x
> > > > %08x\n",
> > > > + pfc->seq, vsi->state.crc[8], vsi-
> > > > > state.crc[9],
> > > > 
> > > > + vsi->state.crc[10], vsi->state.crc[11]);
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_init(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance;
> > > > +struct vdec_av1_slice_init_vsi *vsi;
> > > > +int ret;
> > > > +
> > > > +instance = kzalloc(sizeof(*instance), GFP_KERNEL);
> > > > +if (!instance)
> > > > +return -ENOMEM;
> > > > +
> > > > +instance->ctx = ctx;
> > > > +instance->vpu.id = SCP_IPI_VDEC_LAT;
> > > > +instance->vpu.core_id = SCP_IPI_VDEC_CORE;
> > > > +instance->vpu.ctx = ctx;
> > > > +instance->vpu.codec_type = ctx->current_codec;
> > > > +
> > > > +ret = vpu_dec_init(&instance->vpu);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "failed to init vpu dec, ret
> > > > %d\n", ret);
> > > > +goto error_vpu_init;
> > > > +}
> > > > +
> > > > +/* init vsi and global flags */
> > > > +vsi = instance->vpu.vsi;
> > > > +if (!vsi) {
> > > > +mtk_vcodec_err(instance, "failed to get AV1 vsi\n");
> > > > +ret = -EINVAL;
> > > > +goto error_vsi;
> > > > +}
> > > > +instance->init_vsi = vsi;
> > > > +instance->core_vsi = mtk_vcodec_fw_map_dm_addr(ctx->dev-
> > > > > fw_handler, (u32)vsi->core_vsi);
> > > > 
> > > > +
> > > > +if (!instance->core_vsi) {
> > > > +mtk_vcodec_err(instance, "failed to get AV1 core
> > > > vsi\n");
> > > > +ret = -EINVAL;
> > > > +goto error_vsi;
> > > > +}
> > > > +
> > > > +if (vsi->vsi_size != sizeof(struct vdec_av1_slice_vsi))
> > > > +mtk_vcodec_err(instance, "remote vsi size 0x%x
> > > > mismatch! expected: 0x%lx\n",
> > > > +       vsi->vsi_size, sizeof(struct
> > > > vdec_av1_slice_vsi));
> > > > +
> > > > +instance->irq = 1;
> > > 
> > > Does this mean "irq_enabled"? If so, rename?
> > > 
> > > > +instance->inneracing_mode = IS_VDEC_INNER_RACING(instance-
> > > > >ctx-
> > > > > dev->dec_capability);
> > > > 
> > > > +
> > > > +mtk_vcodec_debug(instance, "vsi 0x%p core_vsi 0x%llx 0x%p,
> > > > inneracing_mode %d\n",
> > > > + vsi, vsi->core_vsi, instance->core_vsi,
> > > > instance->inneracing_mode);
> > > > +
> > > > +ret = vdec_av1_slice_init_cdf_table(instance);
> > > > +if (ret)
> > > > +goto error_vsi;
> > > > +
> > > > +ret = vdec_av1_slice_init_iq_table(instance);
> > > > +if (ret)
> > > > +goto error_vsi;
> > > > +
> > > > +ctx->drv_handle = instance;
> > > > +
> > > > +return 0;
> > > > +error_vsi:
> > > > +vpu_dec_deinit(&instance->vpu);
> > > > +error_vpu_init:
> > > > +kfree(instance);
> > > 
> > > newline?
> > > 
> > > > +return ret;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_deinit(void *h_vdec)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance = h_vdec;
> > > > +
> > > > +if (!instance)
> > > > +return;
> > > > +mtk_vcodec_debug(instance, "h_vdec 0x%p\n", h_vdec);
> > > > +vpu_dec_deinit(&instance->vpu);
> > > > +vdec_av1_slice_free_working_buffer(instance);
> > > > +vdec_msg_queue_deinit(&instance->ctx->msg_queue, instance-
> > > > > ctx);
> > > > 
> > > > +kfree(instance);
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_flush(void *h_vdec, struct
> > > > mtk_vcodec_mem *bs,
> > > > +struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance = h_vdec;
> > > > +int i;
> > > > +
> > > > +mtk_vcodec_debug(instance, "flush ...\n");
> > > > +
> > > > +for (i = 0; i < AV1_MAX_FRAME_BUF_COUNT; i++)
> > > > +vdec_av1_slice_clear_fb(&instance-
> > > > > slots.frame_info[i]);
> > > > 
> > > > +
> > > > +vdec_msg_queue_wait_lat_buf_full(&instance->ctx->msg_queue);
> > > 
> > > newline?
> > > 
> > > > +return vpu_dec_reset(&instance->vpu);
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_get_pic_info(struct
> > > > vdec_av1_slice_instance *instance)
> > > > +{
> > > > +struct mtk_vcodec_ctx *ctx = instance->ctx;
> > > > +u32 data[3];
> > > > +
> > > > +mtk_vcodec_debug(instance, "w %u h %u\n", ctx->picinfo.pic_w,
> > > > ctx->picinfo.pic_h);
> > > > +
> > > > +data[0] = ctx->picinfo.pic_w;
> > > > +data[1] = ctx->picinfo.pic_h;
> > > > +data[2] = ctx->capture_fourcc;
> > > > +vpu_dec_get_param(&instance->vpu, data, 3,
> > > > GET_PARAM_PIC_INFO);
> > > > +
> > > > +ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w,
> > > > VCODEC_DEC_ALIGNED_64);
> > > > +ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h,
> > > > VCODEC_DEC_ALIGNED_64);
> > > > +ctx->picinfo.fb_sz[0] = instance->vpu.fb_sz[0];
> > > > +ctx->picinfo.fb_sz[1] = instance->vpu.fb_sz[1];
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_get_dpb_size(struct
> > > > vdec_av1_slice_instance *instance, u32 *dpb_sz)
> > > 
> > > static inline void?
> > > 
> > > > +{
> > > > +/* refer av1 specification */
> > > > +*dpb_sz = V4L2_AV1_TOTAL_REFS_PER_FRAME + 1;
> > > > +}
> > > > +
> > > > +static void vdec_av1_slice_get_crop_info(struct
> > > > vdec_av1_slice_instance *instance,
> > > > + struct v4l2_rect *cr)
> > > > +{
> > > > +struct mtk_vcodec_ctx *ctx = instance->ctx;
> > > > +
> > > > +cr->left = 0;
> > > > +cr->top = 0;
> > > > +cr->width = ctx->picinfo.pic_w;
> > > > +cr->height = ctx->picinfo.pic_h;
> > > > +
> > > > +mtk_vcodec_debug(instance, "l=%d, t=%d, w=%d, h=%d\n",
> > > > + cr->left, cr->top, cr->width, cr->height);
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_get_param(void *h_vdec, enum
> > > > vdec_get_param_type type, void *out)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance = h_vdec;
> > > > +
> > > > +switch (type) {
> > > > +case GET_PARAM_PIC_INFO:
> > > > +vdec_av1_slice_get_pic_info(instance);
> > > > +break;
> > > > +case GET_PARAM_DPB_SIZE:
> > > > +vdec_av1_slice_get_dpb_size(instance, out);
> > > > +break;
> > > > +case GET_PARAM_CROP_INFO:
> > > > +vdec_av1_slice_get_crop_info(instance, out);
> > > > +break;
> > > > +default:
> > > > +mtk_vcodec_err(instance, "invalid get parameter
> > > > type=%d\n", type);
> > > > +return -EINVAL;
> > > > +}
> > > > +
> > > > +return 0;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_lat_decode(void *h_vdec, struct
> > > > mtk_vcodec_mem *bs,
> > > > +     struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance = h_vdec;
> > > > +struct vdec_lat_buf *lat_buf;
> > > > +struct vdec_av1_slice_pfc *pfc;
> > > > +struct vdec_av1_slice_vsi *vsi;
> > > > +struct mtk_vcodec_ctx *ctx;
> > > > +int ret;
> > > > +
> > > > +if (!instance || !instance->ctx)
> > > > +return -EINVAL;
> > > > +
> > > > +ctx = instance->ctx;
> > > > +/* init msgQ for the first time */
> > > > +if (vdec_msg_queue_init(&ctx->msg_queue, ctx,
> > > > +vdec_av1_slice_core_decode,
> > > > sizeof(*pfc))) {
> > > > +mtk_vcodec_err(instance, "failed to init AV1 msg
> > > > queue\n");
> > > > +return -ENOMEM;
> > > > +}
> > > > +
> > > > +/* bs NULL means flush decoder */
> > > > +if (!bs)
> > > > +return vdec_av1_slice_flush(h_vdec, bs, fb, res_chg);
> > > > +
> > > > +lat_buf = vdec_msg_queue_dqbuf(&ctx->msg_queue.lat_ctx);
> > > > +if (!lat_buf) {
> > > > +mtk_vcodec_err(instance, "failed to get AV1 lat
> > > > buf\n");
> > > 
> > > there exists vdec_msg_queue_deinit(). Should it be called in this
> > > (and 
> > > subsequent) error recovery path(s)?
> > > 
> > > > +return -EBUSY;
> > > > +}
> > > > +pfc = (struct vdec_av1_slice_pfc *)lat_buf->private_data;
> > > > +if (!pfc) {
> > > > +ret = -EINVAL;
> > > > +goto err_free_fb_out;
> > > > +}
> > > > +vsi = &pfc->vsi;
> > > > +
> > > > +ret = vdec_av1_slice_setup_lat(instance, bs, lat_buf, pfc);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "failed to setup AV1 lat ret
> > > > %d\n", ret);
> > > > +goto err_free_fb_out;
> > > > +}
> > > > +
> > > > +vdec_av1_slice_vsi_to_remote(vsi, instance->vsi);
> > > > +ret = vpu_dec_start(&instance->vpu, NULL, 0);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "failed to dec AV1 ret %d\n",
> > > > ret);
> > > > +goto err_free_fb_out;
> > > > +}
> > > > +if (instance->inneracing_mode)
> > > > +vdec_msg_queue_qbuf(&ctx->dev->msg_queue_core_ctx,
> > > > lat_buf);
> > > > +
> > > > +if (instance->irq) {
> > > > +ret = mtk_vcodec_wait_for_done_ctx(ctx,
> > > > MTK_INST_IRQ_RECEIVED,
> > > > +   WAIT_INTR_TIMEOUT_MS
> > > > ,
> > > > +   MTK_VDEC_LAT0);
> > > > +/* update remote vsi if decode timeout */
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "AV1 Frame %d decode
> > > > timeout %d\n", pfc->seq, ret);
> > > > +WRITE_ONCE(instance->vsi->state.timeout, 1);
> > > > +}
> > > > +vpu_dec_end(&instance->vpu);
> > > > +}
> > > > +
> > > > +vdec_av1_slice_vsi_from_remote(vsi, instance->vsi);
> > > > +ret = vdec_av1_slice_update_lat(instance, lat_buf, pfc);
> > > > +
> > > > +/* LAT trans full, re-decode */
> > > > +if (ret == -EAGAIN) {
> > > > +mtk_vcodec_err(instance, "AV1 Frame %d trans full\n",
> > > > pfc->seq);
> > > > +if (!instance->inneracing_mode)
> > > > +vdec_msg_queue_qbuf(&ctx->msg_queue.lat_ctx,
> > > > lat_buf);
> > > > +return 0;
> > > > +}
> > > > +
> > > > +/* LAT trans full, no more UBE or decode timeout */
> > > > +if (ret == -ENOMEM || vsi->state.timeout) {
> > > > +mtk_vcodec_err(instance, "AV1 Frame %d insufficient
> > > > buffer or timeout\n", pfc->seq);
> > > > +if (!instance->inneracing_mode)
> > > > +vdec_msg_queue_qbuf(&ctx->msg_queue.lat_ctx,
> > > > lat_buf);
> > > > +return -EBUSY;
> > > > +}
> > > > +vsi->trans.dma_addr_end += ctx->msg_queue.wdma_addr.dma_addr;
> > > > +mtk_vcodec_debug(instance, "lat dma 1 0x%llx 0x%llx\n",
> > > > + pfc->vsi.trans.dma_addr, pfc-
> > > > > vsi.trans.dma_addr_end);
> > > > 
> > > > +
> > > > +vdec_msg_queue_update_ube_wptr(&ctx->msg_queue, vsi-
> > > > > trans.dma_addr_end);
> > > > 
> > > > +
> > > > +if (!instance->inneracing_mode)
> > > > +vdec_msg_queue_qbuf(&ctx->dev->msg_queue_core_ctx,
> > > > lat_buf);
> > > > +memcpy(&instance->slots, &vsi->slots, sizeof(instance-
> > > > >slots));
> > > > +
> > > > +return 0;
> > > > +
> > > > +err_free_fb_out:
> > > > +vdec_msg_queue_qbuf(&ctx->msg_queue.lat_ctx, lat_buf);
> > > > +mtk_vcodec_err(instance, "slice dec number: %d err: %d", pfc-
> > > > > seq, ret);
> > > > 
> > > > +return ret;
> > > > +}
> > > > +
> > > > +static int vdec_av1_slice_core_decode(struct vdec_lat_buf
> > > > *lat_buf)
> > > > +{
> > > > +struct vdec_av1_slice_instance *instance;
> > > > +struct vdec_av1_slice_pfc *pfc;
> > > > +struct mtk_vcodec_ctx *ctx = NULL;
> > > > +struct vdec_fb *fb = NULL;
> > > > +int ret = -EINVAL;
> > > > +
> > > > +if (!lat_buf)
> > > > +return -EINVAL;
> > > > +
> > > > +pfc = lat_buf->private_data;
> > > > +ctx = lat_buf->ctx;
> > > > +if (!pfc || !ctx)
> > > > +return -EINVAL;
> > > > +
> > > > +instance = ctx->drv_handle;
> > > > +if (!instance)
> > > > +goto err;
> > > > +
> > > > +fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
> > > > +if (!fb) {
> > > > +ret = -EBUSY;
> > > > +goto err;
> > > > +}
> > > > +
> > > > +ret = vdec_av1_slice_setup_core(instance, fb, lat_buf, pfc);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance,
> > > > "vdec_av1_slice_setup_core\n");
> > > > +goto err;
> > > > +}
> > > > +vdec_av1_slice_vsi_to_remote(&pfc->vsi, instance->core_vsi);
> > > > +ret = vpu_dec_core(&instance->vpu);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "vpu_dec_core\n");
> > > > +goto err;
> > > > +}
> > > > +
> > > > +if (instance->irq) {
> > > > +ret = mtk_vcodec_wait_for_done_ctx(ctx,
> > > > MTK_INST_IRQ_RECEIVED,
> > > > +   WAIT_INTR_TIMEOUT_MS
> > > > ,
> > > > +   MTK_VDEC_CORE);
> > > > +/* update remote vsi if decode timeout */
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance, "AV1 frame %d core
> > > > timeout\n", pfc->seq);
> > > > +WRITE_ONCE(instance->vsi->state.timeout, 1);
> > > > +}
> > > > +vpu_dec_core_end(&instance->vpu);
> > > > +}
> > > > +
> > > > +ret = vdec_av1_slice_update_core(instance, lat_buf, pfc);
> > > > +if (ret) {
> > > > +mtk_vcodec_err(instance,
> > > > "vdec_av1_slice_update_core\n");
> > > > +goto err;
> > > > +}
> > > > +
> > > > +mtk_vcodec_debug(instance, "core dma_addr_end 0x%llx\n",
> > > > + instance->core_vsi->trans.dma_addr_end);
> > > > +vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, instance-
> > > > > core_vsi->trans.dma_addr_end);
> > > > 
> > > > +
> > > > +ctx->dev->vdec_pdata->cap_to_disp(ctx, 0, lat_buf-
> > > > > src_buf_req);
> > > > 
> > > > +
> > > > +return 0;
> > > > +
> > > > +err:
> > > > +/* always update read pointer */
> > > > +vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, pfc-
> > > > > vsi.trans.dma_addr_end);
> > > > 
> > > > +
> > > > +if (fb)
> > > > +ctx->dev->vdec_pdata->cap_to_disp(ctx, 1, lat_buf-
> > > > > src_buf_req);
> > > > 
> > > > +
> > > > +return ret;
> > > > +}
> > > > +
> > > > +const struct vdec_common_if vdec_av1_slice_lat_if = {
> > > > +.init= vdec_av1_slice_init,
> > > > +.decode= vdec_av1_slice_lat_decode,
> > > > +.get_param= vdec_av1_slice_get_param,
> > > > +.deinit= vdec_av1_slice_deinit,
> > > > +};
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec_drv_if.c
> > > > b/drivers/media/platform/mediatek/vcodec/vdec_drv_if.c
> > > > index f3807f03d8806..4dda59a6c8141 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/vdec_drv_if.c
> > > > +++ b/drivers/media/platform/mediatek/vcodec/vdec_drv_if.c
> > > > @@ -49,6 +49,10 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx,
> > > > unsigned int fourcc)
> > > >   ctx->dec_if = &vdec_vp9_slice_lat_if;
> > > >   ctx->hw_id = IS_VDEC_LAT_ARCH(hw_arch) ? MTK_VDEC_LAT0
> > > > : MTK_VDEC_CORE;
> > > >   break;
> > > > +case V4L2_PIX_FMT_AV1_FRAME:
> > > > +ctx->dec_if = &vdec_av1_slice_lat_if;
> > > > +ctx->hw_id = MTK_VDEC_LAT0;
> > > > +break;
> > > >   default:
> > > >   return -EINVAL;
> > > >   }
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec_drv_if.h
> > > > b/drivers/media/platform/mediatek/vcodec/vdec_drv_if.h
> > > > index 076306ff2dd49..dc6c8ecd9843a 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/vdec_drv_if.h
> > > > +++ b/drivers/media/platform/mediatek/vcodec/vdec_drv_if.h
> > > > @@ -61,6 +61,7 @@ extern const struct vdec_common_if
> > > > vdec_vp8_if;
> > > >   extern const struct vdec_common_if vdec_vp8_slice_if;
> > > >   extern const struct vdec_common_if vdec_vp9_if;
> > > >   extern const struct vdec_common_if vdec_vp9_slice_lat_if;
> > > > +extern const struct vdec_common_if vdec_av1_slice_lat_if;
> > > >   
> > > >   /**
> > > >    * vdec_if_init() - initialize decode driver
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
> > > > b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
> > > > index ae500980ad45c..05b54b0e3f2d2 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
> > > > +++ b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
> > > > @@ -20,6 +20,9 @@
> > > >   /* the size used to store avc error information */
> > > >   #define VDEC_ERR_MAP_SZ_AVC         (17 * SZ_1K)
> > > >   
> > > > +#define VDEC_RD_MV_BUFFER_SZ        (((SZ_4K * 2304 >> 4) +
> > > > SZ_1K)
> > > > << 1)
> > > > +#define VDEC_LAT_TILE_SZ            (64 * SZ_4K)
> > > > +
> > > >   /* core will read the trans buffer which decoded by lat to
> > > > decode
> > > > again.
> > > >    * The trans buffer size of FHD and 4K bitstreams are
> > > > different.
> > > >    */
> > > > @@ -194,6 +197,14 @@ void vdec_msg_queue_deinit(struct
> > > > vdec_msg_queue *msg_queue,
> > > >   if (mem->va)
> > > >   mtk_vcodec_mem_free(ctx, mem);
> > > >   
> > > > +mem = &lat_buf->rd_mv_addr;
> > > > +if (mem->va)
> > > > +mtk_vcodec_mem_free(ctx, mem);
> > > > +
> > > > +mem = &lat_buf->tile_addr;
> > > > +if (mem->va)
> > > > +mtk_vcodec_mem_free(ctx, mem);
> > > > +
> > > >   kfree(lat_buf->private_data);
> > > >   }
> > > >   }
> > > > @@ -270,6 +281,22 @@ int vdec_msg_queue_init(struct
> > > > vdec_msg_queue
> > > > *msg_queue,
> > > >   goto mem_alloc_err;
> > > >   }
> > > >   
> > > > +if (ctx->current_codec == V4L2_PIX_FMT_AV1_FRAME) {
> > > > +lat_buf->rd_mv_addr.size =
> > > > VDEC_RD_MV_BUFFER_SZ;
> > > > +err = mtk_vcodec_mem_alloc(ctx, &lat_buf-
> > > > > rd_mv_addr);
> > > > 
> > > > +if (err) {
> > > > +mtk_v4l2_err("failed to allocate
> > > > rd_mv_addr buf[%d]", i);
> > > > +return -ENOMEM;
> > > > +}
> > > > +
> > > > +lat_buf->tile_addr.size = VDEC_LAT_TILE_SZ;
> > > > +err = mtk_vcodec_mem_alloc(ctx, &lat_buf-
> > > > > tile_addr);
> > > > 
> > > > +if (err) {
> > > > +mtk_v4l2_err("failed to allocate
> > > > tile_addr buf[%d]", i);
> > > > +return -ENOMEM;
> > > > +}
> > > > +}
> > > > +
> > > >   lat_buf->private_data = kzalloc(private_size,
> > > > GFP_KERNEL);
> > > >   if (!lat_buf->private_data) {
> > > >   err = -ENOMEM;
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
> > > > b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
> > > > index c43d427f5f544..525170e411ee0 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
> > > > +++ b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
> > > > @@ -42,6 +42,8 @@ struct vdec_msg_queue_ctx {
> > > >    * struct vdec_lat_buf - lat buffer message used to store lat
> > > > info for core decode
> > > >    * @wdma_err_addr: wdma error address used for lat hardware
> > > >    * @slice_bc_addr: slice bc address used for lat hardware
> > > > + * @rd_mv_addr:mv addr for av1 lat hardware output, core
> > > > hardware input
> > > > + * @tile_addr:tile buffer for av1 core input
> > > >    * @ts_info: need to set timestamp from output to capture
> > > >    * @src_buf_req: output buffer media request object
> > > >    *
> > > > @@ -54,6 +56,8 @@ struct vdec_msg_queue_ctx {
> > > >   struct vdec_lat_buf {
> > > >   struct mtk_vcodec_mem wdma_err_addr;
> > > >   struct mtk_vcodec_mem slice_bc_addr;
> > > > +struct mtk_vcodec_mem rd_mv_addr;
> > > > +struct mtk_vcodec_mem tile_addr;
> > > >   struct vb2_v4l2_buffer ts_info;
> > > >   struct media_request *src_buf_req;
> > > >   


More information about the Linux-mediatek mailing list