[PATCH] printk: ringbuffer: Improve prb_next_seq() performance
John Ogness
john.ogness at linutronix.de
Mon Oct 25 15:06:43 PDT 2021
Hi Petr,
I am OK with this new best effort cache. But I do have some minor
comments about the implementation...
On 2021-10-25, Petr Mladek <pmladek at suse.com> wrote:
> prb_next_seq() always iterates from the first known sequence number.
> In the worst case, it might loop 8k times for 256kB buffer,
> 15k times for 512kB buffer, and 64k times for 2MB buffer.
>
> It was reported that pooling and reading using syslog interface
> might occupy 50% of CPU.
>
> Speedup the search by storing @id of the last finalized descriptor.
>
> The loop is still needed because the @id is stored and read in the best
> effort way. An atomic variable is used to keep the @id consistent.
> But the stores and reads are not serialized against each other.
> The descriptor could get reused in the meantime. The related sequence
> number will be used only when it is still valid.
>
> An invalid value should be read _only_ when there is a flood of messages
> and the ringbuffer is rapidly reused. The performance is the least
> problem in this case.
>
> Reported-by: Chunlei Wang <chunlei.wang at mediatek.com>
> Signed-off-by: Petr Mladek <pmladek at suse.com>
> ---
> kernel/printk/printk_ringbuffer.c | 46 ++++++++++++++++++++++++++++---
> kernel/printk/printk_ringbuffer.h | 2 ++
> 2 files changed, 44 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 8a7b7362c0dd..7e0c11be07c8 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -474,8 +474,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
> * state has been re-checked. A memcpy() for all of @desc
> * cannot be used because of the atomic_t @state_var field.
> */
> - memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> - sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> + if (desc_out) {
> + memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> + sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> + }
> if (seq_out)
> *seq_out = info->seq; /* also part of desc_read:C */
> if (caller_id_out)
> @@ -1449,6 +1451,9 @@ static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
>
> atomic_long_cmpxchg_relaxed(&d->state_var, prev_state_val,
> DESC_SV(id, desc_finalized)); /* LMM(desc_make_final:A) */
> +
> + /* Best effort to remember the last finalized @id. */
> + atomic_long_set(&desc_ring->last_finalized_id, id);
> }
>
> /**
> @@ -1657,7 +1662,12 @@ void prb_commit(struct prb_reserved_entry *e)
> */
> void prb_final_commit(struct prb_reserved_entry *e)
> {
> + struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
> +
> _prb_commit(e, desc_finalized);
> +
> + /* Best effort to remember the last finalized @id. */
> + atomic_long_set(&desc_ring->last_finalized_id, e->id);
> }
>
> /*
> @@ -1988,6 +1998,30 @@ u64 prb_first_valid_seq(struct printk_ringbuffer *rb)
> return seq;
> }
>
> +/* Try to read seq from the last known location. */
> +static u64 prb_last_finalized_seq(struct printk_ringbuffer *rb)
> +{
> + struct prb_desc_ring *desc_ring = &rb->desc_ring;
> + enum desc_state d_state;
> + unsigned long id;
> + u64 seq = 0;
It is not necessary to initialize @seq.
> +
> + /* Check if the cached @id still points to a valid @seq. */
> + id = atomic_long_read(&desc_ring->last_finalized_id);
> + d_state = desc_read(desc_ring, id, NULL, &seq, NULL);
> +
> + if (d_state == desc_finalized || d_state == desc_reusable)
> + return seq;
> +
> + /*
> + * The information about the last finalized sequence number
> + * has gone. It should happen only when there is a flood of
> + * new messages and the ringbuffer is rapidly recycled.
> + * Give up and start from the beginning.
> + */
> + return 0;
Returning 0 is a bit odd here. For the problem this patch is solving it
would work, but if this new helper is ever used anywhere else, it would
be a bizarre exception since 0 can also mean success. See my suggestion
below.
> +}
> +
> /**
> * prb_next_seq() - Get the sequence number after the last available record.
> *
> @@ -2005,9 +2039,12 @@ u64 prb_first_valid_seq(struct printk_ringbuffer *rb)
> */
> u64 prb_next_seq(struct printk_ringbuffer *rb)
> {
> - u64 seq = 0;
> + u64 seq = prb_last_finalized_seq(rb);
If prb_last_finalized_seq() was successful (which can also be 0 on
success) then there is no point reading that record. We already know
that it must be higher.
I suggest implementing everything within prb_next_seq() instead of
adding the helper function prb_last_finalized_seq(). IMHO this cleanly
handles the case of a failed read for the last finalized id and avoids
an unnecessary descriptor read.
u64 prb_next_seq(struct printk_ringbuffer *rb)
{
struct prb_desc_ring *desc_ring = &rb->desc_ring;
enum desc_state d_state;
unsigned long id;
u64 seq;
/* Check if the cached @id still points to a valid @seq. */
id = atomic_long_read(&desc_ring->last_finalized_id);
d_state = desc_read(desc_ring, id, NULL, &seq, NULL);
if (d_state == desc_finalized || d_state == desc_reusable) {
/* Begin searching after the last finalized record. */
seq++;
} else {
/*
* The information about the last finalized sequence number
* has gone. It should happen only when there is a flood of
* new messages and the ringbuffer is rapidly recycled.
* Give up and start from the beginning.
*/
seq = 0;
}
/*
* The information about the last finalized @seq might be inaccurate.
* Search forward to find the current one.
*/
while (_prb_read_valid(rb, &seq, NULL, NULL))
seq++;
return seq;
}
>
> - /* Search forward from the oldest descriptor. */
> + /*
> + * The information about the last finalized @seq might be inaccurate.
> + * Search forward to find the current one.
> + */
> while (_prb_read_valid(rb, &seq, NULL, NULL))
> seq++;
>
John Ogness
More information about the linux-arm-kernel
mailing list