[PATCH] ring-buffer: Do not read at &event->array[0] if it across the page

Tue Sep 5 07:12:45 PDT 2023

While reading from the tracing/trace, the ftrace reader rarely encounters
a KASAN invalid access issue.
It is likely that the writer has disrupted the ring_buffer that the reader
is currently parsing. the kasan report is as below:

[name:report&]BUG: KASAN: invalid-access in rb_iter_head_event+0x27c/0x3d0
[name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
[name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
[name:report&]
CPU: 2 PID: 380 Comm: xxxx
Call trace:
dump_backtrace+0x168/0x1b0
show_stack+0x2c/0x3c
dump_stack_lvl+0xa4/0xd4
print_report+0x268/0x9b0
kasan_report+0xdc/0x148
kasan_tag_mismatch+0x28/0x3c
__hwasan_tag_mismatch+0x2c/0x58
rb_event_length() [inline]
rb_iter_head_event+0x27c/0x3d0
ring_buffer_iter_peek+0x23c/0x6e0
__find_next_entry+0x1ac/0x3d8
s_next+0x1f0/0x310
seq_read_iter+0x4e8/0x77c
seq_read+0xf8/0x150
vfs_read+0x1a8/0x4cc

In some edge cases, ftrace reader could access to an invalid address,
specifically when reading 4 bytes beyond the end of the currently page.
While issue happened, the dump of rb_iter_head_event is shown as below:

    in function rb_iter_head_event:
          - iter->head = 0xFEC
          - iter->next_event = 0xFEC
          - commit = 0xFF0
          - read_stamp = 0x2955AC46DB0
          - page_stamp = 0x2955AC2439A
          - iter->head_page->page = 0x71FFFF8111A17000
          - iter->head_page->time_stamp = 0x2956A142267
          - iter->head_page->page->commit = 0xFF0
          - the content in iter->head_page->page
                0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C FFFFFFC0

In rb_iter_head_event, reader will call rb_event_length with argument
(struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
Since the content data start at address 0x71FFFF8111A17FFC are 0xFFFFFFC0.
event->type will be interpret as 0x0, than the reader will try to get the
length by accessing event->array[0], which is an invalid address:
    &event->array[0] = 0x71FFFF8111A18000

Signed-off-by: Tze-nan Wu <Tze-nan.Wu at mediatek.com>
---
Following patch may not become a solution, it merely checks if the address
to be accessed is valid or not within the rb_event_length before access.
And not sure if there is any side-effect it can lead to.

I am curious about what a better solution for this issue would look like.
Should we address the problem from the writer or the reader?

Also I wonder if the cause of the issue is exactly as I suspected.
Any Suggestion will be appreciated.

Test below can reproduce the issue in 2 hours on kernel-6.1.24:
    $cd /sys/kernel/tracing/
    # make the reader and writer race more through resize the buffer to 8kb
    $echo 8 > buffer_size_kn
    # enable all events
    $echo 1 > event/enable
    # enable trace
    $echo 1 > tracing_on
 
    # write and run a script that keep reading trace
    $./read_trace.sh

    ``` read_trace.sh
       while :
       do
           cat /sys/kernel/tracing/trace > /dev/null
       done

    ```
---
 kernel/trace/ring_buffer.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 78502d4c7214..ed5ddc3a134b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -200,6 +200,8 @@ rb_event_length(struct ring_buffer_event *event)
 		if (rb_null_event(event))
 			/* undefined */
 			return -1;
+		if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4)
+			return -1;
 		return  event->array[0] + RB_EVNT_HDR_SIZE;
 
 	case RINGBUF_TYPE_TIME_EXTEND:
@@ -209,6 +211,8 @@ rb_event_length(struct ring_buffer_event *event)
 		return RB_LEN_TIME_STAMP;
 
 	case RINGBUF_TYPE_DATA:
+		if ((((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4) && !event->type_len)
+			return -1;
 		return rb_event_data_length(event);
 	default:
 		WARN_ON_ONCE(1);
-- 
2.18.0