[PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit
Waqar Hameed
waqar.hameed at axis.com
Thu Nov 7 14:38:26 PST 2024
On Thu, Nov 07, 2024 at 16:39 +0800 Zhihao Cheng <chengzhihao1 at huawei.com> wrote:
> 在 2024/10/9 22:46, Waqar Hameed 写道:
>> Running
>> rm -f /etc/test-file.bin
>> dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync
>> in a loop, with `CONFIG_UBIFS_FS_AUTHENTICATION`, KASAN reports:
>> BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
>> Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153
>> Call trace:
>> dump_backtrace+0x0/0x340
>> show_stack+0x18/0x24
>> dump_stack_lvl+0x9c/0xbc
>> print_address_description.constprop.0+0x74/0x2b0
>> kasan_report+0x1d8/0x1f0
>> kasan_check_range+0xf8/0x1a0
>> memcpy+0x84/0xf4
>> ubifs_tnc_end_commit+0xa5c/0x1950
>> do_commit+0x4e0/0x1340
>> ubifs_bg_thread+0x234/0x2e0
>> kthread+0x36c/0x410
>> ret_from_fork+0x10/0x20
>> Allocated by task 401:
>> kasan_save_stack+0x38/0x70
>> __kasan_kmalloc+0x8c/0xd0
>> __kmalloc+0x34c/0x5bc
>> tnc_insert+0x140/0x16a4
>> ubifs_tnc_add+0x370/0x52c
>> ubifs_jnl_write_data+0x5d8/0x870
>> do_writepage+0x36c/0x510
>> ubifs_writepage+0x190/0x4dc
>> __writepage+0x58/0x154
>> write_cache_pages+0x394/0x830
>> do_writepages+0x1f0/0x5b0
>> filemap_fdatawrite_wbc+0x170/0x25c
>> file_write_and_wait_range+0x140/0x190
>> ubifs_fsync+0xe8/0x290
>> vfs_fsync_range+0xc0/0x1e4
>> do_fsync+0x40/0x90
>> __arm64_sys_fsync+0x34/0x50
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>> Freed by task 403:
>> kasan_save_stack+0x38/0x70
>> kasan_set_track+0x28/0x40
>> kasan_set_free_info+0x28/0x4c
>> __kasan_slab_free+0xd4/0x13c
>
> Hi Waqar, is that line 2639 ?
> 2540 sstatic int tnc_delete()
> 2541 {
> 2608 if (!znode->parent) {
> 2609 while (znode->child_cnt == 1 && znode->level != 0) {
> 2634 if (zp->cnext) {
> ...
> 2638 } else
> 2639 kfree(zp);
> 2644 }
>
`faddr2line` doesn't work for that func+offset. It just complains with
bad symbol size: sym_addr: 0xc0830f81 cur_sym_addr: 0xc0831464
The offset and size hints that it should be somewhere at the end of
`tnc_delete()`. I tried with `addr2line` and the addresses around that
offsets points to the `kfree()` on line 2639. So yes, it would say
that's the offending one.
>> kfree+0xc4/0x3a0
>> tnc_delete+0x3f4/0xe40
>> ubifs_tnc_remove_range+0x368/0x73c
>> ubifs_tnc_remove_ino+0x29c/0x2e0
>> ubifs_jnl_delete_inode+0x150/0x260
>> ubifs_evict_inode+0x1d4/0x2e4
>> evict+0x1c8/0x450
>> iput+0x2a0/0x3c4
>> do_unlinkat+0x2cc/0x490
>> __arm64_sys_unlinkat+0x90/0x100
>> invoke_syscall.constprop.0+0xa8/0x260
>> do_el0_svc+0xc8/0x1f0
>> el0_svc+0x34/0x70
>> el0t_64_sync_handler+0x108/0x114
>> el0t_64_sync+0x1a4/0x1a8
>>
>
> Looks like there is one possibility:
> 1. tnc_insert() triggers a TNC tree split:
> zroot
> /
> z_p1
> /
> zn
> z_p1 is full, after inserting zn_new(key order is smaller that zn) under z_p1,
> zn->parent is switched to z_p2, but zn->cparent is still z_p1:
> zroot
> / \
> z_p1 z_p2
> / \
> zn_new zn
> 2. tnc_delete() removes all znodes except the 'zn':
> zroot
> \
> z_p2
> \
> zn
> TNC tree is collapsed, zroot and z_p2 are freed:
> zroot'(zn)
> 3. get_znodes_to_commit() finds only one znode(zn, which is also zroot),
> zn->cparent is not updated and still points to z_p1(which was freed).
> 4. write_index() accesses the zn->cparent->zbranch, which triggers an UAF!
>
> Try following modification to verify whether the problem is fixed:
> diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
> index a55e04822d16..7c43e0ccf6d4 100644
> --- a/fs/ubifs/tnc_commit.c
> +++ b/fs/ubifs/tnc_commit.c
> @@ -657,6 +657,8 @@ static int get_znodes_to_commit(struct ubifs_info *c)
> znode->alt = 0;
> cnext = find_next_dirty(znode);
> if (!cnext) {
> + ubifs_assert(c, !znode->parent);
> + znode->cparent = NULL;
> znode->cnext = c->cnext;
> break;
> }
Yup, I think this is it! Nice work!
I've started a test run with the following patch:
diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index a55e04822d16..4c8150d5ed65 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -657,6 +657,11 @@ static int get_znodes_to_commit(struct ubifs_info *c)
znode->alt = 0;
cnext = find_next_dirty(znode);
if (!cnext) {
+ ubifs_assert(c, !znode->parent);
+ if (znode->cparent) {
+ printk("%s:%d\n", __func__, __LINE__);
+ }
+ znode->cparent = NULL;
znode->cnext = c->cnext;
break;
}
and can already see that the print hit a couple of times in a few hours
(250 iterations). I'll let it spin for a little longer (recall that the
other patch "masked" the problem for almost 800 iterations).
This was quite intricate and I really enjoyed your little breakdown.
Thank you very much for the discussions/collaboration! I'll let you now
as soon as I have an updated test result (and thus send a new version).
P.S
I wonder how many systems that might have experienced this
use-after-free and got random memory corruptions (or other security
issues). This bug has been there since v4.20.
D.S
More information about the linux-mtd
mailing list