riscv32 EXT4 splat, 6.8 regression?

Sun Apr 14 07:08:11 PDT 2024

Andreas Dilger <adilger at dilger.ca> writes:

> On Apr 13, 2024, at 8:15 PM, Al Viro <viro at zeniv.linux.org.uk> wrote:
>> 
>> On Sat, Apr 13, 2024 at 07:46:03PM -0600, Andreas Dilger wrote:
>> 
>>> As to whether the 0xfffff000 address itself is valid for riscv32 is
>>> outside my realm, but given that RAM is cheap it doesn't seem unlikely
>>> to have 4GB+ of RAM and want to use it all.  The riscv32 might consider
>>> reserving this page address from allocation to avoid similar issues in
>>> other parts of the code, as is done with the NULL/0 page address.
>> 
>> Not a chance.  *Any* page mapped there is a serious bug on any 32bit
>> box.  Recall what ERR_PTR() is...
>> 
>> On any architecture the virtual addresses in range (unsigned long)-512..
>> (unsigned long)-1 must never resolve to valid kernel objects.
>> In other words, any kind of wraparound here is asking for an oops on
>> attempts to access the elements of buffer - kernel dereference of
>> (char *)0xfffff000 on a 32bit box is already a bug.
>> 
>> It might be getting an invalid pointer, but arithmetical overflows
>> are irrelevant.
>
> The original bug report stated that search_buf = 0xfffff000 on entry,
> and I'd quoted that at the start of my email:
>
> On Apr 12, 2024, at 8:57 AM, Björn Töpel <bjorn at kernel.org> wrote:
>> What I see in ext4_search_dir() is that search_buf is 0xfffff000, and at
>> some point the address wraps to zero, and boom. I doubt that 0xfffff000
>> is a sane address.
>
> Now that you mention ERR_PTR() it definitely makes sense that this last
> page HAS to be excluded.
>
> So some other bug is passing the bad pointer to this code before this
> error, or the arch is not correctly excluding this page from allocation.

Yeah, something is off for sure.

(FWIW, I manage to hit this for Linus' master as well.)

I added a print (close to trace_mm_filemap_add_to_page_cache()), and for
this BT:

  [<c01e8b34>] __filemap_add_folio+0x322/0x508
  [<c01e8d6e>] filemap_add_folio+0x54/0xce
  [<c01ea076>] __filemap_get_folio+0x156/0x2aa
  [<c02df346>] __getblk_slow+0xcc/0x302
  [<c02df5f2>] bdev_getblk+0x76/0x7a
  [<c03519da>] ext4_getblk+0xbc/0x2c4
  [<c0351cc2>] ext4_bread_batch+0x56/0x186
  [<c036bcaa>] __ext4_find_entry+0x156/0x578
  [<c036c152>] ext4_lookup+0x86/0x1f4
  [<c02a3252>] __lookup_slow+0x8e/0x142
  [<c02a6d70>] walk_component+0x104/0x174
  [<c02a793c>] path_lookupat+0x78/0x182
  [<c02a8c7c>] filename_lookup+0x96/0x158
  [<c02a8d76>] kern_path+0x38/0x56
  [<c0c1cb7a>] init_mount+0x5c/0xac
  [<c0c2ba4c>] devtmpfs_mount+0x44/0x7a
  [<c0c01cce>] prepare_namespace+0x226/0x27c
  [<c0c011c6>] kernel_init_freeable+0x286/0x2a8
  [<c0b97ab8>] kernel_init+0x2a/0x156
  [<c0ba22ca>] ret_from_fork+0xe/0x20

I get a folio where folio_address(folio) == 0xfffff000 (which is
broken).

Need to go into the weeds here...

Björn