ubifs: absurdly large directory inode size (possibly race condition / underflow)
Roland Ruckerbauer
roland.ruckerbauer at robart.cc
Wed Sep 20 08:09:04 PDT 2023
Greetings,
I have observed some strange behavior in a UBIFS filesystem, and I wanted to ask if this is known, or unexpected.
For reference, I am using latest upstream 4.19 kernel on an embedded system, with the filesystem in question being encrypted with fscrypt.
When I stat the directory in question it shows the following:
File: ./datastorage
Size: 18446744073709550408 Blocks: 0 IO Block: 4096 directory
Device: 27h/39d Inode: 1168 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-09-19 13:30:21.000000000
Modify: 2023-09-20 14:44:09.000000000
Change: 2023-09-20 14:44:09.000000000
As you can see, the size of the directory inode is absurdly large. Its actually quite close to 2^64, which makes me think there is some kind of
race / underflow happening in regards to the stored inode size metadata.
Apart from this observation, the filesystem in question seems to be healthy, no errors, also no unexpected errors from the applications using it.
As far as I am aware, the problem only manifests itself as corrupted metadata, when calling e.g. stat.
When I move around the folder on the same filesystem, the problem persists. When any files are added / deleted from the directory, it changes the
directory size, but its still corrupted and close to the same value. A reboot of the system shows that this corruption is indeed persistent on the
filesystem.
This began to happen daily (but did happen before), since I did a small code change to an application running on the system.
Here is some pseudocode which I suspect might be related to the problem. Obviously I removed a lot of the error handling etc... to make it more clear.
In essence I refactored some code to be more atomic with its changes to files.
--------------------------------------------------------------------------------------------------------
// Open the file for writing, but make it an anon inode to prevent having incomplete
// files in the filesystem when e.g. crashing
int fd = open(path, O_WRONLY | O_CLOEXEC | O_TMPFILE, 0644);
...
write(fd, buffer, buffer_len);
...
// Unfortunately we need to unlink the destination first, because AT_LINK_REPLACE is not available
// This is not atomic, so in theory someone can re-create the file after its deleted here.
// I think its ok to let the write fail in this case.
unlink(path);
linkat(AT_FDCWD, old_path, AT_FDCWD, path.c_str(), AT_SYMLINK_FOLLOW);
close(fd);
--------------------------------------------------------------------------------------------------------
Is someone aware of a problem like this? I did not find anything similar to this particular problem, despite searching for some time. Even for
other filesystems, not just ubifs. I have expected this to work without issues, its not some rare patter to use O_TMPFILE like this. After
all the linkat() and open() manpages both mention this approach.
Could it be that the unlink() followed by the linkat() is somehow resulting in a race condition in the kernel?
Best regards,
Roland Ruckerbauer
More information about the linux-mtd
mailing list