[PATCH] ubifs: Add new mount option to force fdatasync before rename

Nikhilesh Reddy reddyn at codeaurora.org
Tue Oct 13 11:41:57 PDT 2015


On Tue 06 Oct 2015 01:09:22 AM PDT, Richard Weinberger wrote:
> Am 05.10.2015 um 20:05 schrieb Nikhilesh Reddy:
>> On 10/02/2015 02:38 PM, Richard Weinberger wrote:
>>> Hi!
>>>
>>> Am 28.09.2015 um 20:19 schrieb Nikhilesh Reddy:
>>>> The rename operation in UBIFS is synchronous (or nearly synchronous)
>>>> while the write operation is not. This can result in zero length files when
>>>> renaming of files followed by an abrupt power down or a crash.
>>>>
>>>> For example:
>>>> 1) Say a file a.txt exists with size 1KB.
>>>> 2) Create a file b.tmp (open)
>>>> 3) Update the data in b.tmp with new values (write and close)
>>>> 4) rename b.tmp to a.txt
>>>> 5) Abrupt power down or crash
>>>>
>>>> This above scenario can result in a.txt becoming a file of zero length and
>>>> giving the impression of a.txt being truncated.
>>>> This scenario can ofcourse be prevented by calling fsync or fdatasync
>>>> before the rename operation.
>>>
>>> I gave this a try and hacked up something to emulate a powercut *exactly* after
>>> rename() in UBIFS.
>>>
>>> fd = open("b.tmp", ...)
>>> write(fd, "foo", ...)
>>> close(fd)
>>> rename("b.tmp", "a.txt")
>>> ^---- powercut
>>>
>>> After remounting UBIFS both a.txt and b.tmp are present
>>> but b.tmp is truncated. Not a.txt as you said.
>>>
>>> Can you please double check?
>>> I want to make sure that we're talking about the same things.
>>
>> Since you mentioned a.txt and b.tmp are both present... i assume the file a.txt was present even before b.tmp was created?
>
> Yes.
>
>> I will try and explain as to what i understand the situation to be.
>>
>> If both the files are present then the rename didnt actually get written to the device and was probably still in the internal ubifs write buffer.
>
> A rename operation does not trigger a commit, therefore a powercut directly after rename() would make the rename() void.
> In this context "both files present" means a.txt and b.tmp exist and are both synched to disk?
>
>> I believe there is a small delay between the rename call and the inodes
>> being updated on the the device from the internal ubifs write buffer.
>>
>> The scenario i described above seems to occur when the inode update is committed to the device... i.e here the b.tmp should not exist since the rename was successfully written but
>> the file data writeback (that is in the page cache) has not yet been committed to the device.
>> Since the writeback buffer is way smaller than the page cache the inode update occurs first or is likely to have.
>>
>>
>> Hopefully i did not mess up on my understanding or explanation.
>
> Can you please share a reproducer?
> A simple sequence of syscall would also do it.
>
> Thanks,
> //richard

Sorry for the delay in my reply
I got tied up...

as for the reproducer... its exactly as i described in the commit 
message... though we performed the power reset after a bit of delay. it 
does take a few tries on our end to reproduce... so we have it on a 
loop until it is reproduced.

I Will definitely  send you more concrete steps once i have a bit of 
time.

--
Thanks
Nikhilesh Reddy

Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project.




More information about the linux-mtd mailing list