UBIFS: file data corruption during the power cut-off test

Sergei Poselenov sposelenov at emcraft.com
Sun Jun 9 01:32:24 PDT 2019

Hello Steve,

Please see my comment below.

On Fri, 2019-06-07 at 09:01 -0700, Steve deRosier wrote:
> On Fri, Jun 7, 2019 at 7:24 AM Sergei Poselenov <
> sposelenov at emcraft.com> wrote:
> > Hello Richard,
> > 
> > On Thu, 6 Jun 2019 20:13:07 +0200 Richard Weinberger <
> > richard.weinberger at gmail.com> wrote:
> > 
> > > On Thu, Jun 6, 2019 at 8:08 PM Sergei Poselenov <
> > > sposelenov at emcraft.com> wrote:
> > > > This is understood. However, on the file length that is written
> > > > to the partition, I'd expect that the file content will be the
> > > > same as in the original file. This is not so.
> > > > Is it expected, or is it a deficiency of UBI?
> > > 
> > > Please show in detail what you are doing, on syscall level, and
> > > what
> > > the expected output is.
> > > 
> > 
> > Here is my test:
> ...
> > However, upon retry of the very same test from the beginning (with
> > the power cut-off in the middle) it's easily to have the content of
> > test2 (exactly the last 512 bytes in my case) which doesn't match
> > test0, so "dd if=test2 of=test0 conv=notrunc" will result in test0
> > with a different checksum.
> > 
> IMHO, your test is invalid and it's your expectations that are wrong.
> The file didn't finish writing because you did a power-cut. If I had
> to guess, those exactly "last 512 bytes", are the size of your page
> or
> subpage on the NAND flash, and I'd bet they're filled with 0xFF.
Actually, in that last subpage I've seen 512 bytes of zeroes, or some
other data, but never 0xff.
> Unlike other filesystem media, writing flash media is done in pages,
> where they're erased and then written, and erasing and writing is
> slow
> and complex process.
> If I had to continue my guessing - the valid portion of the file
> test2
> that was successfully written is not a multiple of your NAND's page
> size.  Likely you've got 2Kb pages with 4 512 byte subpages.  The
> last
> page of that flash that was written for that file wrote three of the
> four subpages.  When you `dd` the file overwrite the existing file,
Looks like you are right, what I'm seeing is that only 3 of 4 512-bytes 
subpages written correctly.

So, you are saying that the NAND controller (or the kernel device
driver?) returned "success" for the "4K page write" operation, while
that wasn't actually true?



> you corrupt it yourself by using the no-trim option - for each page
> from the start of test0, it erases, writes the page from test2, until
> it gets to the last page of test2 where it erases the page, writes
> three subpages, and leaves the last subpage as erased, but now you've
> got invalid data in the middle of your file because you don't trim
> the
> size to the write and so the erased data is now part of your file.
> You're seeing a hardware effect and expecting a software result.
> Simple fact is - a power cut when writing a large file, even with
> sync
> on, will result in an invalid (short) file.  UBIFS (nor ANY
> filesystem) can not protect against that. UBIFS is doing it's job by
> making sure your filesystem is still usable after the power cut
> despite it being in the middle of a write.  Which, since your system
> is booting and you're not posting any kernel logs showing corrupted
> filesystem, it seems to me that UBIFS is doing what it is supposed
> to.
> If you want to understand more, the mtd website is a good start, and
> you should absolutely read all the datasheets and app notes for the
> flash device and the NAND interface you're using.
> - Steve

More information about the linux-mtd mailing list