UBIFS: file data corruption during the power cut-off test

Sergei Poselenov sposelenov at emcraft.com
Sun Jun 9 01:32:24 PDT 2019


Hello Steve,

Please see my comment below.

On Fri, 2019-06-07 at 09:01 -0700, Steve deRosier wrote:
> On Fri, Jun 7, 2019 at 7:24 AM Sergei Poselenov <
> sposelenov at emcraft.com> wrote:
> > Hello Richard,
> > 
> > On Thu, 6 Jun 2019 20:13:07 +0200 Richard Weinberger <
> > richard.weinberger at gmail.com> wrote:
> > 
> > > On Thu, Jun 6, 2019 at 8:08 PM Sergei Poselenov <
> > > sposelenov at emcraft.com> wrote:
> > > > This is understood. However, on the file length that is written
> > > > to the partition, I'd expect that the file content will be the
> > > > same as in the original file. This is not so.
> > > > Is it expected, or is it a deficiency of UBI?
> > > 
> > > Please show in detail what you are doing, on syscall level, and
> > > what
> > > the expected output is.
> > > 
> > 
> > Here is my test:
> ...
> > However, upon retry of the very same test from the beginning (with
> > the power cut-off in the middle) it's easily to have the content of
> > test2 (exactly the last 512 bytes in my case) which doesn't match
> > test0, so "dd if=test2 of=test0 conv=notrunc" will result in test0
> > with a different checksum.
> > 
> 
> IMHO, your test is invalid and it's your expectations that are wrong.
> The file didn't finish writing because you did a power-cut. If I had
> to guess, those exactly "last 512 bytes", are the size of your page
> or
> subpage on the NAND flash, and I'd bet they're filled with 0xFF.
Actually, in that last subpage I've seen 512 bytes of zeroes, or some
other data, but never 0xff.
> Unlike other filesystem media, writing flash media is done in pages,
> where they're erased and then written, and erasing and writing is
> slow
> and complex process.
> 
> If I had to continue my guessing - the valid portion of the file
> test2
> that was successfully written is not a multiple of your NAND's page
> size.  Likely you've got 2Kb pages with 4 512 byte subpages.  The
> last
> page of that flash that was written for that file wrote three of the
> four subpages.  When you `dd` the file overwrite the existing file,
Looks like you are right, what I'm seeing is that only 3 of 4 512-bytes 
subpages written correctly.

So, you are saying that the NAND controller (or the kernel device
driver?) returned "success" for the "4K page write" operation, while
that wasn't actually true?

Thanks!

Regards,
Sergei


> you corrupt it yourself by using the no-trim option - for each page
> from the start of test0, it erases, writes the page from test2, until
> it gets to the last page of test2 where it erases the page, writes
> three subpages, and leaves the last subpage as erased, but now you've
> got invalid data in the middle of your file because you don't trim
> the
> size to the write and so the erased data is now part of your file.
> 
> You're seeing a hardware effect and expecting a software result.
> 
> Simple fact is - a power cut when writing a large file, even with
> sync
> on, will result in an invalid (short) file.  UBIFS (nor ANY
> filesystem) can not protect against that. UBIFS is doing it's job by
> making sure your filesystem is still usable after the power cut
> despite it being in the middle of a write.  Which, since your system
> is booting and you're not posting any kernel logs showing corrupted
> filesystem, it seems to me that UBIFS is doing what it is supposed
> to.
> 
> If you want to understand more, the mtd website is a good start, and
> you should absolutely read all the datasheets and app notes for the
> flash device and the NAND interface you're using.
> 
> - Steve
> 




More information about the linux-mtd mailing list