UBIFS: file data corruption during the power cut-off test

Sergei Poselenov sposelenov at emcraft.com
Fri Jun 7 23:42:04 PDT 2019


Hello Richard,

On Fri, 7 Jun 2019 18:02:16 +0200 (CEST) Richard Weinberger <
richard at nod.at> wrote:

> ----- Ursprüngliche Mail -----
> > Now, I' overwriting the content of the "test0" with "test2",
> > without "test0"
> > truncation:
> > mnt/data1 # dd if=test2 of=test0 conv=notrunc
> > 222+0 records in
> > 222+0 records out
> > 113664 bytes (111.0KB) copied, 0.859668 seconds, 129.1KB/s
> > /mnt/data1 # md5sum *
> > 23376319de62934c3859615e6244aae0  test0
> > 23376319de62934c3859615e6244aae0  test1
> > 87cd1d1a28896e63eb039d75efef43a3  test2
> > /mnt/data1 #
> > 
> > For this particular test pass, the content of test2 matches the
> > content of the
> > original file test0, so overwriting test0 with test2 doesn't change
> > the data.
> > 
> > However, upon retry of the very same test from the beginning (with
> > the power
> > cut-off in the middle) it's easily to have the content of test2
> > (exactly the
> > last 512 bytes in my case) which doesn't match test0, so "dd
> > if=test2 of=test0
> > conv=notrunc" will result in test0 with a different checksum.
> > 
> > To remind, the partition is mounted with the "sync" option.  
> 
> Ahhh, now understand what you do. Sorry for being dense.
> 
> I fear your assumption is not correct because UBIFS is strictly copy-
> on-write.
> So if you overwrite a file, it will always create new data blocks. A
> data block
> has 4k. If you face a power-cut from this 4k block the last written
> page can be
> lost. This is likely what you see.
> 
> Does my answer help?
> 

Not actually, I'm afraid. I don't see how overwriting a file is
relevant here. Definitely, I don't do power cut during "dd if=test2
of=test0 conv=notrunc". At the moment of power cut-off, test2 has been
created from scratch and has some data written to it, with more writes in progress.

Overwriting with "dd if=test2 of=test0 conv=notrunc" is just an easy
way to demonstrate that the last written file has invalid data. I still
don't understand why is that? It's not the dd operation itself that
corrupts test0. It's test2 which already has invalid data.

It looks like the directory entry was updated before the data was
actually written to NAND, regardless of the "sync" mount option?

Just in case, here is my sequence:
repeat:
0. cd /mnt/data1; rm -rf *
1. /mnt/data1 # for i in `seq 0 99`; do
> dd if=/bin/busybox of=test${i}; done  
3. Power is cut during the above "for" loop
4. After boot-up, check the last written file:
/mnt/data1 # ls -l
-rw-r--r--    1 root     root        396795 Jan  8 23:58 test0
-rw-r--r--    1 root     root        396795 Jan  8 23:58 test1
-rw-r--r--    1 root     root        113664 Jan  8 23:58 test2

/mnt/data1# dd if=test2 of=test0 conv=notrunc
/mnt/data1# md5sum *
Result:Sometimes, "test0" has invalid data, which means that "test2"
has corrupted data on its written length.
5. goto repeat;


Update: I just re-done the test. Here is what I have after the system
recovery after the power cut-off:
~ # ls -l /mnt/data1
-rw-r--r--    1 root     root        430519 Jan  1 00:00 test0
-rw-r--r--    1 root     root        430519 Jan  1 00:04 test1
-rw-r--r--    1 root     root        430519 Jan  1 00:04 test2
-rw-r--r--    1 root     root        430519 Jan  1 00:04 test3
-rw-r--r--    1 root     root        286720 Jan  1 00:04 test4

Now, my test:
/mnt/data1 # md5sum *
0dd9133f519bbee46ac593490b9d8805  test0
0dd9133f519bbee46ac593490b9d8805  test1
0dd9133f519bbee46ac593490b9d8805  test2
0dd9133f519bbee46ac593490b9d8805  test3
76d6ffbaad4220ac474484bf09b1bbb8  test4
/mnt/data1 # dd if=test4 of=test0 conv=notrunc
/mnt/data1 # md5sum *
81c3dcc4f2dfd65592ad245a6976091d  test0
0dd9133f519bbee46ac593490b9d8805  test1
0dd9133f519bbee46ac593490b9d8805  test2
0dd9133f519bbee46ac593490b9d8805  test3
76d6ffbaad4220ac474484bf09b1bbb8  test4

OK, test4 is corrupted:
# hexdump test0 > test0.hex
# hexdump test4 > test4.hex
# hexdump /bin/busybox > busybox.hex
# diff -pu busybox.hex test0.hex | more
--- busybox.hex	2019-06-07 21:24:16.000000000 +0300
+++ test0.hex	2019-06-07 21:24:28.000000000 +0300
@@ -17852,38 +17852,8 @@
 0045dd0 0004 e1a0 81f0 e8bd 9ba0 0007 3014 e59f
 0045de0 4010 e92d 4000 e593 0008 e594 ffa9 ebff
 0045df0 0008 e584 8010 e8bd 9ba0 0007 0000 e151
-0045e00 40f8 e92d 3000 31a0 4000 e1a0 4001 31a0
-0045e10 1003 31a0 0001 e1a0 ff8d ebff 3044 e59f
-0045e20 5000 e3a0 6000 e593 7000 e1a0 0007 e154
-0045e30 000a 8a00 3004 e596 3001 e243 0003 e154
-0045e40 0006 8a00 0004 e1a0 ff81 ebff 3000 e5d0
-0045e50 4001 e280 000a e353 5001 0285 fff2 eaff
-0045e60 0005 e1a0 80f8 e8bd 9ba0 0007 3000 e3e0
-0045e70 40f7 e92d 5001 e1a0 3000 e581 1000 e5d0
-0045e80 4000 e1a0 002e e351 3130 059f 4001 0280
-0045e90 6000 0593 0008 0596 0038 0a00 0027 e351
-0045ea0 0013 1a00 3001 e5d0 4002 e280 2041 e243
-0045eb0 0019 e352 3020 9283 3073 96ef 3061 e243
-0045ec0 3073 e6ef 0019 e353 0039 8a00 20ec e59f
-0045ed0 3044 e283 2000 e592 1103 e792 0000 e351
-0045ee0 0033 0a00 0000 e592 ffc3 ebff 0000 e585
-0045ef0 002f ea00 002f e351 0019 1a00 6001 e280
-0045f00 0006 e1a0 f7c7 ebfe 4000 e1a0 1004 e066
-0045f10 0006 e1a0 ff36 ebfe 3000 e5d4 2001 e3a0
-0045f20 002f e353 3094 e59f 6000 e1a0 7000 e593
-0045f30 1006 e1a0 0008 e597 3002 e1a0 4001 0284
-0045f40 ff6d ebff 1000 e250 0002 0a00 0000 e597
-0045f50 ffa9 ebff 0000 e585 0006 e1a0 f7b4 ebfe
-0045f60 0013 ea00 0024 e351 0008 1a00 304c e59f
-0045f70 4001 e280 6000 e593 0004 e596 0001 e240
-0045f80 fee5 ebff 1000 e1a0 0000 e596 ffd5 eaff
-0045f90 3030 e241 0009 e353 0005 8a00 3004 e28d
-0045fa0 101c e59f 2005 e1a0 fa0e ebfe 3004 e59d
-0045fb0 4003 e084 0004 e1a0 d00c e28d 80f0 e8bd
-0045fc0 9ba0 0007 dd97 0006 4038 e92d 302c e59f
-0045fd0 5000 e3a0 3000 e593 4034 e593 0058 e593
-0045fe0 4002 e244 0004 e155 0002 2a00 ff29 ebff
-0045ff0 5001 e285 fffa eaff 4038 e8bd ff14 eaff
+0045e00 0000 0000 0000 0000 0000 0000 0000 0000
+*
 0046000 9ba0 0007 40f8 e92d 306c e59f 6000 e1a0
 0046010 7001 e1a0 5000 e593 4003 e1a0 0000 e356
 0046020 0008 da00 0000 e357 0058 e595 0001 aa00

You see, exactly 512 bytes at 0x45e00 are zeroes. That's the test4
file:
# hexdump test4.hex
...
0045dd0 0004 e1a0 81f0 e8bd 9ba0 0007 3014 e59f
0045de0 4010 e92d 4000 e593 0008 e594 ffa9 ebff
0045df0 0008 e584 8010 e8bd 9ba0 0007 0000 e151
0045e00 0000 0000 0000 0000 0000 0000 0000 0000
*
0046000
#

Thanks!

Regards,
Sergei


> Thanks,
> //richard
> 




More information about the linux-mtd mailing list