UBIFS errors when file-system is full

Stefan Agner stefan at agner.ch
Wed Aug 12 00:01:43 PDT 2015


Hi Richard,

[also added Brian to the discussion, since he had a look into that
driver before]

On 2015-08-07 14:37, Richard Weinberger wrote:
> Hi!
> 
> Am 06.08.2015 um 12:31 schrieb Bhuvanchandra DV:
>>>> The tests ran on ubi partition after isolating it from U-Boot completly.
>>>> Formatted the ubi partition and then boot with SD card (4.1.2 kernel fastmap enabled/disabled, fm_debug enabled).
>>>> Please find the below log of ubi-tests:
>>>>
>>>> [io_paral] write_thread():222: written and read data are different
>>> *blink*
>>
>> Tried to run the io_paral test multiple times seperately with few debug prints added to see what exact
>> differences with read and write buffers, so far we could see one complete page is read twice even though
>> it is written once. I'm now confused is the issue happen while reading or while writing. Can you give us
>> some pointers so that we can narrow down the cause for this failure.
> 
> The test verifies that the data has been written correctly to the block.
> (Maybe a buffer problem in your MTD driver?)
> 
> You can also enable UBI's IO checks.
> i.e. echo 1 > /sys/kernel/debug/ubi/ubi0/chk_io
> 
> It will also verify it's writes. Maybe it can give you a clue.

According to Bhuvan's test, it really seems that we have an issue on
write path (this error is reproduceable):
root at colibri-vf:~/ubi-tests-bin# ./io_paral /dev/ubi0 2>&1 | tee
~/io-parl4.log
[ 6451.223087] ubi0 error: self_check_write: self-check failed for PEB
843:4096, len 126976
[ 6451.231650] ubi0: data differ at position 61440
[ 6451.236325] ubi0: hex dump of the original buffer from 61440 to
126976
[ 6451.331045] ubi0: hex dump of the read buffer from 61440 to 126976
[ 6451.426703] CPU: 0 PID: 1182 Comm: io_paral Not tainted
4.1.4-00704-g2631972 #21
[ 6451.434506] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)

This 4.1.4 with v10 of the driver applied:
http://thread.gmane.org/gmane.linux.drivers.devicetree/130300


I worked on the driver since quite some time, currently v10 is in
review. With this issue in mind, I went through the driver however I
currently can't see an issue.

The error position is always page aligned, but at different pages. We
printed the reread buffers once: It seems that one page lands on flash
twice. My guess is that the second page doesn't get transmitted
properly, while the new column/row gets transmitted and
NAND_CMD_PAGEPROG executed... Hence the same buffer would be written to
the device again.

The NFC IP in Vybrid (vf610) has a higher level programming model which
takes care of the command sequencing. Therefore some callbacks are not
actually sending a command to the device (e.g. NAND_CMD_SEQIN) since
this will be done one command later, on in NAND_CMD_PAGEPROG. Now, of
course, the driver relies heavily on not being interrupted by other
requests in between, (also not read!) but I thought that this is taken
care of by the MTD subsystem? So for me it is a bit hard to spot the
error since I'm always unsure whether the assumptions regarding
locking/exclusiveness between the calls is really guaranteed...

--
Stefan




More information about the linux-mtd mailing list