ubifs: master area fails to recover when master node 1 is corrupted

Fri Jan 26 23:21:20 PST 2024

Hi Zhihao,

Your explanation is very professional. Thanks for it.

But I still have a doubt about the code logic:
------------------------------------
  if (mst1) // false
  else {
   offs2 = (void *)mst2 - buf2;  // offs2 = 0
   if (offs2 + sz + sz <= c->leb_size) // true, mst2 is the first node in LEB 2
     goto out_err
  }
------------------------------------
1. My testing result just proved that CRC-corrupted master#1 also runs to "else" clause of the code above, just like master#1 is unmapped.
2. For CRC corrupted master#1 case, the code logic looks inconsistent:
  2.1. If master#2 LEB is just to be full, master#2 will be used to recover master area.
  2.2. If master#2 LEB is not to be full, master recovering will be aborted with error.

I think whether master#2 LEB is to be full has nothing to do with whether to recover master area in such case. How do you think about it?
________________________________________
From: Zhihao Cheng <chengzhihao1 at huawei.com>
Sent: Friday, January 26, 2024 10:20
To: Ryder Wang; linux-mtd at lists.infradead.org
Subject: Re: ubifs: master area fails to recover when master node 1 is corrupted

在 2024/1/25 19:48, Ryder Wang 写道:
> Hi,
>
> I just find that master area will always fail to recover while mounting, when master node 1's CRC is corrupted but master node 2 is completely good.  It can be 100% reproduced on Kernel v5.4.233, but it seems a common issue.
>

According to the debug messages below, the mounting failure occurs as
follows:
                     LEB 1                       LEB 2
           |mst1 | 0xFF 0xFF ... |      |mst2 | 0xFF 0xFF ... |
offset    0                            0
* mst1 has bad crc.

ubifs_recover_master_node
  get_master_node(UBIFS_MST_LNUM, &mst1)
   ubifs_scan_a_node(buf, lnum, offs=0) // SCANNED_A_CORRUPT_NODE
    ubifs_check_node  // -EUCLEAN, caused by bad crc
   if (offs < c->leb_size) // true
    if (!is_empty(buf, min_t(int, len, sz))) // true
     dbg_rcvry("found corruption at %d:%d")
  get_master_node(UBIFS_MST_LNUM + 1, &buf2, &mst2)
   ubifs_scan_a_node // SCANNED_A_NODE
   *mst = buf // buf = sbuf
   buf2 = sbuf
  if (mst1) // false
  else {
   offs2 = (void *)mst2 - buf2;  // offs2 = 0
   if (offs2 + sz + sz <= c->leb_size) // true, mst2 is the first node
in LEB 2
     goto out_err
  }

Above process is one situation recovering master nodes after powercut,
which means that LEB 1 is unmapped and ready to be written the newest
master node, then powercut happens:
ubifs_write_master
  lnum = UBIFS_MST_LNUM; // LEB 1
  if (offs + UBIFS_MST_NODE_SZ > c->leb_size) // true
   err = ubifs_leb_unmap(c, lnum);
  >> powercut <<
  err = ubifs_write_node_hmac(c->mst_node, lnum)
So master node from LEB 2 can only be recovered in condition that there
is no room left for new master nodes in LEB 2.
Now, the problem is that we corrupt mst1 to construct this situation,
UBIFS identifies that the fact is not the expected situation, UBIFS
refuses to recover master nodes.

> How to reproduce it:
> 1. Corrupt the CRC value of master node 1 (keep master node 2 is good) on ubifs.
> 2. Mount this ubifs.
>
> Mount at step#2 will always fail. From the log, it looks master recovering fails, but master recovering is expected to be OK in such case.

Master node is not expected to be OK in this situation. These two master
nodes are not used to recovery in any situations, they are used to find
a valid version of master node. You can refer to following section in [1]:

"The master node stores the position of all on-flash structures ... The
first is that there could be a loss of power at the same instant that
the master node is being written. The second is that there could be
degradation or corruption of the flash media itself. ... In the second
case, recovery is not possible because it cannot be determined reliably
what is a valid master node version."

[1] http://linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf

>
> Below is the kernel log of this failure:
>
> ubifs_mount:2253: UBIFS DBG gen (pid 10770): name ubi0:test_volume, flags 0x0
> ubifs_mount:2274: UBIFS DBG gen (pid 10770): opened ubi0_0
> ubifs_read_node:1094: UBIFS DBG io (pid 10770): LEB 0:0, superblock node, length 4096
> UBIFS (ubi0:0): Mounting in unauthenticated mode
> ubifs_read_superblock:765: UBIFS DBG mnt (pid 10770): Auto resizing from 13 LEBs to 100 LEBs
> ubifs_start_scan:131: UBIFS DBG scan (pid 10770): scan LEB 1:0
> ubifs_scan:270: UBIFS DBG scan (pid 10770): look at LEB 1:0 (253952 bytes left)
> ubifs_scan_a_node:77: UBIFS DBG scan (pid 10770): scanning master node at LEB 1:0
> UBIFS error (ubi0:0 pid 10770): ubifs_scan [ubifs]: bad node
> ubifs_recover_master_node:234: UBIFS DBG rcvry (pid 10770): recovery
> ubifs_scan_a_node:77: UBIFS DBG scan (pid 10770): scanning master node at LEB 1:0
> get_master_node:163: UBIFS DBG rcvry (pid 10770): found corruption at 1:0
> ubifs_scan_a_node:77: UBIFS DBG scan (pid 10770): scanning master node at LEB 2:0
> get_master_node:152: UBIFS DBG rcvry (pid 10770): found a master node at 2:0
> UBIFS error (ubi0:0 pid 10770): ubifs_recover_master_node [ubifs]: failed to recover master node
> UBIFS error (ubi0:0 pid 10770): ubifs_recover_master_node [ubifs]: dumping second master node
> UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 10772
>          magic          0x6101831
>          crc            0x3a5c03b2
>          node_type      7 (master node)
>          group_type     0 (no node group)
>          sqnum          9
>          len            512
>          highest_inum   65
>          commit number  0
>          flags          0x2
>          log_lnum       3
>          root_lnum      12
>          root_offs      0
>          root_len       108
>          gc_lnum        11
>          ihead_lnum     12
>          ihead_offs     4096
>          index_size     112
>          lpt_lnum       7
>          lpt_offs       44
>          nhead_lnum     7
>          nhead_offs     4096
>          ltab_lnum      7
>          ltab_offs      57
>          lsave_lnum     0
>          lsave_offs     0
>          lscan_lnum     10
>          leb_cnt        13
>          empty_lebs     1
>          idx_lebs       1
>          total_free     753664
>          total_dirty    7640
>          total_used     440
>          total_dead     0
>          total_dark     16384
> UBIFS (ubi0:0): background thread "ubifs_bgt0_0" stops
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
> .
>