ubifs: master area fails to recover when master node 1 is corrupted

Sat Jan 27 01:39:13 PST 2024

在 2024/1/27 15:21, Ryder Wang 写道:
> Hi Zhihao,
> 
> Your explanation is very professional. Thanks for it.
> 
> But I still have a doubt about the code logic:
> ------------------------------------
>    if (mst1) // false
>    else {
>     offs2 = (void *)mst2 - buf2;  // offs2 = 0
>     if (offs2 + sz + sz <= c->leb_size) // true, mst2 is the first node in LEB 2
>       goto out_err
>    }
> ------------------------------------
> 1. My testing result just proved that CRC-corrupted master#1 also runs to "else" clause of the code above, just like master#1 is unmapped.
> 2. For CRC corrupted master#1 case, the code logic looks inconsistent:
>    2.1. If master#2 LEB is just to be full, master#2 will be used to recover master area.
>    2.2. If master#2 LEB is not to be full, master recovering will be aborted with error.
> 
> I think whether master#2 LEB is to be full has nothing to do with whether to recover master area in such case. How do you think about it?

Actually, UBIFS can still work even if master#2 is recovered in such  
case(master#1 is corrupted), because the master#2 is the newest version.
The offset checking for master#2 LEB being full is a way to make sure  
that UBIFS can find the newest master node. If we simply remove the  
checking, UBIFS could go wrong in some situations, for example:

Powercut happens before writing mst2_v2 on LEB2, so the UBIFS image  
looks like:
              LEB1                                LEB2
|mst1_v1 | mst1_v2 |0xFF 0xFF ... |      |mst2_v1 | 0xFF 0xFF ... |

The mast1_v2 is expected to be recovered after exeucting  
ubifs_recover_master_node(). If both mst1_v1 and mst1_v2 are corrupted,  
UBIFS will enter into this branch:

    if (mst1) // false
    else {
      offs2 = (void *)mst2 - buf2;  // offs2 = 0
      if (offs2 + sz + sz <= c->leb_size) // offset checking
        goto out_err
      mst = mst2;
    }
If the offset checking is removed, mst_2_v1 is recovered, apperantly,  
UBIFS picks wrong master node and it's not right.

So accodring to the realization of ubifs_recover_master_node(), UBFIS  
choose the newest master node by various offset checking, it's just like  
a whitelist of situations that UBIFS can fully trust on, other  
situations are failure pathes, although some of failure pathes can make  
UBIFS still get a right master node in some(not all) cases.