ubifs: master area fails to recover when master node 1 is corrupted

Sat Jan 27 02:21:17 PST 2024

在 2024/1/27 17:39, Zhihao Cheng 写道:
> 在 2024/1/27 15:21, Ryder Wang 写道:
>> Hi Zhihao,
>>
>> Your explanation is very professional. Thanks for it.
>>
>> But I still have a doubt about the code logic:
>> ------------------------------------
>>    if (mst1) // false
>>    else {
>>     offs2 = (void *)mst2 - buf2;  // offs2 = 0
>>     if (offs2 + sz + sz <= c->leb_size) // true, mst2 is the first 
>> node in LEB 2
>>       goto out_err
>>    }
>> ------------------------------------
>> 1. My testing result just proved that CRC-corrupted master#1 also runs 
>> to "else" clause of the code above, just like master#1 is unmapped.
>> 2. For CRC corrupted master#1 case, the code logic looks inconsistent:
>>    2.1. If master#2 LEB is just to be full, master#2 will be used to 
>> recover master area.
>>    2.2. If master#2 LEB is not to be full, master recovering will be 
>> aborted with error.
>>
>> I think whether master#2 LEB is to be full has nothing to do with 
>> whether to recover master area in such case. How do you think about it?
> 
> 
> Actually, UBIFS can still work even if master#2 is recovered in such 
> case(master#1 is corrupted), because the master#2 is the newest version.
> The offset checking for master#2 LEB being full is a way to make sure 
> that UBIFS can find the newest master node. If we simply remove the 
> checking, UBIFS could go wrong in some situations, for example:
> 
> Powercut happens before writing mst2_v2 on LEB2, so the UBIFS image 
> looks like:
>               LEB1                                LEB2
> |mst1_v1 | mst1_v2 |0xFF 0xFF ... |      |mst2_v1 | 0xFF 0xFF ... |
> 
> The mast1_v2 is expected to be recovered after exeucting 
> ubifs_recover_master_node(). If both mst1_v1 and mst1_v2 are corrupted, 
> UBIFS will enter into this branch:
> 
>     if (mst1) // false
>     else {
>       offs2 = (void *)mst2 - buf2;  // offs2 = 0
>       if (offs2 + sz + sz <= c->leb_size) // offset checking
>         goto out_err
>       mst = mst2;
>     }
> If the offset checking is removed, mst_2_v1 is recovered, apperantly, 
> UBIFS picks wrong master node and it's not right.
> 
> So accodring to the realization of ubifs_recover_master_node(), UBFIS 
> choose the newest master node by various offset checking, it's just like 
> a whitelist of situations that UBIFS can fully trust on, other 
> situations are failure pathes, although some of failure pathes can make 
> UBIFS still get a right master node in some(not all) cases.
> 

Besides, if there are corruptions in UBIFS image, UBIFS should report 
error, there is nothing that UBIFS can do to fix them.