[PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir

Mon May 5 07:42:10 PDT 2025

On 5/5/2025 10:02 AM, Etienne Champetier wrote:
> Hello,
>
> Removing lists, feel free to add them back
>
> Le lun. 5 mai 2025 à 09:14, Christian Brauner <brauner at kernel.org> a écrit :
>> Why is it removed? That's a very strange comment:
>>
>> #if 0   /* reportedly no longer needed */
>>
>> So then just don't remove it. I don't see a reason for us to workaround
>> userspace creating a bug for itself and forcing us to add two new inode
>> operations to work around it.
> This bash workaround introduced ages ago for AFS bypass fs.protected_regular

Chet, I don't think this history is correct.  The bash workaround was 
introduced in 1992 to workaround a behavior when appending to restricted 
access directories stored in IBM AFS 3.1[1] and the Linux kernel's 
30aba6656f61ed44cba445a3c0d38b296fa9e8f5 wasn't added until 2018.

IBM AFS 3.2 addressed the narrow use case described by the bug report by 
implementing a potentially racy change to the AFS cache manager and 
failing to address the server side.  However, that is out of scope for 
this discussion.  To the extent that there is a bug in one or more of 
the AFS server implementations it should be fixed there.

The bash fallback logic to retry the open without O_CREAT introduces a 
bypass for the kernel mode protection provided by 30aba6656f61 and 
should be removed.

Christian,

It just so happens that the workaround added to bash in 1992 masks an 
incompatibility introduced by 30aba6656f61 when the backing filesystem 
is "afs" because the ownership checks required by may_create_in_sticky() 
cannot be reliably performed based upon the kernel's local knowledge of 
the uids.  Ownership checks in "afs" are performed by the fileserver's 
evaluation of the caller's rxgk or rxkad security tokens and not by use 
of uids.  This incompatibility was only noticed after Red Hat began 
enabling fs.protected_regular by default and bash removed the fallback 
logic in the proposed 5.3 release candidates.

The proposed inode operations are to permit filesystems such as AFS 
which cannot rely upon the kernel's local uid knowledge to perform the 
required the ownership checks to perform those checks via another 
mechanism.  In the case of AFS, the fileserver already conveys the 
answer to the "is inode owned by me?" question as part of its delivery 
of caller access rights (AFSFetchStatus.CallerAccess).   The answer to 
the "do these two inodes have the same owner?" question can be 
determined via comparison of the AFSFetchStatus.Owner fields for each 
inode which belong to a uid namespace that is specific the the AFS cell 
in which the inodes are stored.  When performing this ownership check 
for network filesystems I do not believe it is safe to assume that the 
uid namespace of the network filesystem is identical to the local 
machine's uid namespace.  I think it would be safer for all network 
filesystems to answer the ownership questions using network uid values 
instead of local uid values when available.

I'm also concerned about using id-mapped values for this comparison 
because there is no restriction preventing two distinct id values from 
being mapped to the same id.

Sincerely,

Jeffrey Altman

[1] https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4276 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20250505/403553db/attachment-0001.p7s>