[PATCH 07/14] Add auto-generated "." and ".." entries to directories
Zach Brown
zab at zabbo.net
Fri Feb 28 15:01:10 PST 2025
On Thu, Feb 27, 2025 at 03:16:16PM +0100, Valerie Aurora wrote:
> There are pros and cons to real or fake "." and ".." dentries, but we
> decided to go with fake because we don't have to do a potentially slow
> lookup of ".." in large directories. Userspace likes "." and ".." to
> come first in readdir(), and we return entries in hash value order, so
> reserve 0 and 1 in the hash output space for "." and "..". Then
> autogenerate them in readdir() and lookup() using the parent inode
> back reference.
> +static u64 name_hash(void *name, size_t name_len) {
> + char *s = name;
> + u64 hash;
(function opening brace on new line)
> + if ((name_len <= 2) && s[0] == '.') {
> + if (name_len == 1)
> + return NGNFS_DIRENT_DOT_HASH;
> +
> + if (s[1] == '.')
> + return NGNFS_DIRENT_DOT_DOT_HASH;
> + }
It makes me a little nervous that this dereferences name if name_len is
0. It probably never happens, but we can be more defensive and only
test for {".",".."} for name lengths of {1,2}.
> +static int fill_readdir_rd(struct ngnfs_btree_key *key, void *val, size_t val_size, void *args)
> +{
> + int ret;
> + ret = fill_readdir(key, val, args);
> +
> + if (ret == 0)
> + return NGNFS_BTREE_ITER_CONTINUE;
> +
> + return 0;
> +}
The "ret;\nret = " makes this look unfinished. I think it's correct
that it's translating ENOBUFS back into 0, but it's masking all errors
and that looks off. Would it be better to make the translation of
0->continue and NOBUFS->0 explicit, then pass through other errors?
> +static int fill_from_inode(u64 hash, struct ngnfs_inode_txn_ref *inode, char *name, size_t name_len,
> + struct ngnfs_dirent *dent, struct readdir_args *ra)
> +{
> + struct ngnfs_btree_key key = { {cpu_to_le64(hash), 0, 0} };
> +
> + fill_dent(inode->ninode->ino, inode->ninode->version,
> + mode_to_type(le32_to_cpu(inode->ninode->mode)), name, name_len, dent);
> +
> + return fill_readdir(&key, dent, ra);
The is only correct if the caller's provided hash has the collision bit
for the dirent's key. It's true for the use with dots, but it being a
little helper makes it look like it's generally true. Given that one of
the caller's shouldn't be using the inode, this should be folded back
into the caller or reworked to make it clear that it's a specific helper
for the dots caller.
> +/*
> + * Fill in "." and ".." entries for readdir if it is the first read.
> + * Assumes that the buf in ra is large enough to hold both.
> + */
> +static int fill_dots(struct ngnfs_fs_info *nfi, struct ngnfs_transaction *txn,
> + struct ngnfs_btree_key *key, struct readdir_args *ra,
> + struct ngnfs_inode_txn_ref *dir)
> +{
> + struct ngnfs_inode_txn_ref parent_dir;
> + struct ngnfs_dirent dent;
> + u64 parent_ino;
> + int ret;
> +
> + if ((le64_to_cpu(key->k[0]) != 0))
> + return 0;
This will fail to emit dotdot for readdir starting at pos 1.
dir_emit_dots() in the kernel provides the recipe that we should follow
here: if (0) { emit 0; pos = 1} if (1) { emit 1; pos = 2}.
> + parent_ino = le64_to_cpu(dir->ninode->parent_ino);
> +
> + ret = fill_from_inode(NGNFS_DIRENT_DOT_HASH, dir, ".", 1, &dent, ra) ?:
> + ngnfs_inode_get(nfi, txn, NBF_READ, parent_ino, &parent_dir) ?:
> + fill_from_inode(NGNFS_DIRENT_DOT_DOT_HASH, &parent_dir, "..", 2, &dent, ra);
As discussed around the addition of parent_ino, we don't need to read
the parent inode block here. We can know it's gen if we store it in the
inode, and we know that it darn well had better be a dir and don't need
to see its inode mode to get dtype.
So fill_from_inode() could become something like:
fill_dots(.., struct ngnfs_ino_gen *ig, u8 type, ...)
- z
More information about the ngnfs-devel
mailing list