Compressed files & the page cache
Eric Biggers
ebiggers at kernel.org
Wed Jul 16 19:49:03 PDT 2025
On Wed, Jul 16, 2025 at 11:37:28PM +0100, Phillip Lougher wrote:
> > There also seems to be some discrepancy between filesystems whether the
> > decompression involves vmap() of all the memory allocated or whether the
> > decompression routines can handle doing kmap_local() on individual pages.
> >
>
> Squashfs does both, and this depends on whether the decompression
> algorithm implementation in the kernel is multi-shot or single-shot.
>
> The zlib/xz/zstd decompressors are multi-shot, in that you can call them
> multiply, giving them an extra input or output buffer when it runs out.
> This means you can get them to output into a 4K page at a time, without
> requiring the pages to be contiguous. kmap_local() can be called on each
> page before passing it to the decompressor.
While those compression libraries do provide streaming APIs, it's sort
of an illusion. They still need the uncompressed data in a virtually
contiguous buffer for the LZ77 match finding and copying to work. So,
internally they copy the uncompressed data into a virtually contiguous
buffer. I suspect that vmap() (or vm_map_ram() which is what f2fs uses)
is actually more efficient than these streaming APIs, since it avoids
the internal copy. But it would need to be measured.
> > So, my proposal is that filesystems tell the page cache that their minimum
> > folio size is the compression block size. That seems to be around 64k,
> > so not an unreasonable minimum allocation size. That removes all the
> > extra code in filesystems to allocate extra memory in the page cache.
> > It means we don't attempt to track dirtiness at a sub-folio granularity
> > (there's no point, we have to write back the entire compressed bock
> > at once). We also get a single virtually contiguous block ... if you're
> > willing to ditch HIGHMEM support. Or there's a proposal to introduce a
> > vmap_file() which would give us a virtually contiguous chunk of memory
> > (and could be trivially turned into a noop for the case of trying to
> > vmap a single large folio).
... but of course, if we could get a virtually contiguous buffer
"for free" (at least in the !HIGHMEM case) as in the above proposal,
that would clearly be the best option.
- Eric
More information about the linux-mtd
mailing list