trying to gain understanding on BLKRRPART and "failed to re-read partition table" error
Keith Busch
kbusch at kernel.org
Fri Jan 6 11:46:42 PST 2023
On Mon, Jan 02, 2023 at 04:52:28PM -0600, Nick Neumann wrote:
> I've been using nvme-cli 1.9 (default on ubuntu 18.04 LTS) for several
> months formatting nvme drives repeatedly without issue before running
> device-level benchmarks with fio. I always format passing in the block
> device, e.g., /dev/nvme0n1, and haven't had any issues.
>
> I recently started playing with creating file systems on the drives to
> do file-based benchmarking. After doing this, I've noticed that
> intermittently the format on a drive that has had partitions,
> filesystem, and files created will fail with code ECOMM on the
> re-reading of the partition table. However, I've struggled to make the
> failure happen reliably. This is the error I'm talking about:
>
> https://github.com/linux-nvme/nvme-cli/blob/c3db2bfda5346f68344a9e6d795319a7bf35d19e/nvme.c#L5212
>
> The current nvme code has multiple guards on running the problematic
> BLKRRPART, and one of them, "cfg.lbaf != prev_lbaf", is enough that
> current nvme-cli would not run the BLKRRPART for my nvme formats.
>
> But I'm hoping to understand if these guards are actually fixing the
> issue I'm hitting in 1.9, or would just mask my issue... that is,
> upgrading to current nvme-cli would make the error go away, but it
> seems possible that changes to how I format cause BLKRRPART to run
> would bring it back). Are the added protections for the block device
> format case to avoid issues like the one I am hitting? Or is it
> unusual that I'm seeing such an error?
>
> In case it helps, the system does not have the drive or namespace
> mounted in any way, and the format is being issued immediately after
> the system is resuming from a suspend. Happy to provide any more
> details that might help.
There really is no need for user space to do the BLKRRPART ioctl
anymore. The kernel takes care of the rescan automatically, so unless
you've a really old kernel, we might be better off just removing the
ioctl.
More information about the Linux-nvme
mailing list