Buffer I/O Errors from Zoned NVME devices

Damien Le Moal Damien.LeMoal at wdc.com
Mon Feb 1 16:04:44 EST 2021


On 2021/02/02 3:03, hch at lst.de wrote:
> On Mon, Feb 01, 2021 at 09:53:06AM -0800, Keith Busch wrote:
>> On Mon, Feb 01, 2021 at 02:36:12PM +0000, Jeffrey Lien wrote:
>>> Christoph, Keith
>>> We're seeing a lot of these Buffer I/O errors with our zoned nvme devices.  One of the FW developers looked into it and had the following explanation:
>>> All these Reads are from the kernel during enumeration and for LBAs that are in last zone's hole hence expected to return boundary error which is getting logged by kernel.
>>>
>>> [65281.936988] Buffer I/O error on dev nvme1n2, logical block 3800039296, async page read
>>> [65281.937165] blk_update_request: I/O error, dev nvme1n2, sector 3800039297 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>>> [65281.937166] Buffer I/O error on dev nvme1n2, logical block 3800039297, async page read
>>> [65281.937335] blk_update_request: I/O error, dev nvme1n2, sector 3800039298 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>>> [65281.937336] Buffer I/O error on dev nvme1n2, logical block 3800039298, async page read
>>> [65281.937498] blk_update_request: I/O error, dev nvme1n2, sector 3800039299 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>>>
>>> Are you aware of this issue and if so, do you have any recommendations on how to avoid or resolve?  
>>
>> Is this from the partition scanning? We don't partition zoned devices,
>> so I think we can skip it. Does the following resolve the issue?
> 
> We already have special zoned device handling in the partitioning code.

Partitions are ignored and warning printed, but the partition table is still
being read...

> 
> But NVMe should make sure to never span a zone boundary as we set the
> chunk size to avoid that.
> 
> What kernel version is this?  Is CONFIG_BLK_DEV_ZONED enabled?

I had a very similar problem doing zonefs tests on Matias machine on a ZNS drive
last week. The problem was the firmware... An upgrade to the latest version
fixed the issue. Not sure what FW rev you are running here, but upgrading might
solve this.

> 


-- 
Damien Le Moal
Western Digital Research



More information about the Linux-nvme mailing list