Data corruption when using multiple devices with NVMEoF TCP

Sagi Grimberg sagi at grimberg.me
Mon Jan 11 20:29:37 EST 2021


> Hey Hao,
> 
>> Here is the entire log (and it's a new one, i.e. above snippet not 
>> included):
>> https://drive.google.com/file/d/16ArIs5-Jw4P2f17A_ftKLm1A4LQUFpmg/view?usp=sharing 
>>
>>
>> What I found is the data corruption does not always happen, especially 
>> when I copy a small directory. So I guess a lot of log entries should 
>> just look fine.
> 
> So this seems to be a breakage that existed for some time now with
> multipage bvecs that you have been the first one to report. This
> seems to be related to bio merges, which is seems strange to me
> why this just now comes up, perhaps it is the combination with
> raid0 that triggers this, I'm not sure.

OK, I think I understand what is going on. With multipage bvecs
bios can split in the middle of a bvec entry, and then merge
back with another bio.

The issue is that we are not capping the last bvec entry send length
calculation in that.

I think that just this can also resolve the issue:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 973d5d683180..c6b0a189a494 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -201,8 +201,9 @@ static inline size_t nvme_tcp_req_cur_offset(struct 
nvme_tcp_request *req)

  static inline size_t nvme_tcp_req_cur_length(struct nvme_tcp_request *req)
  {
-       return min_t(size_t, req->iter.bvec->bv_len - req->iter.iov_offset,
-                       req->pdu_len - req->pdu_sent);
+       return min_t(size_t, req->iter.count,
+                       min_t(size_t, req->iter.bvec->bv_len - 
req->iter.iov_offset,
+                               req->pdu_len - req->pdu_sent));
  }

  static inline size_t nvme_tcp_pdu_data_left(struct nvme_tcp_request *req)
--



More information about the Linux-nvme mailing list