[PATCH] nvme-multipath: implement active-active round-robin path selector

Eric H. Chang echang at sk.com
Thu Apr 5 03:11:59 PDT 2018


> While your experiment is setup to benefit from round-robin, my only concern is it has odd performance in a real world scenario with IO threads executing in different nodes. Christoph's proposal will naturally utilize both paths >optimally there, where round-robin will saturate node interlinks.
>
>Not that I'm against having the choice; your setup probably does represent real use. But if we're going to have multiple choice, user documentation on nvme path selectors will be useful here.


Apologies for the confusion. 
My understanding is that your concern is about efficient CPU-node utilization on NUMA. 
If we come down a bit to the PCIe-retimer with multiple ports (or RDMA NICs with multiple ports) level, our concern is there. 


<Host structure with dual retimers>
CPU node A ------------------- CPU node B 
  |                                    |
PCIe retimer A                     PCIe retimer B
  |                                    |
Port A | Port B                    Port A | Port B


<Host structure with a single retimer>
CPU node A ------------------- CPU node B 
  |                                
PCIe retimer A                  
  |                                
Port A | Port B                 


In any structure above, we'd like to make all available ports efficiently utilized to maximize the performance and this case could also apply to the hosts which are using NVMeOF enabled storages. For NVMeoF setup, the PCIe retimer would be replaced by RDMA NICs though. So, we'd like to discuss the acceptable implementation for the multi-path on multi-ports of the PCIe retimers or R-NICs as well. Since your concern on unbalanced IO threads execution on multi-nodes is valid too,  when your proposal on NUMA and our proposal(or any other's proposal) on multi-ports are combined, it could be most effective solution, I believe. 
In our setup, the system architecture under all ports are symmetrical, round-robin approach still make sense, but if there's any other better consideration, it'd be good to discuss. 


More information about the Linux-nvme mailing list