ath11k and vfio-pci support

Baochen Qiang quic_bqiang at quicinc.com
Sun Jan 14 04:37:03 PST 2024



On 1/12/2024 8:47 PM, James Prestwood wrote:
> Hi,
> 
> On 1/11/24 6:04 PM, Baochen Qiang wrote:
>>
>>
>> On 1/11/2024 9:38 PM, James Prestwood wrote:
>>>
>>> On 1/11/24 5:11 AM, Kalle Valo wrote:
>>>> James Prestwood <prestwoj at gmail.com> writes:
>>>>
>>>>> Hi Kalle, Baochen,
>>>>>
>>>>> On 1/11/24 12:16 AM, Kalle Valo wrote:
>>>>>> Baochen Qiang <quic_bqiang at quicinc.com> writes:
>>>>>>
>>>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote:
>>>>>>>> Hi Kalle,
>>>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote:
>>>>>>>>> James Prestwood <prestwoj at gmail.com> writes:
>>>>>>>>>
>>>>>>>>>>> But I have also no idea what is causing this, I guess we are 
>>>>>>>>>>> doing
>>>>>>>>>>> something wrong with the PCI communication? That reminds me, 
>>>>>>>>>>> you could
>>>>>>>>>>> try this in case that helps:
>>>>>>>>>>>
>>>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/
>>>>>>>>>> Heh, I saw this pop up a day after I sent this and was 
>>>>>>>>>> wondering. Is
>>>>>>>>>> this something I'd need on the host kernel, guest, or both?
>>>>>>>>> On the guest where ath11k is running. I'm not optimistic that 
>>>>>>>>> this would
>>>>>>>>> solve your issue, I suspect there can be also other bugs, but 
>>>>>>>>> good to
>>>>>>>>> know if the patch changes anything.
>>>>>>>> Looks the same here, didn't seem to change anything based on the
>>>>>>>> kernel logs.
>>>>>>>>
>>>>>>> Could you try this?
>>>>>>>
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c
>>>>>> This reminds me, I assumed James was testing with ath.git master 
>>>>>> branch
>>>>>> (which has that commit) but I never checked that. So for testing 
>>>>>> please
>>>>>> always use the master branch to get the latest and greatest ath11k:
>>>>>>
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/
>>>>>>
>>>>>> There's a quite long delay from ath.git to official releases.
>>>>> Good to know, and I was not in fact using that branch. Rebuilt from
>>>>> ath.git/master but still roughly the same behavior. There does appear
>>>>> to be more output now though, specifically a firmware crash:
>>>>>
>>>>> [    2.281721] ath11k_pci 0000:00:06.0: failed to receive control
>>>>> response completion, polling..
>>>>> [    2.282101] ip (65) used greatest stack depth: 12464 bytes left
>>>>> [    3.306039] ath11k_pci 0000:00:06.0: Service connect timeout
>>>>> [    3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: -110
>>>>> [    3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110
>>>>> [    3.519637] ath11k_pci 0000:00:06.0: firmware crashed: 
>>>>> MHI_CB_EE_RDDM
>>>>> [    3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000
>>>>> [    3.627087] ath11k_pci 0000:00:06.0: firmware crashed: 
>>>>> MHI_CB_EE_RDDM
>>>>> [    3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 0x4000
>>>>> [   13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode
>>>>> request (mode 4): -110
>>>>> [   13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode
>>>>> off: -110
>>>> Ok, that's progress now. Can you try next try the iommu patch[1] we
>>>> talked about earlier? It's already in master-pending branch (along with
>>>> other pending patches) so you can use that branch if you want.
>>>>
>>>> [1] 
>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/
>>>
>>> Same result unfortunately, tried both with just [1] applied to 
>>> ath.git and at HEAD of master-pending.
>>>
>>> Thanks,
>>>
>>> James
>> Strange that still fails. Are you now seeing this error in your host 
>> or your Qemu? or both?
>> Could you share your test steps? And if you can share please be as 
>> detailed as possible since I'm not familiar with passing WLAN hardware 
>> to a VM using vfio-pci.
> 
> Just in Qemu, the hardware works fine on my host machine.
> 
> I basically follow this guide to set it up, its written in the context 
> of GPUs/libvirt but the host setup is exactly the same. By no means do 
> you need to read it all, once you set the vfio-pci.ids and see your 
> unclaimed adapter you can stop:
> 
> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
> 
> In short you should be able to set the following host kernel options and 
> reboot (assuming your motherboard/hardware is compatible):
> 
> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103
> 
> Obviously change the device/vendor IDs to whatever ath11k hw you have. 
> Once the host is rebooted you should see your wlan adapter as UNCLAIMED, 
> showing the driver in use as vfio-pci. If not, its likely your 
> motherboard just isn't compatible, the device has to be in its own IOMMU 
> group (you could try switching PCI ports if this is the case).
> 
> I then build a "kvm_guest.config" kernel with the driver/firmware for 
> ath11k and boot into that with the following Qemu options:
> 
> -enable-kvm -device -vfio-pci,host=<PCI address>
> 
> If it seems easier you could also utilize IWD's test-runner which 
> handles launching the Qemu kernel automatically, detecting any 
> vfio-devices and passes them through and mounts some useful host folders 
> into the VM. Its actually a very good general purpose tool for kernel 
> testing, not just for IWD:
> 
> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt
> 
> Once set up you can just run test-runner with a few flags and you'll 
> boot into a shell:
> 
> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash
> 
> Please reach out if you have questions, thanks for looking into this.
> 
Thanks for these details. I reproduced this issue by following your guide.

Seems the root cause is that the MSI vector assigned to WCN6855 in qemu 
is different with that in host. In my case the MSI vector in qemu is 
[Address: fee00000  Data: 0020] while in host it is [Address: fee00578 
Data: 0000]. So in qemu ath11k configures MSI vector [Address: fee00000 
Data: 0020] to WCN6855 hardware/firmware, and firmware uses that vector 
to fire interrupts to host/qemu. However host IOMMU doesn't know that 
vector because the real vector is [Address: fee00578  Data: 0000], as a 
result host blocks that interrupt and reports an error, see below log:

[ 1414.206069] DMAR: DRHD: handling fault status reg 2
[ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
[ 1414.210334] DMAR: DRHD: handling fault status reg 2
[ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
[ 1414.212496] DMAR: DRHD: handling fault status reg 2
[ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
[ 1414.214600] DMAR: DRHD: handling fault status reg 2

While I don't think there is a way for qemu/ath11k to get the real MSI 
vector from host, I will try to read the vfio code to check further. 
Before that, to unblock you, a possible hack is to hard code the MSI 
vector in qemu to the same as in host, on condition that the MSI vector 
doesn't change. In my case, the change looks like

diff --git a/drivers/net/wireless/ath/ath11k/pci.c 
b/drivers/net/wireless/ath/ath11k/pci.c
index 09e65c5e55c4..89a9bbe9e4d2 100644
--- a/drivers/net/wireless/ath/ath11k/pci.c
+++ b/drivers/net/wireless/ath/ath11k/pci.c
@@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci 
*ab_pci)
                 ab->pci.msi.addr_hi = 0;
         }

-       ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", 
ab->pci.msi.ep_base_data);
+       ab->pci.msi.addr_hi = 0;
+       ab->pci.msi.addr_lo = 0xfee00578;
+       ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base 
data is %d\n",
+                  ab->pci.msi.addr_hi,
+                  ab->pci.msi.addr_lo,
+                  ab->pci.msi.ep_base_data);

         return 0;

@@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct 
ath11k_pci *ab_pci)
         }

         ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data;
+       ab_pci->ab->pci.msi.ep_base_data = 0;

         ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq 
msi_ep_base_data %d\n",
                    ab_pci->ab->pci.msi.ep_base_data);


This hack works on my setup.




More information about the ath11k mailing list