ath11k and vfio-pci support

Baochen Qiang quic_bqiang at quicinc.com
Tue Jan 16 21:26:32 PST 2024



On 1/16/2024 9:05 PM, James Prestwood wrote:
> Hi Baochen,
> 
> On 1/14/24 4:37 AM, Baochen Qiang wrote:
>>
>>
>> On 1/12/2024 8:47 PM, James Prestwood wrote:
>>> Hi,
>>>
>>> On 1/11/24 6:04 PM, Baochen Qiang wrote:
>>>>
>>>>
>>>> On 1/11/2024 9:38 PM, James Prestwood wrote:
>>>>>
>>>>> On 1/11/24 5:11 AM, Kalle Valo wrote:
>>>>>> James Prestwood <prestwoj at gmail.com> writes:
>>>>>>
>>>>>>> Hi Kalle, Baochen,
>>>>>>>
>>>>>>> On 1/11/24 12:16 AM, Kalle Valo wrote:
>>>>>>>> Baochen Qiang <quic_bqiang at quicinc.com> writes:
>>>>>>>>
>>>>>>>>> On 1/10/2024 10:55 PM, James Prestwood wrote:
>>>>>>>>>> Hi Kalle,
>>>>>>>>>> On 1/10/24 5:49 AM, Kalle Valo wrote:
>>>>>>>>>>> James Prestwood <prestwoj at gmail.com> writes:
>>>>>>>>>>>
>>>>>>>>>>>>> But I have also no idea what is causing this, I guess we 
>>>>>>>>>>>>> are doing
>>>>>>>>>>>>> something wrong with the PCI communication? That reminds 
>>>>>>>>>>>>> me, you could
>>>>>>>>>>>>> try this in case that helps:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/
>>>>>>>>>>>> Heh, I saw this pop up a day after I sent this and was 
>>>>>>>>>>>> wondering. Is
>>>>>>>>>>>> this something I'd need on the host kernel, guest, or both?
>>>>>>>>>>> On the guest where ath11k is running. I'm not optimistic that 
>>>>>>>>>>> this would
>>>>>>>>>>> solve your issue, I suspect there can be also other bugs, but 
>>>>>>>>>>> good to
>>>>>>>>>>> know if the patch changes anything.
>>>>>>>>>> Looks the same here, didn't seem to change anything based on the
>>>>>>>>>> kernel logs.
>>>>>>>>>>
>>>>>>>>> Could you try this?
>>>>>>>>>
>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/ath/ath11k/pci.c?id=39564b475ac5a589e6c22c43a08cbd283c295d2c
>>>>>>>> This reminds me, I assumed James was testing with ath.git master 
>>>>>>>> branch
>>>>>>>> (which has that commit) but I never checked that. So for testing 
>>>>>>>> please
>>>>>>>> always use the master branch to get the latest and greatest ath11k:
>>>>>>>>
>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/
>>>>>>>>
>>>>>>>> There's a quite long delay from ath.git to official releases.
>>>>>>> Good to know, and I was not in fact using that branch. Rebuilt from
>>>>>>> ath.git/master but still roughly the same behavior. There does 
>>>>>>> appear
>>>>>>> to be more output now though, specifically a firmware crash:
>>>>>>>
>>>>>>> [    2.281721] ath11k_pci 0000:00:06.0: failed to receive control
>>>>>>> response completion, polling..
>>>>>>> [    2.282101] ip (65) used greatest stack depth: 12464 bytes left
>>>>>>> [    3.306039] ath11k_pci 0000:00:06.0: Service connect timeout
>>>>>>> [    3.307588] ath11k_pci 0000:00:06.0: failed to connect to HTT: 
>>>>>>> -110
>>>>>>> [    3.309286] ath11k_pci 0000:00:06.0: failed to start core: -110
>>>>>>> [    3.519637] ath11k_pci 0000:00:06.0: firmware crashed: 
>>>>>>> MHI_CB_EE_RDDM
>>>>>>> [    3.519678] ath11k_pci 0000:00:06.0: ignore reset dev flags 
>>>>>>> 0x4000
>>>>>>> [    3.627087] ath11k_pci 0000:00:06.0: firmware crashed: 
>>>>>>> MHI_CB_EE_RDDM
>>>>>>> [    3.627129] ath11k_pci 0000:00:06.0: ignore reset dev flags 
>>>>>>> 0x4000
>>>>>>> [   13.802105] ath11k_pci 0000:00:06.0: failed to wait wlan mode
>>>>>>> request (mode 4): -110
>>>>>>> [   13.802175] ath11k_pci 0000:00:06.0: qmi failed to send wlan mode
>>>>>>> off: -110
>>>>>> Ok, that's progress now. Can you try next try the iommu patch[1] we
>>>>>> talked about earlier? It's already in master-pending branch (along 
>>>>>> with
>>>>>> other pending patches) so you can use that branch if you want.
>>>>>>
>>>>>> [1] 
>>>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20231212031914.47339-1-imguzh@gmail.com/
>>>>>
>>>>> Same result unfortunately, tried both with just [1] applied to 
>>>>> ath.git and at HEAD of master-pending.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> James
>>>> Strange that still fails. Are you now seeing this error in your host 
>>>> or your Qemu? or both?
>>>> Could you share your test steps? And if you can share please be as 
>>>> detailed as possible since I'm not familiar with passing WLAN 
>>>> hardware to a VM using vfio-pci.
>>>
>>> Just in Qemu, the hardware works fine on my host machine.
>>>
>>> I basically follow this guide to set it up, its written in the 
>>> context of GPUs/libvirt but the host setup is exactly the same. By no 
>>> means do you need to read it all, once you set the vfio-pci.ids and 
>>> see your unclaimed adapter you can stop:
>>>
>>> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
>>>
>>> In short you should be able to set the following host kernel options 
>>> and reboot (assuming your motherboard/hardware is compatible):
>>>
>>> intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103
>>>
>>> Obviously change the device/vendor IDs to whatever ath11k hw you 
>>> have. Once the host is rebooted you should see your wlan adapter as 
>>> UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely 
>>> your motherboard just isn't compatible, the device has to be in its 
>>> own IOMMU group (you could try switching PCI ports if this is the case).
>>>
>>> I then build a "kvm_guest.config" kernel with the driver/firmware for 
>>> ath11k and boot into that with the following Qemu options:
>>>
>>> -enable-kvm -device -vfio-pci,host=<PCI address>
>>>
>>> If it seems easier you could also utilize IWD's test-runner which 
>>> handles launching the Qemu kernel automatically, detecting any 
>>> vfio-devices and passes them through and mounts some useful host 
>>> folders into the VM. Its actually a very good general purpose tool 
>>> for kernel testing, not just for IWD:
>>>
>>> https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt
>>>
>>> Once set up you can just run test-runner with a few flags and you'll 
>>> boot into a shell:
>>>
>>> ./tools/test-runner -k <kernel-image> --hw --start /bin/bash
>>>
>>> Please reach out if you have questions, thanks for looking into this.
>>>
>> Thanks for these details. I reproduced this issue by following your 
>> guide.
>>
>> Seems the root cause is that the MSI vector assigned to WCN6855 in 
>> qemu is different with that in host. In my case the MSI vector in qemu 
>> is [Address: fee00000  Data: 0020] while in host it is [Address: 
>> fee00578 Data: 0000]. So in qemu ath11k configures MSI vector 
>> [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and 
>> firmware uses that vector to fire interrupts to host/qemu. However 
>> host IOMMU doesn't know that vector because the real vector is 
>> [Address: fee00578  Data: 0000], as a result host blocks that 
>> interrupt and reports an error, see below log:
>>
>> [ 1414.206069] DMAR: DRHD: handling fault status reg 2
>> [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
>> [ 1414.210334] DMAR: DRHD: handling fault status reg 2
>> [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
>> [ 1414.212496] DMAR: DRHD: handling fault status reg 2
>> [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 
>> 0x0 [fault reason 0x25] Blocked a compatibility format interrupt request
>> [ 1414.214600] DMAR: DRHD: handling fault status reg 2
>>
>> While I don't think there is a way for qemu/ath11k to get the real MSI 
>> vector from host, I will try to read the vfio code to check further. 
>> Before that, to unblock you, a possible hack is to hard code the MSI 
>> vector in qemu to the same as in host, on condition that the MSI 
>> vector doesn't change. In my case, the change looks like
>>
>> diff --git a/drivers/net/wireless/ath/ath11k/pci.c 
>> b/drivers/net/wireless/ath/ath11k/pci.c
>> index 09e65c5e55c4..89a9bbe9e4d2 100644
>> --- a/drivers/net/wireless/ath/ath11k/pci.c
>> +++ b/drivers/net/wireless/ath/ath11k/pci.c
>> @@ -459,7 +459,12 @@ static int ath11k_pci_alloc_msi(struct ath11k_pci 
>> *ab_pci)
>>                 ab->pci.msi.addr_hi = 0;
>>         }
>>
>> -       ath11k_dbg(ab, ATH11K_DBG_PCI, "msi base data is %d\n", 
>> ab->pci.msi.ep_base_data);
>> +       ab->pci.msi.addr_hi = 0;
>> +       ab->pci.msi.addr_lo = 0xfee00578;
>> +       ath11k_dbg(ab, ATH11K_DBG_PCI, "msi addr hi 0x%x lo 0x%x base 
>> data is %d\n",
>> +                  ab->pci.msi.addr_hi,
>> +                  ab->pci.msi.addr_lo,
>> +                  ab->pci.msi.ep_base_data);
>>
>>         return 0;
>>
>> @@ -487,6 +492,7 @@ static int ath11k_pci_config_msi_data(struct 
>> ath11k_pci *ab_pci)
>>         }
>>
>>         ab_pci->ab->pci.msi.ep_base_data = msi_desc->msg.data;
>> +       ab_pci->ab->pci.msi.ep_base_data = 0;
>>
>>         ath11k_dbg(ab_pci->ab, ATH11K_DBG_PCI, "after request_irq 
>> msi_ep_base_data %d\n",
>>                    ab_pci->ab->pci.msi.ep_base_data);
>>
>>
>> This hack works on my setup.
> 
> Progress! Thank you. This didn't work for me but its likely because my 
> host MSI vector is not fee00578. Where did you come up with this value? 
It could, and most likely, be different from machine to machine.

> I don't see anything in the dmesg logs, or in lspci etc.
> 
fee00578 is the physical MSI vector so I got it using lspci in host, see
...
         Capabilities: [50] MSI: Enable+ Count=1/32 Maskable+ 64bit-
                 Address: fee00578  Data: 0000
                 Masking: fffffffe  Pending: 00000000
...

> Thanks,
> 
> James
> 



More information about the ath11k mailing list