[RFC] ARM/ARM64: KVM: Implement KVM_FLUSH_DCACHE_GPA ioctl
Jérémy Fanguède
j.fanguede at virtualopensystems.com
Thu May 7 09:56:24 PDT 2015
On Thu, May 7, 2015 at 5:34 PM, Christoffer Dall
<christoffer.dall at linaro.org> wrote:
> On Thu, May 7, 2015 at 4:50 PM, Jérémy Fanguède
> <j.fanguede at virtualopensystems.com> wrote:
>> On Thu, May 7, 2015 at 1:20 PM, Christoffer Dall
>> <christoffer.dall at linaro.org> wrote:
>>> On Thu, May 07, 2015 at 12:50:50PM +0200, Jérémy Fanguède wrote:
>>>> On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall
>>>> <christoffer.dall at linaro.org> wrote:
>>>> > Hi Jérémy,
>>>> >
>>>> > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote:
>>>> >> To maintain cache coherency on ARM, we may need a mechanism to flush
>>>> >> the data cache.
>>>> >
>>>> > In addition to generally just making this functionality available (see
>>>> > below), do you have an actual use case in mind for this? To solve the
>>>> > VGA issue, for example, we already have a patch series from Drew trying
>>>> > to address this. Does that not work for you?
>>>> >
>>>> > There was a long discussion about this here:
>>>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html
>>>> >
>>>> > Drew then created a patch set, here:
>>>> > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html
>>>> >
>>>> > and replied to himself, here:
>>>> > https://www.marc.info/?l=android-virt&m=142670523929132&w=3
>>>> >
>>>> > Which basically says that he doesn't like having to do flushes all over
>>>> > QEMU (IIUC), so he sent this version instead:
>>>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html
>>>> >
>>>> > Which he now said he'd respin.
>>>>
>>>> In fact, I used this ioctl in pairs with this QEMU patch series:
>>>> https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html
>>>> My current work doesn't do anything about vga ram, so vga issue
>>>> probably still persists, but it solves others issues with some
>>>> emulated devices (mainly PCI) which were failing before and now work
>>>> fine with this patch.
>>>
>>> Why does Drew's approach not work and your approach works here? What is
>>> the case that we haven't though about yet?
>>
>> The first patch from Andrew, (which is for arm64 only) doesn't permit
>> me to make some emulated PCI devices working with virt, probably
>> because some flushes/cleans are missing.
>> As for the second patch, it focuses, for now, only on VGA ram. I
>> quickly tried to extend the KVM_MEM_UNCACHED flag to the whole guest
>> memory, but then the guest fails to boot; but even if it was working,
>> does it make sense to map as uncached all the ram of the guest? Since
>> we can not guess which region of the guest memory will be accessed.
>>
>> Simple PCI devices like e1000 or usb-ehci (with usb-kbd for instance)
>> are not usable, with or without these patches, but if I flush a
>> precise memory range, on reads and writes performed by emulated
>> devices on guest ram, (It's exactly what the QEMU patch series that I
>> sent does, with this ioctl), emulated PCI devices are now working.
>
> I understand all this. What I'd like for us to find out is why we are
> having coherency issues. We knew that for the VGA adapter, the guest
> maps the memory as uncached (because that's how the real hardware
> works), and QEMU maps the memory as cached (because it's just normal
> memory), and unsurprisingly the two views of that memory is not
> coherent.
>
> What are the cases you are seeing with e1000 or usb-ehci?
USB devices fail with a timeout error, as if the communication between
the kernel and the devices fail at a certain point:
usb 1-1: device not accepting address 5, error -110
usb usb1-port1: unable to enumerate USB device
e1000 fails when the userspace tries to use it, with these type of
kernel messages:
e1000 0000:00:02.0 eth0: Detected Tx Unit Hang
Tx Queue <0>
TDH <d>
TDT <d>
next_to_use <d>
next_to_clean <9>
buffer_info[next_to_clean]
time_stamp <ffff9311>
next_to_watch <a>
jiffies <ffff956a>
next_to_watch.status <0>
>
> Hint: We can make a lot of things work by just sticking cache flushes
> all over, but it's not a good engineering approach.
Yes, it's probably not the best approach here, but currently it's the
only solution that I have to make some PCI devices usable on ARM.
More information about the linux-arm-kernel
mailing list