[PATCH v3 5/7] ARM: KVM: rework HYP page table freeing
Marc Zyngier
marc.zyngier at arm.com
Thu Apr 18 03:13:10 EDT 2013
On Wed, 17 Apr 2013 22:23:21 -0700, Christoffer Dall
<cdall at cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 1:00 AM, Marc Zyngier <marc.zyngier at arm.com>
wrote:
>
>> On Sun, 14 Apr 2013 23:51:55 -0700, Christoffer Dall
>> <cdall at cs.columbia.edu> wrote:
>> > On Fri, Apr 12, 2013 at 8:18 AM, Marc Zyngier <marc.zyngier at arm.com>
>> wrote:
>> >> There is no point in freeing HYP page tables differently from
Stage-2.
>> >> They now have the same requirements, and should be dealt with the
same
>> >> way.
>> >>
>> >> Promote unmap_stage2_range to be The One True Way, and get rid of a
>> >> number
>> >> of nasty bugs in the process (goo thing we never actually called
>> >> free_hyp_pmds
>> >
>> > could you remind me, did you already point out these nasty bugs
>> > somewhere or did we discuss them in an older thread?
>>
>> No, I decided it wasn't worth the hassle when I spotted them (specially
>> as
>> we're moving away from section mapped idmap)... But for the record:
>>
>> <quote>
>> static void free_hyp_pgd_entry(pgd_t *pgdp, unsigned long addr)
>> {
>> pgd_t *pgd;
>> pud_t *pud;
>> pmd_t *pmd;
>>
>> pgd = pgdp + pgd_index(addr);
>> pud = pud_offset(pgd, addr);
>>
>> if (pud_none(*pud))
>> return;
>> BUG_ON(pud_bad(*pud));
>>
>> pmd = pmd_offset(pud, addr);
>> free_ptes(pmd, addr);
>> pmd_free(NULL, pmd);
>> ^^^^^^^^^^^^^^^^^^^^<-- BUG_ON(pmd not page aligned)
>>
>
> This would never be non-page-aligned for Hyp page tables where PMDs are
> always 4K in size, that was the assumption.
Well, look at the pmd variable. As the result of pmd_offset(), it gives
you a pointer to a PMD *entry*, not to a PMD *page*. This entry can be
anywhere in that page, depending on the 2M section in a 1GB table.
>
>> pud_clear(pud);
>> }
>> </quote>
>>
>> Now, if you decide to fix the above by forcing the page alignment:
>>
>> <quote>
>> static void free_ptes(pmd_t *pmd, unsigned long addr)
>> {
>> pte_t *pte;
>> unsigned int i;
>>
>> for (i = 0; i < PTRS_PER_PMD; i++, addr += PMD_SIZE) {
>> ^^^^^^^^^^^^^<-- start freeing memory outside of the
>> (unaligned) pmd...
>>
>
> freeing what outside the pmd? I don't see this.
Remember the above (PMD entry vs PMD page)? This code assumes it gets a
PMD page. To work properly, it would read:
for (i = pmd_index(addr); i < PTRS_PER_PMDs; ....
>
>> if (!pmd_none(*pmd) && pmd_table(*pmd)) {
>> pte = pte_offset_kernel(pmd, addr);
>> pte_free_kernel(NULL, pte);
>> }
>> pmd++;
>> }
>> }
>> </quote>
>>
>> Once you've fixed that as well, you end up noticing that if you have
PTEs
>> pointed to by the same PMD, you need to introduce some refcounting if
you
>> need to free one PTE and not the others.
>>
>
> huh? the code always freed all the hyp tables and there would be no
sharing
> of ptes across multiple pmds. I'm confused.
Not anymore. In non HOTPLUG_CPU case, we free the trampoline page from the
runtime HYP page tables (it is worth it if we had to use a bounce page). If
you free an entire PMD, odds are you'll also unmap some of the HYP code
(been there...).
> For the record, I like your patch and we should definitely only have one
> path of allocating and freeing the tables, but if my code was buggy I
want
> to learn from my mistakes and know exactly what went bad.
Well, I hope I made clear what the problem was. The main reason we didn't
notice this is because this code was only called on the error path, which
is mostly untested.
>>
>> At this point, I had enough, and decided to reuse what we already had
>> instead of reinventing the wheel.
>
>
>> > nit: s/goo/good/
>> >
>> >> before...).
>>
>> Will fix.
>>
>
> I made the adjustment in my queue so you don't need to send out a new
> series, unless you want to do that for other reasons.
No particular reason, no. Unless people have additional comments that
would require a new version of the series.
Thanks,
M.
--
Fast, cheap, reliable. Pick two.
More information about the linux-arm-kernel
mailing list