[RFC PATCH 3/3] mm/vmalloc.c: change _vm_unmap_aliases() to do purge firstly

Baoquan He bhe at redhat.com
Tue May 23 02:35:30 PDT 2023


On 05/22/23 at 10:21pm, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 22:34, Baoquan He wrote:
> > On 05/22/23 at 02:02pm, Thomas Gleixner wrote:
> >> > @@ -1736,6 +1737,14 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> >> >  	list_replace_init(&purge_vmap_area_list, &local_purge_list);
> >> >  	spin_unlock(&purge_vmap_area_lock);
> >> >  
> >> > +	vb = container_of(va, struct vmap_block, va);
> >> 
> >> This cannot work vmap_area is not embedded in vmap_block. vmap_block::va
> >> is a pointer. vmap_area does not link back to vmap_block, so there is no
> >> way to find it based on a vmap_area.
> >
> > Oh, the code is buggy. va->flags can tell if it's vmap_block, then we
> > can deduce the vb pointer.
> 
> No. It _CANNOT_ work whether you check the flags or not.
> 
> struct foo {
>        .....
>        struct bar bar;
> };
> 
> container_of(ptr_to_bar, struct foo, bar) returns the pointer to the
> struct foo which has struct bar embedded.
> 
> But
> 
> struct foo {
>        .....
>        struct bar *bar;
> };
> 
> cannot do that because ptr_to_bar points to some object which is
> completely disconnected from struct foo.
> 
> Care to look at the implementation of container_of()?
> 
> Here is what it boils down to:
> 
>   void *member_pointer = bar;
>   
>   p = (struct foo *)(member_pointer - offsetof(struct foo, bar);
> 
> So it uses the pointer to bar and subtracts the offset of bar in struct
> foo. This obviously can only work when struct bar is embedded in struct
> foo.
> 
> Lets assume that *bar is the first member of foo, i.e. offset of *bar in
> struct foo is 0
> 
>   p = (struct foo *)(member_pointer - 0);
> 
> So you end up with
> 
>   p == member_pointer == bar
> 
> But you won't get there because the static_assert() in container_of()
> will catch that and the compiler will tell you in colourful ways.

Thanks a lot, learn it now. I never noticed container_of() is not
suitable for pointer member of struct case.

> 
> Once the vmap area is handed over for cleaning up the vmap block is gone
> and even if you let it stay around then the vmap area does not have any
> information where to find the block.
> 
> You'd need to have a pointer to the vmap block in vmap area or embed
> vmap area into vmap block.

Got it now. Embedding vmap_area into vmap_block seems not feasible
because va need be reused when inserting into free_vmap_area_root/list.
Adding a pointer to vmap_block looks do-able.

Since vm_map_ram area doesn't have vm_struct associated with it, we can
reuse the space of '->vm' to add vb pointer like below. Since in the
existing code there are places where we use 'if(!va->vm)' to check if
it's a normal va, we need be careful to find all of them out and replace
with new and tighter checking. Will give a draft code change after all
is done and testing is passed.

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index c720be70c8dd..e2ba6d59d679 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -15,6 +15,7 @@
 struct vm_area_struct;         /* vma defining user mapping in mm_types.h */
 struct notifier_block;         /* in notifier.h */
 struct iov_iter;               /* in uio.h */
+struct vmap_block;             /* in mm/vmalloc.c */
 
 /* bits in flags of vmalloc's vm_struct below */
 #define VM_IOREMAP             0x00000001      /* ioremap() and friends */
@@ -76,6 +77,7 @@ struct vmap_area {
        union {
                unsigned long subtree_max_size; /* in "free" tree */
                struct vm_struct *vm;           /* in "busy" tree */
+               struct vmap_block *vb;           /* in "busy and purge" tree */
        };
        unsigned long flags; /* mark type of vm_map_ram area */
 };
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c0f80982eb06..d97343271e27 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2061,6 +2061,10 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
                return ERR_PTR(err);
        }
 
+       spin_lock(&vmap_area_lock);
+       va->vb = vb;
+       spin_unlock(&vmap_area_lock);
+
        vbq = raw_cpu_ptr(&vmap_block_queue);
        spin_lock(&vbq->lock);
        list_add_tail_rcu(&vb->free_list, &vbq->free);




More information about the linux-arm-kernel mailing list