Why does unmap_area_sections() depend on !CONFIG_SMP?

김준수 iamjoonsoo.kim at lge.com
Wed Oct 2 01:05:03 EDT 2013



> -----Original Message-----
> From: Nicolas Pitre [mailto:nicolas.pitre at linaro.org]
> Sent: Wednesday, October 02, 2013 3:24 AM
> To: Joonsoo Kim
> Cc: Russell King; linux-arm-kernel at lists.infradead.org
> Subject: Re: Why does unmap_area_sections() depend on !CONFIG_SMP?
> 
> On Tue, 1 Oct 2013, Joonsoo Kim wrote:
> 
> > Hello, Russell.
> >
> > I looked at ioremap code in arm tree and found that
> > unmap_area_sections() is enabled only if !CONFIG_SMP. I can't
> > understand the comments above this function and it comes from you.
> > Could you elaborate more on this?
> >
> > I guess that flush_cache_vunmap() before clearing page table and
> > flush_tlb_kernel_range() after clearing page table is safe enough to
> > cache consistency regardless CONFIG_SMP configuration. I think that 4K
> > vunmap() also depends on this flushing logic.
> >
> > Please let me know what I am missing here.
> 
> This is all related to the page table level involved.
> 
> Each entry in the first level page table may refer to a second level page
> table covering 1MB worth of virtual space, or it may be a direct mapping
> corresponding to 1MB of contiguous physical memory.
> 
> In Linux, all tasks, including kernel threads, have their own first level
> page table.  On process creation, the top entries covering TASK_SIZE and
> above in the first level page table is copied from init_mm into the new
> page table as the kernel address space is meant to be identical across all
> tasks.
> 
> This is however not always the case though.  Consider one call to
> ioremap() which does create a new entry in the kernel virtual space.
> In order to ensure that the kernel virtual space is indeed the same across
> all tasks, the ioremap code would have to walk the entire task list just
> to update their own copy of the kernel virtual mapping.  So what we do
> instead is to create the new page table entry in init_mm only, and lazily
> update the other task's page table when they fault on access due to their
> own page table being incomplete.
> 
> What about iounmap() then.  When a mapping is removed, we don't want it to
> be accessible through some random task's page table.  Well, in the normal
> ioremap() case, the actual mapping is created into a second level page
> table which happens to be common to all tasks.  Hence the first level page
> table entry being created is actually a pointer to that second level page
> table, and when a mapping is removed it is only removed from that second
> level page table.  The second level table itself remains in memory
forever,
> ready to be reused for any other call to ioremap().  Therefore there is no
> need to update each task's first level table again.
> 
> So far so good.
> 
> Now comes the section mapping for ioremap().  Since this is handled into
> the first level page table only with no common second level table, we
> needed a mechanism to ensure that any mapping removal gets propagated to
> all first level tables in the system.  This is accomplished with a
> sequence counter namely vmalloc_seq which is incremented whenever such a
> change occurs.  Upon every task switch, this counter is checked against
> the master copy to detect when the next task to be scheduled has its first
> level page table out of date, and if so it is updated before the new
> memory context is instated.
> 
> But... this works only if not SMP.  On SMP, different tasks might be
> running on the other CPUs and incrementing vmalloc_seq won't have any
> effect on them.  This is why section mappings for ioremap() is not
> available if SMP.
> 
> This could probably be fixed by sending an IPI to the other processors,
> forcing them to resync their page table right after clearing the mapping
> from the master table.  But no one implemented it so far.

Hello, Nicolas.

Really thanks for kind explanation.
Now, I totally understand why it is not available if SMP.
I will investigate more on this and try to implement that section mapping
for ioremap() works on SMP.

Thanks.




More information about the linux-arm-kernel mailing list