Kernel related (?) user space crash at ARM11 MPCore
Russell King - ARM Linux
linux at arm.linux.org.uk
Sun Sep 20 15:02:27 EDT 2009
On Sun, Sep 20, 2009 at 10:31:39AM +0100, Russell King - ARM Linux wrote:
> On Sun, Sep 20, 2009 at 09:39:00AM +0100, Catalin Marinas wrote:
> > I don't think it's recommended to clean the D-cache (and invalidate the
> > I-cache) every time in copy_user_highpage, therefore cache maintenance
> > via mprotect -> change_protection -> flush_cache_range may be a better
> > option.
> I really don't believe so - try it yourself - run some benchmarks on your
> ARMv6 or v7 system, comparing the results both with and without the patch.
> Especially pay attention to the process creation/shell script performance.
> I think you'll find that with your patch, it'll be worse than ARM systems
> running at similar clock rates with VIVT caches.
The figures reveal a 10% reduction in the performance of execve - that's
quite a nasty hit, basically meaning shell scripts will run about 10%
slower (shell scripts typically exec lots of programs.)
Using my proposal measures more favourably - there is no measurable impact
on execve itself (maybe a 0.5% reduction, which I consider to be in the
measurement noise), but a 5.5% reduction in the performance of fork()+exit()
- this is using __cpuc_coherent_kern_range() in
v6_copy_user_highpage_nonaliasing() to ensure the new page is fully
Here's my measurements, using the programs from an old version of the
byte benchmarks (not running it as per the benchmark due to the limited
environment on the Realview platform). Basically, each program was
invoked four times, requesting 30 second runs, and the average taken.
The first two are really for establishing an indication that things are
still as they should be (iow, the hardware isn't playing silly buggers).
old b42c634 bf45699 d374bf1 df297bf catalin vma1 vma2 cowcc
Dhrystone 2 RSS=332K:
7640923 7641386 7642194 7639866 7641322 7641308 7640809 7641096 7640737
99.995% 100.00% 100.01% 99.981% 100.00% 100.00% 99.993% 99.997% 99.993%
Pipe based context switching RSS=296K:
246648 244868 244402 242516 248877 248970 237815 246172 247082
99.104% 98.389% 98.202% 97.444% 100.00% 100.04% 95.555% 98.913% 99.279%
fork()+exit() (aka spawn) RSS=292K:
10492 10457 10460 10455 10485 10478 10524 10491 9968
100.07% 99.733% 99.762% 99.714% 100.00% 99.933% 100.37% 100.06% 95.069%
execve() (aka execl) RSS=188K:
2236 2219 2268 2247 2239 2033 2260 2263 2222
99.866% 99.107% 101.30% 100.36% 100.00% 90.799% 100.94% 101.05% 99.241%
The runs are (the board was power cycled before each run):
old - my published tip of tree on Saturday
df297bf - my (slightly modified) prefetch abort fixes (the runs between
these represent various stages through that work.)
catalin - your patch
vma1 - passing but not using the vma to the copy_user_highpage
functions. this appears to be an anomalous run.
vma2 - repeated run
cowcc - adding cache handling to v6_copy_user_highpage_nonaliasing()
One thing I have noticed: it takes the Realview SMP board _two_ attempts
to boot a kernel. The first attempt tends to cause a spontaneous reboot
when the CLCD controller is enabled, or possibly a hang. The second
attempt seems to always run fine.
More information about the linux-arm-kernel