Problem with atomic accesses in pstore on some ARM CPUs

Tue Aug 16 13:50:22 PDT 2016

On Tue, Aug 16, 2016 at 1:26 PM, Guenter Roeck <groeck at google.com> wrote:
> On Tue, Aug 16, 2016 at 10:35 AM, Colin Cross <ccross at android.com> wrote:
>> On Mon, Aug 15, 2016 at 3:15 PM, Mark Rutland <mark.rutland at arm.com> wrote:
>>> On Tue, Aug 16, 2016 at 08:02:53AM -0700, Guenter Roeck wrote:
>>>> On Tue, Aug 16, 2016 at 6:21 AM, Will Deacon <will.deacon at arm.com> wrote:
>>>> > On Tue, Aug 16, 2016 at 06:14:53AM -0700, Guenter Roeck wrote:
>>>> >> On Tue, Aug 16, 2016 at 3:32 AM, Robin Murphy <robin.murphy at arm.com> wrote:
>>>> >> > On 16/08/16 00:19, Guenter Roeck wrote:
>>>> >> >> we are having a problem with atomic accesses in pstore on some ARM
>>>> >> >> CPUs (specifically rk3288 and rk3399). With those chips, atomic
>>>> >> >> accesses fail with both pgprot_noncached and pgprot_writecombine
>>>> >> >> memory. Atomic accesses do work when selecting PAGE_KERNEL protection.
>>>> >> >
>>>> >> > What's the pstore backed by? I'm guessing it's not normal DRAM.
>>>> >> >
>>>> >>
>>>> >> it is normal DRAM.
>>>> >
>>>> > In which case, why does it need to be mapped with weird attributes?
>>>> > Is there an alias in the linear map you can use?
>>>> >
>>>>
>>>> I don't really _want_ to do anything besides using pstore as-is, or,
>>>> in other words, to have the upstream kernel work with the affected
>>>> systems.
>>>>
>>>> The current pstore code runs the following code for memory with
>>>> pfn_valid() = true.
>>>>
>>>>         if (memtype)
>>>>                 prot = pgprot_noncached(PAGE_KERNEL);
>>>>         else
>>>>                 prot = pgprot_writecombine(PAGE_KERNEL);
>>>>         ...
>>>>         vaddr = vmap(pages, page_count, VM_MAP, prot);
>>>>
>>>> It then uses the memory pointed to by vaddr for atomic operations.
>>>
>>> This means that the generic ramoops / pstore code is making non-portable
>>> assumptions about memory types.
>>>
>>> So _something_ has to happen to that code.
>>>
>>>> In my case, both protection options don't work. Everything works fine
>>>> (or at least doesn't create an exception) if I use
>>>>         vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL);
>>>> instead.
>>>
>>> Architecturally, that will give you a memory type to which we can safely use
>>> atomics on.
>>>
>>> It would be nice to know why the ramoops/pstore code must use atomics, and
>>> exactly what it's trying to achieve (i.e. is this just for serialisation, or an
>>> attempt to ensure persistence).
>>>
>>> Depending on that, it may be possible to fix things more generically by using
>>> memremap by default, for example, and only allowing uncached mappings on those
>>> architectures which support them.
>>
>> persistent_ram uses atomic ops in uncached memory to store the start
>> and end positions in the ringbuffer so that the state of the
>> ringbuffer will be valid if the kernel crashes at any time.  This was
>> inherited from Android's ram_console implementation, and worked
>> through armv7.  It has been causing more and more problems recently,
>> see for example 027bc8b08242c59e19356b4b2c189f2d849ab660 (pstore-ram:
>> Allow optional mapping with pgprot_noncached) and
>> 7ae9cb81933515dc7db1aa3c47ef7653717e3090 (pstore-ram: Fix hangs by
>> using write-combine mappings).
>>
>> Maybe it should be replaced with a spinlock in normal ram protecting
>> writes to the uncached region.
>
> The necessary functions already exist, and are used for memory mapped
> with ioremap() / ioremap_wc(). They were introduced with commit
> 0405a5cec3 ("pstore/ram: avoid atomic accesses for ioremapped
> regions"), and the description in that patch sounds quite similar to
> the current problem. Given that, would it be acceptable to remove
> buffer_start_add_atomic() and buffer_size_add_atomic(), and always use
> buffer_start_add_locked() and buffer_size_add_locked() instead ? Those
> functions still use atomic_set() and atomic_read(), which works fine
> in my tests. The only difference is that a spinlock in main memory is
> used instead of atomic_cmpxchg().

I don't see much of a down side to this. ramoops isn't expected to be
high-bandwidth so trading for a single global lock doesn't really
bother me.

(I've added other folks to CC that have touched this more recently...)

-Kees

-- 
Kees Cook
Nexus Security