build_skb() and data corruption
Ben Hutchings
bhutchings at solarflare.com
Mon Jan 13 08:42:19 EST 2014
On Mon, 2014-01-13 at 12:47 +0100, Jonas Jensen wrote:
> Please help,
>
> I think I see memory corruption with a driver recently added to linux-next.
>
> The following error occur downloading a large file with wget (or
> ncftp): "read error: Bad address"
> wget exits and leaves the file unfinished.
>
> The error goes away when build_skb() is patched out, in this case by
> allocating pages directly in RX loop.
The problem is you're assuming build_skb() does rather more than it
does.
> I currently have two branches with code placed under ifdef USE_BUILD_SKB:
> https://bitbucket.org/Kasreyn/linux-next/src/faa7c408a655fdfd7c383f79259031da5a05d61e/drivers/net/ethernet/moxa/moxart_ether.c#cl-472
Quoting from there:
> #define USE_BUILD_SKB
> #ifdef USE_BUILD_SKB
>
>
> skb = build_skb(page_address(priv->rx_pages[priv->rx_head]), priv->rx_buf_size);
>
>
> if (unlikely(!skb)) {
> net_info_ratelimited("RX build_skb failed\n");
Don't log; increment the rx_dropped counter.
> moxart_rx_drop_packet(ndev, desc0);
> return true;
> }
>
>
> if (likely(skb))
Redundant condition.
> get_page(priv->rx_pages[priv->rx_head]);
>
>
> skb_put(skb, len);
>
>
> #else
>
>
> dma_unmap_page(&ndev->dev, priv->rx_buf_phys[priv->rx_head], RX_BUF_SIZE, DMA_FROM_DEVICE);
dma_unmap_page() is always needed, so move this above the #ifdef
USE_BUILD_SKB.
>
> skb = netdev_alloc_skb_ip_align(ndev, 128);
Missing check for NULL.
> skb_fill_page_desc(skb, 0, priv->rx_pages[priv->rx_head], 0, len);
> skb->len += len;
> skb->data_len += len;
>
>
> if (len > 128) {
> skb->truesize += PAGE_SIZE;
> __pskb_pull_tail(skb, ETH_HLEN);
> } else {
> __pskb_pull_tail(skb, len);
> }
>
>
> p = priv->rx_pages[priv->rx_head];
Missing check for NULL.
> moxart_alloc_rx_page(ndev, priv->rx_head);
New buffer is always required, so move this under the #endif... well,
except that if moxart_alloc_rx_page() fails then things go horribly
wrong as you have.
Best practice is to try allocating the next buffer before passing up the
current one; if that fails then you recycle the current buffer and
increment rx_dropped.
>
>
> #endif
[end quoted code]
> If build_skb() is the cause, is there something the driver can do about it?
>
> A quick search on "build_skb memory corruption" reveals the following
> commit, "igb: Revert support for build_skb in igb"
> f9d40f6a9921cc7d9385f64362314054e22152bd:
[...]
The problem I identified in igb occurs only when splitting a page into
multiple buffers (or when recycling DMA-mapped pages). build_skb() is
fine to use whenever each RX buffer is a single page (except that that's
memory-inefficient) that's unmapped on completion.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
More information about the linux-arm-kernel
mailing list