[PATCH RFC] dwc2: Don't assume URB transfer_buffer are dword-aligned

Mauro Carvalho Chehab mchehab at s-opensource.com
Fri Mar 17 03:52:54 PDT 2017


Hi Greg,

(c/c media ML)

Em Fri, 17 Mar 2017 10:24:15 +0900
Greg Kroah-Hartman <gregkh at linuxfoundation.org> escreveu:

> On Thu, Mar 16, 2017 at 09:08:40PM -0300, Mauro Carvalho Chehab wrote:
> > The dwc2 hardware doesn't like to do DMA transfers without
> > aligning data in DWORD. The driver also assumes that, even
> > when there's no DMA, at dwc2_read_packet().
> > 
> > That cause buffer overflows, preventing some drivers to work.  
> 
> Why aren't the drivers being fixed?  This is a well-known (hopefully)
> restriction on USB data buffers, can't the uvc_driver be fixed?

I never knew that adding padding data is required on the
transfer buffer. Yet, almost all drivers at media just do
something like:

	data = kmalloc(size, GFP_KERNEL);
...
	usb_control_msg(dev->udev, pipe, req, reqtype, val, idx,
			data, size, timeout);

So, if kmalloc add pad bytes to the data, it should work, even
if size is not dword-aligned.

The UVC driver does something different, though. For video controls,
it gets value from 7 different USB registers and store on a cache,
with this code:

#define UVC_CTRL_DATA_CURRENT	0
#define UVC_CTRL_DATA_BACKUP	1
#define UVC_CTRL_DATA_MIN	2
#define UVC_CTRL_DATA_MAX	3
#define UVC_CTRL_DATA_RES	4
#define UVC_CTRL_DATA_DEF	5
#define UVC_CTRL_DATA_LAST	6

The size of each control is stored on a struct, at
ctrl->info.size, and vary from control to control. On the webcam I'm
currently working, there are controls with 1 byte, 2 bytes, 8 bytes
and 26 bytes.

Instead of allocating 7 buffers, to store each value, it uses
just one kmalloc for the entire cache:

	ctrl->uvc_data = kzalloc(ctrl->info.size * UVC_CTRL_DATA_LAST + 1,
				 GFP_KERNEL);

And it uses a function that returns a position inside the 
register cache buffer:

	static inline __u8 *uvc_ctrl_data(struct uvc_control *ctrl, int id)
	{
		return ctrl->uvc_data + id * ctrl->info.size;
	}

This is not packed nor dword-aligned.

It shouldn't be hard to change its logic to ensure that the
cached data will be aligned with the architecture's default
alignment, by changing the code to:

	for (i = 0; i < UVC_CTRL_DATA_LAST; i++)
		ctrl->uvc_data[i] = kzalloc(ctrl->info.size, GFP_KERNEL);

	...

	static inline __u8 *uvc_ctrl_data(struct uvc_control *ctrl, int id)
	{
		return ctrl->uvc_data[id];
	}

With should solve this issue and avoid double-buffering. I think
we should do it anyway.


Yet, there are already *a lot* of URB traffic done using non-aligned
transfer buffers on RPi3. I was able to add dump_stack() only after
adding a rate-limit logic there. I applied the enclosed patch without
having any external USB devices connected to RPi3 and without this 
RFC patch applied. I'm using network via RPi3 Ethernet port.

That's what I got:

[   43.083288] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 145:
[   43.083301] CPU: 1 PID: 791 Comm: sshd Tainted: G         C      4.11.0-rc1+ #52
[   43.083305] Hardware name: Generic DT based system
[   43.083329] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.083345] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.083362] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.083378] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.083396] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b3f08>] (usbnet_start_xmit+0x228/0x6e0)
[   43.083416] [<c09b3f08>] (usbnet_start_xmit) from [<c0b6e7b0>] (dev_hard_start_xmit+0x88/0x110)
[   43.083434] [<c0b6e7b0>] (dev_hard_start_xmit) from [<c0b90300>] (sch_direct_xmit+0xe0/0x1b4)
[   43.083450] [<c0b90300>] (sch_direct_xmit) from [<c0b6eb18>] (__dev_queue_xmit+0x1ec/0x568)
[   43.083466] [<c0b6eb18>] (__dev_queue_xmit) from [<c0bfcfd4>] (ip6_finish_output2+0x1e0/0x5a0)
[   43.083480] [<c0bfcfd4>] (ip6_finish_output2) from [<c0bfd63c>] (ip6_xmit+0x2a8/0x5dc)
[   43.083494] [<c0bfd63c>] (ip6_xmit) from [<c0c307cc>] (inet6_csk_xmit+0x78/0xac)
[   43.083513] [<c0c307cc>] (inet6_csk_xmit) from [<c0bb9e10>] (tcp_transmit_skb+0x44c/0x8fc)
[   43.083529] [<c0bb9e10>] (tcp_transmit_skb) from [<c0bba42c>] (tcp_write_xmit+0x16c/0xffc)
[   43.083545] [<c0bba42c>] (tcp_write_xmit) from [<c0bbb2f0>] (__tcp_push_pending_frames+0x34/0xe8)
[   43.083560] [<c0bbb2f0>] (__tcp_push_pending_frames) from [<c0bad6dc>] (tcp_sendmsg+0x890/0xbd4)
[   43.083577] [<c0bad6dc>] (tcp_sendmsg) from [<c0b52990>] (sock_sendmsg+0x14/0x24)
[   43.083592] [<c0b52990>] (sock_sendmsg) from [<c0b52a20>] (sock_write_iter+0x80/0xb0)
[   43.083608] [<c0b52a20>] (sock_write_iter) from [<c0423620>] (__vfs_write+0xbc/0x114)
[   43.083624] [<c0423620>] (__vfs_write) from [<c0424b54>] (vfs_write+0xa4/0x168)
[   43.083639] [<c0424b54>] (vfs_write) from [<c0425950>] (SyS_write+0x3c/0x90)
[   43.083655] [<c0425950>] (SyS_write) from [<c0308940>] (ret_fast_syscall+0x0/0x3c)
[   43.084178] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 18955:
[   43.084193] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.11.0-rc1+ #52
[   43.084198] Hardware name: Generic DT based system
[   43.084237] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.084271] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.084306] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.084342] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.084370] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b453c>] (rx_submit+0x17c/0x2f4)
[   43.084399] [<c09b453c>] (rx_submit) from [<c09b4ca0>] (rx_complete+0x2a8/0x304)
[   43.084433] [<c09b4ca0>] (rx_complete) from [<c09c8080>] (__usb_hcd_giveback_urb+0x90/0x118)
[   43.084466] [<c09c8080>] (__usb_hcd_giveback_urb) from [<c09c81a0>] (usb_giveback_urb_bh+0x98/0xe8)
[   43.084516] [<c09c81a0>] (usb_giveback_urb_bh) from [<c034890c>] (tasklet_action+0x74/0x110)
[   43.084546] [<c034890c>] (tasklet_action) from [<c0348acc>] (__do_softirq+0xd0/0x21c)
[   43.084567] [<c0348acc>] (__do_softirq) from [<c0348f10>] (irq_exit+0xd8/0x140)
[   43.084599] [<c0348f10>] (irq_exit) from [<c038cf08>] (__handle_domain_irq+0x60/0xb4)
[   43.084631] [<c038cf08>] (__handle_domain_irq) from [<c030d2cc>] (__irq_svc+0x6c/0x90)
[   43.084654] [<c030d2cc>] (__irq_svc) from [<c0309450>] (arch_cpu_idle+0x38/0x3c)
[   43.084685] [<c0309450>] (arch_cpu_idle) from [<c037e138>] (do_idle+0x164/0x1f8)
[   43.084708] [<c037e138>] (do_idle) from [<c037e444>] (cpu_startup_entry+0x18/0x1c)
[   43.084735] [<c037e444>] (cpu_startup_entry) from [<c1200c78>] (start_kernel+0x380/0x38c)
[   43.096418] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 209:
[   43.096435] CPU: 1 PID: 791 Comm: sshd Tainted: G         C      4.11.0-rc1+ #52
[   43.096439] Hardware name: Generic DT based system
[   43.096473] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.096490] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.096508] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.096526] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.096549] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b3f08>] (usbnet_start_xmit+0x228/0x6e0)
[   43.096570] [<c09b3f08>] (usbnet_start_xmit) from [<c0b6e7b0>] (dev_hard_start_xmit+0x88/0x110)
[   43.096586] [<c0b6e7b0>] (dev_hard_start_xmit) from [<c0b90300>] (sch_direct_xmit+0xe0/0x1b4)
[   43.096602] [<c0b90300>] (sch_direct_xmit) from [<c0b6eb18>] (__dev_queue_xmit+0x1ec/0x568)
[   43.096619] [<c0b6eb18>] (__dev_queue_xmit) from [<c0bfcfd4>] (ip6_finish_output2+0x1e0/0x5a0)
[   43.096634] [<c0bfcfd4>] (ip6_finish_output2) from [<c0bfd63c>] (ip6_xmit+0x2a8/0x5dc)
[   43.096653] [<c0bfd63c>] (ip6_xmit) from [<c0c307cc>] (inet6_csk_xmit+0x78/0xac)
[   43.096671] [<c0c307cc>] (inet6_csk_xmit) from [<c0bb9e10>] (tcp_transmit_skb+0x44c/0x8fc)
[   43.096688] [<c0bb9e10>] (tcp_transmit_skb) from [<c0bba42c>] (tcp_write_xmit+0x16c/0xffc)
[   43.096704] [<c0bba42c>] (tcp_write_xmit) from [<c0bbb2f0>] (__tcp_push_pending_frames+0x34/0xe8)
[   43.096719] [<c0bbb2f0>] (__tcp_push_pending_frames) from [<c0bad6dc>] (tcp_sendmsg+0x890/0xbd4)
[   43.096738] [<c0bad6dc>] (tcp_sendmsg) from [<c0b52990>] (sock_sendmsg+0x14/0x24)
[   43.096753] [<c0b52990>] (sock_sendmsg) from [<c0b52a20>] (sock_write_iter+0x80/0xb0)
[   43.096769] [<c0b52a20>] (sock_write_iter) from [<c0423620>] (__vfs_write+0xbc/0x114)
[   43.096784] [<c0423620>] (__vfs_write) from [<c0424b54>] (vfs_write+0xa4/0x168)
[   43.096801] [<c0424b54>] (vfs_write) from [<c0425950>] (SyS_write+0x3c/0x90)
[   43.096818] [<c0425950>] (SyS_write) from [<c0308940>] (ret_fast_syscall+0x0/0x3c)
[   43.097026] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 825:
[   43.097040] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.11.0-rc1+ #52
[   43.097043] Hardware name: Generic DT based system
[   43.097060] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.097075] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.097091] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.097105] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.097123] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b3f08>] (usbnet_start_xmit+0x228/0x6e0)
[   43.097143] [<c09b3f08>] (usbnet_start_xmit) from [<c0b6e7b0>] (dev_hard_start_xmit+0x88/0x110)
[   43.097158] [<c0b6e7b0>] (dev_hard_start_xmit) from [<c0b90300>] (sch_direct_xmit+0xe0/0x1b4)
[   43.097172] [<c0b90300>] (sch_direct_xmit) from [<c0b6eb18>] (__dev_queue_xmit+0x1ec/0x568)
[   43.097186] [<c0b6eb18>] (__dev_queue_xmit) from [<c0bfcfd4>] (ip6_finish_output2+0x1e0/0x5a0)
[   43.097199] [<c0bfcfd4>] (ip6_finish_output2) from [<c0bfd63c>] (ip6_xmit+0x2a8/0x5dc)
[   43.097216] [<c0bfd63c>] (ip6_xmit) from [<c0c307cc>] (inet6_csk_xmit+0x78/0xac)
[   43.097232] [<c0c307cc>] (inet6_csk_xmit) from [<c0bb9e10>] (tcp_transmit_skb+0x44c/0x8fc)
[   43.097248] [<c0bb9e10>] (tcp_transmit_skb) from [<c0bba42c>] (tcp_write_xmit+0x16c/0xffc)
[   43.097263] [<c0bba42c>] (tcp_write_xmit) from [<c0bbcaa8>] (tcp_tsq_handler.part.7+0x6c/0x7c)
[   43.097283] [<c0bbcaa8>] (tcp_tsq_handler.part.7) from [<c0bbcd0c>] (tcp_tasklet_func+0x138/0x13c)
[   43.097301] [<c0bbcd0c>] (tcp_tasklet_func) from [<c034890c>] (tasklet_action+0x74/0x110)
[   43.097317] [<c034890c>] (tasklet_action) from [<c0348acc>] (__do_softirq+0xd0/0x21c)
[   43.097332] [<c0348acc>] (__do_softirq) from [<c0348f10>] (irq_exit+0xd8/0x140)
[   43.097351] [<c0348f10>] (irq_exit) from [<c038cf08>] (__handle_domain_irq+0x60/0xb4)
[   43.097365] [<c038cf08>] (__handle_domain_irq) from [<c030d2cc>] (__irq_svc+0x6c/0x90)
[   43.097380] [<c030d2cc>] (__irq_svc) from [<c0309450>] (arch_cpu_idle+0x38/0x3c)
[   43.097395] [<c0309450>] (arch_cpu_idle) from [<c037e138>] (do_idle+0x164/0x1f8)
[   43.097408] [<c037e138>] (do_idle) from [<c037e444>] (cpu_startup_entry+0x18/0x1c)
[   43.097427] [<c037e444>] (cpu_startup_entry) from [<c1200c78>] (start_kernel+0x380/0x38c)
[   43.097532] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 18955:
[   43.097541] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.11.0-rc1+ #52
[   43.097545] Hardware name: Generic DT based system
[   43.097561] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.097574] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.097588] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.097606] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.097621] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b453c>] (rx_submit+0x17c/0x2f4)
[   43.097636] [<c09b453c>] (rx_submit) from [<c09b4ca0>] (rx_complete+0x2a8/0x304)
[   43.097652] [<c09b4ca0>] (rx_complete) from [<c09c8080>] (__usb_hcd_giveback_urb+0x90/0x118)
[   43.097667] [<c09c8080>] (__usb_hcd_giveback_urb) from [<c09c81a0>] (usb_giveback_urb_bh+0x98/0xe8)
[   43.097699] [<c09c81a0>] (usb_giveback_urb_bh) from [<c034890c>] (tasklet_action+0x74/0x110)
[   43.097726] [<c034890c>] (tasklet_action) from [<c0348acc>] (__do_softirq+0xd0/0x21c)
[   43.097752] [<c0348acc>] (__do_softirq) from [<c0348f10>] (irq_exit+0xd8/0x140)
[   43.097778] [<c0348f10>] (irq_exit) from [<c038cf08>] (__handle_domain_irq+0x60/0xb4)
[   43.097812] [<c038cf08>] (__handle_domain_irq) from [<c030d2cc>] (__irq_svc+0x6c/0x90)
[   43.097836] [<c030d2cc>] (__irq_svc) from [<c0309450>] (arch_cpu_idle+0x38/0x3c)
[   43.097860] [<c0309450>] (arch_cpu_idle) from [<c037e138>] (do_idle+0x164/0x1f8)
[   43.097883] [<c037e138>] (do_idle) from [<c037e444>] (cpu_startup_entry+0x18/0x1c)
[   43.097908] [<c037e444>] (cpu_startup_entry) from [<c1200c78>] (start_kernel+0x380/0x38c)
[   43.097991] dwc2_alloc_dma_aligned_buffer: creating a DMA-aligned buffer with size 18955:
[   43.098010] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.11.0-rc1+ #52
[   43.098025] Hardware name: Generic DT based system
[   43.098049] [<c0311484>] (unwind_backtrace) from [<c030c754>] (show_stack+0x10/0x14)
[   43.098072] [<c030c754>] (show_stack) from [<c05bf5e0>] (dump_stack+0x88/0x9c)
[   43.098097] [<c05bf5e0>] (dump_stack) from [<c09ee4f8>] (dwc2_map_urb_for_dma+0x14c/0x154)
[   43.098121] [<c09ee4f8>] (dwc2_map_urb_for_dma) from [<c09c9c68>] (usb_hcd_submit_urb+0x94/0xa08)
[   43.098147] [<c09c9c68>] (usb_hcd_submit_urb) from [<c09b453c>] (rx_submit+0x17c/0x2f4)
[   43.098177] [<c09b453c>] (rx_submit) from [<c09b4ca0>] (rx_complete+0x2a8/0x304)
[   43.098203] [<c09b4ca0>] (rx_complete) from [<c09c8080>] (__usb_hcd_giveback_urb+0x90/0x118)
[   43.098228] [<c09c8080>] (__usb_hcd_giveback_urb) from [<c09c81a0>] (usb_giveback_urb_bh+0x98/0xe8)
[   43.098255] [<c09c81a0>] (usb_giveback_urb_bh) from [<c034890c>] (tasklet_action+0x74/0x110)
[   43.098281] [<c034890c>] (tasklet_action) from [<c0348acc>] (__do_softirq+0xd0/0x21c)
[   43.098306] [<c0348acc>] (__do_softirq) from [<c0348f10>] (irq_exit+0xd8/0x140)
[   43.098332] [<c0348f10>] (irq_exit) from [<c038cf08>] (__handle_domain_irq+0x60/0xb4)
[   43.098356] [<c038cf08>] (__handle_domain_irq) from [<c030d2cc>] (__irq_svc+0x6c/0x90)
[   43.098369] [<c030d2cc>] (__irq_svc) from [<c0309450>] (arch_cpu_idle+0x38/0x3c)
[   43.098382] [<c0309450>] (arch_cpu_idle) from [<c037e138>] (do_idle+0x164/0x1f8)
[   43.098395] [<c037e138>] (do_idle) from [<c037e444>] (cpu_startup_entry+0x18/0x1c)
[   43.098408] [<c037e444>] (cpu_startup_entry) from [<c1200c78>] (start_kernel+0x380/0x38c)

So, at least the network drivers seem to be doing something similar to 
what the UVC driver is doing.

Eventually, some of those requests may actually be dword aligned.
Whan that happens, it will use the original transfer_buffer, instead
of the DWORD aligned one. If size is not word-aligned, the DMA
will override data at the end of the buffer, causing troubles
at the network driver.

If a block driver would have similar troubles, this issue could
cause mass corruption there.

So, IMHO we should apply this fix or some other logic that would
prevent such troubles.

> 
> > In the specific case of uvc_driver, it uses an array where
> > it caches the content of video controls, passing it to the
> > USB stack via usb_control_msg(). Typical controls use 1 or 2
> > bytes. The net result is that other values of the buffer
> > gets overriden when this function is called.  
> 
> Not good, it should be fixed, otherwise you are going to have to try to
> fix up all host controllers :(
> 
> thanks,
> 
> greg k-h

Thanks,
Mauro

[PATCH] dwc2: add a debug function to warn when a buffer is not DMA-aligned

Temporary hack to check why there are so much usage of non-DMA
aligned buffers.

Signed-off-by: Mauro Carvalho Chehab <mchehab at s-opensource.com>

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index a73722e27d07..3cc04df6b95f 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2602,6 +2602,17 @@ static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
 	kmalloc_size = urb->transfer_buffer_length +
 		sizeof(struct dma_aligned_buffer) + DWC2_USB_DMA_ALIGN - 1;
 
+	{
+		static DEFINE_RATELIMIT_STATE(rs,
+					      DEFAULT_RATELIMIT_INTERVAL,
+					      DEFAULT_RATELIMIT_BURST);
+		if (__ratelimit(&rs)) {
+			pr_info("%s: creating a DMA-aligned buffer with size %d:\n",
+				 __func__, kmalloc_size);
+			dump_stack();
+		}
+	}
+
 	kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);
 	if (!kmalloc_ptr)
 		return -ENOMEM;




More information about the linux-rpi-kernel mailing list