In the Linux kernel, the following vulnerability has been resolved:
USB: core: Fix hang in usbkillurb by adding memory barriers
The syzbot fuzzer has identified a bug in which processes hang waiting for usbkillurb() to return. It turns out the issue is not unlinking the URB; that works just fine. Rather, the problem arises when the wakeup notification that the URB has completed is not received.
The reason is memory-access ordering on SMP systems. In outline form, usbkillurb() and _usbhcdgivebackurb() operating concurrently on different CPUs perform the following actions:
CPU 0 CPU 1 ---------------------------- --------------------------------- usbkillurb(): _usbhcdgivebackurb(): ... ... atomicinc(&urb->reject); atomicdec(&urb->usecount); ... ... waitevent(usbkillurbqueue, atomicread(&urb->usecount) == 0); if (atomicread(&urb->reject)) wakeup(&usbkillurbqueue);
Confining your attention to urb->reject and urb->use_count, you can see that the overall pattern of accesses on CPU 0 is:
write urb->reject, then read urb->use_count;
whereas the overall pattern of accesses on CPU 1 is:
write urb->use_count, then read urb->reject.
This pattern is referred to in memory-model circles as SB (for "Store Buffering"), and it is well known that without suitable enforcement of the desired order of accesses -- in the form of memory barriers -- it is entirely possible for one or both CPUs to execute their reads ahead of their writes. The end result will be that sometimes CPU 0 sees the old un-decremented value of urb->usecount while CPU 1 sees the old un-incremented value of urb->reject. Consequently CPU 0 ends up on the wait queue and never gets woken up, leading to the observed hang in usbkill_urb().
The same pattern of accesses occurs in usbpoisonurb() and the failure pathway of usbhcdsubmit_urb().
The problem is fixed by adding suitable memory barriers. To provide proper memory-access ordering in the SB pattern, a full barrier is required on both CPUs. The atomicinc() and atomicdec() accesses themselves don't provide any memory ordering, but since they are present, we can use the optimized smpmbafteratomic() memory barrier in the various routines to obtain the desired effect.
This patch adds the necessary memory barriers.