In the Linux kernel, the following vulnerability has been resolved:
nvme: fix a possible use-after-free in controller reset during load
Unlike .queuerq, in .submitasync_event drivers may not check the ctrl readiness for AER submission. This may lead to a use-after-free condition that was observed with nvme-tcp.
The race condition may happen in the following scenario: 1. driver executes its resetctrlwork 2. -> nvmestopctrl - flushes ctrl asynceventwork 3. ctrl sends AEN which is received by the host, which in turn schedules AEN handling 4. teardown admin queue (which releases the queue socket) 5. AEN processed, submits another AER, calling the driver to submit 6. driver attempts to send the cmd ==> use-after-free
In order to fix that, add ctrl state check to validate the ctrl is actually able to accept the AER submission.
This addresses the above race in controller resets because the driver during teardown should: 1. change ctrl state to RESETTING 2. flush asynceventwork (as well as other async work elements)
So after 1,2, any other AER command will find the ctrl state to be RESETTING and bail out without submitting the AER.