Over the weekend I kept tinkering with this NVME driver, because sometimes I am not good at work/life balance.
Tidying up and an optimization
Removed the no-longer-used
nvme_io_txn() and dependencies on the hexdump utility library,
leftover from early tests. Adjusted completion queue processing to only ring the doorbell
that tells the hardware that the queue tail has advanced after draining the queue (thank you,
Doug Gale, for pointing out this inefficiency in a comment on the CL).
Display more information, tidy the code that does it
NVME controllers are very flexible and I’m dumping a bunch of information about how whatever
controller the driver is talking to is configured. Now we dump even more, but tidy the code
that does it up a bit, adjust the formatting to be a bit more consistent, and prepare to stash
some of that information in the driver’s
nvme_device_t structure so we can reference it
from code that may want to take advantage of optional features in the future:
Here’s some typical output from the driver, for a Crucial M.2 NVME SSD:
nvme: version 1.2.0 nvme: page size: (MPSMIN): 4096 (MPSMAX): 4096 nvme: doorbell stride: 4 nvme: timeout: 65536 ms nvme: boot partition support (BPS): N nvme: supports NVM command set (CSS:NVM): Y nvme: subsystem reset supported (NSSRS): N nvme: weighted-round-robin (AMS:WRR): Y nvme: vendor-specific arbitration (AMS:VS): N nvme: contiquous queues required (CQR): Y nvme: maximum queue entries supported (MQES): 65536 nvme: model: 'Force MP500' nvme: serial number: '17457994000122410152' nvme: firmware: 'E7FM04.5' nvme: max outstanding commands: 0 nvme: max namespaces: 1 nvme: scatter gather lists (SGL): N 00000000 nvme: max data transfer: 2097152 bytes nvme: sanitize caps: 0 nvme: abort command limit (ACL): 4 nvme: asynch event req limit (AERL): 4 nvme: firmware: slots: 1 reset: Y slot1ro: N nvme: host buffer: min/preferred: 0/0 pages nvme: capacity: total/unalloc: 0/0 nvme: volatile write cache (VWC): Y nvme: atomic write unit (AWUN)/(AWUPF): 256/1 blks nvme: feature: FIRMWARE_DOWNLOAD_COMMIT nvme: feature: FORMAT_NVM nvme: feature: SECURITY_SEND_RECV nvme: feature: SAVE_SELECT_NONZERO nvme: feature: WRITE_UNCORRECTABLE nvme: ns: atomic write unit (AWUN)/(AWUPF): 256/1 blks nvme: ns: NABSN/NABO/NABSPF/NOIOB: 255/0/0/0 nvme: ns: LBA FMT 00: RP=1 LBADS=2^9b MS=0b nvme: ns: LBA FMT 01: RP=0 LBADS=2^12b MS=0b nvme: ns: LBA FMT #0 active nvme: ns: data protection: caps/set: 0x00/0 nvme: ns: size/cap/util: 234441648/234441648/234441648 blks
Later this will be much reduced and only displayed if verbose debug chatter is requested.
Some cleanup, make Plextor devices work
Added code to cancel in-flight transactions when the driver shuts down in
Added some (disabled) code to do a SHUTDOWN operation before RESET – a Plextor SSD was erroring
out when configuring the IO completion queue and my initial theory was maybe it didn’t like
the abrupt reset. Turned out I was erroneously setting the namespace ID in the CREATE QUEUE
command – neither Qemu, nor 6 other different NVME controllers cared about this, but the Plextor
controller is more particular about spec adherence here!
So this change also tidies up the various setup commands and leaves the namespace ID as 0 for the
several commands where it should be zero.
I still haven’t sorted out why the legacy PCI IRQs are not working properly with Qemu. I put
together a patch for Qemu to support PCI MSI interrupts which works around that problem from
the other side, removed the polling hack I had in the driver for use on Qemu, and filed a bug
against the owner of our PCI subsystem so he can investigate the legacy IRQ interaction when
The patch to Qemu (which we should tidy up and send upstream as we’ve done with other Qemu patches)
is currently applied to our local Qemu tree over here: