vfio-pci passthrough
Fei Li
fli@suse.com
Summary
➢ What is VFIO/IOMMU? Why want them?
➢ VFIO – qemu part
➢ VFIO – kernel part
➢ VFIO usage: how to passthrough a pci device
2
- What is VFIO/IOMMU? Why want them?
What is VFIO/IOMMU?
➢ The VFIO (Virtual Function I/O) driver is an IOMMU/device agnostic
framework for exposing direct device access to userspace, in a
secure, IOMMU protected environment.
➢ For x86, it needs the I/O MMU hardware support.
➢ VFIO consists of
➢ - kernel device driver: vfio_pci_driver, vfio_iommu_driver, vfio_dma
➢ - QEMU device class: VFIODevice, VFIOPCIDevice
➢ The guest can operate the pass-throughed PCI device by:
- accessing the mapped PCI config space and memory space
- ioctl() on a fd of the VFIO kernel device for control operations
4
What is VFIO/IOMMU? (Continue.)
➢ In qemu, use VFIO to configure IOMMU:
e.g. ioctl(VFIO_SET_IOMMU) && ioctl(VFIO_IOMMU_MAP_DMA)
5
VFIO: device, group, container
➢ A group is a set of devices which is isolatable from all other devices,
specialized in IOMMU. It is the minimum granularity.
➢ Within one container, different groups can share a set of page tables
to reduce the duplication. The container provides little
functionality: version check and extension query.
➢ The user needs to add a group into the container.
➢ The VFIO device API includes ioctls for describing the device, the I/O
regions and their read/write/mmap offsets on the device descriptor,
&& mechanisms for describing and registering interrupt
notifications.
6
Why want them?
➢ In short, for higher I/O performance by lessening the times of
VM-EXIT/VM-ENTRY when accessing PCI BAR and doing DMA.
7
Why want them? (literal)
➢ - when accessing PCI BAR. The emulated guest BIOS emulate the
BAR address for the guest (On the contrary, if expose the host real BAR to
the guest, there may be conflicts between host real PCI BAR and the other emulated
PCI device’s BAR in the guest). For the first time when BAR is accessed
and can not be visited, the VM exists and does the address
mapping between GPA and HPA using EPT, and records the
mapping. When later access, no VM-exit is needed.
➢ - when PCI device communicates with GPA via DMA. When
initializing the vfio: vfio_realize() in qemu, a memory mapping is
established via vfio_region_mmap(). Then a (ioctl(s->container,
VFIO_IOMMU_MAP_DMA, &dma_map) in qemu_vfio_do_mapping
will do the real mapping between VFIO iova and HPA.
8