<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.1">Jekyll</generator><link href="https://people.redhat.com/~cohuck/feed.xml" rel="self" type="application/atom+xml" /><link href="https://people.redhat.com/~cohuck/" rel="alternate" type="text/html" /><updated>2022-07-18T10:03:41+02:00</updated><id>https://people.redhat.com/~cohuck/feed.xml</id><title type="html">KVM, QEMU, and more</title><subtitle>A blog about free and open source virtualization</subtitle><entry><title type="html">VIRTIO 1.2 is out!</title><link href="https://people.redhat.com/~cohuck/2022/07/18/virtio-12-is-out.html" rel="alternate" type="text/html" title="VIRTIO 1.2 is out!" /><published>2022-07-18T09:45:00+02:00</published><updated>2022-07-18T09:45:00+02:00</updated><id>https://people.redhat.com/~cohuck/2022/07/18/virtio-12-is-out</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2022/07/18/virtio-12-is-out.html">&lt;p&gt;&lt;a href=&quot;https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.pdf&quot;&gt;A new version of the virtio specification&lt;/a&gt;
has been released! As it has been three years after the 1.1 release, quite a
lot of changes have accumulated. I have attempted to list some of them below;
for details, you are invited to check out the spec :)&lt;/p&gt;

&lt;p&gt;There are already some changes queued for 1.3; let’s hope it won’t take us
three years again before the next release ;)&lt;/p&gt;

&lt;h1 id=&quot;new-device-types-in-12&quot;&gt;New device types in 1.2&lt;/h1&gt;

&lt;p&gt;Several new device types have been added.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;virtio-pmem: persistent memory device; useful to avoid a separate page cache
in the guest&lt;/li&gt;
  &lt;li&gt;virtio-fs: access a file system; kind of the spritual successor to the never
officially standardized virtio-9p&lt;/li&gt;
  &lt;li&gt;virtio-rpmb: a tamper-resistant and anti-replay storage device&lt;/li&gt;
  &lt;li&gt;virtio-iommu: can both be a proxy for a physical IOMMU, or act as a virtual
IOMMU&lt;/li&gt;
  &lt;li&gt;virtio-snd: a sound card supporting input and output PCM streams&lt;/li&gt;
  &lt;li&gt;virtio-mem: provides a memory region in guest physical address space; useful
to implement memory hot(un)plugging&lt;/li&gt;
  &lt;li&gt;virtio-i2c: a virtual I2C adapter&lt;/li&gt;
  &lt;li&gt;virtio-scmi: implements the Arm System Control and Management Interface, for
things like sensors etc.&lt;/li&gt;
  &lt;li&gt;virtio-gpio: a virtual GPIO device to manage named I/O lines&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;new-features-for-existing-device-types&quot;&gt;New features for existing device types&lt;/h1&gt;

&lt;p&gt;Enhancements have been added to some already existing device types.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;virtio-blk
    &lt;ul&gt;
      &lt;li&gt;multiqueue support&lt;/li&gt;
      &lt;li&gt;lifetime metrics&lt;/li&gt;
      &lt;li&gt;secure erase&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;virtio-net
    &lt;ul&gt;
      &lt;li&gt;support for the guest providing the exact header length&lt;/li&gt;
      &lt;li&gt;receive-side scaling&lt;/li&gt;
      &lt;li&gt;per-packet hash reporting&lt;/li&gt;
      &lt;li&gt;per-virtqueue driver notifications&lt;/li&gt;
      &lt;li&gt;UDP segmentation offload&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;virtio-gpu
    &lt;ul&gt;
      &lt;li&gt;support for 3D commands&lt;/li&gt;
      &lt;li&gt;resource sharing&lt;/li&gt;
      &lt;li&gt;blob resources&lt;/li&gt;
      &lt;li&gt;context initialization&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;virtio-balloon
    &lt;ul&gt;
      &lt;li&gt;free page hints&lt;/li&gt;
      &lt;li&gt;page poisoning&lt;/li&gt;
      &lt;li&gt;free page reporting&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;virtio-vsock
    &lt;ul&gt;
      &lt;li&gt;seqpacket sockets&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;features-not-specific-to-a-device-type&quot;&gt;Features not specific to a device type&lt;/h1&gt;

&lt;p&gt;Some general enhancements include:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;support for vendor-specific PCI capabilities&lt;/li&gt;
  &lt;li&gt;support for sharing resources between devices&lt;/li&gt;
  &lt;li&gt;support for resetting individual virtqueues&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">A new version of the virtio specification has been released! As it has been three years after the 1.1 release, quite a lot of changes have accumulated. I have attempted to list some of them below; for details, you are invited to check out the spec :)</summary></entry><entry><title type="html">QEMU machine types and compatibility (part 2)</title><link href="https://people.redhat.com/~cohuck/2022/01/21/qemu-machine-types-part2.html" rel="alternate" type="text/html" title="QEMU machine types and compatibility (part 2)" /><published>2022-01-21T13:45:00+01:00</published><updated>2022-01-21T13:45:00+01:00</updated><id>https://people.redhat.com/~cohuck/2022/01/21/qemu-machine-types-part2</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2022/01/21/qemu-machine-types-part2.html">&lt;p&gt;In the &lt;a href=&quot;/~cohuck/2022/01/05/qemu-machine-types.html&quot;&gt;first part&lt;/a&gt; of this
article, I talked about how you can use versioned machine types to ensure
compatibility. But the more interesting part is how this actually works
under the covers.&lt;/p&gt;

&lt;h1 id=&quot;device-properties-and-making-them-compatible&quot;&gt;Device properties, and making them compatible&lt;/h1&gt;

&lt;p&gt;QEMU devices often come with a list of properties that influence how the
device is created and how it operates. Typically, authors try to come up
with reasonable default values, which may be overriden if desired. However,
the idea of what is considered reasonable may change over time, and a newer
QEMU may provide a different default value for a property.&lt;/p&gt;

&lt;p&gt;If you want to migrate a guest from an older QEMU machine to a more recent
QEMU, you obviously need to use the default values from that older QEMU machine
as well. For that, QEMU uses arrays of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GlobalPropery&lt;/code&gt; structures.&lt;/p&gt;

&lt;p&gt;If you take a look at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw/core/machine.c&lt;/code&gt;, you will notice several arrays named
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw_compat_&amp;lt;major&amp;gt;_&amp;lt;minor&amp;gt;&lt;/code&gt;. These contain triplets specifying (from right to
left) the default value for a certain property for a certain device. The arrays
are designed to be included by the compat machine for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;major&amp;gt;.&amp;lt;minor&amp;gt;&lt;/code&gt;, thus
specifying a default value for that machine version and older. (More on this
later in this article.)&lt;/p&gt;

&lt;p&gt;For example, QEMU 5.2 changed the default number of virtio queues defined for
virtio-blk and virtio-scsi devices: prior to 5.1, one queue would be present
if no other value had been specified; with 5.2, the default number of queues
would align with the number of vcpus for virtio-pci. Therefore, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw_compat_5_1&lt;/code&gt;
contains the following lines:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;{ &quot;virtio-blk-device&quot;, &quot;num-queues&quot;, &quot;1&quot;},
{ &quot;virtio-scsi-device&quot;, &quot;num_queues&quot;, &quot;1&quot;},
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(and some corresponding lines for vhost.) This makes sure that any virtio-blk
or virtio-scsi device on a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-5.1&lt;/code&gt; or older machine type will have one virtio
queue per default. Note that this holds true for &lt;em&gt;all&lt;/em&gt; virtio-blk and virtio-scsi
devices, regardless of which transport they are using; for transports like ccw
where nothing changed with 5.2, this simply does not make any difference.&lt;/p&gt;

&lt;p&gt;Generally, statements for all devices can go into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw_compat_&lt;/code&gt; arrays; if a
device is not present or even not available at all for the machine that is
started, the statement will simply not take any effect.&lt;/p&gt;

&lt;h2 id=&quot;x86-considerations&quot;&gt;x86 considerations&lt;/h2&gt;

&lt;p&gt;For the x86 machine types (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-i440fx&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-q35&lt;/code&gt;), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc_compat_&amp;lt;major&amp;gt;_&amp;lt;minor&amp;gt;&lt;/code&gt;
arrays are defined in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw/i386/pc.c&lt;/code&gt;, mostly covering properties for x86 cpus,
but also some other x86-specific devices.&lt;/p&gt;

&lt;h1 id=&quot;per-machine-changes&quot;&gt;Per-machine changes&lt;/h1&gt;

&lt;p&gt;Some incompatible changes are not happening at the device property level, so the
compat properties approach cannot be used. Instead, the individual machines need
to take care of those changes.&lt;/p&gt;

&lt;p&gt;For example, in QEMU 6.2 the smp parsing code started to prefer cores over sockets
instead of preferring sockets. Therefore, all 6.1 compat machines have code like&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;m-&amp;gt;smp_props.prefer_sockets = true;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;to set prefer_sockets to true in the MachineClass. (Note that the m68k virt machine
does not support smp, and therefore does not need that statement.)&lt;/p&gt;

&lt;p&gt;Machines also sometimes need to configure associated capabilities in a compatible
way. For example, the s390x cpu models may gain new feature flags in newer QEMU
releases; when using a compat machine, those new flags need to be off in the cpu
models that are used by default.&lt;/p&gt;

&lt;h1 id=&quot;inheritance&quot;&gt;Inheritance&lt;/h1&gt;

&lt;p&gt;Compat machines for older machine types need the compatibility changes for newer
machine types as well as some changes on top. Typically, this is done by the
MachineState respectively MachineClass initializing functions for version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n-1&lt;/code&gt;
calling the respective initializing functions for version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n&lt;/code&gt;. As all new
compatibility changes are added for the latest versioned machine type, changes
are propagated down the whole stack of versions.&lt;/p&gt;

&lt;p&gt;All machine types for version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n&lt;/code&gt; include the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw_compat_&amp;lt;n&amp;gt;&lt;/code&gt; array (and the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc_compat_&amp;lt;n&amp;gt;&lt;/code&gt; array for x86), unless they are the latest version (which does
not need any compat handling yet.) The older compat property arrays are included
via the inheritance mechanism.&lt;/p&gt;

&lt;h1 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h1&gt;

&lt;p&gt;QEMU currently supports versioned machine types for x86 (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-i440fx&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-q35&lt;/code&gt;),
arm (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt&lt;/code&gt;), aarch64 (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt&lt;/code&gt;), s390x (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s390-ccw-virtio&lt;/code&gt;), ppc64 (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries&lt;/code&gt;),
and m68k (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt&lt;/code&gt;). At the beginning of each development cycle, new (empty) arrays
of compat properties for the last version are added and wired up in the machine
types for that last version, new versions of each of these machines are added to
the code, and the defaults switched to them (well, that’s the goal.) After that,
the framework for adding incompatible changes is in place.&lt;/p&gt;

&lt;p&gt;If you find that these changes have not yet been made when you plan to make an
incompatible change, it is important that you add the new machine types first.&lt;/p&gt;

&lt;h2 id=&quot;new-and-incompatible-device-properties&quot;&gt;New and incompatible device properties&lt;/h2&gt;

&lt;p&gt;If you plan to change the default value of a device property, or add a new property
with a default value that will cause guest-observable changes, you need to add
an entry that preserves the old value (or sets a value that does not change the
behaviour) to the compat property array for the last version. In general (non-x86
specific change), that means adding it to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hw_compat_&lt;/code&gt; array, and all machine
types will use it automatically.&lt;/p&gt;

&lt;p&gt;Take care to use the right device for specifying the property; for example, there
is often some confusion when dealing with virtio devices. If you e.g. modify a
virtio-blk property (as in the example above), you need to add a statement for
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virtio-blk-device&lt;/code&gt; and not for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virtio-blk-pci&lt;/code&gt;, or virtio-blk instances using
the ccw or mmio transports would be left out. If, on the other hand, you modify
a property only for virtio-blk devices using the pci transport, you need to add
a statement for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virtio-blk-pci&lt;/code&gt;. Similar considerations apply to other devices
inheriting from base types.&lt;/p&gt;

&lt;h2 id=&quot;per-machine-changes-1&quot;&gt;Per-machine changes&lt;/h2&gt;

&lt;p&gt;If you change a non-device default characteristic, you need to add a compatibility
statement for the machine types for the last version in their instance (or class)
init functions. The hardest part here is making sure that all relevant machine
types get the update.&lt;/p&gt;

&lt;p&gt;For example, if you add a change in the s390x cpu models, it is easy to see that
you only need to modify the code for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s390-ccw-virtio&lt;/code&gt; machine. For other
changes, every versioned machine needs the change. And there are cases like
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prefer_sockets&lt;/code&gt; change mentioned above, that apply to any machine type
that supports smp.&lt;/p&gt;

&lt;p&gt;I hope that these explanations help a bit with understanding how machine type
compatibility works, and where to add your own changes.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">In the first part of this article, I talked about how you can use versioned machine types to ensure compatibility. But the more interesting part is how this actually works under the covers.</summary></entry><entry><title type="html">QEMU machine types and compatibility</title><link href="https://people.redhat.com/~cohuck/2022/01/05/qemu-machine-types.html" rel="alternate" type="text/html" title="QEMU machine types and compatibility" /><published>2022-01-05T12:30:00+01:00</published><updated>2022-01-05T12:30:00+01:00</updated><id>https://people.redhat.com/~cohuck/2022/01/05/qemu-machine-types</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2022/01/05/qemu-machine-types.html">&lt;p&gt;If you want to migrate a guest initially started on an older QEMU version to
a newer version of QEMU, you need to make sure that the two machines are
actually compatible with each other. Once you exclude things like devices
that cannot be migrated at all and make sure both QEMU invocations actually
create the same virtual hardware, this basically boils down to using compatible
machines.&lt;/p&gt;

&lt;h1 id=&quot;versioned-machine-types&quot;&gt;Versioned machine types&lt;/h1&gt;

&lt;p&gt;If you simply want to create a machine without any consideration regarding
migration compatibility, you will usually do something like&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-system-ppc64 -machine pseries (...)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will create a machine of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries&lt;/code&gt; type. But in this case, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries&lt;/code&gt;
is actually an alias to the latest version of this machine type; for 6.2,
this would be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries-6.2&lt;/code&gt;. You can find out which machine types are versioned
(and which machine types actually exist for a given binary) via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-machine ?&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ qemu-system-ppc64 -machine ?
Supported machines are:
40p                  IBM RS/6000 7020 (40p)
bamboo               bamboo
g3beige              Heathrow based PowerMAC
mac99                Mac99 based PowerMAC
mpc8544ds            mpc8544ds
none                 empty machine
pegasos2             Genesi/bPlan Pegasos II
powernv10            IBM PowerNV (Non-Virtualized) POWER10
powernv8             IBM PowerNV (Non-Virtualized) POWER8
powernv              IBM PowerNV (Non-Virtualized) POWER9 (alias of powernv9)
powernv9             IBM PowerNV (Non-Virtualized) POWER9
ppce500              generic paravirt e500 platform
pseries-2.1          pSeries Logical Partition (PAPR compliant)
pseries-2.10         pSeries Logical Partition (PAPR compliant)
pseries-2.11         pSeries Logical Partition (PAPR compliant)
pseries-2.12         pSeries Logical Partition (PAPR compliant)
pseries-2.12-sxxm    pSeries Logical Partition (PAPR compliant)
pseries-2.2          pSeries Logical Partition (PAPR compliant)
pseries-2.3          pSeries Logical Partition (PAPR compliant)
pseries-2.4          pSeries Logical Partition (PAPR compliant)
pseries-2.5          pSeries Logical Partition (PAPR compliant)
pseries-2.6          pSeries Logical Partition (PAPR compliant)
pseries-2.7          pSeries Logical Partition (PAPR compliant)
pseries-2.8          pSeries Logical Partition (PAPR compliant)
pseries-2.9          pSeries Logical Partition (PAPR compliant)
pseries-3.0          pSeries Logical Partition (PAPR compliant)
pseries-3.1          pSeries Logical Partition (PAPR compliant)
pseries-4.0          pSeries Logical Partition (PAPR compliant)
pseries-4.1          pSeries Logical Partition (PAPR compliant)
pseries-4.2          pSeries Logical Partition (PAPR compliant)
pseries-5.0          pSeries Logical Partition (PAPR compliant)
pseries-5.1          pSeries Logical Partition (PAPR compliant)
pseries-5.2          pSeries Logical Partition (PAPR compliant)
pseries-6.0          pSeries Logical Partition (PAPR compliant)
pseries-6.1          pSeries Logical Partition (PAPR compliant)
pseries              pSeries Logical Partition (PAPR compliant) (alias of pseries-6.2)
pseries-6.2          pSeries Logical Partition (PAPR compliant) (default)
ref405ep             ref405ep
sam460ex             aCube Sam460ex
taihu                taihu
virtex-ml507         Xilinx Virtex ML507 reference design
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As you can see, there are various &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries-x.y&lt;/code&gt; machine types for older versions;
these are designed to present a configuration that is compatible with a default
machine that was created with an older QEMU version. For example, if you wanted
to migrate a guest running on a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries&lt;/code&gt; machine that was created using QEMU
5.1, the receiving QEMU would need to be started with&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;qemu-system-ppc64 -machine pseries-5.1 (...)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;supported-machine-types&quot;&gt;Supported machine types&lt;/h2&gt;

&lt;p&gt;Note: the following applies to upstream QEMU. Distributions may support different
versioned machine types in their builds.&lt;/p&gt;

&lt;p&gt;This list is as of QEMU 6.2; new versioned machine types may be added in the
future, and sometimes old ones deprecated and removed. The machine types for the
next QEMU release are usually introduced early in the release cycle (at least,
that is the goal…)&lt;/p&gt;

&lt;h3 id=&quot;arm-aarch64&quot;&gt;arm, aarch64&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt&lt;/code&gt; machine type supports versions since 2.6.&lt;/p&gt;

&lt;h3 id=&quot;m68k&quot;&gt;m68k&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt&lt;/code&gt; machine type supports versions since 6.0.&lt;/p&gt;

&lt;h3 id=&quot;ppc64&quot;&gt;ppc64&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pseries&lt;/code&gt; machine type supports versions since 2.1.&lt;/p&gt;

&lt;h3 id=&quot;s390x&quot;&gt;s390x&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s390-ccw-virtio&lt;/code&gt; machine type supports versions since 2.4.&lt;/p&gt;

&lt;h3 id=&quot;i386-x86_64&quot;&gt;i386, x86_64&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-i440fx&lt;/code&gt; machine type supports versions since 1.4 (there used to be even
older ones, but they have been removed), while the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-q35&lt;/code&gt; machine type supports
versions since 2.4.&lt;/p&gt;

&lt;p&gt;There’s an additional thing to consider here: the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc&lt;/code&gt; machine type alias points
(as of QEMU 6.2) to the latest &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-i440fx&lt;/code&gt; machine type; if you want the latest
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pc-q35&lt;/code&gt; machine type instead, you have to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;q35&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;how-to-use-this&quot;&gt;How to use this&lt;/h1&gt;

&lt;p&gt;If you want to simply fire up a QEMU instance and shut it down again without
wanting to migrate it anywhere, you can stick to the default machine type. However,
if you might want to migrate the machine later, it is probably a good idea to
specify a versioned machine type explicitly, so that you don’t have to remember
which QEMU version you started it with.&lt;/p&gt;

&lt;p&gt;Or just use management software like libvirt, which will do the machine type
expansion to the latest version for you automatically, so you don’t have to
worry about it later.&lt;/p&gt;

&lt;p&gt;This concludes the usage part of compatible machine types; a follow-up post
will look at how this is actually implemented.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">If you want to migrate a guest initially started on an older QEMU version to a newer version of QEMU, you need to make sure that the two machines are actually compatible with each other. Once you exclude things like devices that cannot be migrated at all and make sure both QEMU invocations actually create the same virtual hardware, this basically boils down to using compatible machines.</summary></entry><entry><title type="html">Blog update</title><link href="https://people.redhat.com/~cohuck/2021/11/02/blog-update.html" rel="alternate" type="text/html" title="Blog update" /><published>2021-11-02T00:00:00+01:00</published><updated>2021-11-02T00:00:00+01:00</updated><id>https://people.redhat.com/~cohuck/2021/11/02/blog-update</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2021/11/02/blog-update.html">&lt;p&gt;I have moved my blog to a new location and done some other changes at the same
time.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This blog is now generated via &lt;a href=&quot;https://jekyllrb.com/&quot;&gt;Jekyll&lt;/a&gt; (a huge thank
you to the authors here!) This makes posts easier to write for me (especially
when formatting command output and similar), and gets rid of intrusive scripts
as on the Blogger platform.&lt;/li&gt;
  &lt;li&gt;This blog’s title is now “KVM, QEMU, and more.” Observant readers may have
noticed that I dropped the “Big Iron” part. I may still post s390x-specific
content, but in general, I plan to write more about topics that are not
architecture-specific.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yes, I plan to actually post something new this year ;)&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">I have moved my blog to a new location and done some other changes at the same time.</summary></entry><entry><title type="html">s390x changes in QEMU 5.2</title><link href="https://people.redhat.com/~cohuck/2020/11/24/s390x-changes-in-qemu-52.html" rel="alternate" type="text/html" title="s390x changes in QEMU 5.2" /><published>2020-11-24T16:34:00+01:00</published><updated>2020-11-24T16:34:00+01:00</updated><id>https://people.redhat.com/~cohuck/2020/11/24/s390x-changes-in-qemu-52</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/11/24/s390x-changes-in-qemu-52.html">&lt;p&gt;As, once again, a new QEMU release is around the corner, the time has come to
list some s390x changes in there.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;TCG has gained emulation support for some additional instructions that had
been introduced with the z14. More enhancements needed to be able to run
distributions built for the z14 will likely come in the future.&lt;/li&gt;
  &lt;li&gt;When running under KVM, QEMU now supports the diagnose 0x318 instruction.
This can be used to set some diagnostic information (such as the operating
system), which may be helpful when servicing the hardware. With this comes
support for extended SCCBs; this is needed as the facility indication for
diag318 encroaches into the control block used for reporting CPU information.
A guest needs support for extended SCCBs to be able to see information for
all CPUs if diag318 support is provided.&lt;/li&gt;
  &lt;li&gt;You can now use virtiofs on s390x, thanks to some endianness fixes, and a
vhost-user-fs-ccw device has been added.&lt;/li&gt;
  &lt;li&gt;Up to now, both fully emulated PCI functions and PCI functions passed via
vfio-pci reported the same values when the guest issued CLP instructions.
However, the passed through functions may use different values for things
such as the supported DMA range. If the host kernel supplies the respective
capabilities for the vfio-pci device, QEMU can now provide the real values in
the CLP queries.&lt;/li&gt;
  &lt;li&gt;zPCI is now also able to honour vfio DMA limits, if passed via the vfio-pci
device, and can trigger the guest to flush its DMA mappings when needed.&lt;/li&gt;
  &lt;li&gt;The s390-ccw bios now tries harder to find a bootable device, if the first
device is not suitable. This brings s390x booting a bit closer to what other
architectures do.&lt;/li&gt;
  &lt;li&gt;And the usual fixes and cleanups.&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">As, once again, a new QEMU release is around the corner, the time has come to list some s390x changes in there.</summary></entry><entry><title type="html">Configuring mediated devices (Part 2)</title><link href="https://people.redhat.com/~cohuck/2020/07/29/configuring-mediated-devices-part-2.html" rel="alternate" type="text/html" title="Configuring mediated devices (Part 2)" /><published>2020-07-29T13:12:00+02:00</published><updated>2020-07-29T13:12:00+02:00</updated><id>https://people.redhat.com/~cohuck/2020/07/29/configuring-mediated-devices-part-2</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/07/29/configuring-mediated-devices-part-2.html">&lt;p&gt;In the &lt;a href=&quot;/~cohuck/2020/07/27/configuring-mediated-devices-part-1.html&quot;&gt;last part&lt;/a&gt;
of this article, I talked about configuring a mediated device directly via
sysfs. This is a bit cumbersome, and you may want to make your configuration
more permanent. Fortunately, there is tooling available for this.&lt;/p&gt;

&lt;h1 id=&quot;driverctl-bind-to-the-correct-driver&quot;&gt;driverctl: bind to the correct driver&lt;/h1&gt;

&lt;p&gt;&lt;a href=&quot;https://gitlab.com/driverctl/driverctl&quot;&gt;driverctl&lt;/a&gt; is a tool to manage the driver that a device may bind to.
As a device that is supposed to be used via vfio will need to be bound to a
vfio driver instead of its ‘normal’ driver, it makes sense to add some
configuration that makes sure that this binding is actually done automatically.
While driverctl had originally been implemented to work with PCI devices, the
css bus (for subchannel devices) supports management with driverctl as of Linux
5.3 as well. (The ap bus for crypto devices does not support setting driver
overrides, as it implements a different mechanism.)&lt;/p&gt;

&lt;h2 id=&quot;example-vfio-ccw&quot;&gt;Example (vfio-ccw)&lt;/h2&gt;

&lt;p&gt;Let’s reuse the example from the last post, where we wanted to assign the
device behind subchannel &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.0313&lt;/code&gt; to the guest. In order to set a driver
override, use&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# driverctl -b css set-override 0.0.0313 vfio_ccw
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If the subchannel is not currently bound to the vfio-ccw driver already,
it will be unbound from its driver and bound to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vfio_ccw&lt;/code&gt;. Moreover, a udev
rule to bind the subchannel to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vfio_ccw&lt;/code&gt; automatically in the future will be
added.&lt;/p&gt;

&lt;p&gt;Unfortunately, a word of caution regarding the udev rule is in order: As
uevents on the css bus for I/O subchannels are delayed until after device
recognition has been performed, automatic binding may not work out as desired.
We plan to address that in the future by reworking the way the css bus handles
uevents; until then, you may have to trigger a rebind manually. Also, keep in
mind that the subchannel id for a device may not be stable (as mentioned
previously); automation should be used cautiously in that case.&lt;/p&gt;

&lt;h1 id=&quot;mdevctl-manage-mediated-devices&quot;&gt;mdevctl: manage mediated devices&lt;/h1&gt;

&lt;p&gt;The more tedious part of configuring a passthrough setup is configuring and
managing mediated devices. To help with that, &lt;a href=&quot;https://github.com/mdevctl/mdevctl&quot;&gt;mdevctl&lt;/a&gt; has been
written. It can create, modify, and remove mediated devices (and optionally
make those changes persistent), work with configurations and devices created
via other means, and list mediated devices and the different types that are
supported.&lt;/p&gt;

&lt;h2 id=&quot;creating-a-mediated-device&quot;&gt;Creating a mediated device&lt;/h2&gt;

&lt;p&gt;In order to create a mediated device, you need a uuid. You can either provide
your own (as in the manual case), or let mdevctl pick one for you. In order to
get the same configuration as in the manual configuration examples, let’s
create a vfio-ccw device with the same uuid as before.&lt;/p&gt;

&lt;p&gt;The following command defines the same mediated device as in the manual example:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# mdevctl define -u 7e270a25-e163-4922-af60-757fc8ed48c6 -p 0.0.0313 -t vfio_ccw-io -a
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note the ‘&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-a&lt;/code&gt;’, which instructs mdevctl to start the device automatically from
now on.&lt;/p&gt;

&lt;p&gt;After you’ve created the device, you can check which devices mdevctl is now
aware of:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list -d
7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that the ‘&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-d&lt;/code&gt;’ instructs mdevctl to show defined, but not started devices.&lt;/p&gt;

&lt;p&gt;Let’s start the device:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl start -u 7e270a25-e163-4922-af60-757fc8ed48c6
[root@host ~] # mdevctl list -d 7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io auto (active)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The mediated device is now ready to be used and can be passed to a guest.&lt;/p&gt;

&lt;h2 id=&quot;making-your-configuration-persistent&quot;&gt;Making your configuration persistent&lt;/h2&gt;

&lt;p&gt;If you already created a mediated device manually, you may want to reuse the
existing configuration and make it persistent, instead of starting from scratch.&lt;/p&gt;

&lt;p&gt;So, let’s create another vfio-ccw the manual way:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # uuidgen
b29e4ca9-5cdb-4ee1-a01b-79085b9ab237
[root@host ~] # echo &quot;b29e4ca9-5cdb-4ee1-a01b-79085b9ab237&quot; &amp;gt; /sys/bus/css/drivers/vfio_ccw/0.0.0314/mdev_supported_types/vfio_ccw-io/create
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;mdevctl now actually knows about the active device (in addition to the device
we configured before):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list
b29e4ca9-5cdb-4ee1-a01b-79085b9ab237 0.0.0314 vfio_ccw-io
7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io (defined)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But it obviously does not have a definition for the manually created device:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list -d
7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io auto (active)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On a restart, the new device would be gone again; but we can make it persistent:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl define -u b29e4ca9-5cdb-4ee1-a01b-79085b9ab237
[root@host ~ ] mdevctl list
b29e4ca9-5cdb-4ee1-a01b-79085b9ab237 0.0.0314 vfio_ccw-io (defined)
7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io (defined)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If you check under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/mdevctl.d/&lt;/code&gt;, you will find that an appropriate JSON
file has been created:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # cat /etc/mdevctl.d/0.0.0314/b29e4ca9-5cdb-4ee1-a01b-79085b9ab237
{
	&quot;mdev_type&quot;: &quot;vfio_ccw-io&quot;,
	&quot;start&quot;: &quot;manual&quot;,
	&quot;attrs&quot;: []
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(Note that this device is not automatically started by default.)&lt;/p&gt;

&lt;h2 id=&quot;modifying-an-existing-device&quot;&gt;Modifying an existing device&lt;/h2&gt;

&lt;p&gt;There are good reasons to modify an existing device: you may want to modify
your setup, or, in the case of vfio-ap, you need to modify some attributes
before being able to use the device in the first place.&lt;/p&gt;

&lt;p&gt;Let’s first create the device. This command creates the same device as created
manually in the last post:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl define -u &quot;669d9b23-fe1b-4ecb-be08-a2fabca99b71&quot; --parent matrix --type vfio_ap-passthrough
[root@host ~] # mdevctl list -d
669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This device is not yet very useful, as you still need to assign some queues to
it. It now looks like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list -d -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --dumpjson
{
	&quot;mdev_type&quot;: &quot;vfio_ap-passthrough&quot;,
	&quot;start&quot;: &quot;manual&quot;
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s modify the device and add some queues:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] #&amp;amp;nbsp;mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_adapter --value=5
[root@host ~] #&amp;amp;nbsp;mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=4
[root@host ~] #&amp;amp;nbsp;mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=0xab
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The device’s JSON now looks like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list -d -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --dumpjson
{
	&quot;mdev_type&quot;: &quot;vfio_ap-passthrough&quot;,
	&quot;start&quot;: &quot;manual&quot;,
	&quot;attrs&quot;: [
		{
			&quot;assign_adapter&quot;: &quot;5&quot;
		},
		{
			&quot;assign_domain&quot;: &quot;4&quot;
		},
		{
			&quot;assign_domain&quot;: &quot;0xab&quot;
		}
	]
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is now exactly what we had defined manually in the last post.&lt;/p&gt;

&lt;p&gt;But what if you notice that you want domain 0x42 instead of domain 4? Just
modify the definition. To make it easier to figure out how to specify the
attribute to manipulate, use this output:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # devctl list -dv -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71
669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual
Attrs:
	@{0}: {&quot;assign_adapter&quot;:&quot;5&quot;}
	@{1}: {&quot;assign_domain&quot;:&quot;4&quot;}
	@{2}: {&quot;assign_domain&quot;:&quot;0xab&quot;}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You want to remove attribute 1, and add a new value:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --delattr --index=1
[root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=0x42
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s check that it now looks as desired:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # mdevctl list -dv -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71
669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual
Attrs:
	@{0}: {&quot;assign_adapter&quot;:&quot;5&quot;}
	@{1}: {&quot;assign_domain&quot;:&quot;0xab&quot;}
	@{2}: {&quot;assign_domain&quot;:&quot;0x42&quot;}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h1 id=&quot;future-development&quot;&gt;Future development&lt;/h1&gt;

&lt;p&gt;While mdevctl works perfectly fine for managing individual mediated devices, it
does not maintain a view of the complete system. This means you notice conflicts
between two devices only when you try to activate the second one. In the case of
vfio-ap, the rules to be considered are complex, and there is quite some
potential for conflict. In order to be able to catch that kind of problem early,
we plan to add callouts to mdevctl, which would e.g. allow to invoke a tool for
validation when a new device is added, but before it is activated. This is
potentially useful for other device types as well.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><category term="vfio" /><summary type="html">In the last part of this article, I talked about configuring a mediated device directly via sysfs. This is a bit cumbersome, and you may want to make your configuration more permanent. Fortunately, there is tooling available for this.</summary></entry><entry><title type="html">Configuring mediated devices (Part 1)</title><link href="https://people.redhat.com/~cohuck/2020/07/27/configuring-mediated-devices-part-1.html" rel="alternate" type="text/html" title="Configuring mediated devices (Part 1)" /><published>2020-07-27T18:55:00+02:00</published><updated>2020-07-27T18:55:00+02:00</updated><id>https://people.redhat.com/~cohuck/2020/07/27/configuring-mediated-devices-part-1</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/07/27/configuring-mediated-devices-part-1.html">&lt;p&gt;vfio-mdev has become popular over the last few years for assigning certain
classes of devices to guests. On the s390x side, vfio-ccw and vfio-ap are using
the vfio-mdev framework for making channel devices and crypto adapters
accessible to guests.&lt;/p&gt;

&lt;p&gt;This and a follow-up article aim to give an overview of the infrastructure, how
to set up and manage devices, and how to use tooling for this.&lt;/p&gt;

&lt;h1 id=&quot;what-is-a-mediated-device&quot;&gt;What is a mediated device?&lt;/h1&gt;

&lt;h2 id=&quot;a-general-overview&quot;&gt;A general overview&lt;/h2&gt;

&lt;p&gt;Mediated devices grew out of the need to build upon the existing vfio
infrastructure in order to support more fine grained management of resources.
Some of the initial use cases included GPUs and (maybe somewhat surprisingly)
s390 channel devices.&lt;/p&gt;

&lt;p&gt;When using the mediated device (mdev) API, common tasks are performed in the
mdev core driver (like device management), while device-specific tasks are done
in a vendor driver. Current in-kernel examples of vendor drivers are the Intel
vGPU driver, vfio-ccw, and vfio-ap.&lt;/p&gt;

&lt;h2 id=&quot;examples-on-s390&quot;&gt;Examples on s390&lt;/h2&gt;

&lt;h3 id=&quot;vfio-ccw&quot;&gt;vfio-ccw&lt;/h3&gt;

&lt;p&gt;vfio-ccw can be used to assign channel devices. It is pretty straightforward:
vfio-ccw is an alternative driver for I/O subchannels, and a single mediated
device per subchannel is supported.&lt;/p&gt;

&lt;h3 id=&quot;vfio-ap&quot;&gt;vfio-ap&lt;/h3&gt;

&lt;p&gt;vfio-ap can be used to assign crypto cards/queues (APQNs). It is a bit more
involved, requiring prior setup on the ap bus level and configuration of a
‘matrix’ device. Complex relationships between the resources that can be
assigned to different guests exist. Configuration-wise, this is probably the
most complex mediated device available today.&lt;/p&gt;

&lt;h1 id=&quot;configuring-a-mediated-device-the-manual-way&quot;&gt;Configuring a mediated device: the manual way&lt;/h1&gt;

&lt;p&gt;Mediated devices can be configured manually via sysfs operations. This is a good
way to see what actually happens, but probably not what you want to do as a
general administration task. Tools to help here will be introduced in part 2 of
this article.&lt;/p&gt;

&lt;p&gt;I will show the steps for both vfio-ccw and vfio-ap, just to show two different
approaches. (Both examples are also used in the QEMU documentation, in case this
looks familiar.)&lt;/p&gt;

&lt;h2 id=&quot;binding-to-the-correct-driver&quot;&gt;Binding to the correct driver&lt;/h2&gt;

&lt;h3 id=&quot;vfio-ccw-1&quot;&gt;vfio-ccw&lt;/h3&gt;

&lt;p&gt;Assume you want to use a DASD with the device bus ID &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.2b09&lt;/code&gt;. As vfio-ccw
operates on the subchannel level, you first need to locate the subchannel for
this device:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# lscss | grep 0.0.2b09 | awk &apos;{print $2}&apos;
0.0.0313
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(A word of caution: a device is not guaranteed to use the same subchannel at all
times; on LPARs, the subchannel number will usually be stable, but z/VM – and
QEMU – assign subchannel numbers in a consecutive order. If you don’t get any
hotplug events for a device, the subchannel number will stay stable for at least
as long as the guest is running, though.)&lt;/p&gt;

&lt;p&gt;Now you need to unbind the subchannel device from the default I/O subchannel
driver and bind it to the vfio-ccw driver (make sure the device is not in use!):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# echo 0.0.0313 &amp;gt; /sys/bus/css/devices/0.0.0313/driver/unbind
[root@host ~]# echo 0.0.0313 &amp;gt; /sys/bus/css/drivers/vfio_ccw/bind
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;vfio-ap-1&quot;&gt;vfio-ap&lt;/h3&gt;

&lt;p&gt;You need to perform some preliminary configuration of your crypto adapters
before you can use any of them with vfio-ap. If nothing different has been set
up, a crypto adapter will only bind to the default device drivers, and you
cannot use it via vfio-ap. In order to be able to bind an adapter to vfio-ap,
you first need to modify the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/sys/bus/ap/apmask&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/sys/bus/ap/aqmask&lt;/code&gt;
entries. Both are basically bitmasks that indicate that the matching adapter IDs
respectively queue indices can only be bound to the default drivers. If you want
to use a certain APQN via vfio-ap, you need to unset the respective bits.&lt;/p&gt;

&lt;p&gt;Let’s assume you want to assign the APQNs (5, 4) and (5, ab). First, you need to
make the adapter and the domains available to non-default drivers:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# echo -5 &amp;gt; /sys/bus/ap/apmask
[root@host ~]# echo -4, -0xab &amp;gt; /sys/bus/ap/aqmask
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This should result in the devices being bound to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vfio_ap&lt;/code&gt; driver (you can
verify this by looking for them under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/sys/bus/ap/drivers/vfio_ap/&lt;/code&gt;).&lt;/p&gt;

&lt;h2 id=&quot;create-a-mediated-device&quot;&gt;Create a mediated device&lt;/h2&gt;

&lt;p&gt;The basic workflow is “pick a uuid, create a mediated device identified by it”.&lt;/p&gt;

&lt;h3 id=&quot;vfio-ccw-2&quot;&gt;vfio-ccw&lt;/h3&gt;

&lt;p&gt;For vfio-ccw, the two steps of the basic workflow are enough:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# uuidgen
7e270a25-e163-4922-af60-757fc8ed48c6
[root@host ~]# echo &quot;7e270a25-e163-4922-af60-757fc8ed48c6&quot; &amp;gt; /sys/bus/css/devices/0.0.0313/mdev_supported_types/vfio_ccw-io/create
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;vfio-ap-2&quot;&gt;vfio-ap&lt;/h3&gt;

&lt;p&gt;For vfio-ap, you need a more involved approach. The uuid is used to create a
mediated device under the ‘&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;matrix&lt;/code&gt;’ device:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~] # uuidgen
669d9b23-fe1b-4ecb-be08-a2fabca99b71
[root@host ~]# echo &quot;669d9b23-fe1b-4ecb-be08-a2fabca99b71&quot; &amp;gt; /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/create
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This mediated device will need to collect all APQNs that you want to pass to a
specific guest. For that, you need to use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assign_adapter&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assign_domain&lt;/code&gt;,
and possibly &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assign_control_domain&lt;/code&gt; attributes (we’ll ignore control domains
for simplicity’s sake.) All attributes have a companion &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unassign_&lt;/code&gt; attribute
to remove adapters/domains from the mediated device again. You can only assign
adapters/domains that you removed from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;apmask&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aqmask&lt;/code&gt; in the previous step.
To follow up on our example again:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# echo 5 &amp;gt; /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/669d9b23-fe1b-4ecb-be08-a2fabca99b71/assign_adapter
[root@host ~]# echo 4 &amp;gt; /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/669d9b23-fe1b-4ecb-be08-a2fabca99b71/assign_domain
[root@host ~]# echo 0xab &amp;gt; /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/669d9b23-fe1b-4ecb-be08-a2fabca99b71/assign_domain
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If you want to make sure that the mediated device is set up correctly, check via&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[root@host ~]# cat /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/669d9b23-fe1b-4ecb-be08-a2fabca99b71/matrix
05.0004
05.00ab
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;configuring-qemulibvirt&quot;&gt;Configuring QEMU/libvirt&lt;/h2&gt;

&lt;p&gt;Your mediated device is now ready to be passed to a guest.&lt;/p&gt;

&lt;h3 id=&quot;vfio-ccw-3&quot;&gt;vfio-ccw&lt;/h3&gt;

&lt;p&gt;Let’s assume you want the device to show up as device &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.1234&lt;/code&gt; in the guest.&lt;/p&gt;

&lt;p&gt;For the QEMU command line, use&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;-device vfio-ccw,devno=fe.0.1234,sysfsdev=/sys/bus/mdev/devices/7e270a25-e163-4922-af60-757fc8ed48c6
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For libvirt, use the following XML snippet in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;devices&amp;gt;&lt;/code&gt; section:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;lt;hostdev mode=&apos;subsystem&apos; type=&apos;mdev&apos; managed=&apos;no&apos; model=&apos;vfio-ccw&apos;&amp;gt;
	&amp;lt;source&amp;gt;
		&amp;lt;address uuid=&apos;7e270a25-e163-4922-af60-757fc8ed48c6&apos;/&amp;gt;
	&amp;lt;/source&amp;gt;
	&amp;lt;address type=&apos;ccw&apos; cssid=&apos;0xfe&apos; ssid=&apos;0x0&apos; devno=&apos;0x1234&apos;/&amp;gt;
&amp;lt;/hostdev&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;vfio-ap-3&quot;&gt;vfio-ap&lt;/h3&gt;

&lt;p&gt;Any APQNs will show up in the guest exactly as they show up in the host (i.e.,
no remapping is possible.)&lt;/p&gt;

&lt;p&gt;For the QEMU command line, use&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/669d9b23-fe1b-4ecb-be08-a2fabca99b71
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For libvirt, use the following XML snippet in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;devices&amp;gt;&lt;/code&gt; section:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;lt;hostdev mode=&apos;subsystem&apos; type=&apos;mdev&apos; managed=&apos;no&apos; model=&apos;vfio-ap&apos;&amp;gt;
	&amp;lt;source&amp;gt;
		&amp;lt;address uuid=&apos;669d9b23-fe1b-4ecb-be08-a2fabca99b71&apos;/&amp;gt;
	&amp;lt;/source&amp;gt;
&amp;lt;/hostdev&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h1 id=&quot;tooling&quot;&gt;Tooling&lt;/h1&gt;

&lt;p&gt;All this manual setup is a bit tedious; the next article in this series will
look at some of the tooling that is available for mediated devices.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><category term="vfio" /><summary type="html">vfio-mdev has become popular over the last few years for assigning certain classes of devices to guests. On the s390x side, vfio-ccw and vfio-ap are using the vfio-mdev framework for making channel devices and crypto adapters accessible to guests.</summary></entry><entry><title type="html">s390x changes in QEMU 5.1</title><link href="https://people.redhat.com/~cohuck/2020/07/10/s390x-changes-in-qemu-51.html" rel="alternate" type="text/html" title="s390x changes in QEMU 5.1" /><published>2020-07-10T16:27:00+02:00</published><updated>2020-07-10T16:27:00+02:00</updated><id>https://people.redhat.com/~cohuck/2020/07/10/s390x-changes-in-qemu-51</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/07/10/s390x-changes-in-qemu-51.html">&lt;p&gt;QEMU has entered softfreeze for 5.1, so it is time to summarize the s390x
changes in that version.&lt;/p&gt;

&lt;h1 id=&quot;protected-virtualization&quot;&gt;Protected virtualization&lt;/h1&gt;

&lt;p&gt;One of the biggest features on the s390/KVM side in Linux 5.7 had been protected
virtualization aka secure execution, which basically restricts the (untrusted)
hypervisor from accessing all of the guest’s memory and delegates many tasks to
the (trusted) ultravisor. QEMU 5.1 introduces the QEMU part of the feature.&lt;/p&gt;

&lt;p&gt;In order to be able to run protected guests, you need to run on a z15 or a Linux
One III, with at least a 5.7 kernel. You also need an up-to-date s390-tools
installation. Some details are available &lt;a href=&quot;https://www.qemu.org/docs/master/system/s390x/protvirt.html&quot;&gt;in the QEMU documentation&lt;/a&gt;.
For more information about what protected virtualization is, watch
&lt;a href=&quot;https://www.youtube.com/watch?v=J2YibrLfB4s&quot;&gt;this talk from KVM Forum 2019&lt;/a&gt;
and &lt;a href=&quot;https://media.ccc.de/v/36c3-107-the-challenges-of-protected-virtualization&quot;&gt;this talk from 36C3&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;vfio-ccw&quot;&gt;vfio-ccw&lt;/h1&gt;

&lt;p&gt;vfio-ccw has also seen some improvements over the last release cycle.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Requests that do not explicitly allow prefetching in the ORB are no longer
rejected out of hand (although the kernel may still do so, if you run a
pre-5.7 version.) The rationale behind this is that most device drivers never
modify their channel programs dynamically, and the one common code path that
does (IPL from DASD) is already accommodated by the s390-ccw bios. While you
can instruct QEMU to ignore the prefetch requirement for selected devices,
this is an additional administrative complication for little benefit; it is
therefore no longer required.&lt;/li&gt;
  &lt;li&gt;In order to be able to relay changes in channel path status to the guest, two
new regions have been added: a schib region to relay real data to stsch, and a
crw region to relay channel reports. If, for example, a channel path is varied
off on the host, all guests using a vfio-ccw device that uses this channel
path now get a proper channel report for it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;other-changes&quot;&gt;Other changes&lt;/h1&gt;

&lt;p&gt;Other than the bigger features mentioned above, there have been the usual fixes,
improvements, and cleanups, both in the main s390x QEMU code and in the s390-ccw
bios.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">QEMU has entered softfreeze for 5.1, so it is time to summarize the s390x changes in that version.</summary></entry><entry><title type="html">s390x changes in QEMU 5.0</title><link href="https://people.redhat.com/~cohuck/2020/04/08/s390x-changes-in-qemu-50.html" rel="alternate" type="text/html" title="s390x changes in QEMU 5.0" /><published>2020-04-08T13:58:00+02:00</published><updated>2020-04-08T13:58:00+02:00</updated><id>https://people.redhat.com/~cohuck/2020/04/08/s390x-changes-in-qemu-50</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/04/08/s390x-changes-in-qemu-50.html">&lt;p&gt;QEMU is currently in hardfreeze, with the 5.0 release expected at the end of the
month. Here’s a quick list of some notable s390x changes.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;You can finally enable Adapter Interrupt Suppression in the cpu model (ais=on)
when running under KVM. This had been working under TCG for some time now, but
KVM was missing an interface that was provided later – and we finally
actually check for that interface in QEMU. This is mostly interesting for PCI.&lt;/li&gt;
  &lt;li&gt;QEMU had been silently fixing odd memory sizes to something that can be
reported via SCLP for some time. Silently changing user input is probably not
such a good idea; compat machines will continue to do so to enable migration
from old QEMUs for machines with odd sizes, but will print a warning now. If
you have such an old machine (and you can modify it), it might be a good idea
to either specify the memory size it gets rounded to or to switch to the 5.0
machine type, where memory sizes can be more finegrained due to the removal of
support for memory hotplug. We may want to get rid of the code doing the fixup
at some time in the future.&lt;/li&gt;
  &lt;li&gt;QEMU now properly performs the whole set of initial, clear, and normal cpu
reset.&lt;/li&gt;
  &lt;li&gt;And the usual fixes, cleanups, and improvements.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For 5.1, expect more changes; support for protected virtualization will be a big
item.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><summary type="html">QEMU is currently in hardfreeze, with the 5.0 release expected at the end of the month. Here’s a quick list of some notable s390x changes.</summary></entry><entry><title type="html">Channel Measurements: A Quick Overview</title><link href="https://people.redhat.com/~cohuck/2020/01/22/channel-measurements-quick-overview.html" rel="alternate" type="text/html" title="Channel Measurements: A Quick Overview" /><published>2020-01-22T14:42:00+01:00</published><updated>2020-01-22T14:42:00+01:00</updated><id>https://people.redhat.com/~cohuck/2020/01/22/channel-measurements-quick-overview</id><content type="html" xml:base="https://people.redhat.com/~cohuck/2020/01/22/channel-measurements-quick-overview.html">&lt;p&gt;The s390 channel subsystem can gather some statistics on I/O performance for
you, which might be useful if you try to figure out why something is not
performing as well as you’d expect it to be. From a QEMU/KVM perspective, this
is currently mainly useful on the host.&lt;/p&gt;

&lt;h1 id=&quot;channel-monitoring-for-ccw-devices&quot;&gt;Channel monitoring for ccw devices&lt;/h1&gt;

&lt;p&gt;The first kind of channel measurements is those collected per subchannel. For a
detailed overview of what actually happens there, turn to the Principles of
Operation, Chapter 17 (“I/O Support Functions”), “Channel Monitoring”. I’ll
cover here what will most likely be of interest to people running a Linux (host)
system.&lt;/p&gt;

&lt;h2 id=&quot;enabling-channel-measurements&quot;&gt;Enabling channel measurements&lt;/h2&gt;

&lt;p&gt;If you a running a non-vintage machine (i.e. a z990 or later), you will not need
a system-wide setup. Older machines should be fine as well, if you do not want
to measure more than 1024 devices.&lt;/p&gt;

&lt;p&gt;To enable measurements for a specific ccw device (say, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.1234&lt;/code&gt;), simply
issue:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;chccwdev -a cmb_enable=1 0.0.1234
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;measurements-collected&quot;&gt;Measurements collected&lt;/h2&gt;

&lt;p&gt;Under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/sys/bus/ccw/device/0.0.1234/&lt;/code&gt;, you should now have a new subdirectory
called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cmf&lt;/code&gt;, which contains some files. For a system that has been running for
some time, the contents may look something like the following:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;head cmf/*
==&amp;gt; cmf/avg_control_unit_queuing_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_active_only_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_busy_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_connect_time &amp;lt;==
829031
==&amp;gt; cmf/avg_device_disconnect_time &amp;lt;==
398526
==&amp;gt; cmf/avg_function_pending_time &amp;lt;==
142810
==&amp;gt; cmf/avg_initial_command_response_time &amp;lt;==
19170
==&amp;gt; cmf/avg_sample_interval &amp;lt;==
8401681344
==&amp;gt; cmf/avg_utilization &amp;lt;==
00.0%
==&amp;gt; cmf/sample_count &amp;lt;==
10803
==&amp;gt; cmf/ssch_rsch_count &amp;lt;==
10803
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that all values but &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sample_count&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ssch_rsch_count&lt;/code&gt; are averaged over
time. We also see that samples seem to have been taken whenever the driver
issued a ssch.&lt;/p&gt;

&lt;p&gt;The device in our example shows an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;avg_utilization&lt;/code&gt; of 0%, which is consistent
with a device that mostly sits idle. But what about a device where something is
actually happening?&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;head cmf/*
==&amp;gt; cmf/avg_control_unit_queuing_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_active_only_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_busy_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_connect_time &amp;lt;==
58454
==&amp;gt; cmf/avg_device_disconnect_time &amp;lt;==
16743818
==&amp;gt; cmf/avg_function_pending_time &amp;lt;==
99322
==&amp;gt; cmf/avg_initial_command_response_time &amp;lt;==
20284
==&amp;gt; cmf/avg_sample_interval &amp;lt;==
153014636
==&amp;gt; cmf/avg_utilization &amp;lt;==
11.0%
==&amp;gt; cmf/sample_count &amp;lt;==
1281
==&amp;gt; cmf/ssch_rsch_count &amp;lt;==
1281
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here, we see a higher &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;avg_utilization&lt;/code&gt;, but actually not that many ssch
invocations. Interesting is the relatively high value of
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;avg_device_disconnect_time&lt;/code&gt;: It indicates that there are quite long intervals
where the device and the channel subsystem do not talk to each other. That
might, for example, happen if other LPARs on the same system drive a lot of I/O
via the same channel paths as the device.&lt;/p&gt;

&lt;h2 id=&quot;help-i-cannot-enable-channel-measurements-on-my-device&quot;&gt;Help, I cannot enable channel measurements on my device!&lt;/h2&gt;

&lt;p&gt;There’s one drawback when trying to enable channel measurements on a live
device: It needs to execute a msch, which only can be done on an idle
subchannel. For devices that execute separate ssch invocations to go about their
business (e.g. dasd), the common I/O layer can squeeze in the msch between ssch
invocations and all is well. However, some devices use a long-running channel
program, which will not conclude during the time the device is enabled; the most
prominent example are devices using QDIO, like zFCP adapters or OSA cards. In
that case, the common I/O layer cannot squeeze in a msch; you might try
disabling the device, but that’s usually not something you want to do in a live
system.&lt;/p&gt;

&lt;h1 id=&quot;extended-channel-measurements&quot;&gt;Extended channel measurements&lt;/h1&gt;

&lt;p&gt;What if you want to find out something not about an individual device, but for a
channel path? There’s a feature for that; you can issue&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;echo 1 &amp;gt; /sys/devices/css0/cm_enable
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and will find new entries (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;measurement&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;measurement_chars&lt;/code&gt;) under the various
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;chp0.xx&lt;/code&gt; objects.&lt;/p&gt;

&lt;p&gt;Unfortunately, these attributes only provide some binary data, which does not
seem to be publicly documented, and I’m not aware of any tool that can parse
them.&lt;/p&gt;

&lt;h1 id=&quot;channel-measurements-in-qemu-guests&quot;&gt;Channel measurements in QEMU guests&lt;/h1&gt;

&lt;p&gt;So far, all measurements have been collected on the host; but what about
measurements in the guest?&lt;/p&gt;

&lt;p&gt;The good news: You can turn on channel measurements for ccw devices in the
guest. The bad news: They are not very useful.&lt;/p&gt;

&lt;p&gt;Consider, for example, this virtio-ccw device:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;head cmf/*
==&amp;gt; cmf/avg_control_unit_queuing_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_active_only_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_busy_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_connect_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_disconnect_time &amp;lt;==
0
==&amp;gt; cmf/avg_function_pending_time &amp;lt;==
0
==&amp;gt; cmf/avg_initial_command_response_time &amp;lt;==
0
==&amp;gt; cmf/avg_sample_interval &amp;lt;==
-1
==&amp;gt; cmf/avg_utilization &amp;lt;==
00.0%
==&amp;gt; cmf/sample_count &amp;lt;==
0
==&amp;gt; cmf/ssch_rsch_count &amp;lt;==
134
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No samples, just a ssch count. Why? QEMU does not fully emulate the sampling
infrastructure; only counting of ssch is done (which is very easy to implement).
Moreover, virtio-ccw devices use channel programs mainly to set up queues,
negotiate features, etc., so measurements here do not reflect what is going on
on the virtqueues, which would be the interesting part for performance issues.&lt;/p&gt;

&lt;p&gt;But what about a dasd passed through via vfio-ccw? That one should have more
statistics, right?&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;head cmf/*
==&amp;gt; cmf/avg_control_unit_queuing_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_active_only_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_busy_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_connect_time &amp;lt;==
0
==&amp;gt; cmf/avg_device_disconnect_time &amp;lt;==
0
==&amp;gt; cmf/avg_function_pending_time &amp;lt;==
0
==&amp;gt; cmf/avg_initial_command_response_time &amp;lt;==
0
==&amp;gt; cmf/avg_sample_interval &amp;lt;==
-1
==&amp;gt; cmf/avg_utilization &amp;lt;==
00.0%
==&amp;gt; cmf/sample_count &amp;lt;==
0
==&amp;gt; cmf/ssch_rsch_count &amp;lt;==
144
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No samples, just a ssch count, again. Why? Currently, vfio-ccw uses the same
emulation infrastructure as the other emulated devices. In the future, we may
implement some kind of passthrough for channel measurements, but that requires
some work.&lt;/p&gt;</content><author><name>Cornelia Huck</name></author><category term="Channel I/O" /><summary type="html">The s390 channel subsystem can gather some statistics on I/O performance for you, which might be useful if you try to figure out why something is not performing as well as you’d expect it to be. From a QEMU/KVM perspective, this is currently mainly useful on the host.</summary></entry></feed>