Monday, December 15, 2014

Setting up a minimal rbd/ceph server for libvirt testing

In my last post I talked about setting up a minimal gluster server. Similarly this will describe how I set up a minimal single node rbd/ceph server in a VM for libvirt network storage testing.

I pulled info from a few different places and a lot of other reading, but things just weren't working on F21; trying systemctl start ceph just wasn't producing any output, and all the ceph cli commands just hung. I had better success with F20.

The main difficulty was figuring out a working ceph.conf. My VM's IP address is 1902.168.124.101, and its hostname is 'localceph', so here's what I ended up with:

auth cluster required = none
auth service required = none
auth client required = none
osd crush chooseleaf type = 0
osd pool default size = 1

mon data = /data/$name
mon addr =
host = localceph

keyring = /data/keyring.$name
host = localceph

osd data = /data/$name
osd journal = /data/$name/journal
osd journal size = 1000
host = localceph

Ceph setup steps:
  • Cloned an existing F20 VM I had kicking around, using virt-manager's clone wizard. I called it f20-ceph.
  • In the VM, disable firewalld and set selinux to permissive. Not strictly required but I wanted to make this as simple as possible.
  • Setup the ceph server:
    • yum install ceph
    • I needed to set a hostname for my VM, ceph won't accept 'localhost': hostnamectl set-hostname localceph
    • mkdir -p /data/mon.0 /data/osd.0
    • Overwrite /etc/ceph/ceph.conf with the content listed above.
    • mkcephfs -a -c /etc/ceph/ceph.conf
    • service ceph start
    • Prove it works from my host with: sudo mount -t ceph $VM_IPADDRESS:/ /mnt
  • Add some storage for testing:
    • Libvirt only connects to Ceph's block device interface, RBD. The above mount example is _not_ what libvirt will see, it just proves we can talk to the server.
    • Import files within the VM like: rbd import $filename
    • List files with: rbd list
Notable here is that no ceph auth is used. Libvirt supports ceph auth but at this stage I didn't want to deal with it for testing. This setup doesn't match what a real deployment would ever look like.

Here's the pool definition I passed to virsh pool-define on my host:

<pool type='rbd'>
    <host name='$VM_IPADDRESS'/>

Thursday, December 11, 2014

Setting up a minimal gluster server for libvirt testing

Recently I've been working on virt-install/virt-manager support for libvirt network storage pools like gluster and rbd/ceph. For testing I set up a single node minimal gluster server in an F21 VM. I mostly followed the gluster quickstart and hit only a few minor hiccups.

Steps for the gluster setup:
  • Cloned an existing F21 VM I had kicking around, using virt-manager's clone wizard. I called it f21-gluster.
  • In the VM, disable firewalld and set selinux to permissive. Not strictly required but I wanted to make this as simple as possible.
  • Setup the gluster server
    • yum install glusterfs-server
    • Edit /etc/glusterfs/glusterd.vol, add: option rpc-auth-allow-insecure on
    • systemctl start glusterd; systemctl enable glusterd
  • Create the volume:
    • mkdir -p /data/brick1/gv0
    • gluster volume create gv0 $VM_IPADDRESS:/data/brick1/gv0
    • gluster volume start gv0
    • gluster volume set gv0 allow-insecure on
  • From my host machine, I verified things were working: sudo mount -t glusterfs $VM_IPADDRESS:/gv0 /mnt
  • I added a couple example files to the directory: a stub qcow2 file, and a boot.iso.
  • Verified that qemu can access the ISO: qemu-system-x86_64 -cdrom gluster://$VM_IPADDRESS/gv0/boot.iso
  • Once I had a working setup, I used virt-manager to create a snapshot of the running VM config. So anytime I want to test gluster, I just start the VM snapshot and I know things are all nicely setup.
The bits about 'allow-insecure' is so that an unprivileged client can access the gluster share, see this bug for more info. The gluster docs also have a section about it but the steps don't appear to be complete.

The final bit is setting up a storage pool with libvirt. The XML I passed to virsh pool-define on my host looks like:

<pool type='gluster'>
    <host name='$VM_IPADDRESS'/>
    <dir path='/'/>

Wednesday, December 10, 2014

qemu-2.2.0 in rawhide, virt-preview disabled for F20

qemu-2.2.0 was released yesterday, check the release announcement and fine grained changelog. Packages are available in rawhide and fedora-virt-preview for Fedora 21.

But now that Fedora 21 is out, there won't be any new builds for F20 virt-preview. If you want to play with the latest and greatest virt bits, you'll need to update to F21.

Friday, December 5, 2014

virt-manager 1.0 creates qcow2 images that don't work on RHEL6

One of the big features we added in virt-manager 1.0 was snapshot support. As part of this change, we switched to using the QCOW2 disk image format for new VMs. We also enable the QCOW2 lazy_refcounts feature that improves performance of some snapshot operations.

However, not all versions of QEMU in the wild can handle lazy_refcounts, and will refuse to use the disk image, particularly RHEL6 QEMU. So by default, a disk image from a VM created with Fedora 20 virt-manager will not run on RHEL6 QEMU, throwing an error like:

... uses a qcow2 feature which is not supported by this qemu version: QCOW version 3 

This has generated some user confusion.

The 'QCOW version 3' is a bit misleading here: while indeed using lazy_refcounts sets a bit in the QCOW2 file header calling it QCOW3, qemu-img and libvirt still call it QCOW2, but with a different 'compat' setting.

Kevin Wolf, one of the QEMU block maintainers, explains it in this mail:
qemu versions starting with 1.1 can use [lazy_refcounts] which require an incompatible on-disk format. Between version 1.1 and 1.6, they needed to be specified explicitly during image creation, like this:

qemu-img create -f qcow2 -o compat=1.1 test.qcow2 8G

Starting with qemu 1.7, compat=1.1 became the default, so that newly created images can't be read by older qemu versions by default. If you need to read them in older version, you now need to be explicit about using the old format:

qemu-img create -f qcow2 -o compat=0.10 test.qcow2 8G

With the same release, qemu 1.7, a new qemu-img subcommand was introduced that allows converting between both versions, so you can downgrade your existing v3 image to the format known by RHEL 6 like this:

qemu-img amend -f qcow2 -o compat=0.10 test.qcow2
As explained, qemu-img with QEMU 1.7+ also defaults to lazy_refcounts/compat=1.1, but also provides a 'qemu-img amend' tool to easily convert between the two formats.

Unfortunately that command is not available on Fedora 20 and older, however you can use the pre-existing 'qemu-img convert' command:

qemu-img convert -f qcow2 -O qcow2 -o compat=0.10 $ORIGPATH $NEWPATH

Beware though, converting between two qcow2 images will drop all internal snapshots in the new image, so only use that option if you don't need to preserve any snapshot data. 'qemu-img amend' will preserve snapshot data.

(Unfortunately at this time virt-manager doesn't provide any way in the UI to _not_ use lazy_refcounts, but you could always use qemu-img/virsh to create a disk image outside of virt-manager, and select it when creating a new VM.)