Dockerfile to Bootable GCP-Optimized VM
2019-08-05
This post is an experiment-turned-solution -- as per usual, my code is available in my experiments Github repository.
This experiment was motivated in part to help support work my team is doing at Dialpad. We're hiring! More specifically, I'm hiring for my team in Vancouver and Kitchener!
tl;dr: turn an arbitrary Dockerfile into a bootable and fully GCP-compatible custom VM image by going here and running
make
.
Why Would I Want to do That?
I recently came across this fantastic post from @iximiuz (Ivan Velichko) which was a wonderful read -- it turns out there's not a whole bunch of work required to convert a Docker image into a bootable Linux disk; it pretty much comes down to installing a kernel, a bootloader, an init manager, and then loading all that into a disk image. From there, the image is bootable in QEMU or trivially convertible to other useful formats through, say, VirtualBox.
Ivan's article has been floating around my head for a couple months now -- his post begun with a challenge I just couldn't ignore:
Well, I don't see any practical applications of the approach I'm going to describe...
Well, I couldn't just let that stand! For a while, I couldn't think of any useful application either, but recently the stars aligned and the perfect problem was presented to me:
Bragging About My Team Problem Statement
At Dialpad, my team has built up a lot of very cool tooling to help out the datascience team -- there are plenty of automated pipelines for transforming an idea and a dataset into an optimized model running in our massively scalable realtime system. But despite having automated and optimized a good chunk of that, sometimes there's just no excuse for creating a VM with exactly the right environment for your model and playing around with your code.
It turns out our datascience team spends some of their time spinning up GCP
instances for exactly this; problem is, even though there's a
huge list of VM images to choose from, the team still needs to
install all the tools they need on top of those images every single time. GCP
lets you create new VM images by cloning instance boot disks,
but that doesn't help us avoid the initial build (which can sometimes take
hours! Libraries for Doing Scienceā¢ are no joke). Not to mention, we're
constantly building new models with different architectures, different
dependencies, different everything! We want our datascientists to spend their
time pushing the edges of AI research, not waiting for a make install
to
complete.
Luckily, the folks at Google were kind enough to allow us to import disk images in addition to cloning them from pre-built instances. That gives us a nice method for building these images automatically without having to try to build and destroy a huge number of VMs just for cloning purposes.
My first thought at this point was to check out Packer -- the tools that come out of the folks at Hashicorp are pretty darn fantastic and I've found many of them to be invaluable in the past. Unfortunately, this isn't quite the right tool for the job here: their GCP image builder only supports cloning instances and their QEMU image builder doesn't actually solve any of the problems we need it to -- namely, Ivan has already taught us how to build bootable images, but making those images work on GCP is a different problem entirely.
I guess that means it's time to roll up our sleeves and start hacking!
The Actually Interesting Part of this Post
I'm going to mostly gloss over our starting point here; I highly recommend reading the original article where all this gets explained. The core of it is that we have some Dockerfile with a kernel and an init system:
1 2 3 4 5 6 7 |
|
We build that dockerfile into a tarball:
1 2 3 4 |
|
And then we use that tarball to create a VM image with a filesystem and a bootloader:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Aside: if you're following along from OSX, you'll need to do all of the building of this image within Docker -- if you're on Linux, you should be able to run everything directly, but no one's stopping you from using Docker anyway! Using a Docker image to build itself into a VM image may be a bit of a brainteaser, but it gets the job done. If that's how you want to do things, I recommend using your target image as the builder image (since it is, by definition, guaranteed to have all the dependencies you'll need) and running within the following context:
1 2 3 4 5 6 7 |
|
At that point, the resulting disk.img
file is immediately bootable in QEMU:
1 |
|
But that's not quite enough on its own to make the instance work the same as other GCP image. Where do we go from here?
Installing GRUB so We Can Actually Boot Up
The first thing you're going to notice about trying to get this image working on GCP is that it won't even boot. What gives? It has an MBR (Master Boot Record) and a bootloader, so why does GCP boot up their instances any differently than QEMU?
Turns out the answer to that is... well, I have no idea, but nestled away in the GCP importable images requirements is that your image must use GRUB (in addition to some other requirements that we'd have to go out of our way to break, so I'll ignore 'em here). So step one in making our image GCP-useable: swap out Syslinux for GRUB.
1 2 3 4 5 6 7 8 9 10 11 |
|
First off, we're going to need to update our Dockerfile: it's going to need the
grub-pc-bin
and grub2-common
packages. Note that you could also install
grub-pc
directly, but that'll include a bunch of stuff we won't need here.
In our builder image, we're not going to need Syslinux anymore (duh!), but we
will need multipath-tools
to give us access to kpartx
, which will let us
deal better with partitions -- in the syslinux code above, we're only mounting
the first partition and blindly overwriting the pre-that-partition disk with
the syslinux MBR. For GRUB, though, we're going to need to actually mount both
the disk and the partition and handle them separately; no blind dd
usage for
us.
1 2 3 4 5 6 7 |
|
At this point, the /dev/loop0
device will be set up as a loopback to our disk
and /dev/loop1
will be pointing to our first (and only) partition. For here,
we can switch out our syslinux install for a GRUB one.
First off, we'll need to tell GRUB what the current state of our disks look
like. We'll configure the boot disk as being /dev/loop0
so the GRUB
installation works (but we'll update that later to the correct value for how
our disk will look post-install!):
1 2 3 4 |
|
We'll temporarily bind the devices seen by our builder image into the virtual disk, so that our disk image can properly be aware of its own disks without needing to boot into it:
1 |
|
Next, we'll chroot
into our disk and have GRUB write out its configuration --
but we'll store that configuration in the builder image rather than the disk
so that we can run the GRUB installer without needing to be booted into that
disk:
1 |
|
Just one more cleanup before we're ready to install GRUB; here we're just
making sure that GRUB expects its disk to be located at /dev/sda1
like any
reasonable system rather than being stuck thinking it needs to boot from a
loopback device (eg. our current configuration):
1 |
|
At this point, we're all set up for running the GRUB installer:
1 2 3 4 |
|
Note the modules we've enabled here: the unfortunately named ext2
module
enables support not just for ext2
but also the ext3
filesystem we're using
(and also ext4
, if you're feeling frisky). part_msdos
enables support for
DOS-style disk partitioning, which is what we configured way up above with
sfdisk
.
Once GRUB is installed, we can have the virtual disk configure its own
device.map
properly rather than pointing to the loopback device and
gracefully unmount our resources:
1 2 3 4 5 6 |
|
Our disk image is bootable again, this time through GRUB rather than Syslinux, but there's just a few things we'll want to configure to help us out down the line: configuring GRUB's output to be in the place/format GCP will expect it and disabling the pointless 5s timer before GRUB boots:
1 2 3 |
|
Adding All the Useful Features
At this point, our image would technically work in GCP, but it wouldn't be all that useful -- it'd be a mostly-filled 2GB disk with a readonly filesystem accessible only over a serial port and with no configured internet access (and its system time would be wrong, too). I suppose there might exist some case where this is what we need, but I don't buy it. Let's go through and fix all those things!
First off, let's update our base image with all the things we'll need later:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
Note that I'm referencing a few files there: dhclient.service is just a simple systemd unit file for starting our DHCP client and expand-root is copied from the most recent in the chain of open-source projects Google used for creating their public VM images (aside: this project is archived, does anyone know if there's a successor?).
We're going to want to make sure our root file system gets mounted with write
permissions (but keep in mind that from here on if you run your image in QEMU
for testing, any changes you make will be reflected in the disk.img
file!):
1 |
|
We'll also want our instance to boot on its own as the root user rather than hanging until we connect to the serial port and log in from there:
1 2 3 4 5 |
|
Finally, Google provides some configuration recommendations: all of these are optional, but they're pretty much strictly a good idea in our case:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Last Mile: Using the Image in GCP
At this point, our image will run in GCP with all the options and features of
any other GCP VM image. If we want to update our VM with any extra
dependencies, all we need to do is update our Dockerfile
and re-run the
build.
To get our image loaded into GCP, we need to convert it to the expected format and push it to Google Storage:
1 2 3 4 5 6 |
|
You may have noticed I referred to that image as "unoptimized" -- that's because we've not yet attached all the final stuff that makes an image truly useful on GCE, things like IAM integration and such. Fortunately, GCP actually has a built-in import tool which can do this for us:
1 2 3 4 |
|
And that's it! We can now create instances from the custom
versioned images
which work the same way any of the provided-by-Google base images do:
1 2 3 4 5 |
|