I am looking for an easy way to pre-pull docker images required to run a specific rancher version. When I run “Registration Command” on a new node, it first downloads rancher-agent image, and then during installation process it downloads all the other images e.g. rke-tools, rancher/hyperkube and more. I want to prepare EC2 AMI image with already prepulled rancher images. I found sth like this GitHub - rancher/pre-pull-images: Pre-pull Rancher infrastructure images based on Rancher version, but it seems outdated and without any README. Is there any tool that I could use to automate this process ?
There are a few options for this. You could bake an AMI with Hashicorp Packer and as part of provisioning pull all the images that Rancher needs (the images and versions are all published for each version). Of course that brings up the debate about build time provisioning (bake) vs. launch time (fry) and the potential maintenance overhead for version upgrades and so on. OTOH you do benefit from faster node launch and scaling as well as removing one potential failure condition at a time when you least want that to happen (launching a node in Production). It depends which matters more to you.
Have you looked at the Rancher air-gap installation instructions ?
IME this is more critical when using Windows nodes since those images are typically much larger, and/or when you are trying to minimise costs by scaling your node pool up and down more dynamically (be careful with that it can easily turn into a bit of a nightmare - personally I would prefer over provisioning using cheaper spot instances but YMMV or your company policy may not allow that in Prod)
Another consideration is where you are pulling your images from. Again, typically, pulling from your own private registry on your internal network (or within your cloud VPC) will be significantly faster, and may mitigate the time required to create and scale your cluster(s), and mean that your base AMI is simpler and, in all likelihood, you will only need one generic version with upgrades more to do with OS security patching.
Yes, I am using Packer for this. I have my own scaling solution, and I want to have faster worker boot up.
I already found out that there is a rancher-images.txt file with every new rancher release. So I will just add proper docker pull based on this file to my ansible provisioner. However, I am not sure if the downloading all 400 images is the best option. Maybe, like you said, creating local private registry might be simpler and almost as fast as baking all images into new AMI. I will check it and compare.
the 400+ images (excluding Windows) obviously cover all versions of each component (for example 9 versions of k8s), so, depending on whether you want to be able to support every version you won’t necessarily need all of them. That said, it’s non trivial to go through and match each version of every image back to its k8s release so TBH it’s easier from a maintenance POV to just add them all. Depending on where you are putting them (e.g. private registry) you may benefit from cross repository caching (not all support that but many do), so the loading overhead isn’t always as bad as it first appears. Same is true when you upgrade. Also, the image pull/push is a build-time activity so is not time sensitive in the way that launching a k8s node is.
These docs (which I expect you have read) provide all the necessary steps and resources needed for managing the save and load process:
At my last place of work we used variant of this where we pulled, scanned and pushed images into our private registry on AWS (ECR) via a simple CI pipeline and made use of the rancher-images.txt to drive that.
It may be different for you, but we only allowed images from our private registry to be deployed into any of our environments, and only images that passed our image assurance policies were permitted to be pushed into that registry (we used aqua-sec image assurance policies for that purpose). Interestingly quite a few of the Rancher provided images did NOT pass and a few contained a significant number of high severity vulnerabilities. We worked with our Rancher technical account manager and our own CISO to mitigate those either by getting them fixed in the image or obtaining a CISO exception to the policy (never an easy conversation !). Our CISO would NEVER allow us to deploy images directly from DockerHub or any other external registry, which, while adding to our workload, did protect us from a number of potential threats.
Thanks for advice. I did some tests and decided the best for me is to pre-pull some images into the build packer image. However, I decided not to download all 400, but to choose only worker node related. I still download all versions of each image, but still it is something around 60. So not so bad.