Some IT problems are easy to solve if one has the ability to control parameters and environments. When there are constraints, however, things can get really tough.
I found myself in such a constrained scenario. I needed to develop a GitLab CI/CD pipeline to install and run a Ruby application kitchen-terraform (GitHub link) on a GitLab CI/CD executor on which I had a very limited set of OS packages (GNU/Linux, Centos-based, yum) I could install. The Ruby version available was very old. In addition, the standard way of installing Ruby gems (language packages) using https://rubygems.org/gems was unavailable. Also, the repository where Ruby source is available is not accessible.
Add to this that I had a work environment available but it was a highly constrained Microsoft Windows system. I could not install Ruby or any other software on that machine. I was able to send myself brief emails from outside systems, however. And, I could download a Ruby source bundle via browser and Ruby gems were available. This Windows system has a install of Python 3.9 with the standard library along with a limited set of additional Python PyPI packages. However, I do have a Linux system that is isolated over which I have full control.
So, let's break this problem down into parts. First, let's build Ruby from source. Via the Windows system I downloaded the Ruby source package ruby-2.7.6.tar.gz and I could insert this into my GitLab git repo, and I started populating the vendor directory with this. Vendor is a typical directory that has meaning to Ruby and its gem ecosystem. So, building Ruby from source is straight forward. Need to install some OS packages - to be able to build Ruby and a few other things I needed to do. Squelch the output of these commands once I know they work well. This will keep the transcript of the GitLab pipeline from being noisy. This will do it:
yum install 'Development Tools' -y &> /dev/null
yum install openssl-devel openssl -y &> /dev/null
tar xf vendor/ruby-2.7.6.tar.gz
cd ruby-2.7.6
./configure --without-rdoc > /dev/null
make > /dev/null
# GitLab CI/CD pipelines run with root as the user
# so no sudo required on this next command
make install > /dev/null
Building Ruby from source takes a bit of time. My pipeline is going to be running regularly so it would be a nice addition if we don't have to do that every run. I cannot permanently change the GitLab executor, though that would be idea. But, GitLab CI/CD has a cache feature, which allows one to save a portion of the directory structure from the prior run of the pipeline and restore this prior to the next run. So, what if we captured the full Ruby build and set that to be cached. Then just run the 'make install'. Seems to make sense.
However, the Ruby build process using a lot of makefiles is a horrendous mess. It actually compiles C code during 'make install'. Yeah, that is badly broken. Beside not building Ruby every time, it would be ideal to not have to even install the Development Tools via yum, because that takes some unnecessary time. So, begin picking apart the makefiles and figure out what 'make install' is actually doing and what can be done to subset these steps. About 8 hours later, I have the minimal set of commands. However, we need a helper makefile to be added to the suite of other makefiles. Let's call this makefile 'only-install-ext.mak' and we will place it at the root directory. This is the contents to put in that file:
${INSTRUBY} --make="$(MAKE)" $(INSTRUBY_ARGS) --install=ext-comm
${INSTRUBY} --make="$(MAKE)" $(INSTRUBY_ARGS) --install=ext-arch
Here is the minimal script to install a previously built Ruby:
yum install make -y > /dev/null
make do-install-bin
make do-install-lib
make -f GNUmakefile -f only-install-ext.make install_ext_special
make do-install-gem
Ok, one more optimization we can do is to reduce the size of the cache GitLab maintains. This means less time restoring the cache. With some iterations, one can determine which directories of the Ruby build are really not needed by deleting the directories and determine if the above install process still works or the directory is needed. These directories are not needed: basictest benchmark bootstraptest cygwin doc sample spec test win32. So, with the full build process, we remove these directories so that, at the end of the pipeline execution, they will not be present to be added to the cache. In the pipeline configuration, we can add this to capture the cache:
build_job:
...
cache:
key: ruby
paths:
- ruby-2.7.6
So, here is the full shell script (bash) for the building/installing Ruby. We start by checking if the cache has been restored. If it has, we just do the "make install" process.
if [ -d ruby-2.7.6 ]; then
cd ruby-2.7.6
yum install make bind-utils openssl -y > /dev/null
make do-install-bin
make do-install-lib
make -f GNUmakefile -f only-install-ext.mak install_ext_special
make do-install-gem
cd ..
else
yum groupinstall 'Development Tools' -y &> /dev/null
yum install openssl-devel bind-utils openssl -y &> /dev/null
tar xf vendor/ruby-2.7.6.tar.gz
cd ruby-2.7.6
./configure --without-rdoc > /dev/null
make > /dev/null
rm -rf basictest benchmark bootstraptest cygwin doc \
sample spec test win32
cd ..
# some additional work done in this else block ...
...
fi
Ok, Ruby building/installing is taken care of in an optimized fashion. Next, if we want to be able to run kitchen-terraform, we need to have some gems to be installed. In fact, it is quite a large number in a very twisted hierarchy of dependencies. Turns out the easy way to get these gems (found after I started down the hard way path). The easy way is to utilize the Linux system over which I have full control to build an exhaustive list of gems with versions. Do this by first installing the very same ruby-2.7.6 version from source (seems all Linux versions are in the dark ages with Ruby versions the OS packages support). Once that is set up, change to a work directory. Create a file called "Gemfile" with this content:
source "https://rubygems.org"
gem "kitchen-terraform", "~> 6.1"
This Gemfile can be used for the Ruby "gem" command but also the super-powered "bundle" command. So, run the bundle command and capture the output.
bundle install > bundle.log
That output file will contain the names and versions of the full set of required gems for kitchen-terraform. Edit this down to a nice file listing one gem name and version per line. So, this file contents can safely be emailed to the restricted Windows system. Now these gems must be downloaded from rubygems.org. But there are so many that you really don't want to do this via your browser. So, Python to the rescue. This will read a file listing gems and versions.
import os
import sys
import requests
def getgem(name, version):
filename = f'{name}-{version}.gem'
url = f'https://rubygems.org/downloads/{filename}'
r = requests.get(url)
with open(filename, 'wb') as fp:
fp.write(r.content)
Now, each of these gems is actually a .tar.gz file. They can be put in the vendor/cache directory in our repo where "bundle" can find them.
Next, create the Gemfile in our repo with this content:
gem "kitchen-terraform", "6.1.0"
Now, about these gems. Most of them are pure ruby code. Nothing special is required to install these gems besides the .tar.gz files downloaded. However, a small number of these gems require compiling C code. That would mean we need to have the yum group 'Development Tools' be installed. We don't want to have the installed routinely because it slows down the pipeline. So, how about if we build those particular gems at the same time we are building Ruby from source and we have the 'Development Tools' installed. Then, we can zip them up and save the result as a GitLab pipeline artifact, which can be subsequently be downloaded and inserted into our GitLab repo.
To do this, we need to install these compiled gems in a special location because there are lots of files created and we want to separate all those from other gems installed when Ruby was built from source. So, the following shell script code will install the gems in the local directory in the subdirectories of: build_info cache doc extensions gems specifications. The zip will then be unpacked on subsequent runs in the system gem directory.
# install these compiled gems from vendor directory and
# capture the zip
for gem in bcrypt_pbkdf-1.1.0 bson-4.15.0 ed25519-1.3.0 \
ffi-1.15.5 unf_ext-0.0.8.2 json-2.6.2;do
gem install -i . -N -V --local "vendor/cache/${gem}.gem" \
> /dev/null
done
# now save results as zip file
zip -r compiled_gems.zip build_info cache doc extensions \
gems specifications > /dev/null
# built it once, set as job artifact and download it
# and store in vendor/
# this is used below when not built here.
compiled_gems="yes"
You will see that compiled_gems is a flag, which if not set, will trigger installing these gems from the zip file later in this code. We change directory to the system gem install directory for unpacking the zip:
# install pre-built gems so we don't need to install
# developer tools every run
if [ -z "$compiled_gems" ]; then
pushd /usr/local/lib/ruby/gems/2.7.0 > /dev/null
unzip $CI_PROJECT_DIR/vendor/compiled_gems.zip > /dev/null
popd > /dev/null
fi
These installations and the others can be tested by querying with the "gem" command:
# check the compiled gems are available
gem list '^(json|unf_ext|bcrypt_pbkdf|bson|ed25519|ffi)' -d
# and a few of the others
gem list '^(aws-eventstream|azure_graph_rbac)' -d
By the way, you can put this in the GitLab pipeline to capture compiled_gems.zip.
build_job:
...
artifacts:
paths:
- compiled_gems.zip
So, if everything is put together, this is the full setup shell script:
if [ -d ruby-2.7.6 ]; then
cd ruby-2.7.6
yum install make bind-utils openssl -y > /dev/null
make do-install-bin
make do-install-lib
make -f GNUmakefile -f only-install-ext.mak install_ext_special
make do-install-gem
cd ..
else
yum groupinstall 'Development Tools' -y &> /dev/null
yum install openssl-devel bind-utils openssl -y &> /dev/null
tar xf vendor/ruby-2.7.6.tar.gz
cd ruby-2.7.6
./configure --without-rdoc > /dev/null
make > /dev/null
rm -rf basictest benchmark bootstraptest cygwin doc sample \
spec test win32
cd ..
# install these compiled gems from vendor directory and
# capture the zip
for gem in bcrypt_pbkdf-1.1.0 bson-4.15.0 ed25519-1.3.0 \
ffi-1.15.5 unf_ext-0.0.8.2 json-2.6.2;do
gem install -i . -N -V --local "vendor/cache/${gem}.gem" \
> /dev/null
done
# now save results as zip file
zip -r compiled_gems.zip build_info cache doc extensions gems \
specifications > /dev/null
# build it once, set as job artifact and download it and
# store in vendor/ this is used below when not built here.
compiled_gems="yes"
fi
# ruby location, /usr/local/bin, is already on path
echo -n 'Ruby version: '
ruby --version
# install pre-built gems so we don't need to install
# developer tools every run
if [ -z "$compiled_gems" ]; then
pushd /usr/local/lib/ruby/gems/2.7.0 > /dev/null
unzip $CI_PROJECT_DIR/vendor/compiled_gems.zip > /dev/null
popd > /dev/null
fi
# install remaining vendored gems; get gems here: https://rubygems.org/gems
bundle install --local &> bundle.log
log_lines=$(wc -l bundle.log|cut -d' ' -f 1)
normal_log_lines=260
if [ $log_lines -ne $normal_log_lines ]; then
echo "--- bundle.log lines=$log_lines ---"
cat bundle.log
fi
# check the compiled gems are available
# gem list '^(json|unf_ext|bcrypt_pbkdf|bson|ed25519|ffi)' -d
# and a few of the others
# gem list '^(aws-eventstream|azure_graph_rbac)' -d