Saturday, February 10, 2024

Recent Python for an Amazon Linux 2 Based Container

A coworker had a process to publish a static website via Vercel. Vercel runs their publishing jobs using a container running Amazon Linux 2 (AL2). This particular publishing job for the coworker had to invoke a Python program which required at least Python 3.10. If you know CentOS, from which AL2 was derived, you know that their versions of Python supported are ancient. The nominal solution is to build Python from a source distribution in order to get a later version installed. There is an option to use the EPEL package source but things are very different with that source than they used to be and it used to easier to work with. So, building from source is the way to go. However, the coworker was frustrated by the length of time required for this process on the Vercel platform. 

So, enter me. After some effort, I was able to convince this coworker that, yes, Python must be built from source. However, could we build Python in a pre-deployment step, not run on Vercel and then simply perform a binary install on Vercel? A binary package of all the files installed by the Python build could be created and saved into the Git repository. Maybe not an ideal thing to include such a large binary in the repo, but it should be workable. 

A few Python packages are also required by the Python program that needed to run as part of the deployment. So, we might as well include these in the binary we assemble because that way will require less time (no network access required or any other overhead). 

Now, this binary would only need to be updated if a different container image of AL2 were used by Vercel or the Python packages were to be required. 

So, the results looked like this. This script is used to perform the binary image build: 

#!/usr/bin/env bash

PYTHON_VERSION='3.11.7'
PYTHON_MINOR_VERSION='3.11'
SEVEN_Z_VERSION='2301'

cd /build-python
yum erase openssl-devel -y
yum install tar gzip make gcc openssl11 openssl11-devel  libffi-devel bzip2-devel  \
   bzip2-devel libffi-devel zlib-devel xz xz-libs -y
curl -O https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tgz
tar xzf Python-${PYTHON_VERSION}.tgz
cd Python-${PYTHON_VERSION}
./configure
make altinstall
cd ..
/usr/local/bin/pip${PYTHON_MINOR_VERSION} install nbconvert jupyter-console pandoc
 

# 7z is better at compressing, yet on Feb 2, 2024 their cert was expired.
# Temporarily adding "-k" to this next command. Try without at later time.
curl -k -O https://7-zip.org/a/7z${SEVEN_Z_VERSION}-linux-x64.tar.xz
tar xf 7z${SEVEN_Z_VERSION}-linux-x64.tar.xz
./7zz a python_install.7zip /usr/local/bin /usr/local/lib
rm -rf  Python-${PYTHON_VERSION} Python-${PYTHON_VERSION}.tgz 7z${SEVEN_Z_VERSION}-linux-x64.tar.xz  \
        7zzs History.txt License.txt MANUAL readme.txt

Using 7zip for the binary assembly saves almost have the space over a compressed tar file. Now, getting 7z took a few extra steps. We use the altinstall make target for the Python makefile and all files are installed under /usr/local.

This following script can take care of kicking off the build of the binary assembly, on, for example, my coworkers laptop. It assumes the above script is found at the path build-python/build-python.sh.

#!/usr/bin/env bash

VERCEL_BASE_CONTAINER='amazonlinux:2.0.20191217.0'

# Build the data needed on Vercel deployment: a 7z executable and an archive of a Python installation

docker run --rm -it -v $(pwd):/build-python ${VERCEL_BASE_CONTAINER} /bin/bash /build-python/build-python.sh

Finally, the Vercel deployment can be run with a script something like this script:

#!/bin/bash

# Install python setup with installed packages
./7zz x -spe -o/usr/local/ python_install.7zip
PATH="/usr/local/bin:$PATH"
npm run fixDocs && docusaurus build


It takes maybe 10 minutes to run the progress to build the binary assembly. About 62 MB is the resulting size. The Vercel deployment is significantly faster and less complicated.