I needed a cheap aarch64 Linux GitHub Actions runner for building R and R packages. I ended up with a VM in the Oracle Cloud, that makes use or their “always free resources” offer. It runs each container job in isolation, in a new Docker container. This post documents how I set it up.
Create the VM
I started with a free OCI subscription and ran into “out of capacity” errors when creating a VM with always free resources. The solution was to upgrade to a “pay as you go” subscription.
It important to read the conditions for always free resources carefully and create the VM accordingly.
In particular, I
- chose my home region,
- used the
VM.Standard.A1.Flex
shape, and - selected the Ubuntu 22.04 Minimal aarch64 OS.
The current limits of the always free resources are 4 OCPU, 24 GB memory and 200 GB disk. One can use all of them for a single machine, or create up to four machines. As long as the totals add up to these limits, the resources are free. No need to worry too much about the floating cost estimate box that says the VM will incur costs, it is simply wrong.
Configure the VM
When the VM is up and running, I set up a couple of things to make everyday maintenance easier. None of these are strictly neccessary, except for installing Docker. Your mileage may vary.
It seems like a good idea to restrict access to the VM. By default SSH is available from any IP address on OCI. To restrict this, I went to ‘Networking’ -> ‘Virtual cloud networks’, then chose the network of the new VM, then ‘Security List Details’, and edited the rule to port 22. to allow a single IP.
I like to add remote hosts to the .ssh/config
file. E.g. for my new
aarch64 VM I have:
1 | host arm |
Btw. on macOS, I need to edit the
/etc/ssh/ssh_config
file to use ssh
after every macOS update.
I like to create a swap file on the VM. OCI is quite generous with the memory, but Linux starts killing processes when if runs out of memory, so a swap file can literally save a congested VM. There are plenty of guides on creating swap files, I like this one. For my VM with 12 GB memory I created a swap file of 12 GB. For a smaller machine, e.g. one with 4 GB memory, I would create a swap file of at least 8 GB, if the available disk space allows this.
Then I installed Docker, i.e. the docker.io
Ubuntu package. I like to
add the user to the docker
group, so I don’t need to use sudo
do run
Docker commands:
1 | sudo apt-get install docker.io |
The default user is called ubuntu
on OCI Ubuntu machines, other VMs might
have a different name. I had to log out and log in again for this
change to take effect.
Configure the runner container hooks
By default a self-hosted runner runs each job as the user that runs the runner’s software. This is not great, because jobs are not isolated from each other and the rest of the VM. Ideally, each container job should run in a separate container.
GitHub lets us customize self hosted runners by adding four hooks that corresponds to the events that happen when running a job. The main focus of this interface is extensibility and not ease of use, so writing the hooks is a substantial amount of work. Luckily, I was able to use an example implementation without any modifications.
I cloned the runner-container-hooks
repo first into my user’s home
directory:
1 | sudo apt-get install git |
The example hooks are written in TypeSCript, so I needed to install Node.js 20, and compile the hooks from TypeScript to JavaScript. First installed Node 20.x:
1 | curl -LO https://nodejs.org/dist/latest-v20.x/node-v20.17.0-linux-arm64.tar.gz |
Check if it works:
1 | node --version |
1 | v20.17.0 |
Then I compiled the hooks:
1 | cd runner-container-hooks |
Add the self hosted runner
I followed the GitHub documentation to add a self hosted runner, but did
not start it yet. (From the ‘Settings’ tab of my organization or user
I selected ‘Actions’, then ‘Runners’, then ‘New runner’.) Selected an ARM64
Linux runner. I named it arm
, but the name is not important, and used the
default labels: self-hosted
, Linux
and ARM64
. IF the runner is for
specific jobs, then it makes sense to create more spcific labels.
Before running run.sh
, I needed to set up the runner to use the custom
hooks from runner-container-hooks
. I added a .env
file to the directory
that contains the runner software, i.e. next to the run.sh
file.
The .env
file contains this:
1 | LANG=C.UTF-8 |
The path points to my runner-container-hooks
directory, so it might be
different if the VM has a different username. This is all that is needed
for the hooks, and the runner is ready to run now!
I find it convenient to run the runner within a screen or tmux session, so I can easily detach its virtual terminal and log out of the VM cleanly:
1 | sudo apt-get install screen |
Pressing CTRL+a d
(i.e. CTRL
and a
together, then let go and d
)
detaches the terminal. Running screen -list
lists the screen sessions.
The output looks like this:
1 | ubuntu@arm:~/actions-runner$ screen -list |
screen -r
re-attaches the session:
1 | screen -r 12867.pts-0.arm |
and then CTRL+a d
detaches it again.
Instead of starting the runner from an interactive session, I could also configure it as a service. I am sure that configuring multiple runners as a service works, though.
The self hosted runner is running now. In the list of runners at
https://github.com/organizations/<my-org>/settings/actions/runners
it shows up as “idle”.
Run jobs
To run a job on the new runner, I can use one or more of its configured labels. Here is an example:
1 | on: |
This workflow was a workflow_dispatch
trigger, so I can start it from
the GitHub web UI.
Tips and limitations
Using container:
means that the job will run in a container.
If the jobs needs to run node.js actions, the container needs to support
Node.js, version 20.x nowadays. E.g. actions/checkout@v4
,
actions/upload-artifact@v4
, etc. need Node.js 20.x.
The runner can run Docker in Docker by default! I.e. in the above example
I could install the docker.io
Ubuntu package inside the container, and
then call docker run
, etc. Here is a real example. Note, however that all these containers are using the same docker
daemon, so they’re going to be running as sibling containers.
Only run trusted code in these containers! Every container has access to the single Docker daemon of the VM, so essentially they have admin access to the VM. Do not run untrusted third party code! Also, do not kill all Docker containers from within the job, that’ll kill the job as well.
I like to mount ${{ github.workspace }}
into the container, at exactly
the same place as on the runner (the VM), and then run commands inside
that directory. This makes it easier to copy files between containers
when running Docker in Docker.
It seems that the runner software never cleans up its temporary directory
in _work/_temp
, so it’ll eventually fill up and will cause jobs to fail.
I don’t really have a good solution for this currently, apart from a manual
cleanup after the failure. I suppose a cron job could work, but it’d have
to be careful to only clean up if no jobs are running.
I can start a second (third, fourth, etc.) runner on the same machine!
Use a different directory than the default actions-runner
and a new
screen
session. This especially makes sense if the VM has more than one
OCPUs. Keep in mind that both runners will use the same Docker daemon, so
for Docker in Docker, the jobs have to be careful to avoid name clashes,
i.e. the names of containers, networks, etc. must be different for each
job (and each sub-job of a matrix job!), in case they are running
concurrently on the VM.
Creating more runners is also a way (the only way?) to use the same VM in multiple GitHub organizations.
Improvements
Ideally, the VM would dynamically create runners as they are needed for jobs, up to a limit (probably 4 for the always free OCI VM) and then remove them once the jobs are done. To get four aarch64 Linux runners I don’t really want to create four VMs, or run four independent runners manually on the same VM.
This is certainly possible, but needs a bit more configuration, and some Kubernetes knowledge does not hurt, either. In particular, later I would like to try ARC. It should work pretty well on an OCI VM with all the always free resources, i.e. 4 OCPUs, 24 GB memory and 200 GB disk.
Other options
I also considered other options.
The simplest way to use aarch64 runners on GitHub Actions is to pay for their hosted runners.
Another way to run aarch64 Linux is to use multi-architecture Docker on the GitHub hosted runners. This is how [I run Linux on s390x on GHA]. It works well, but it is quite slow. Some Linux distros are slower than others, e.g. Fedora is very slow, probably because it is compiled for a more modern processor that is harder to emulate. I do use this method for some tasks, but it is not feasible for others: the six hours time limit is not enough to compile R on Fedora this way. (This is a task that takes less than 10 mintes on a native aaarch64 machine!)
All major cloud providers support aarch64 Linux servers: Azure, AWS, Google Cloud, and there are many smaller ones like netcup.
These options cost money, so they could not compete with the Oracle Cloud server.
Updates
2024-09-21: added paragraph about multi-architecture Docker as an alternative. Also moved the ‘Other options’ section to the end.
2024-09-23: added advice about restricting ssh to the VM.
2024-09-24: added note about configuring the runner as a service.