R on s390x Linux on GitHub Actions

Gábor Csárdi

2024-09-09

GitHub Actions, R, R package, big endian, s390x

It is sometimes useful to be able to test your R package on a big-endian architecture, like s390x.

TL;DR

Copy this workflow and adjust it to your needs.

The rest of the post shows how this workflow works.

Create the container

First, we need to create an s390x Docker container. This is mostly straightforward, and not the part that this post focuses on. Nevertheless the latest version of the Docker image is at ghcr.io/r-hub/containers/s390x.

It is built from a Dockerfile in the same repository, using GitHub Actions, specifically this workflow.

The container image uses R 4.1 currently, to avoid having to compile R on s390x.

Only x86_64 containers are welcome on GHA

GitHub Actions has tools to run multi-architecture Docker, so this is relatively easy, with the one catch that it is not possible to run non-x86_64 containers natively. This means that you cannot run actions inside the s390x container, at least not out of the box. Which is a pity, because I would really like to run setup-r-dependencies to install package dependencies, and check-r-package to run R CMD check, on the s390x container.

The rest of this post shows how to create a remote Rscript shell that runs all R script, including the ones from actions, inside the s390x container.

Get the container and start it

We’ll put all the code together into a composite action.

First we need to install qemu:

Install qemuSee on GitHub

- name: Install Qemu
  uses: docker/setup-qemu-action@v3
  with:
    image: tonistiigi/binfmt:qemu-v8.1.5

Specifying the image is important. By default the action uses an older qemu image, which crashes frequently, at least for s390x.

Then we pull the image from a registry and start a container.

Pull the image, start containerSee on GitHub

- name: Pull and start image
  run: |
    R_LIBS_USER=${RUNNER_TEMP}/Library
    echo "R_LIBS_USER=${R_LIBS_USER}" >> ${GITHUB_ENV}
    if [ "${{ inputs.platform }}" == "" ]; then
      PLT=""
    else
      PLT="--platform ${{ inputs.platform }}"
    fi
    docker pull $PLT ${{ inputs.image }}
    docker run -d $PLT --name ${{ inputs.ctr-name }} \
      -v`pwd`:/root \
      -v"$R_LIBS_USER:/usr/local/lib/R/site-library" \
      ${{ inputs.image }} \
      bash -c 'while true; do sleep 10000; done'
  shell: bash

The first couple of lines deal with setting R_LIBS_USER to the runner’s temporary library.

platform will be an input parameter that we pass to the --platform option of docker pull and docker run, unless it is empty.

We’ll set the name of the container, also an input, so we can refer to it later to run code on it.

In this example we use two bind mounts, one to the current working directory, that’s where the git tree of the current R project is checked out, typically. The second is for the R package library, and we do that so we can cache the library. Hopefully the bind mounts are fast enough with qemu, I haven’t actually checked this. Emulating s390x on x86_64 is pretty slow, anyway, so hopefully this does not matter too much.

We run an empty loop on the container. We’ll run the actual R commands with docker exec later.

Use a custom shell to run `Rscript` in the container

The next thing we need is a custom Rscript shell that runs an R script inside the container, instead of running it on the runner machine. This is slightly tricky, because we want the custom shell to be a shell script, and this is not supported by GitHub Actions out of the box. See the previous blog post on how to do this. I’ll repeat the code needed here:

Set up a remote shellSee on GitHub

- name: Setup up remote shell
  run: |
    cp ${{ github.action_path }}/Rscript \
      /usr/local/bin/Rscript
    chmod 775 /usr/local/bin/Rscript
    echo "CTR_NAME=${{ inputs.ctr-name }}" >> "$GITHUB_ENV"
  shell: bash

We also set the CTR_NAME env var to the name of the container, so we can refer to it later.

We can start writing the actual Rscript wrapper:

Rscript wrapperSee on GitHub

#! /bin/bash
env_file=$(basename `mktemp`)
out_file=$(basename `mktemp`)
cch_dir=$(basename `mktemp`)

CTR=${CTR_NAME}

if [ "$CTR" == "" ]; then
  echo "Could not find container :("
  exit 2;
fi

docker exec ${CTR} touch "/tmp/${env_file}" "/tmp/${out_file}"

We’ll need some temporary files to copy data back from the container. We create the files for ${GITHUB_ENV} and ${GITHUB_OUTPUT} on the container, the so actions that will be “forwarded” to the container by our remote shell can set environment variables and output parameters.

Now we are ready to run the R script, passed in as the first argument $1:

Rscript wrapperSee on GitHub

docker exec -i -w /root -e"R_LIB_FOR_PAK=/usr/lib/R/library" \
  -e"GITHUB_ENV=/tmp/${env_file}" \
  -e"GITHUB_OUTPUT=/tmp/${out_file}" \
  -e"R_PKG_CACHE_DIR=/tmp/${cch_dir}" \
  -e"XDG_CACHE_HOME=/tmp/${cch_dir}" \
  -e"NOT_CRAN=${NOT_CRAN}" \
  -e"CI=${CI}" \
  -e"GITHUB_PAT=${GITHUB_PAT}" \
  -e"GITHUB_TOKEN=${GITHUB_TOKEN}" \
  ${CTR} < $1 R --no-save -q
status=$?

-w sets the working directory to /root where we bind mounted the current working directory.

-e passes in a bunch of environment variables. GITHUB_ENV and GITHUB_OUTPUT point to the temporary files we created above. The rest of them are typically needed for R packages. You might need to pass more GITHUB_* env vars or env vars for R CMD check.

❗ Env vars that you set with `env:` will not be available on the container, unless you explicitly pass them with `-e` here!

Finally, we save the exit status of docker exec, so the Rsctipt wrapper will be able to return the same status.

Now all we left to do is copying back the GITHUB_ENV and GITHUB_OUTPUT files.

Rscript wrapperSee on GitHub

docker cp ${CTR}:/tmp/${env_file} /tmp/${env_file} || true
docker cp ${CTR}:/tmp/${out_file} /tmp/${out_file} || true
touch /tmp/${env_file} /tmp/${out_file}
cat /tmp/${env_file} >> $GITHUB_ENV
cat /tmp/${out_file} >> $GITHUB_OUTPUT

if [ "$status" != "0" ]; then
  echo "::error ::Command failed in container."
fi

exit $status

To test the remote shell, you could use a step like this:

Test remote RscriptSee on GitHub

- name: Test R in container
  run: |
    getRversion()
    R.version[["platform"]]
  shell: Rscript {0}

R package binaries for s390x

Emulating s390x on x86_64 is difficult, and thus quite slow, so it is important that we pre-build some binary R packages, so they don’t have to be compiled and built for every workflow run. Building duckdb for s390x takes more than 24 hours on my laptop!

I won’t go into the details here, but the repository with s390x binarires is updated daily on GitHub Actions. The binary packages themselves are stored at GitHub releases at the https://github.com/cran CRAN mirror. E.g. the binaries for cli 3.6.3 are here.

Important to note that this kind of “distributed” repository only works with pak, so you won’t be able to use install.packages() to install these binaries. We pre-install pak on our container image and to also set up the repos option to point to the binary packages.

Install pak, set binary repoSee on GitHub

# Use R-hub repo
RUN echo "options(repos = c(RHUB = 'https://raw.githubusercontent.com/r-hub/repos/main/ubuntu-22.04-s390x/4.1', getOption('repos')))" \
      >> /usr/lib/R/library/base/R/Rprofile

# Install pak into system lib
RUN R -q -e 'install.packages("https://github.com/cran/pak/releases/download/0.7.2/pak_0.7.2_b1_R4.1_s390x-ibm-linux-gnu-ubuntu-22.04.tar.gz", repos = NULL, lib = .Library)'

Putting it all together

It is convenient to create an action that starts the container and sets up the remote shell. Here is an example.

An example workflow that uses this action could look like this:

R package check on s390xSee on GitHub

on:
  workflow_dispatch:

name: s390x.yaml

jobs:
  s390x:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: r-hub/actions/ctr-start@main
        with:
          image: ghcr.io/r-hub/containers/s390x
          platform: linux/s390x
          ctr-name: s390x

      - name: Test R in container
        run: |
          getRversion()
          R.version[["platform"]]
        shell: Rscript {0}

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          pak-version: none
          cache-version: s390x-1
          extra-packages: any::rcmdcheck
          needs: check

      - uses: r-lib/actions/check-r-package@v2
        with:
          build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")'
          upload-results: never
          upload-snapshots: false
        env:
          NOT_CRAN: true

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: ${{ format('{0}-{1}-results', runner.os, runner.arch) }}
          path: check

The first interesting part calls the action mentioned above, with the right parameters:

R package check on s390xSee on GitHub

- uses: r-hub/actions/ctr-start@main
  with:
    image: ghcr.io/r-hub/containers/s390x
    platform: linux/s390x
    ctr-name: s390x

Then we use the usual r-lib/actions actions, but we do need to set some extra parameters:

R package check on s390xSee on GitHub

- uses: r-lib/actions/setup-r-dependencies@v2
  with:
    pak-version: none
    cache-version: s390x-1
    extra-packages: any::rcmdcheck
    needs: check

We need to set pak-version: none here, so setup-r-dependencies does not try to install the pak package, which is already pre-installed on the container. We also set cache-version to have an s390x specific name for the cache.

R package check on s390xSee on GitHub

- uses: r-lib/actions/check-r-package@v2
  with:
    build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")'
    upload-results: never
    upload-snapshots: false
  env:
    NOT_CRAN: true

For check-r-package we need to set upload-results and upload-snapshots because uploading the artifacts doesn’t currently work with the remote shell. Instead, we upload artifacts manually:

R package check on s390xSee on GitHub

- uses: actions/upload-artifact@v4
  if: failure()
  with:
    name: ${{ format('{0}-{1}-results', runner.os, runner.arch) }}
    path: check

Is this a hack?

The dummy Rscript shell, and the “remote shell” are hacks. The rest is good, though.

Updates

2024-10-07: In the first version of this blog post I stated that the custom shell must be a binary program, but that is not true, I just didn’t manage to put the #! hash-bang at the first line of the script. I updated the post accordingly.

TL;DR

Create the container

Only x86_64 containers are welcome on GHA

Get the container and start it

Use a custom shell to run Rscript in the container

R package binaries for s390x

Putting it all together

Is this a hack?

Updates

Use a custom shell to run `Rscript` in the container