Posted 24-Feb-2020 in Software Development tagged DevOps, Version Control, Shell Script, TypeScript

Complete CI/CD pipeline with GitLab Runners

Automation is one of the best ways to improve productivity. Even as a development team of one, spending a bit of time on DevOps and improving your developer quality of life can pay off immensely.

Automated tasks strip away cognitive load. No more forgetting to deploy code because the process was manual and easy to forget. Take it a step further with automated linting and testing.

With platforms like GitLab aiming to make it dead simple to build out automated pipelines, it's never been easier. You don't even need any specialized software, shell scripts and a bit of Linux knowledge is more than enough to implement a full and complete continuous integration and deployment pipeline.

When I approach building out a new SaaS product, the stack of services tends to look like this:

Static [marketing] website (www) running on Jekyll.
Front-end application (app) with React (TypeScript).
Back-end API (api) using Express (also TypeScript).

Each of these "services" lives in a monorepo under the directories api, app and www. No dependencies are shared between them and each can be deployed independently of the rest if I felt so inclined to do so.

At the very earliest stage, I host all of these "services" on a single machine, each with a slightly different build process. If there are database concerns, I will throw in a separate server for that, which makes it easier to split out the services down the road if/when that day comes.

So with these three different parts to the project, there are a few separate steps that need to take place to go from code in the repository to deployed application:

Dependencies need to be installed.
The code needs to be linted (app and api only).
The test suite needs to be run (same deal, app and api only).
The project needs to be built.
The built code needs to be copied to the server.
Services need to be bounced (pm2 in this case).
Caches need to be reset (usually just Cloudflare CDN).
Old builds should be removed.

Quite a few steps, especially if you're doing it manually as some of those steps have to be done 2-3 times depending on the services that it applies to.

The aforementioned steps can easily be grouped up 3 or so stages and with GitLab's parallelization of stages, can be broken out further to isolate each service that we're working with:

Build

Install dependencies and build the www.
Install dependencies and build the app.
Install dependencies and build the api.

Test

Lint and run tests for the app.
Lint and run tests for the api.

Deploy
Copy code to the server: * Copy the build for the www. * Copy the build for the app. * Copy the build for the api.
Link to the new build: * Link to the new build for the www. * Link to the new build for the app. * Link to the new build for the api.
Reload pm2 to start serving the new build of the api.
Purge the Cloudflare CDN cache.
Clean up builds older than 30 days old.

You very well could run this all in a single stage, but I find that it's a ton easier to track down issues when you have things abstracted out, even if it's going to take a bit longer due to bringing up new containers for each step.

You may be wondering why I'm not bouncing a web server as part of this flow. In my experience, the web server (in my case, nginx) tends to need updated pretty rarely, especially after the initial paths are configured and such. I also try to avoid doing any super user actions in my automated deployments (in this case, using systemctl). Call it paranoia if you want, but I'd much rather that my automated deployments don't have said elevated privileges, just in case of a breach.

Before getting into the meat and potatoes of my .gitlab-ci.yml file, I want to discuss a few of the assumptions that I have baked in here.

First, I use a directory structure like this for releases:

/home/username/releases
├── api
│   ├── 20200221194212-5ab009659b110c534bbc8a30abb9f418f4b150df
│   ├── 20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef
│   └── current -> /home/username/releases/api/20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef
├── app
│   ├── 20200221194212-5ab009659b110c534bbc8a30abb9f418f4b150df
│   ├── 20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef
│   └── current -> /home/username/releases/app/20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef
└── www
    ├── 20200221194212-5ab009659b110c534bbc8a30abb9f418f4b150df
    ├── 20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef
    └── current -> /home/username/releases/www/20200221201443-258e011dd4524e4ba25268593ab1051c6a7c95ef

This gives me a pretty solid structure for the inevitable day when one of the services needs moved to a new box / cluster. Nothing is shared between the services and that's a great thing.

The other set of assumptions has a lot to do with my development stack choices as a whole. I build things with Node.js and React leveraging TypeScript. I prefer static websites to WordPress installs. That's all just my thing, you can apply this same pipeline to your current choices, just tweak things where it's appropriate.

In terms of server software, I use pm2 for the api service and have nginx proxying back to it, while also serving the www and app build directly. I don't want to get weighed down with discussing every single configuration file in this post, and focus more on the pipeline itself, so YMMV if you don't have your server configured yet.

In terms of configuration, staying true to The Twelve-Factor App, my .gitlab-ci.yml file doesn't contain any secrets or anything of that nature. For that stuff, I am using GitLab's CI/CD variables (masked when possible).

The variables needed to use this particular pipeline are as follows:

CLOUDFLARE_AUTH_EMAIL - Authentication email for Cloudflare.
CLOUDFLARE_AUTH_KEY - Authentication key / token for Cloudflare.
CLOUDFLARE_ZONE_ID - Zone ID for the site in Cloudflare.
SSH_HOSTNAME - Hostname / IP Address for the server.
SSH_PORT - Port number for the server.
SSH_PRIVATE_KEY - Private key for the deploy user on the server.
SSH_USERNAME - Username of the deploy user on the server.

In the past I've also carried around a variable with the known hosts to drop into the SSH configuration, but have since dropped that in favor of some additional configuration to skip the name check. In my own experience, most of the time I would bork the known hosts data the first time or so, so it's always been a pain point to manage for me.

For the sake of speed, and to keep each stage of the process isolated in it's intent, I am leveraging artifacts to carry the build and node_modules between each step in the process.

To help manage configuration files, I keep them in the repository, but had to be mindful to copy those files into the build directory to ensure they were deployed with the rest of the build.

Outside of the dependencies, the only other thing that this pipeline needs to run is curl, openssh-client and rsync. Of course, if you don't need to purge anything from Cloudflare, you could even drop curl since that's the only thing using it.

In the past I would use curl to talk to slack to let me know about the status of the deployment, but have since opted to use GitLab's Slack integration instead of maintaining any of that myself.

Also worth noting, the build and test stages both are configured to run on every branch (which the status shows up on merge requests) but will only deploy when on the master branch.

So after quite a bit of tweaks, and many failed builds and deploys, here's the final product (at the time of this writing, at least ;):

image: node:13 stages: - build - test - deploy variables: CI: "true" GIT_STRATEGY: clone build_api: stage: build artifacts: paths: - ./api/build - ./api/node_modules script: - cd api - npm install - npm run build - cp {ecosystem.config.js,nginx.conf} build - cp -R node_modules build/ build_app: stage: build artifacts: paths: - ./app/build - ./app/node_modules script: - cd app - npm install - npm run build - cp nginx.conf build build_www: image: jekyll/jekyll:4.0 stage: build variables: JEKYLL_ENV: production artifacts: paths: - ./www/_site script: - cd www - bundle config set path '.bundle' - bundle install - bundle exec jekyll build - cp nginx.conf _site test_api: stage: test services: - mongo script: - cd api - npm run lint - npm run test test_app: stage: test script: - cd app - npm run lint - npm run test

deploy: stage: deploy only: - master before_script: - RELEASE="$(date +%Y%m%d%H%M%S)-$CI_COMMIT_SHA" - RELEASES_DIR="/home/$SSH_USERNAME/releases" - API_RELEASE_DIR="$RELEASES_DIR/api/$RELEASE" - API_CURRENT_DIR="$RELEASES_DIR/api/current" - APP_RELEASE_DIR="$RELEASES_DIR/app/$RELEASE" - APP_CURRENT_DIR="$RELEASES_DIR/app/current" - WWW_RELEASE_DIR="$RELEASES_DIR/www/$RELEASE" - WWW_CURRENT_DIR="$RELEASES_DIR/www/current" - apt-get update - apt-get install -y curl openssh-client rsync - eval $(ssh-agent -s) - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null - mkdir -p ~/.ssh && chmod 700 ~/.ssh - echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config script: - > rsync -avz -e "ssh -p $SSH_PORT" ./api/build/ "$SSH_USERNAME@$SSH_HOSTNAME:$API_RELEASE_DIR" - > rsync -avz -e "ssh -p $SSH_PORT" ./app/build/ "$SSH_USERNAME@$SSH_HOSTNAME:$APP_RELEASE_DIR" - > rsync -avz -e "ssh -p $SSH_PORT" ./www/_site/ "$SSH_USERNAME@$SSH_HOSTNAME:$WWW_RELEASE_DIR" - > ssh -A "$SSH_USERNAME"@"$SSH_HOSTNAME" -p "$SSH_PORT" ln -nsf "$API_RELEASE_DIR" "$API_CURRENT_DIR" - ssh -A "$SSH_USERNAME"@"$SSH_HOSTNAME" -p "$SSH_PORT" pm2 reload api - > ssh -A "$SSH_USERNAME"@"$SSH_HOSTNAME" -p "$SSH_PORT" ln -nsf "$APP_RELEASE_DIR" "$APP_CURRENT_DIR" - > ssh -A "$SSH_USERNAME"@"$SSH_HOSTNAME" -p "$SSH_PORT" ln -nsf "$WWW_RELEASE_DIR" "$WWW_CURRENT_DIR" after_script: - > curl -s -X DELETE "https://api.cloudflare.com/client/v4/zones/$CLOUDFLARE_ZONE_ID/purge_cache" -H "X-Auth-Email: $CLOUDFLARE_AUTH_EMAIL" -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" -H "Content-Type: application/json" --data '{"purge_everything":true}' - > ssh -A "$SSH_USERNAME"@"$SSH_HOSTNAME" -p "$SSH_PORT" find $RELEASES_DIR/{api,app,www} -mindepth 1 -maxdepth 1 -type d -mtime +30 -exec rm -rf {} +

Honestly not much to it. If you wanted to deploy to multiple servers, you just need to tweak things a bit, either running through a list of IP addresses or breaking the deploy stage apart. Still want the after_script clean up to run after all of your deploys can finished, simply move that to the .post stage!

I'm sure at least somebody out there is saying to themselves "this is great, but how do you manually deploy stuff, Josh?". Simply put, I don't. I know there are edge cases that come up that require getting your hands dirty, but I don't ever want to make that a habit, so I simply avoid it entirely by not giving myself a way to deploy outside of the CI/CD pipeline.

Maybe that's not for everybody, and definitely eliminates doing manual deploys to a staging environment, but that's just how I work. In most cases, when I do need to work in some sort of manual flow, I try to automate that as well, just because I'd rather spend a bit more time up front, than recurring time recalling stuff from my mental archives because the information doesn't get used to often.

Of course, this pipeline is pretty opinionated in terms of how code is structured and how it's shipped to production. Doesn't mean that I'm averse to comments, if anything seems off, or you have a better way to go about something I'm doing here, by all means, drop me a comment!