Demo site

Welcome

This website exists to showcase an example of creating a set of applications built, tested and deployed completely with open-source and free tools.

Most of the backend services are Python Flask applications using Jinja2 templates and markdown while the frontend uses the Material Design Lite library with jQuery.

All backend services are built into Docker images on Travis CI from source code on GitHub. The images are then pulled from Docker Hub onto Pine64 and Rock64 servers using webhooks where the containers are running behind an Nginx proxy server exposing secure HTTP/2 endpoints with SSL certificates from Let's Encrypt.

Monitoring is done via Google Analytics on the frontend and with Prometheus and Grafana on the backend. Logs are collected with Fluentd into Elasticsearch and visualized by Kibana. External monitoring is powered by Uptime Robot and the results are available on status.viktoradam.net.

The whole process including the deployments is automated, the only manual step being the git push command.

To see how it works go to the Specs tab or click the button below.
If you have any questions or would like to know more about it, get in touch on Twitter @rycus86 or send me an email.

If you are interested in these topics, check out my blog at blog.viktoradam.net which is powered by Ghost.

Acknowledgements

Backend

Frontend

Infrastructure

Overview

We can see on the high-level overview that there are three main types of interaction between the frontend and the backend:

  • HTTP GET requests from the browser to load the HTML generated by the demo-site app
  • HTTP POST requests with JSON payload from jQuery to render HTML fragments using another demo-site endpoint
  • HTTP GET to one of the three proxy apps to get the JSON required for the fragment above

Each request will end up on the nginx proxy server that is the only application in the stack listening to requests from the external network. It's configuration is managed by nginx-pygen for the backend services and by certbot-pygen for the SSL certificates.

Build

To have my dev laptop nicely organised and my tools kept portable in case I want to switch to another OS or machine I tend to avoid installing anything on it apart from Docker and Git (though Git comes pre-installed I think on Debian).

To write Python code and related configurations (like Dockerfiles, Bash scripts, etc.) I use the excellent PyCharm CE from JetBrains - in a Docker container of course.
Check out the rycus86/docker-pycharm repository to see how to use it on (probably) anything with X11 on it.
More details on how to run GUI applications with Docker check out this link.

If you don't have or want PyCharm installed on your host machine then have a look at vim which is a great editor with syntax coloring and tons of other useful features.

CI / CD

Testing the application, measuring it's quality and building the production binary is what continuous integration means to me for this project.
Luckily there are free services available to do all of these for open-source projects.

By continuous deployment I mean that after a git push you don't have to do anything manual to get a successful build onto the target server - it should happen automatically.
Apart from hosting an actual server this can be done for free as well, let's see how.

Prepare

Testing starts even before the code is submitted to remote services with a git commit. I use a simple git pre-commit hook to check that I don't commit something I know won't build successfully or would contain debug settings only meaningful for local development.

Setting it up is quite easy. Have an executable file (like Shell script) in your project root (.pre-commit.sh for example) then run:
ln -s ../../.pre-commit-sh .git/hooks/pre-commit
You'll have to do this once - though on each machine you clone the git project to. You can (and should) check in your git hook scripts into version control but unfortunately you cannot automatically register them with git locally.

This gives you a nice fail-fast way of doing some sanity-checks on your codebase but you'll probably want to repeat most of the steps in a CI service too - in case a change is submitted without having the git hooks in place or with git commit --no-verify.

Testing

Testing is important to verify that adding new features to your application will work as expected but also to make sure that those new features won't break any other existing ones.

For Python projects one of the easiest ways to have unit tests is with the official, pre-installed unittest module.

Running all your tests is easy:
PYTHONPATH=src python -m unittest discover -s tests -v
This assumes you're running the command from your project's root directory where the actual sources are in the src folder and the tests in the tests folder.

Measuring quality

It sounds like a good idea to know if you're code is well covered by tests, right?

In Python this is very easy with the coverage.py module. To install, simply run:
pip install coverage

To measure test coverage while running the same tests above run:

PYTHONPATH=src python                     \
    -m coverage run --branch --source=src \
    -m unittest discover -s tests -v

This will execute the same tests as above but recording code coverage metrics too this time. By default, these will be saved in a .coverage file - make sure you have that in your .gitignore.

To print the results after collecting the metrics you can run:
python -m coverage report -m

This outputs something like this:

$ python -m coverage report -m
Name         Stmts   Miss Branch BrPart  Cover   Missing
--------------------------------------------------------
src/app.py      47      2     10      1    95%   82, 86, 85->86

Not very high-tech or 2017-flashy in terms of displaying the results but you have your percentage of covered lines there plus the line numbers of the missing ones.
We can do better though.

Coveralls

For code coverage I use Coveralls.

Coveralls

This platform provides a nice timeline of overall code coverage as well as line-by-line markup of each source file clearly displaying which lines aren't covered by tests.
As an added bonus, you can also get neat coverage badges on your README files like this: Coverage Status

As I mentioned, this service is free to use for open-source projects and you can sign up with your GitHub account. Assuming you use Travis (more on Travis later) sending your code coverage metrics is as easy as:

$ pip install coveralls
... run coverage ...
$ coveralls

If you want, you can also set up automatic email, Slack, etc. notifications when your code coverage drops (or increases) so you would know about it automatically without actively checking.

Code Climate

Code Climate does code coverage too but it also provides static code analysis to reveal duplicated code or code style issues and much more.

Code Climate

Your source files get graded (from A to F) depending on how many issues they contain and how severe those are. The issues can be viewed inside the platform or you can install a Chrome extension and see the code analysis and coverage metrics directly in GitHub, nice.
This service also provides all sorts of notification hooks and other integrations.

Set-up is a bit more fiddly:

$ export CC_TEST_REPORTER_ID=<your token>
$ curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
$ chmod +x ./cc-test-reporter
... run coverage ...
$ ./cc-test-reporter after-build --exit-code $?

The first three lines set the Code Climate test reporter ID, download the reporting tool and make it executable. The last line takes care of sending the actual metrics along with the build's exit code.

Note for Python: you'll need to have XML coverage reports from the coverage.py module.

If you got all of the above right then here are your well-deserved badges:
Code Climate Test Coverage Issue Count

Travis CI

All right, this all sounds good so far but how do I do this automatically in a reliable and repeatable way every time I push changes to my project?
This is where Travis comes in.

Travis CI

Travis CI is a continuous integration service that allows you to run complex build (and deployment) plans for your project.
It is also free for open-source projects and you can sign up with your GitHub account.

If you're familiar with Jenkins or Bamboo then Travis is not dissimilar to them but it is hosted in the cloud - and did I mention it's free?

It can automatically start new builds whenever there is a new commit in your GitHub project and run it using default settings based on the auto-detected settings but for most use-cases you'll want to have your own build process described in a .travis.yml file.
You can see my .travis.yml file here as an example.

The main elements in the YAML file are:

  • language: the main programming language of your project
  • install: running installation instructions on the build environment
  • before_script: build preparation
  • script: the actual build instructions
  • after_success: any final steps after a successful build
  • env: environment variables to pass to the build

Taking a Python application as an example, let's see an example of putting together all the steps above:

language: python
python:
  - '2.7'
install:
  - pip install -r requirements.txt
  - pip install coveralls
before_script:
  - curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
  - chmod +x ./cc-test-reporter
script:
  # python tests
  - PYTHONPATH=src python -m coverage run --branch --source=src -m unittest discover -s tests -v
after_success:
  # coverage reports
  - coveralls
  - python -m coverage report -m
  - python -m coverage xml
  - ./cc-test-reporter after-build --exit-code $TRAVIS_TEST_RESULT
env:
  - CC_TEST_REPORTER_ID=1234abcd

This build plan will:

  • Initialize the build environment for Python version 2.7 and set an environment variable for Code Climate reporting.
  • The install step will fetch the libraries required by the application plus the Coveralls reporting tool.
  • In the before_script section we download and prepare the Code Climate reporter.
  • The script step is the actual command (can be more than one) that makes our build pass or fail.
  • Given the build was successful, the after_success section will prepare and print the coverage report and send it to Coveralls and Code Climate.

Easy.

Travis supports quite a few languages and can integrate with lots of other services.
You can get it to publish your built binaries or documentation somewhere like PyPI, deploy your application to AWS, Heroku, Google App Engine or others and you can also build Docker images and upload them to Docker Hub.

You can also get a command line client application for Travis that lets you do simple tasks without having to open up the web UI. This includes getting build statuses, restarting builds or encrypting files or variables for example. If you're like me and don't want to install this tool and its dependencies but still want to use it then have a look at my docker-travis-cli project that allows you to do exactly this by only installing a Bash script to execute the Travis commands using a Docker image.

Another great feature is the Matrix builds that allows you to have a build plan run multiple times for the same git commit but with different settings. For example, you might want to build your Python project on version 2.7 and 3.5 or you want to build multiple Docker images with different tags and/or different Dockerfiles.
Travis allows you to do this in an easy and intuitive way. Certain build sections (like python: above) treat multiple values as input for a matrix build, for example:

language: python
python:
  - '2.7'
  - '3.5'

This will cause your build plan to spin up 2 workers on changes, one for each Python version. A more generic approach is to define matrix environment variables:

...
env:
  matrix:
  - DOCKER_TAG=latest  DOCKERFILE=Dockerfile
  - DOCKER_TAG=alpine  DOCKERFILE=Dockerfile.alpine

This for example could be input for a Docker build to create a standard and a small version of the image but otherwise using the same build plan.

You can use both approaches in the same build plan and the combination of all matrix variables will be used to start builds - this might end up being a large number for lots of matrix variables.

Travis will send you emails about broken and fixed builds by default but you can get notifications on a wide range of other channels too.

And of course, we can't forget about our beloved badge either:
Build Status

Docker

Docker is (broadly speaking) an automation tool that was created to build reusable images and start lightweight containers of them. Since the start Docker is now able to do much more than that including but not limited to virtual network management and multi-host orchestration with load-balancing and automatic fail-over.
Make sure you're familiar with Docker for this part by reading their documentation.

The idea behind it is that if you can build a Docker image that you can run on one machine that same image should work on any machine having Docker.
That sounds nice, right?

To make builds repeatable Docker uses Dockerfiles - simple text files with a set of pre-defined instructions to create and image.

Let's look at an example:

FROM alpine

RUN apk add --no-cache python py2-pip ca-certificates

ADD requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt

RUN adduser -S webapp
USER webapp

ADD src /app
WORKDIR /app

CMD [ "python", "app.py" ]

The container running this site starts FROM a base image called alpine, the super-small Alpine Linux distribution that is ideal for building (and transferring) images based on them.
The RUN instruction executes Shell commands like you would do if you would do this on an actual server the old-fashioned way.
USER changes the Unix user the container will run the main process with. This is a good idea in general to reduce the attack surface your web application (and its container) has.
ADD (or COPY) copies files or directories from your host machine into the result image to make it available when running it as a container.
Finally, CMD defines what command should be executed when starting a container from this image unless an override is specified on the docker run command.

To see a complete reference for Dockerfile instructions check this link. While you're there, make sure to have a look at best practises for writing Dockerfiles.

To create an image from a Dockerfile you can execute docker build -t <owner>/<image-name>:<tag> -f Dockerfile .
The -f bit can be left out if you're using the default filename. The -t parameter assigns basically a name to the result image that you can refer to when interacting with it - to run a container or update it to a newer version for example.
Each instruction in the Dockerfile will add a new layer on top of the ones already in there and the docker build command can reuse previously processed layers where the instruction didn't change in the Dockerfile. This basically caches steps that were successful so you won't have to wait to install 79 packages with apt-get for example every time you build it locally.

Once your image is ready you'll probably want to distribute it (or at least copy it to another host for actually running containers of it).
Docker uses registries to allow you to do that which efficiently store images and lets you push new or pull existing content from it. It also handles layers and their caching as expected so on updates there's a good chance some layers can be reused without having to download them again.

Docker Hub

Docker Hub is the official hosted Docker registry. It is free to sign-up and start pushing public images into it so basically Docker is giving you space to store them for free - awesome!

The Docker client is able to interface with Docker Hub quite easily:

$ docker login
$ docker push my-user/my-awesome-image:latest
...
$ docker pull my-user/my-awesome-image

The docker login command will authenticate you against Docker Hub so you can push your images to the cloud later. Once there, the pull command fetches the latest version of the image which works without authentication too for public images.

If that is too 1998 for you then you can also use a nice feature of Docker Hub called automated builds. It can connect to your GitHub or BitBucket account and build your Docker project automatically on new changes on Docker's infrastructure. You can easily do the same with Travis but the nice thing here is that your images built this way will be marked as automated build on Docker Hub also displaying the Dockerfile they were built from so others can be sure that when they pull it the image will actually contain and do what you say it would.

Multiple versions (e.g. tags) are also supported. If you have different branches in GitHub then Docker Hub by default will take branch name and use it as Docker tag to build the Dockerfile in that branch. See the rycus86/pycharm image for example and the related repository.

If you're not sick of badges by this point then good news, here are two more from Shields.io:
Build Status Pulls

Multiarch builds

This is all nice and good and it works well as long as you want to build images for the most common x86 (or x64) architecture - Docker Hub should have no problems with that.
But what if you want the same app available for armv7 or aarch64 as well? - I'll explain a bit later why you would want to do that.

This is where we turn back to our awesome Travis matrix builds.

Previously we used the Python unit tests as a build success indicator but arguably your build being successful should mean that you can also produce a deployable binary version of it too. Even having automated builds on Docker Hub shouldn't stop us from running Docker builds on Travis too if we want to.

To enable Docker builds on Travis the .travis.yml file has to say:

sudo: required
services:
  - docker

After this we can use any Docker commands we want in the build.
For testing the build we'll want to run a docker build command.

Going back to multiarch: if we can get Travis to build a Docker image for us then we can get it to build N images for N different Dockerfiles in the same project.
The project running this site for example has three of them:

  • Dockerfile: for x86 hosts
  • Dockerfile.armhf: for 32-bits ARM hosts like the Raspberry Pi
  • Dockerfile.aarch64: for 64-bits ARM hosts like the Pine64

They are all exactly the same except for the initial FROM instruction that select an alpine base image each appropriate for the target architecture.

We could use a single build to create the images one-by-one sequentially but we can do better. Take this build configuration for example:

language: python
python:
  - '2.7'
sudo:
  - required
services:
  - docker
script:
  # python tests
  - PYTHONPATH=src python -m coverage run --branch --source=src -m unittest discover -s tests -v
  # docker build
  - docker run --rm --privileged multiarch/qemu-user-static:register --reset
  - docker build -t demo-site:$DOCKER_TAG -f $DOCKERFILE .
after_success:
  # push docker image
  - >
    if [ "$DOCKER_PUSH" == "yes" ] && [ "$TRAVIS_BRANCH" == "master" ]; then
      docker login -u="rycus86" -p="$DOCKER_PASSWORD"
      docker tag demo-site:$DOCKER_TAG rycus86/demo-site:$DOCKER_TAG
      docker push rycus86/demo-site:$DOCKER_TAG
    else
      echo 'Not pushing to Docker Hub'
    fi
env:
  matrix:
  - DOCKER_TAG=latest  DOCKERFILE=Dockerfile
  - DOCKER_TAG=armhf   DOCKERFILE=Dockerfile.armhf   DOCKER_PUSH=yes
  - DOCKER_TAG=aarch64 DOCKERFILE=Dockerfile.aarch64 DOCKER_PUSH=yes

What happens here is:

  • The script section now contains a docker build instruction (if it fails the build will fail too)
  • In after_success we log in with our Docker Hub credentials (which is nicely hidden by Travis) then we tag the image to be pushed under our Docker Hub user and finally push it - if we want to and we just built from the master branch
  • The env / matrix section injects the actual values for the environment variables for each sub-build.

I have to add a couple of footnotes at this point.

Pushing the built image can be disabled by not setting the DOCKER_PUSH environment variable to yes. You probably want to do this for the version that gets built by Docker Hub automatically to avoid overwriting that version.

Having said that, even for automated builds it seems that Docker Hub will allow you to push images with different tags under the same space without any problems - in fact this is how I have my ARM builds together with the automated one.

Finally, I have not explained this line so far:
docker run --rm --privileged multiarch/qemu-user-static:register --reset
This is what makes the multiarch builds possible. The multiarch/qemu-user-static project makes it possible to register a static QEMU binary with the host kernel so when it receives instructions for another processor architecture it will use that to execute the commands. For example on the x86 Travis hosts we can run executables compiled for ARM architecture after this line.

For the ARM Docker builds to pass we also need to make sure that the instructions (e.g. commands) run by the build will have access to the static QEMU binary. To do so we need to use base images that have it at /usr/bin/qemu-arm-static and that file is executable.
It turns out this is very simple to do and we can even do it with a Docker Hub automated build, see these projects for example:

To give credit, this is based on the great work the Hypriot team documented even though it looks like people have come up with this back in late 2015.

OK, great, this is working nicely again but the question still stands:
"why would we even want to do this to ourselves?"

Hosting

Once you have some applications in any presentable shape or form you might want to make it available on the Internet so people can find it and use it. A cost-effective way of doing so is by hosting a server (or a few) by yourself.

I had a couple of Raspberry Pi devices lying around at home and I run this stack on one of them initially. Once I had my multiarch images ready I could easily pull them from Docker Hub on it and start it with the Docker daemon.
If you're interested in Raspberry Pi and/or Docker have a look at this great blog from Alex Ellis where I've got some of the ideas from for this site.

The Raspberry Pi 3 I already had in use for other things has only an 8 GB SD card in it and I started to run out of space on it with all the Docker pulls quite quickly. I didn't want to redo its whole setup again so I started looking for another device.

I have finally found my Pine64 board hiding in a drawer not doing anything so I decided I'll use that one for hosting services such as this site. It has a similar 64-bit ARM processor than the Raspberry Pi 3 but unlike that you can find Linux distributions for it that support the 64-bits aarch64 architecture - as of this writing the Raspbian OS only supports 32-bits execution.

Porting my existing Docker images was easy once I've set up the arm64v8 base images. I've enabled a new configuration in the Travis matrix builds and the rest continued working as before with x86 and armv7.

Getting Docker on Pine64 wasn't as easy as it is on other, more common architectures but Alex Ellis has a great post on how to get it up and running.

After running and updating a few containers manually I realised that this is a lot of work and thought that I can probably do something better.

Docker-Compose

Instead of relying on your Bash history and worrying about whether you'll completely forget how do you usually run container #97 how nice would it be to define how to run a number of them with individual settings and have them join virtual networks they need to be in?

docker-compose does that for you and more. All it needs is a simple YAML file to describe your application stack.

To see what docker-compose can do check the Composefile reference - it is quite powerful. It even allows you to scale your containers and have them running multiple times if you want it.

Let's have a look at this site's Composefile for another example:

version: '2'
services:

  demo-site:
    image: rycus86/demo-site:aarch64
    read_only: true
    expose:
      - "5000"
    restart: always
    environment:
      - HTTP_HOST=0.0.0.0

  github-proxy:
    image: rycus86/github-proxy:aarch64
    read_only: true
    expose:
      - "5000"
    restart: always
    environment:
      - HTTP_HOST=0.0.0.0
    env_file:
      - github-secrets.env

  dockerhub-proxy:
    image: rycus86/dockerhub-proxy:aarch64
    read_only: true
    expose:
      - "5000"
    restart: always
    environment:
      - HTTP_HOST=0.0.0.0
    env_file:
      - dockerhub-secrets.env

The Composefile defines three read-only applications that will work together. Each application exposes its port 5000 to listen for incoming requests and they are also configured in a way that Docker would restart them automatically should they fail for whatever reason. The HTTP_HOST environment variable is set to 0.0.0.0 and is processed by the Flask app by making it listen on all network interfaces - not only on localhost like it would do by default. Unlike Docker Swarm docker-compose cannot manage secrets so those are passed to the containers as environment variables from a key-value text file. Note that the containers are all based on aarch64 images built automatically by Travis and are uploaded to Docker Hub.

Given that each of the apps listen on port 5000 we need a way of getting requests to find the correct target when sent from the external network.

Proxy server

When I was looking for a solution to route requests to my apps I came across the brilliant jwilder/nginx-proxy project on GitHub. It is a self-contained Docker image that will run an Nginx proxy server plus a lightweight process (called docker-gen) that listens for container start and stop events from the Docker daemon and automatically reconfigures Nginx - plain awesome!

It requires attaching the Docker socket file as a volume to the container so it can connect to the daemon. It also requires the target containers to expose their target ports and to have the VIRTUAL_HOST environment variable set to the domain name we want Nginx to proxy the requests from. As containers start or stop (or scale) Nginx is automatically reloaded with the updated auto-generated configuration, load-balancing between multiple instances of the same application if it runs as multiple containers by scaling.

This is great because the only thing you have to worry about is starting properly configured containers and docker-gen will do the rest for you. If you want to add a new application to your stack you just start it with the VIRTUAL_HOST environment variable set to the new domain name and you're done.

Having a publicly available endpoint that also has root access to your Docker daemon is not the most secure thing ever though but the GitHub project suggest a nice alternative. You can run Nginx as the only container with an externally reachable port alongside an individual docker-gen container that has read-only access to the daemon and a shared volume with Nginx to be able to update its configuration file. When it did that it will send a UNIX signal to the nginx container that causes the proxy server to reload its configuration.

Let's look at the Composefile for this site again:

version: '2'
services:

  # Proxy server
  nginx:
    image: rycus86/arm-nginx:aarch64
    container_name: nginx
    ports:
      - "80:80"
    volumes:
      - nginx-config:/etc/nginx/conf.d

  nginx-gen:
    image: rycus86/arm-docker-gen:aarch64
    container_name: nginx-gen
    command: -notify-sighup nginx -watch -only-exposed /etc/docker-gen/templates/nginx.tmpl /etc/nginx/conf.d/default.conf
    volumes:
      - nginx-config:/etc/nginx/conf.d
      - /tmp/nginx-proxy.tmpl:/etc/docker-gen/templates/nginx.tmpl:ro
      - /var/run/docker.sock:/tmp/docker.sock:ro

  # Demo site
  demo-site:
    image: rycus86/demo-site:aarch64
    expose:
      - "5000"
    environment:
      - VIRTUAL_HOST=demo.viktoradam.net

  # REST services
  github-proxy:
    image: rycus86/github-proxy:aarch64
    expose:
      - "5000"
    environment:
      - VIRTUAL_HOST=github.api.viktoradam.net

  dockerhub-proxy:
    image: rycus86/dockerhub-proxy:aarch64
    expose:
      - "5000"
    environment:
      - VIRTUAL_HOST=docker.api.viktoradam.net

volumes:
  nginx-config:

To piece it together:

  • nginx listens on port 80 from the outside world and passes connections to port 80 of the running Nginx process.
  • nginx-gen is configured to share the configuration volume with it and with the name of the container to send the reload signal to. It also has read-only access to the Docker daemon but not any other containers.
  • The Flask applications don't have externally bound ports, they only expose their port 5000 so Nginx can proxy requests to them on the Docker virtual network.
    They also each have their corresponding VIRTUAL_HOST variable set.
  • The volumes section declares the volume shared by both nginx and nginx-gen.

Again, adding a new application to this stack is a matter of adding its configuration to the Composefile and executing a docker-compose up -d command.

What if you want more flexibility on how the configuration file is templated and/or you're not a huge fan of Go templates? Enter Docker-PyGen.

Docker-PyGen

I found docker-gen amazing and it mostly did what I wanted but I'm not familiar with Go or its templates so it would have been difficult to get it to do exactly what I wanted.

For this reason I started working on Docker-PyGen. It is the same concept but implemented in Python and it uses Jinja2 templates to generate content for configuration files - the same language that Flask uses for rendering content and is behind this page and site too.

Having this tool in Python allows me to write more expressive templates with the models over the runtime information of Docker containers. For example:

http://{{ container.networks.first_value.ip_address }}:{{ container.ports.tcp.first_value }}/
          ^^        ^^       ^^          ^^
          model     list     property    str
          (dict)             (dict)

- or -

https://{{ containers.labels.virtual_host }}:{{ container.env.http_port }}/{{ container.labels.context_path }}
                      ^^     ^^                           ^^       ^^
                      (dict) key                          (dict)   key
                             labels['virtual_host']                env['HTTP_PORT']

Because the docker-py models from the official Python client are wrapped, I could add these methods for convenience when using it in templates. You can also use matching on lists of containers or networks to filter them to loop through container having been attached to the same network in a Compose project for example.

Like docker-gen, this tool can also signal or restart other containers when the target configuration file is updated. I wanted to remove the need having to run the target container as a single instance with a pre-defined name so Docker-PyGen supports targeting containers by Compose service name or specific labels / environment variables too.

Dynamic DNS

If you're looking to set up your own server (like I did) you might run into some hosting issues if your ISP doesn't give you a static IP address. There are good option still like Namecheap that supports Dynamic DNS - meaning you can send them your IP address periodically and they will point the domain name at that address you sent.

To help you do that you can use ddclient that supports quite a few DNS providers.
It has a simple text-based configuration that looks something like this:

use=web, web=dynamicdns.park-your-domain.com/getip
protocol=namecheap 
server=dynamicdns.park-your-domain.com 
login=sample.com
password=your dynamic dns password
test, demo, www

This configuration could update the IP addresses for:

  • test.sample.com
  • demo.sample.com
  • www.sample.com

To make it easier you can also have ddclient in a container like this and configure it in a Composefile perhaps alongside your applications.
It would look like this:

version: '2'
services:
  # Proxy server
  ...
  # Applications
  ...
  # DynDNS client
  ddclient:
    image: rycus86/ddclient:aarch64
    command: --daemon=300 --ssl --debug
    restart: always
    volumes: 
      - /etc/ddclient.conf:/etc/ddclient.conf:ro

This would run the ddclient and it would update your configured domains every 5 minutes.

SSL / HTTPS

To make the website and the API endpoints more secure, we should really expose them on HTTPS rather than HTTP. Thankfully this can now be done for free too!

The LetsEncrypt organization is dedicated to make the web safer and more secure so they provide services that allow the registration of SSL certificates for free and in an automated fashion. To get started, you only need an ACME protocol compliant client (like the official Certbot client) and to control the domain you want to register. If all works out, you'll get an SSL certificate that is trusted by most modern browsers and HTTPS client libraries.

To register (or renew) a certificate for a domain, LetsEncrypt needs to verify that you actually own it. At the moment, this is done by getting their services to issue requests to the target domain where some application needs to respond to their challenge.

If you don't have anything running on the target yet, the easiest way is to let certbot start a simple Python webserver to respond to these challenges on the standard HTTP or HTTPS ports.
This can be done easily like this:

certbot certonly -n -d my.domain.com --keep --standalone \
        --email admin[email protected] \
        --agree-tos

You might already have a webserver in place though so this method of verification would not work. On this demo site I have Nginx listening on both of those ports and I can easily get it to accept the challenges. All it takes is to define a root folder for static content on the target domain and save the challenge response in a file in there before the request arrives. You can achieve this manually with a couple of Shell commands but why not automate it?

Using my Docker-PyGen tool and a container having certbot with a couple of scripts I can generate a list of Shell commands to instruct certbot to request or renew certificates for domains defined as labels on the running containers.
The template for this is really quite simple:

{% for ssl_domain, matching in containers|groupby('labels.nginx-virtual-host') if ssl_domain %}
    certbot certonly -n -d {{ ssl_domain }} --keep --manual \
        --manual-auth-hook /usr/bin/certbot-authenticator \
        --manual-public-ip-logging-ok \
        --email {{ containers.matching('certbot-helper').labels.letsencrypt_email }} \
        --agree-tos
{% endfor %}

Once the list of commands is generated, I can send a signal to the container that has certbot to take them one-by-one and execute them. In the template, the --manual-auth-hook parameter refers to a Shell script that will save the challenge file with the appropriate content on a shared volume that maps onto the static file folder on Nginx. This is run before the actual challenge starts and can be cleaned up using the --manual-cleanup-hook parameter once it's finished.

certbot saves the certificates in the /etc/letsencrypt/live/<domain> folder by default so we just need to make sure this also resides on a shared volume to make it accessible for Nginx.

Putting it all together

The relevant parts of the docker-compose.yml look like this:

version: '2'
services:

  # Proxy server
  nginx:
    image: nginx
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - nginx-config:/etc/nginx/conf.d
      - ssl-certs:/etc/letsencrypt:ro
      - letsencrypt-challenge:/var/www/challenge/.well-known/acme-challenge:ro

  nginx-pygen:
    image: rycus86/docker-pygen
    command: --template /etc/docker-pygen/templates/nginx.tmpl --target /etc/nginx/conf.d/default.conf --signal nginx HUP
    volumes:
      - nginx-config:/etc/nginx/conf.d
      - ./nginx-pygen.tmpl:/etc/docker-pygen/templates/nginx.tmpl:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro

  # SSL certificates

  certbot-helper:
    image: ...
    labels:
      - [email protected]
    volumes:
      - letsencrypt-config:/etc/certbot-helper:ro
      - letsencrypt-challenge:/var/www/challenge
      - ssl-certs:/etc/letsencrypt

  certbot-pygen:
    image: rycus86/docker-pygen
    command: --template /etc/docker-pygen/templates/certbot.tmpl --target /etc/certbot-helper/updates.list --signal certbot-helper HUP
    restart: always
    volumes:
      - letsencrypt-config:/etc/certbot-helper
      - ./certbot-pygen.tmpl:/etc/docker-pygen/templates/certbot.tmpl:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro

volumes:
  nginx-config:
  ssl-certs:
  letsencrypt-config:
  letsencrypt-challenge:

The changes in the Nginx Docker-PyGen template are:

server {
    ...
    listen 80;
    listen 443 ssl;
    server_name {{ virtual_host }};
    ssl_certificate /etc/letsencrypt/live/{{ virtual_host }}/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/{{ virtual_host }}/privkey.pem;
    ...

    location '/.well-known/acme-challenge' {
        root /var/www/challenge;
    }
}

The template for certbot-helper is as shown earlier. Once the target file with the list of certbot commands is updated, the SIGHUP signal will execute those commands and we'll have our fresh SSL certificates in place - yay!

HTTP/2

Once we have SSL enabled on our endpoints, there's really nothing stopping us from enabling HTTP/2 as well. The newer version of the protocol uses a duplex connection to the server which allows multiple simultaneous requests and responses to be processed on the same domain.

This is great for serving many small static files like JavaScript and CSS but also if the frontend needs to issue tons of AJAX calls to backend services that are all on the some domain - like this site.

As it turns out, enabling HTTP/2 on Nginx is as easy as changing:

server {
  listen 443 ssl;
  ...
}
server {
  listen 443 ssl http2;
  ...
}

Can you spot the difference?

Having mentioned small static files (but also any kind of response - from REST endpoints for example) it's generally a good idea to enable gzip compression on all the responses where we can. In my case it can be everything and on Nginx you could do that in the main configuration file (at /etc/nginx/nginx.conf by default) like this:

http {
  ...
  gzip on;
  ..
}

Let's verify that everything is working as expected:

$ curl --compressed -s -v https://demo.viktoradam.net/page/specs > /dev/null 
*   Trying 94.192.68.109...
* TCP_NODELAY set
* Connected to demo.viktoradam.net (94.192.68.109) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [100 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [2480 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=demo.viktoradam.net
*  start date: Jul 16 23:15:00 2017 GMT
*  expire date: Oct 14 23:15:00 2017 GMT
*  subjectAltName: host "demo.viktoradam.net" matched cert's "demo.viktoradam.net"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0x55baba733da0)
} [5 bytes data]
> GET /page/specs HTTP/1.1
> Host: demo.viktoradam.net
> User-Agent: curl/7.52.1
> Accept: */*
> Accept-Encoding: deflate, gzip
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
< HTTP/2 200 
< server: nginx
< date: Tue, 18 Jul 2017 21:33:31 GMT
< content-type: text/html; charset=utf-8
< vary: Accept-Encoding
< content-encoding: gzip
< 
{ [15 bytes data]
* Curl_http_done: called premature == 0
* Connection #0 to host demo.viktoradam.net left intact

Let's have a look at the HTTP response and headers.

The server's certificate was verified successfully:

* Server certificate:
*  subject: CN=demo.viktoradam.net
*  start date: Jul 16 23:15:00 2017 GMT
*  expire date: Oct 14 23:15:00 2017 GMT
*  subjectAltName: host "demo.viktoradam.net" matched cert's "demo.viktoradam.net"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.

HTTP/2 is working as expected:

* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)

And finally, the response was gzip compressed:

< HTTP/2 200 
< server: nginx
< date: Tue, 18 Jul 2017 21:33:31 GMT
< content-type: text/html; charset=utf-8
< vary: Accept-Encoding
< content-encoding: gzip

Google Analytics

To know about the site's visitors and their browsing experience I use Google Analytics tracking.

The basic setup is really easy. After registering an account and your website (for free) you get an identifier like UA-123456789-1. Once you have that you can add a small JavaScript code to the end of your pages like this:

<!-- Google Analytics -->
<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-123456789-1');
  ga('send', 'pageview');
</script>

This will send a pageview tracking event with the page's title and URL every time someone is browsing it. This allows you to know about what landing pages are performing best or what other pages do visitors go on to check if any.

To know more about how the browsing experience is for your website's visitors you can add some more JavaScript code to get timings of certain events.

if (window.performance) {
    var timeSincePageLoad = Math.round(performance.now());
    ga('send', 'timing', 'JS Dependencies', 'javascript', timeSincePageLoad);

    var existingWindowOnLoad = window.onload;
    window.onload = function() {
        if (!!existingWindowOnLoad) {
            existingWindowOnLoad();
        }

        setTimeout(function() {
            var timing = window.performance.timing;
            var page_load_time = timing.loadEventEnd - timing.navigationStart;
            ga('send', 'timing', 'Page load', 'load', page_load_time);

            var connect_time = timing.responseEnd - timing.requestStart;
            ga('send', 'timing', 'Request time', 'request', connect_time);
        }, 0);
    };
}

If the browser supports the Performance API (most of them do) you can get timing metrics on how long did it take to load the JavaScripts (JS Dependencies) for the page or how much time did the Page load or the initial request (Request time) take. By default Google Analytics tracks this for 1% of the visitors but you can tweak it by changing the initialization code like this:

ga('create', 'UA-123456789-1', {'siteSpeedSampleRate': 100});

This will track all your visitors' browser timings. You can track any sorts of events, for example the time it takes to lazy-load certain parts of the website with a simple instruction like this:

ga('send', 'timing', title, label, end_time - start_time);

Once you gathered some data about the visitors log in to the dashboard to analyze the results.

Google Analytics

Webmaster tools

Now that you have tracking on the website you might find out that people don't actually find it. Getting your pages to show up in Google search results could greatly improve your chances though and it's easy to get started with it.

You can register your website on Google Webmasters Tool (for free) to get their crawler to index it. It will also display useful statistics about how often do your pages show up in search results, for what queries and how many times people click on those links to get to the site. This could also help optimizing the website to make it more appealing in the search results if you find that people don't navigate to it even if it does show up on the list often.

To get started, Google needs to verify that you manage the website. There are multiple ways to do it but having Google Analytics set up already you can get the system to verify that the website has tracking code on it associated with your account.

You can see useful metrics and information about how the crawler interprets your site, how often does it check it or any issues it has identified.

Search crawler

Once your pages are getting indexed and start showing up in the search results, you can find statistics about them in the dashboard.

Search traffic

You can influence your pages look like in the search result. To control the basics, make sure you have short and meaningful titles in the <title> tags and have a nice summary of the content in the description as:

<meta name="description" content="...">

If you can refer to your pages with multiple URLs, you might want to also add canonical links to them to tell Google that they are the same and which URL is your preferred one.

<!-- Make sure to use absolute URLs here. -->
<link rel="canonical" href="https://my.site.com/preferred/url"/>

There are a number of factors influencing the ranking of the pages including having relevant and frequently updated content on them. Other metrics like time spent on the site by visitors and bounce rate might also have an effect on it.

Both visitors and Google takes into consideration how quickly they can load pages of the website so let's run some tests to find out!

Pagespeed Insights

Once you have some metrics about browser timings from your site's visitors, you might find that the experience is not that great when you're not connecting to the site on localhost on your laptop.

A great tool that can help you identify some of the problems is Google's PageSpeed Insights that processes a page on your website and analyzes its performance from both mobile and desktop perspective. It also gives you helpful hints on how to resolve those problems.

PageSpeed Insights

You only need to enter the URL of the page you're looking to get more information about and hit the Analyze button. On my demo site I've got a few helpful tips on how to make it faster.

Browser caching

The static resources like images, JavaScript and CSS files where served without or not long-enough cache expiration headers. I have changed this easily to 1 month in Flask using a simple configuration.

app = Flask(__name__)
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 30 * 24 * 60 * 60  # 1 month
Optimize images

Optimizing images was an easy win since most of my images were screenshots saved as is so there was definitely room for improvement. PageSpeed Insight lets you download the optimized version of your images or you can use any tools you like.

Eliminate render-blocking resources

The browser will not be able to render the page if it has resources to download before the content, usually in the <head> section of the HTML. For JavaScript you can try the async attribute on the <script> tag though be aware that the scripts will be downloaded and executed in random order so, if you have dependencies between your scripts, this might be tricky. I opted for using the defer attribute which delays loading the JavaScript files until after the page has finished parsing.

CSS was a bit trickier since you would normally want the content on the page to appear looking nice so these usually live in the <head> section. Still, you might have styles that can be loaded later, for content not visible at first glance or elements that are lazy loaded. I have changed my <link> tags loading non-critical CSS to:

<meta name="custom-fetch-css" content="/asset/non-critical.css"/>

These are then turned into regular <link> tags with JavaScript once those are loaded:

lazyLoadCSS: function () {
    $('meta[name=custom-fetch-css]').each(function () {
        var placeholder = $(this);
        var href = placeholder.attr('content');

        placeholder.replaceWith(
            $('<link>').attr('rel', 'stylesheet')
                       .attr('href', href)
                       .attr('type', 'text/css'));
    });
}
Enable compression

I could easily enable this in Nginx as described already with a little bit of configuration.

http {
  ...
  gzip on;
  ..
}

server {
    gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript image/svg+xml;
    ...
}
Other hints

My site's results were getting better in PageSpeed Insights but there were still some more things to improve:

  • Reduce server response time
  • Minify HTML
  • Minify CSS
  • Minify JavaScript

One could use the PageSpeed Module as suggested by the tool which is an add-on for Apache or Nginx webservers. I did not want to install additional modules on mine though so I have opted for a different approach.

Webpage Test

Another great tool to measure your website's performance is webpagetest.org.

This site allows you to run tests that fetch a page on your website and collect all sorts of metrics useful to know where to focus for optimization. They can run these tests using the most popular browsers and from many locations all around the world. This is especially useful to see how latency hurts the user experience.

Webpagetest options

The results are now presented in useful tables and graphs but screenshots and a video is also available on how the page was loaded in the browser.

Webpagetest result

The site gives you a score too and offers helpful hints on what areas need improvement.

This site has helped me identifying some bottlenecks. Based on the information and the suggestions I am now:

  • Using a Content Delivery Network (see in a following section)
  • Sending HTTP/2 preload headers for CSS and JavaScript
  • Inlining some of the CSS to avoid sending it as 2 additional files
  • Hosting the fonts on my server behind the CDN
  • Lazy-loading images
  • Sending better cache headers

This has helped me to reduce the page load time from certain locations from close to 10 seconds to about 3 seconds.
Not too bad.

CDN

My main problem with my self-hosted server is that its response time is limited by my internet connection speed - which is not enterprise-grade by any measure. Response times are mostly OK from within the UK but the further you go the worse it gets.

The solution to the problem are Content Delivery Networks. These provide a geographically distributed service to proxy your web content with the goal of reducing the latency by serving the data from a server close to the end user.

Cloudflare offers a free plan that gives you a global CDN with SSL, caching, analytics and much more.

On the initial setup, I needed to change the DNS servers on Namecheap to use the ones provided by Cloudflare instead of their own. This change could take 1 or 2 days according to them but for me it has actually happened within minutes.

Apart from serving the response closer to the user what does Cloudflare give me?

End-to-end encryption

With the Full (strict) SSL setting the visitors use the new Cloudflare certificates while it fetches the content from my origin server using its LetsEncrypt certificates.

Auto Minify content

A great feature: with a matter of checking some checkboxes my HTML, JavaScript and CSS content is now served minified without having to do it on the origin server.

Caching

You can get Cloudfare to cache content for a certain amount of time and add appropriate cache headers on the responses to get the browser to cache content for some time too. For example, you could get the edge servers to cache everything for 2 hours and tell the browser to cache content locally for 1 hour.

This setting actually needed using 1 of the 3 free Page Rules. The configuration I am using is:

  • Cache Level: Cache Everything
  • Browser Cache TTL: 1 hour
  • Edge Cache TTL: 2 hours

You can also purge cached content on the dashboard or using their API if you need to make an update visible quickly.

HTTP/2 Push

HTTP/2 Push is a great feature of the new version of the protocol. It allows the web server to send resources to the browser before it would realise it will need them, for example CSS or JavaScript files included in the response document.

Unfortunately, Nginx does not support this in the community version, only in the paid Plus version so I could not use it easily. Cloudflare does handle them though beautifully and all you need to do is send HTTP headers like these in the response:

Link: </asset/some-style.css>; rel=preload; as=style
Link: </asset/some-script.js>; rel=preload; as=script

In my Flask application I have a dictionary of the static assets so the code to prepare these headers looks like this:

for name, link in assets.items():
    if name.endswith('.css'):
        response.headers.add('Link', '<%s>; rel=preload; as=style' % link)

    elif name.endswith('.js'):
        response.headers.add('Link', '<%s>; rel=preload; as=script' % link)

Updates

My initial update strategy used to be simple. I had one server running the site and its related services using docker-compose. Every 15 minutes a cron job would fetch the latest configuration from a private BitBucket repository, run docker-compose pull to update the images from Docker Hub and execute docker-compose up -d to restart the changed services.

I have since switched to 3 server nodes forming a Swarm cluster. Whenever a new image is pushed to Docker Hub it will send a webhook request to a public-facing webhook-proxy instance. This validates it and on success it sends the required details to another (internal) webhook processor instance. That will pull the latest image, get its hash and update the related service(s) to use the new image tag. This allows me to leverage Swarm's rolling update mechanism.

For configuration updates happening in the private repository, the Git repository is pulled, services related to changed configuration files are restarted and finally the stack is updated using docker stack deploy <stack_name> to ensure any changes in the YAML file are applied.

The update process has changed considerably since I've started working on the stack but the ultimate goal hasn't: it allows me to forget about everything needed to get my changes live - the only thing I have to do is git push.

Now that we have our services up and running we should keep track of how they are doing. It is time to add some monitoring for the stack!

Prometheus

A great open-source software to monitor your applications is Prometheus originally built at SoundCloud. It has a multi-dimensional data model to keep track of time series with key/value pairs and a flexible query language to make great use of it.

Prometheus graph

Prometheus uses an HTTP-based pull model for collecting metrics. That means that the applications you want to monitor should normally expose a /metrics HTTP endpoint that Prometheus can access. It then periodically scrapes the configured targets to retrieve the latest metrics from them. If this is not your thing, you can also use a push model via gateways.

The server is available as a Docker image too as prom/prometheus so to try it out you can simply start it like:

docker run -p 9090:9090 -v /tmp/prometheus.yml:/etc/prometheus/prometheus.yml \
       prom/prometheus

This expects the prometheus.yml configuration file at /tmp/prometheus.yml on the host machine. Configuration is done via this simple YAML file where you can define global settings and the scrape targets:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

scrape_configs:
  # Prometheus to monitor itself
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'remote-app'

    static_configs:
      - targets: ['remote.server:9122']

The server conveniently responds to SIGHUP signals by reloading itsi configuration which makes it an excellent candidate for integrating with Docker-PyGen. It supports many other configuration options though for service discovery so make sure to have a look at the configuration reference to see if any of them would work better for you.

To expose metrics, you can have a look at the growing list of existing exporters or you can roll your own using the client libraries. The response format is very simple but most of these libraries have convenience functions to start an HTTP endpoint doing this for you.

Prometheus comes with a simple UI to check the configuration, the status of the scrape targets and query and visualisation page.

Prometheus scrape targets

This gives you a quick and easy way to run queries ad-hoc but for visualization you will probably want something more sophisticated.

Grafana

Grafana is also an open platform for analytics and monitoring. It supports over 25 data sources already to build beautiful dashboards using their data.

Grafana node metrics

Grafana is also available as a Docker image. You can get it by pulling the official grafana/grafana image then running it as:

docker run -d --name=grafana -p 3000:3000 grafana/grafana

This will start the server on port 3000 and you can log in using the admin/admin credentials.

The platform supports Prometheus out of the box so you can add it easily.

Grafana datasource

Once it's done, you can build a dashboard using its metrics or you can also import the default dashboard for it that displays useful metrics about the Prometheus server itself.

Grafana - Prometheus

To make these monitoring systems connected in my docker-compose project I a setup like this:

version: '2'
services:

  ...services to monitor...

  # Metrics
  prometheus:
    image: rycus86/prometheus:aarch64
    restart: always
    ports:
      - "9090:9090"
    volumes:
      - prometheus-config:/etc/prometheus:ro

  prometheus-pygen:
    image: rycus86/docker-pygen:aarch64
    command: --template /etc/docker-pygen/templates/prometheus.tmpl --target /etc/prometheus/prometheus.yml --signal prometheus HUP
    restart: always
    volumes:
      - prometheus-config:/etc/prometheus
      - /tmp/prometheus-pygen.tmpl:/etc/docker-pygen/templates/prometheus.tmpl:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro

  prometheus-node-exporter:
    image: rycus86/prometheus-node-exporter:aarch64
    command: --collector.procfs /host/proc --collector.sysfs /host/sys
    restart: always
    expose:
      - "9100"
    labels:
      - prometheus-job=node-exporter
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro

  grafana:
    image: rycus86/grafana:aarch64
    restart: always
    ports:
      - "3000:3000"
    labels:
      - nginx-virtual-host=metrics.viktoradam.net
    volumes:
      - /tmp/grafana.config.ini:/etc/grafana/grafana.ini

volumes:
  prometheus-config:

This way I can get the Docker-PyGen container to reload the Prometheus configuration when new services are available and have Grafana point to it as http://prometheus:9090. Simple, isn't it?

You can see live dashboards at metrics.viktoradam.net:

Portainer

To help you monitor your Docker containers better you can also run Portainer. The application is a lightweight management UI that taps into the Docker daemon to provide all sorts of useful information about it.

Getting started could not be easier:

docker run -d --name portainer -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock \
       portainer/portainer

This starts Portainer on port 9000 so just launch http://localhost:9000/ in your browser. On first run, it will ask you for an admin password then you can access the interface afterwards.

The dashboard gives you a quick glance over some metrics around the number of containers, images, volumes and Docker networks you have plus basic information about the node it is connected to and about Swarm if it is available.

Portainer dashboard

You can see details of all of them on dedicated pages and you can even control them from there - pretty nice.

Portainer containers

You can pull new images or start new containers and services for example from these pages. It is also great for quickly checking what is unused and deleting them - in case you have not grown to love the new prune commands on the Docker CLI yet.

If you want to use this in Docker Compose you could have it defined like this:

version: '2'
services:

  ...other services...

  portainer:
    image: portainer/portainer:linux-arm64
    restart: always
    ports:
      - "9000:9000"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

As you can see, the official images are available for multiple architectures, not only for x64 which is very nice.

Loading...
Loading...