self-hosting webdev

SELF HOST WEB ANALYTICS WITH PLAUSIBLE


Plausible has made it easy to move away from Google Analytics

I am increasingly concerned about the growing big-tech surveillance state. As a worker in tech I know all too well how much personally identifiable information is shared by simply visiting a web site. Google is not the worst company in this regard, but they are certainly one of the most pervasive. Over the past few years, I have engaged in a never ending game of whack-a-mole to De-Google my life.

One of the last areas I still relied on Google was for web analytics for this blog and my band's website.

I chose Plausible.io to replace Google Analytics for the following reasons:

The first step was figuring out where to host my analytics server. I could host this at home for free but I avoid opening any ports on my home network to the internet. So I set up a small vps at my my favorite cloud platform. I don't need anything beefy here, I selected the smallest possible VPS with a total cost $5/month.

Next, I needed a publicly available domain name for this server. I purchased one and moved the DNS to Cloudflare

I created an Ansible playbook to install all the necessary services on my VPS. If you're not familiar with Ansible, it is a simple but powerful tool to manage remote computers with a series of yaml files.

The full playbook is publicly available on Github.

This playbook performs the following tasks.

Configuring Plausible Analytics proved incredibly simple. I only needed to add a few items to the docker-compose.yml that they provide in their git repository.

Labels to allow Traefik to point to the Plausible container

services:
    plausible:
        labels:
            - "traefik.enable=true"
            - "traefik.http.routers.stats.rule=Host(`stats.MYDOMAIN.COM`)"
            - "traefik.http.routers.stats.entrypoints=web,websecure"
            - "traefik.http.routers.stats.tls.certresolver=letsencrypt"
            - "traefik.http.routers.stats.middlewares=chain-authelia@file"

Add my docker network to the compose file

networks:
    traefik_proxy:
        name: traefik_proxy
        external: true

Add my docker network to each service

services:
    service-name:
        networks:
          - traefik_proxy
        ...

Create an access control policy to allow access to the Plausible container

access_control:
    default_policy: deny
    rules:
        - domain: "stats.{{ "{{" }} domain_name }}"
          resources:
              - '^/js/.*\.js$' # Access to the analytics javascript
              - "^/api([/?].*)?$" # Access to the API
              - "^/MYDOMAIN.COM*$" # Share the stats publicly
              - "^/share/.*$" # Allow embedded stats in a web page
          policy: bypass

The team at Plausible has done an amazing job. Their documentation is simple to follow and their self-hosted platform worked flawlessly the first time.

The last step was to edit my Jekyll blog templates to remove Google analytics and replace it with Plausible. Here's the commit showing the changes.

Now, Google is no longer spying on visitors to this site.

There are a few remaining steps that I plan on tackling next.

  1. Update my band's website to use Plausible
  2. Import my old data from Google Analytics using Plausible's import tool
← My MacOS ZSH profile
 → A JSON content feed for Jekyll