Reading recommendations (2017-08-13)

Posted on Sun 13 August 2017 in reading recommendations

~Onatcer tells devastating things about the new exams required to study Computer Science in Vienna in Der TU-Aufnahme-Test: Pleiten für alle! (German).

Troy Hunt provides a new service where one can check passwords against a gigantic collection of millions of leaked passwords with Introducing 306 Million Freely Downloadable Pwned Passwords.

Currently Final Fantasy XIV's Moonfire Faire seasonal festival is running and I already played through the seasonal quests in order not to miss anything. ~Luxpheras from the Community Team put up the post Shaved Ice Ice Baby to promote the event with some pictures of the spoils.


Sidenotes.


How I publish this blog

Posted on Mon 07 August 2017 in development

It was 2015 when I finally decided to act upon my dissatisfaction with the WordPress publishing process and move to a different solution. I exported my posts and pages from its MySQL database and moved on to Pelican - a static site generator written in Python. Usually, when you hear "static site generator" you think of Jekyll. Jekyll is the static site generator people know of - the major reason for that being that it is used behind the scenes for Github Pages.

Jekyll is written in Ruby, however, and I have not put enough time into Ruby to be more familiar with it than exchanging some lines in existing code here and there. Python is my tool of choice and when a friend mentioned Pelican I was immediately hooked - even though it took me many months to finally put my plans into motion.

Back in the days: WordPress

WordPress had always struck me as being built for ease of use. It is heavyweight, can be deployed almost everywhere and its features are plentiful. There was one major pain point for me though: For a reason I have never figured out, none of the available native clients (e.g. Blogo, Marsedit) ever managed to show me more than my last few posts instead of a full view of all historical ones.

I frequently edit posts in the days after they are published. I fix typos, update the wording if I think it is bad after reading it again and sometimes add additional information. I consider publishing an article a bit like writing software or configuring a system. It often needs a little adjustment after it has been in use (or in testing) for some time. With WordPress that meant I had to go to the admin page every time to change something. The workflow was something akin to:

  • go to bookmarked login site
  • swear about login being insecure due to missing TLS deployment
  • log in
  • go to section "posts"
  • find the post in question
  • edit the post by copying the modified content from my local file to the website
  • preview the post on the site
  • save the post

I dislike the need to look for click targets, to scan for the relevant article in the list, the waiting between interactions on a slow connection. The setup screamed for some sort of automation but nothing seemed easy to set up at that point.

Uploading Pelican

Immediately after switching to Pelican for content generation, I found myself in the puzzling situation of having a blog but no easy way to publish it. A bit of investigation uncovered Pelican shipping with a Makefile that includes a ftp_upload target though. I configured this and added a ~/.netrc file so I didn't need to type my password every time an upload was performed. This worked fine for a while. I even wrote a little bash aliases to run it.

source ~/.virtualenvironments/pelican/bin/activate \
  && cd ~/…/ghostlyrics-journal/Pelican \
  && make ftp_upload \
  && deactivate \
  && cd - \
  && terminal-notifier -message "GhostLyrics Journal published." -open "http://ghostlyrics.net

It was in May 2016 that the lftp build for macOS broke. That means that after an upgrade of macOS I was left without a way of easily deploying changes to the blog. Pelican uses lftp because of some of its features like mirroring a local folder and updating only the differences instead of copying the whole folder recursively every time you kick it. I think I tried to publish with Transmit once or twice but it is simply not built for this task.

I was enormously frustrated and heartbroken. I didn't write anything for weeks, instead hoping a solution would surface that didn't require engineering effort on my part. However, the build remained broken and so did my FTP upload.

After being inspired I decided that the status quo wasn't acceptable and went on to build a way that allowed me to simply run publish in Terminal and have everything done for me - reproducibly and rock solid.

Up comes Vagrant

In October 2016 I came up with a Vagrantfile that allowed me to publish from an Ubuntu machine via Vagrant. This worked around the author of lftp seemingly having little interest in building for macOS.

Vagrant.configure("2") do |config|
  config.vm.box = "bento/ubuntu-16.04"
  config.vm.synced_folder "/…/ghostlyrics-journal", "/pelican"

  config.vm.provision "file", source: "~/.netrc", run: "always", destination: ".netrc"

  config.vm.provision "shell", env:{"DEBIAN_FRONTEND" => "noninteractive"}, inline: <<-SHELL
    apt-get -qq update
    apt-get -qq -o=Dpkg::Use-Pty=0 install -y --no-install-recommends \
      make \
      python-markdown \
      python-typogrify \
      python-bs4 \
      python-pygments \
      pelican \
      lftp
  SHELL

  config.vm.provision "shell", privileged: false, run: "always", inline: <<-SHELL
    make -C /pelican/Pelican ftp_upload
  SHELL
end

In short: I use a bento Ubuntu box because I've had bad experience on multiple occasions with the boxes in the Ubuntu namespace. I sync the folder my blog resides in to /pelican in the VM. I copy the .netrc file with the credentials. The VM gets some packages I need to run Pelican and calls the ftp_upload make target. This also got a new bash alias.

cd ~/vagrant/xenial-pelican \
  && vagrant up \
  && vagrant destroy -f \
  && cd - \
  && tput bel

Now, if you only ever publish a few times, this works fine and is perfectly acceptable. If you intend to iterate, pushing out changes a few times within half an hour, you'll be stuck waiting more often than you'd like due to the VM booting and reconfiguring. This was necessary to avoid conflicts when I work on different machines with the Vagrantfile being in my Dropbox.

Wrapping it up with Docker

Enter Docker. Now I know what you are thinking: "Docker is not the solution to all our problems" and I agree - it is not. It seems like the right kind of tool for this job though. Being built on xhyve and therefore Hypervisor.framework it is decidedly more lightweight than Virtualbox. When it is already running, firing up a container that builds the blog, uploads it and shuts the running container down again is very, very fast.

I built the following Dockerfile with the command docker build -t pelican . while in the directory containing the Dockerfile and .netrc.

FROM buildpack-deps:xenial
LABEL maintainer="Alexander Skiba"

VOLUME "/pelican"
WORKDIR /pelican
ENV DEBIAN_FRONTEND noninteractive

ADD ".netrc" "/root"

RUN apt-get update \
 && apt-get install -y --no-install-recommends \
      make \
      python3-pip \
      python3-setuptools \
      python3-wheel \
      lftp

RUN pip3 install \
      pelican \
      markdown \
      typogrify \
      bs4 \
      pygments

CMD ["make", "-C", "/pelican/Pelican", "ftp_upload"]

Again, I build on top of a Ubuntu Xenial machine, work in /pelican, copy the .netrc file and install packages. This time, however I install the packages via pip to get current versions. It is also of note that while building the image, one does not have access to files outside of the current directory and its subdirectory, which made a local copy of .netrc necessary. Furthermore, the paths for Docker volumes cannot be defined in the Dockerfile by design. Because of that, the new bash aliases is this:

docker run -v /…/ghostlyrics-journal/:/pelican pelican

This short command starts the container called pelican with the given folder mounted as volume into /pelican. Since I don't specify interactive mode, the CMD defined earlier is called and the blog built and uploaded. Afterwards the container exists since the command itself exits. Quite an elegant solution, I think.


Final Fantasy XIV: Stories about Fellowship

Posted on Sat 05 August 2017 in video games • Tagged with Stories

I started playing Final Fantasy XIV (FF) in February when my disappointment about the many quirks of Black Desert Online (BDO) reached an all-time high. After feeling that a lot of things were unpolished in BDO I wanted to try an MMO with monthly subscription - the assumption being that the extra money was used for a certain layer of polish and QA that I long for when playing a video game.

I was pleasantly surprised. All the GUIs were fine, not overloaded, no text outside of its intended boxes or similar stuff showing neglect on behalf of the developer. While the beginning of combat is rather boring and depressingly slow, it grows better when you get more skills. The world is build with attention to detail even though I felt that BDO's world felt more alive, especially Altinova. I want to point out that the writing is superb. The jokes, pop culture references and times when the game doesn't take itself serious are amazing.

When taking pictures in FF I am almost always taking images of events and experiences, even characters whereas in BDO my favorite motive was the environment.

Nadzeya looking at grilled food in Altinova

Another thing I realized early on is how the game is build to foster community and friendliness. There are systems in place to help new players (Novice Chat), that encourage players to play older content with others (dungeon bonuses, second chances for Khloe's Wondrous Tails) and to be generally helpful and cooperative while in an instance (player recommendations). All this is just so fundamentally different from the dog-eat-dog mentality in BDO where you can basically get stabbed outside safe zones with little to no repercussions for the murderer.

Let me tell you about Kakysha Saranictil, a rogue and ninja fighting for the good of the people of Eorzea. She is a hero both to the common folk as she is to statesmen. Fighting for the right cause is reason enough for her to help everyone, be them a poor miner in a almost forsaken village or the ruler of a grand city-state.

An airship leaving Ul'dah

While she started her journey as pugilist (read: martial artist) in Ul'dah, the prosperous desert nation, she soon discovered that her true calling are the shadows and so she became a member of The Dutiful Sisters of the Edelweiss in Limsa Lominsa where she studied under Captain Jacke. As her travels led her all over Eorzea she sadly realized that Jacke had little left to teach her. Gladly Oboro, a ninja hailing from Doma in the Far East took her under his wings and taught her the ways of the ninja.

Kakysha sitting cloaked in Idyllshire

Now, while her comrades at the Scions of the Seventh Dawn certainly kept her busy defending this or that nation from both primal and Garlean threat, she certainly did spent her downtime well, building trust with a more conservative faction of Ul'dah's lizard people, the Amalj'aa. A proud folk of warriors, they came to respect her when she helped them uphold their traditions again and again against their religious fanatical kin revering Ifrit as well as defending their clanswoman.

Admittedly even a hero needs a little rest from time to time and what better use of said downtime would there be than finally having dinner with her close friend, Ser Aymeric de Borel.

Aymeric laughing about a joke Kakysha made

Kakysha smiling at Aymeric

But even between all those events, she found a sense of belonging, of fellowship. Kakysha joined a Free Company soon after starting her journey, but ultimately felt unfulfilled by both the people and their way of treating each other. After a long period of solitude she ultimately came across the Seraphs, Jenji and Syn Seraph, who invited her to join their Free Company, The Black Crown, where people were pleasant and all was well. While Crown has a considerable amount of adventurers who have failed to show up in recent times there is a core group of heroes who are there to help others, to talk and to have fun with.

Recently one might come to the impression that Kakysha had become complacent, ignoring the plight of her fellow people. Nothing could be further from the truth - it is only that she needed to focus on solving the biggest issues first (namely, the liberation of Doma and Ala Mhigo) before tackling the smaller issues (left over sidequests) now that a manner of peace has been established.

Kakysha watching the people in Quarrymill

Look out for her on the Phoenix, EU server - you'll know her by her completely sand colored clothes - be they adorned by gems and jewels or more work oriented with belts and pouches, bright red hair and glasses.


Reading recommendations (2017-07-26)

Posted on Wed 26 July 2017 in reading recommendations

Pieter Hintjens has Ten Steps to Better Public Speaking for you. Amongst them is to avoid using slides since they send the audience into passive 'consumer only' mode. I'm definitely guilty of doing that as a listener.

Here is an interview with Craig Schaefer, author of one of my favorite book series, the Faust books. - Cover Reveal and Mini-Q&A with Craig Schaefer (by Mihir Wanchoo). I'm very sad that he still hasn't resumed selling his books on Apple's iBooks, my preferred source.

Jason Schreier writes in Final Fantasy XII: The Zodiac Age: The Kotaku Review that one of the most interesting games in the Final Fantasy series is as good as I remember it and might be even better in its new version. I'm very pleased to hear that even though I don't currently own a PS4. They even removed that one quirk where you mustn't open a specific chest for the whole game in order to get one of the best weapons.

Matt Gemmell's Regulars is about peoplewatching. It's about the what-ifs. It's what happens when observation and imagination meet and have a great time in a coffee shop. (pun intended)

With every new framework release comes the fresh chance of masking your lack of fundamental JavaScript knowledge. @iamdevloper

I also happened to read the comic books I got with The Witcher 2 and Alan Wake, but sadly those didn't click with me.


Sidenotes.


Example of a Sensu Puppet class

Posted on Thu 13 July 2017 in work • Tagged with Sensu

At the sensu-users mailing list someone asked how they could deploy Sensu plugins with Puppet. After giving a short snippet, I was asked for further help whether to implement the snippet as a class or what I would recommend. Therefore I present you you: A slightly redacted example of a Sensu Puppet class taken from production.

I will attempt to walk you through the sections I used and at the end there will be a big code block with the complete class, for easier copy & paste.

Please note that this class was written with Sensu 0.26 and sensu-puppet 2.2.0 in mind and may not include all the latest features you expected to use and does not use features available in versions of Sensu or sensu-puppet.

Detailed explanation

docs

# Class: services::sensu
# Manages configuration, checks, handlers and certs for
# the sensu monitoring system
#
# parameters:
# (bool) is_main_server: makes this server the main host on which sensu is run
# (bool) consistent_connection: if set to `false`, enables high-value timeouts
#        for sensu keepalive checks
# (array) subscriptions: the check groups a host should subscribe to

You will always want some form of documentation. Leaving a little bit in the code is considered good practice and puppet-lint will (rightfully) complain if you don't. I make sure to also leave hints about class parameters and their types since I don't use them a lot in this project.

default parameters

class services::sensu($is_main_server = false,
                      $consistent_connection = true,
                      $subscriptions = [])
{

As you might have seen, I use is_main_server to denote the sensu-server instance, so it defaults to false. consistent_connection will be manually set to false for desktop or laptop machines that will be turned off regularly and is true by default. In a later version of Sensu and sensu-puppet this can be solved easier with deregistration. The subscriptions array will be filled with strings that enable subscriptions and checks that are not automatically detected and is empty by default.

manual configuration

# configuration
$rabbitmq_password = 'REDACTED'
$gitlab_health_token = 'REDACTED'
$gitlab_issues_token = 'REDACTED'
$assignments_health_token = 'REDACTED'
$sensu_monitoring_password = 'REDACTED'

# installed sensu plugins
$plugins = ['sensu-plugins-cpu-checks',
            'sensu-plugins-disk-checks',
            'sensu-plugins-environmental-checks',
            'sensu-plugins-filesystem-checks',
            'sensu-plugins-http',
            'sensu-plugins-load-checks',
            'sensu-plugins-memory-checks',
            'sensu-plugins-network-checks',
            'sensu-plugins-nvidia',
            'sensu-plugins-ntp',
            'sensu-plugins-postfix',
            'sensu-plugins-process-checks',
            'sensu-plugins-puppet',
            'sensu-plugins-raid-checks',
            'sensu-plugins-uptime-checks']

# kibana URL - allows clicking to jump to filtered log results
$kibana_url = "https://REDACTED/#/discover?_g=()&_a=(columns:!(_source),interval:auto,query:(query_string:(analyze_wildcard:!t,query:'host:${::hostname}')),sort:!('@timestamp',desc),index:%5Blogstash-%5DYYYY.MM.DD)#"
# grafana URL - allows clicking to jump to filtered metrics
$grafana_url = "https://REDACTED/dashboard/db/single-host-overview?var-hostname=${::hostname}"
# runbook prefix - allows linking directly to a propose solution
$runbook_prefix = 'https://REDACTED/administrators/documentation/blob/master/runbooks/sensu'

# how many times should keepalive fire before notifications
$keepalive_occurrences = '1'
# how much time needs to pass until keepalive notification is repeated (in seconds)
$keepalive_refresh = '3600'
# impact text for keepalive
$keepalive_impact = 'Host is not checking in with monitoring and may be completely unavailable.'
# suggestion text for keepalive
$keepalive_suggestion = 'Check if the host is frozen, stuck, down or offline.'

This is the section where details specific to our deployment reside. There is one block that holds tokens and passwords that used by Puppet during the deployment (rabbitmq_password) and ones that are used by Sensu during standard operation (e.g. gitlab_health_token when monitoring GitLab's health API).

  • plugins: lists Sensu plugins that should be installed on all machines.
  • kibana_url, grafana_url: We have systems in place to collect log files and metrics from the systems we monitor. These are easy links that will be displayed in Uchiwa and notifications (e-mail, Mattermost) that link directly to data for the host in question.
  • runbook_prefix: I wrote runbooks for most checks so that my colleagues can resolve issues while I'm on vacation. This is prepended in checks, so that one only needs to concatenate the prefix with the filename of the runbook in question to get a full URL.

The next block describes Sensu's keepalive events - you get these when Sensu has lost contact with a client (meaning your client hasn't checked in with the Sensu server for some time). The keepalive_occurrences and keepalive_refresh attributes are used for filtering of notifications.

keepalive_impact and keepalive_suggestion are part of a concept I use throughout our Sensu deployment - Every check that can trigger a notification needs to have information on what the real-world impact of a failure is and what the quickest and most common solution to the problem could be.

automatic subscriptions

# automatic subscriptions computed from machine properties
if (str2bool($::is_virtual) == true)
{
  $machine_type = ['virtual']
}
else
{
  $machine_type = ['physical']
}

if (str2bool($::has_nvidia_graphics_card) == true and str2bool($::using_nouveau_driver) == false)
{
  $gpu = ['nvidia']
}
else
{
  $gpu = []
}

if (($::operatingsystem == 'Ubuntu' and versioncmp($::operatingsystemrelease, '16.04') >= 0) or
    ($::operatingsystem == 'Debian' and versioncmp($::operatingsystemrelease, '8.0') >= 0))
{
  $systemd_enabled = ['systemd']
}
else
{
  $systemd_enabled = []
}

$automatic_subscriptions = concat($machine_type, $gpu, $systemd_enabled, ['client_specific'])

After a while, hardcoding checks gets annoying and that's why I try to automatically detect some things based on hardware or operating system.

  • ::is_virtual is a default Puppet fact. I'll add checks for S.M.A.R.T. as well as RAID checks and sensors metrics if run on a physical machine. (not included in this example)
  • ::has_nvidia_graphics_card is a fact taken from jaredjennings/puppet-nvidia_graphics. I'll add GPU specific metrics based on that. (not included in this example)
  • I'll also try to decide whether Systemd is managing the host or not. I'll add some specific service checks based on that. (not included in this example)

The automatic subscriptions are then combined with a pseudo-subscription called client_specific that I use to distribute only the configuration of various client specific checks to hosts.

metrics templates

# template variables (must be in class scope)
$default_scheme = 'sensu.host.$(hostname)'
$metrics_handler = ['graphite_tcp']
$timestamp = '`date +%s`'

For easier use of metrics checks that are not written with sensu-plugin (the framework) I have some variables that are reused whenever hacking together a check on the quick.

  • default_scheme is prepended to a metric, resulting in something like sensu.host.myawesomehostname.cpu.usage
  • metrics_handler is an easier way of specifying the handler should we ever need to change it (or extend it).
  • timestamp is a simple way to get a UNIX timestamp.

sensu-server: packages and subscriptions

# SENSU SERVER
if ($is_main_server == true)
{
  $combined_subscriptions = unique(concat(['proxy'], $subscriptions, $automatic_subscriptions))

  $server_packages = ['redis-server', 'curl', 'jq']

  $server_plugins = [ 'sensu-plugins-imap',
                      'sensu-plugins-slack',
                      'sensu-plugins-ssl',
                      'sensu-extensions-occurrences']

  # install server-only packages
  package
  {
    $server_packages:
    ensure => present,
  }

  # install plugins for proxy group

  package
  {
    $server_plugins:
    ensure   => present,
    provider => 'sensu_gem',
    require  => Package[$server_packages],
  }

The Sensu server is the machine handling proxy requests for me. That means that checks that check e.g. if a site is available via HTTP on another machine is a proxy check and will in my deployment be run on the Sensu server. To achieve this, a proxy subscription is added to the subscriptions of the server.

Next, the server_packages are installed via the default package management (e.g. apt in my case) and the server_plugins are Sensu specific ruby gems that are installed via the sensu_gem provider that comes with sensu-puppet.

sensu-server: workaround

# Workaround for sensu-api not subscribing to check updates.
Class['::sensu::client::service'] ~> Class['::sensu::api::service']

Sometimes I had the problem that the results for some queries in Uchiwa were not the most recent ones and this snippet seems to have solved them.

sensu-server: configuration

This is the part where the sensu-puppet module is configured by my class.

class
{
  '::sensu':
  rabbitmq_password           => $rabbitmq_password,
  server                      => true,
  client                      => true,
  api                         => true,
  api_bind                    => '127.0.0.1',
  use_embedded_ruby           => true,
  rabbitmq_reconnect_on_error => true,
  redis_reconnect_on_error    => true,
  redis_auto_reconnect        => true,
  subscriptions               => $combined_subscriptions,
  rabbitmq_host               => '127.0.0.1',
  redis_host                  => '127.0.0.1',
  redact                      => ['password', 'pass', 'api_key','token'],
  purge                       => true,
  safe_mode                   => true,

  require                     => Package[$server_packages],

  client_custom               =>
  {
    kibana_url       => $kibana_url,
    grafana_url      => $grafana_url,
    type             => $::virtual,
    operating_system => $::lsbdistdescription,
    kernel           => $::kernelrelease,
    puppet_version   => $::puppetversion,

    gitlab_health    =>
    {
      token => $gitlab_health_token,
    },
    ldap_sensu       =>
    {
      password => $sensu_monitoring_password,
    },
    gitlab_issues    =>
    {
      token => $gitlab_issues_token,
    },
    assignments_health =>
    {
      token => $assignments_health_token,
    }
  }
}

You can read about most parameters in the docs. Here are some general hints:

  • api_bind: I bind to the machine so everything needs to be proxied (e.g. with Apache or Nginx).
  • rabbitmq_reconnect_on_error, redis_reconnect_on_error, redis_auto_reconnect: I want my deployment to be potentially self-healing.
  • redact: I have some additional keywords here that will be redacted in the API output. Please check out the Sensu docs on redaction, it's a great feature.
  • purge: I enable this since I control all changes centrally. [queue Mass Effect 2 taking direct control soundbite]
  • safe_mode: Though it is more work, you probably do not want your hosts to run arbitrary commands.

The client_custom section is where additional attributes are defined. I've already talked about kibana_url and grafana_url. I find that the operating system, the kernel version, the puppet_version and whether the host is virtual or physical are helpful information to display on its dashboard page, so I include these.

The tokens and passwords are written to files on the host, and can then easily be referenced in Sensu commands using e.g. :::gitlab_health.token:::.

sensu-server: uchiwa

class
{
  '::uchiwa':
  install_repo => false,
  host         => '127.0.0.1',
  require      => Class['::sensu'],
}

I run Uchiwa, the dashboard for Sensu on the same machine and have it proxied. Note that this requires the yelp/uchiwa Puppet module.

sensu-server: includes

  # sensu server specific checks
  include services::sensu::core

  # include all checks here, so that the master has all in order to run
  # with safe_mode => true

  # subscription: proxy
  include services::sensu::imap
  include services::sensu::certificates
  include services::sensu::client_specific
  include services::sensu::api_health
  include services::sensu::availability
  include services::sensu::remote_metrics

  # automatic subscriptions
  include services::sensu::nvidia
  include services::sensu::physical
  include services::sensu::systemd
  include services::sensu::virtual

  # last part is subscription name
  include services::sensu::elasticsearch
  include services::sensu::fail2ban
  include services::sensu::kibana
  include services::sensu::ldap
  include services::sensu::mailman
  include services::sensu::logstash
  include services::sensu::seafile
  include services::sensu::seahub

  # include handler definitions
  include services::sensu::handlers
}

Since I'm using safe_mode, the Sensu server needs to have every single check that should be run. I include them here, manually.

Structuring your checks into neatly partitioned and readable files is a daunting task. I've tried to do it the following way: There is one file that holds checks that are common (core). I've grouped all proxy subscription checks into one block, automatic subscriptions into the second block and files that are automatically included based on the content of the subscriptions array that the class receives in the third block. Handler definitions also get their own file (handlers) since they get unwieldy even with only a few handlers.

sensu-client: subscriptions

# SENSU CLIENT
else
{
  # default client configuration
  $combined_subscriptions = unique(concat($subscriptions, $automatic_subscriptions))

  # default include checks and metrics
  include services::sensu::core
  include services::sensu::client_specific

  # automatically include checks for subscriptions
  services::sensu::combined_subscriptions{$combined_subscriptions:}

Similar to the server, the client gets a combination of (manual) subscription and automatic_subscriptions. Then, the core checks and metrics are included as well as any client_specific ones. I include Puppet classes automatically based on the combined_subscriptions then. For your convenience I'll include this Puppet hack.

## combined_subscriptions.pp
# Define: services::sensu::combined_subscriptions
# use a define to dynamically include classes with checks

define services::sensu::combined_subscriptions
{
  include "services::sensu::${name}"
}

sensu-client: keepalive configuration

# if the client is not consistently connected, warn after 2 weeks
# and throw a critical error after 4 weeks
# something will be wrong, outdated or the client can be removed
if ($consistent_connection == false)
{
  $client_keepalive =
  {
    thresholds =>
    {
      warning => 1209600,
      critical => 2419200,
    },
    handlers   => ['default', 'mail', 'mattermost'],
    runbook    => "${runbook_prefix}/keepalive.markdown",
    occurences => $keepalive_occurrences,
    refresh    => $keepalive_refresh,
    impact     => $keepalive_impact,
    suggestion => $keepalive_suggestion,
  }
}
else
{
  $client_keepalive =
  {
    handlers   => ['default', 'mail', 'mattermost'],
    runbook    => "${runbook_prefix}/keepalive.markdown",
    occurences => $keepalive_occurrences,
    refresh    => $keepalive_refresh,
    impact     => $keepalive_impact,
    suggestion => $keepalive_suggestion,
      }

The configuration for keepalive events is part of the client attributes, not a separate check. If I set consistent_connection to false, it will take some weeks until I am notified of a "missing" device. Filters are configured via occurrences and refresh. The Sensu developers wrote a helpful blog post on that. Again, if you have a new enough version of Sensu, you should not need this.

As you can see, the "check" also has a runbook, an impact description and an operator suggestion defined to make manual intervention very easy.

sensu-client: configuration

This is the part where the sensu-puppet module is configured by my class.

  class
  {
    '::sensu':
    rabbitmq_password           => $rabbitmq_password,
    rabbitmq_host               => 'REDACTED',
    rabbitmq_port               => '5671',
    server                      => false,
    api                         => false,
    client                      => true,
    client_keepalive            => $client_keepalive,
    subscriptions               => $combined_subscriptions,
    rabbitmq_ssl                => true,
    rabbitmq_ssl_private_key    => 'puppet:///modules/services/sensu/client-key.pem',
    rabbitmq_ssl_cert_chain     => 'puppet:///modules/services/sensu/client-cert.pem',
    use_embedded_ruby           => true,
    rabbitmq_reconnect_on_error => true,
    purge                       => true,
    safe_mode                   => true,

    require                     => Package['ruby-json'],

    client_custom               =>
    {
      kibana_url       => $kibana_url,
      grafana_url      => $grafana_url,
      type             => $::virtual,
      operating_system => $::lsbdistdescription,
      kernel           => $::kernelrelease,
      puppet_version   => $::puppetversion,
    },
  }
}

There is nothing especially fancy here except the client_keepalive which gets filled with the values from a previous section. Everything else should either be taken from the docs or was already explained earlier.

Of note: rabbitmq_ssl_private_key and rabbitmq_ssl_cert_chain are the same for every host. This is an (unfortunate) implementation detail which allows only one cert in use for the whole Sensu transport deployment. I think I would've liked to piggyback onto Puppet's certificates if possible, but am quite aware this is neither good in terms of compartmentalization nor good design.

common

  package
  {
    $plugins:
    ensure   => installed,
    provider => 'sensu_gem',
  }

  file
  {
    '/etc/sudoers.d/sensu':
    ensure  => file,
    owner   => 'root',
    group   => 'root',
    mode    => '0440',
    source  => 'puppet:///modules/services/sensu/sudoers.d',
    require => Package['sudo'],
  }

  # all nodes need development dependencies for native extentions

  $client_packages = ['g++', 'make', 'ruby-json', 'sudo']

  Class['apt::update']
  -> Package[$client_packages]

  package
  {
    $client_packages:
    ensure => present,
  }
}

This section is for both the server and the client part. The list of sensu-plugins is installed via sensu_gem. Some checks I use with Sensu require sudo rights, so I distribute a customized sudoers file directly into /etc/sudoers.d/ which whitelists some commands for Sensu.

Since it is often the case that Ruby gems try to build native extensions on installation we require development tools on each host.

As a last little detail I make sure to only install packages after an apt-get update run. I think I added this since I was often testing my setup in a Docker container via GitLab's CI feature. It is good practice to have a container that is as small as possible, so people delete the cached apt sources which leads to errors while installing packages if apt-get update is not run before an apt-get install PACKAGE.

Minimal example

Alright, so now I've written quite a bit about this specific class, but how would one actually use all of this? Let's see a minimal working example.

# site.pp
node 'myhostname.mydomain.com'
{
  include services::sensu
}

If you wanted to add additional (previously implemented) subscriptions, you would use something like this:

# site.pp
node 'example.domain.com'
{
  class{ 'services::sensu': subscriptions => ['fail2ban', 'ldap']}
}

sensu.pp

# Class: services::sensu
# Manages configuration, checks, handlers and certs for
# the sensu monitoring system
#
# parameters:
# (bool) is_main_server: makes this server the main host on which sensu is run
# (bool) consistent_connection: if set to `false`, enables high-value timeouts
#        for sensu keepalive checks
# (array) subscriptions: the check groups a host should subscribe to

class services::sensu($is_main_server = false,
                      $consistent_connection = true,
                      $subscriptions = [])
{
  # configuration
  $rabbitmq_password = 'REDACTED'
  $gitlab_health_token = 'REDACTED'
  $gitlab_issues_token = 'REDACTED'
  $assignments_health_token = 'REDACTED'
  $sensu_monitoring_password = 'REDACTED'

  # installed sensu plugins
  $plugins = ['sensu-plugins-cpu-checks',
              'sensu-plugins-disk-checks',
              'sensu-plugins-environmental-checks',
              'sensu-plugins-filesystem-checks',
              'sensu-plugins-http',
              'sensu-plugins-load-checks',
              'sensu-plugins-memory-checks',
              'sensu-plugins-network-checks',
              'sensu-plugins-nvidia',
              'sensu-plugins-ntp',
              'sensu-plugins-postfix',
              'sensu-plugins-process-checks',
              'sensu-plugins-puppet',
              'sensu-plugins-raid-checks',
              'sensu-plugins-uptime-checks']

  # kibana URL - allows clicking to jump to filtered log results
  $kibana_url = "https://REDACTED/#/discover?_g=()&_a=(columns:!(_source),interval:auto,query:(query_string:(analyze_wildcard:!t,query:'host:${::hostname}')),sort:!('@timestamp',desc),index:%5Blogstash-%5DYYYY.MM.DD)#"
  # grafana URL - allows clicking to jump to filtered metrics
  $grafana_url = "https://REDACTED/dashboard/db/single-host-overview?var-hostname=${::hostname}"
  # runbook prefix - allows linking directly to a propose solution
  $runbook_prefix = 'https://REDACTED/administrators/documentation/blob/master/runbooks/sensu'

  # how many times should keepalive fire before notifications
  $keepalive_occurrences = '1'
  # how much time needs to pass until keepalive notification is repeated (in seconds)
  $keepalive_refresh = '3600'
  # impact text for keepalive
  $keepalive_impact = 'Host is not checking in with monitoring and may be completely unavailable.'
  # suggestion text for keepalive
  $keepalive_suggestion = 'Check if the host is frozen, stuck, down or offline.'

  # automatic subscriptions computed from machine properties
  if (str2bool($::is_virtual) == true)
  {
    $machine_type = ['virtual']
  }
  else
  {
    $machine_type = ['physical']
  }

  if (str2bool($::has_nvidia_graphics_card) == true and str2bool($::using_nouveau_driver) == false)
  {
    $gpu = ['nvidia']
  }
  else
  {
    $gpu = []
  }

  if (($::operatingsystem == 'Ubuntu' and versioncmp($::operatingsystemrelease, '16.04') >= 0) or
      ($::operatingsystem == 'Debian' and versioncmp($::operatingsystemrelease, '8.0') >= 0))
  {
    $systemd_enabled = ['systemd']
  }
  else
  {
    $systemd_enabled = []
  }

  $automatic_subscriptions = concat($machine_type, $gpu, $systemd_enabled, ['client_specific'])

  # template variables (must be in class scope)
  $default_scheme = 'sensu.host.$(hostname)'
  $metrics_handler = ['graphite_tcp']
  $timestamp = '`date +%s`'


  # SENSU SERVER
  if ($is_main_server == true)
  {
    $combined_subscriptions = unique(concat(['proxy'], $subscriptions, $automatic_subscriptions))

    $server_packages = ['redis-server', 'curl', 'jq']

    $server_plugins = [ 'sensu-plugins-imap',
                        'sensu-plugins-slack',
                        'sensu-plugins-ssl',
                        'sensu-extensions-occurrences']

    # install server-only packages
    package
    {
      $server_packages:
      ensure => present,
    }

    # install plugins for proxy group

    package
    {
      $server_plugins:
      ensure   => present,
      provider => 'sensu_gem',
      require  => Package[$server_packages],
    }


    # Workaround for sensu-api not subscribing to check updates.
    Class['::sensu::client::service'] ~> Class['::sensu::api::service']

    class
    {
      '::sensu':
      rabbitmq_password           => $rabbitmq_password,
      server                      => true,
      client                      => true,
      api                         => true,
      api_bind                    => '127.0.0.1',
      use_embedded_ruby           => true,
      rabbitmq_reconnect_on_error => true,
      redis_reconnect_on_error    => true,
      redis_auto_reconnect        => true,
      subscriptions               => $combined_subscriptions,
      rabbitmq_host               => '127.0.0.1',
      redis_host                  => '127.0.0.1',
      redact                      => ['password', 'pass', 'api_key','token'],
      purge                       => true,
      safe_mode                   => true,

      require                     => Package[$server_packages],

      client_custom               =>
      {
        kibana_url       => $kibana_url,
        grafana_url      => $grafana_url,
        type             => $::virtual,
        operating_system => $::lsbdistdescription,
        kernel           => $::kernelrelease,
        puppet_version   => $::puppetversion,

        gitlab_health    =>
        {
          token => $gitlab_health_token,
        },
        ldap_sensu       =>
        {
          password => $sensu_monitoring_password,
        },
        gitlab_issues    =>
        {
          token => $gitlab_issues_token,
        },
        assignments_health =>
        {
          token => $assignments_health_token,
        }
      }
    }

    class
    {
      '::uchiwa':
      install_repo => false,
      host         => '127.0.0.1',
      require      => Class['::sensu'],
    }

    # sensu server specific checks
    include services::sensu::core

    # include all checks here, so that the master has all in order to run
    # with safe_mode => true

    # subscription: proxy
    include services::sensu::imap
    include services::sensu::certificates
    include services::sensu::client_specific
    include services::sensu::api_health
    include services::sensu::availability
    include services::sensu::remote_metrics

    # automatic subscriptions
    include services::sensu::nvidia
    include services::sensu::physical
    include services::sensu::systemd
    include services::sensu::virtual

    # last part is subscription name
    include services::sensu::elasticsearch
    include services::sensu::fail2ban
    include services::sensu::kibana
    include services::sensu::ldap
    include services::sensu::mailman
    include services::sensu::logstash
    include services::sensu::seafile
    include services::sensu::seahub

    # include handler definitions
    include services::sensu::handlers
  }

  # SENSU CLIENT
  else
  {
    # default client configuration
    $combined_subscriptions = unique(concat($subscriptions, $automatic_subscriptions))

    # default include checks and metrics
    include services::sensu::core
    include services::sensu::client_specific

    # automatically include checks for subscriptions
    services::sensu::combined_subscriptions{$combined_subscriptions:}

    # if the client is not consistently connected, warn after 2 weeks
    # and throw a critical error after 4 weeks
    # something will be wrong, outdated or the client can be removed
    if ($consistent_connection == false)
    {
      $client_keepalive =
      {
        thresholds =>
        {
          warning => 1209600,
          critical => 2419200,
        },
        handlers   => ['default', 'mail', 'mattermost'],
        runbook    => "${runbook_prefix}/keepalive.markdown",
        occurences => $keepalive_occurrences,
        refresh    => $keepalive_refresh,
        impact     => $keepalive_impact,
        suggestion => $keepalive_suggestion,
      }
    }
    else
    {
      $client_keepalive =
      {
        handlers   => ['default', 'mail', 'mattermost'],
        runbook    => "${runbook_prefix}/keepalive.markdown",
        occurences => $keepalive_occurrences,
        refresh    => $keepalive_refresh,
        impact     => $keepalive_impact,
        suggestion => $keepalive_suggestion,
      }

    }

    class
    {
      '::sensu':
      rabbitmq_password           => $rabbitmq_password,
      rabbitmq_host               => 'REDACTED',
      rabbitmq_port               => '5671',
      server                      => false,
      api                         => false,
      client                      => true,
      client_keepalive            => $client_keepalive,
      subscriptions               => $combined_subscriptions,
      rabbitmq_ssl                => true,
      rabbitmq_ssl_private_key    => 'puppet:///modules/services/sensu/client-key.pem',
      rabbitmq_ssl_cert_chain     => 'puppet:///modules/services/sensu/client-cert.pem',
      use_embedded_ruby           => true,
      rabbitmq_reconnect_on_error => true,
      purge                       => true,
      safe_mode                   => true,

      require                     => Package['ruby-json'],

      client_custom               =>
      {
        kibana_url       => $kibana_url,
        grafana_url      => $grafana_url,
        type             => $::virtual,
        operating_system => $::lsbdistdescription,
        kernel           => $::kernelrelease,
        puppet_version   => $::puppetversion,
      },
    }
  }

  package
  {
    $plugins:
    ensure   => installed,
    provider => 'sensu_gem',
  }

  file
  {
    '/etc/sudoers.d/sensu':
    ensure  => file,
    owner   => 'root',
    group   => 'root',
    mode    => '0440',
    source  => 'puppet:///modules/services/sensu/sudoers.d',
    require => Package['sudo'],
  }

  # all nodes need development dependencies for native extentions

  $client_packages = ['g++', 'make', 'ruby-json', 'sudo']

  Class['apt::update']
  -> Package[$client_packages]

  package
  {
    $client_packages:
    ensure => present,
  }
}