windows git pre-commit hook for puppet development

For a recent client I was asked to provide a puppet development setup on the windows platform. The toolchain included Git for windows / Git bash / Turtoise GIT / Atom and some others. It also includes the windows adaption of my git pre-commit hook.

To use this hook, you need to install the puppet v4 agent and install a couple of ruby gems within the provided ruby environment. Something like :

c:\program files\puppetlabs\puppet\sys\ruby\bin\gem.bat install r10k
c:\program files\puppetlabs\puppet\sys\ruby\bin\gem.bat install puppet-lint

(from an elevated command prompt)

The hook needs to be saved in every git repository that contains puppet code in het directory [repo]/.git/hooks/

Warning : The windows linefeed bit is really recent and might need some additional testing.

 

#!/bin/bash
# Requires bash, as it uses the [[ ]] syntax.
#
# https://puppetlabs.com/blog/using-puppet-lint-to-save-yourself-from-style-faux-pas
# https://docs.puppetlabs.com/guides/templating.html#syntax-checking
#
# If it's puppet code, lint it up.
# 20150915 syncaddict
# - Added support for erb syntax checking
#
# 20151020 syncaddict
# - Added support for YAML syntax checking
# - more verbose operation
#
# 20170615 syncaddict 
# - version that works on windows
#
# 20170809 syncaddict
# - detect / convert windows linefeeds
#
# Variables goes hither

 

PUPPETLINT="/c/Progra~1/Puppet~1/Puppet/sys/ruby/bin/puppet-lint.bat"
PUPPETAGENT="/c/Progra~1/Puppet~1/Puppet/bin/puppet.bat"
ERB="/c/Progra~1/Puppet~1/Puppet/sys/ruby/bin/erb.bat"
RUBY="/c/Progra~1/Puppet~1/Puppet/sys/ruby/bin/ruby.exe"

DOS2UNIX=/usr/bin/dos2unix
GREP=/usr/bin/grep
WC=/usr/bin/wc

declare -a FILES

IFS="
"
FILES=$(git diff --cached --name-only --diff-filter=ACM )
 

for file in ${FILES[@]}
do

   ## replace windows linefeeds on all changed files - WIP
   LFCHECK=`$GREP "\r\n$" $file | $WC -l`
   if [[ $? > 0 ]]; then echo "LF check failed"; fi

   if [[ $LFCHECK -gt 0 ]];
   then
     $DOS2UNIX $file
     echo "Converted linefeeds on $file, please re-add and retry your commit"
     exit 666
   fi

 

 

  case $file in
    *\.pp*)
      echo "Checking puppet file $file"
      $PUPPETLINT --no-puppet_url_without_modules-check --no-arrow_on_right_operand_line-check --no-140chars-check --fail-on-warnings --fix --with-filename "$file"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi

      $PUPPETAGENT parser validate "$file"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *\.erb*)
      echo "Checking erb template $file"
      $ERB -P -x -T '-' $file | $RUBY -c
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *\.yaml*)
      echo "Checking yaml file $file"
      $RUBY -e "require 'yaml'; YAML.load_file('$file')"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *)
      echo "Not checking file $file"
    ;;
  esac
done

exit 0

Running encrypted backups with duplicity

This is just a short note on my experiences running backups with Duplicity.

Duplicity is an open source package that allows you to do incremental backups, complete with proper indexing, to remote storage. This can be a modern ‘cloud’ storage like S3, but I prefer to run it over a simple SSH link.

Next to properly working incremental backups, it also provides data security by using GPG to encrypt the data. And it has a lot of stuff you would expect : configurable full dump cycles, purging of old  backups. There is a windows / C# implementation too (haven’t tried it though)

The only thing lacking may be deduplication, which is kinda hard given that all data is encrypted.

It took me some time to get all the parameters right, but after some initial fiddling, I wrapped it all in some puppet code that gets deployed to all new machines / nodes.

So every new machine is backup up automagicly using duplicity by only applying my basic puppet profile to the host.

I also did an extensive restore test during the implementation phase with went fine.

Highly recommended little know tool in some dark corner of the Internet : http://duplicity.nongnu.org/ . Don’t let the HTML 1.0 web design turn you off, this tool is maintained and stable.

Puppet – Server Error: hiera configuration version 3 cannot be used in an environment

Screenshot of failing puppet-agent runToday I found my homelab puppet setup failing. I mostly run the latest released puppet code on my setup and this time it bit me in the ass :

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: hiera configuration version 3 cannot be used in an environment on node buzz.maljaars-it.nl
 Warning: Not using cache on failed catalog
 Error: Could not retrieve catalog; skipping run

When looking it at the issue, I found that my setup had automatically been upgraded to puppet code version 4.9. This version supports a newer hiera config file version without maintaining backward compatibility apparently.

Looking for documentation on the new format, it seems to be missingat this time. Searching the history on the puppet community slack seems to confirm this.

For now I downgraded the puppet-agent package on the puppetserver machine and pinned it to the previous release. After restarting the puppetserver service everything was fine for now.

More info on this issue and other new features of puppet 4.9 : https://docs.puppet.com/puppet/4.9/release_notes.html

Validating puppet code changes using octocatalog-diff

During Puppetconf 2016 Github announces it was releasing one of its internal test tools called octocatalog-diff as open source. What the tools basicly does is compile the catalog for a certain machine using your old and new puppetcode to show you the diff output. This allows you to see the impact of your code without actually deploying and running it on puppet clients.

As this could potentially save me a lot of time, I decided to delve into the tool to see what it could bring me. This post describes to the proces of setting the tool up and the surprises I came across.

Environment

For my initial testing I used a vagrant provisioned box running Ubuntu 16.04. Next to that I installed the puppet agent 3.8.7 that is in the standard Ubuntu repositories.

Preparation

The following gems were needed during my trails. Just run ‘sudo gem install [GEMNAME]’ to get them on your system. When using Puppet v4 take note to use the gem binary in /opt/puppetlabs/puppet/bin/gem

  • hiera-eyaml
  • r10k
  • bundler
  • rspec

The tool can query the puppetdb instance of your setup, or you can use the facts in yaml format for the node(s) you are testing. As the puppetdb did not work for my at this point, I copied the contents of /var/lib/puppet/yaml/facts from my puppetmaster to my local test environment.

The tool also needs some develop utilities upon install. Get them using your systems package manager :

  • cmake
  • pkg-config
  • ruby-dev

Installing the tool

After trying the gem, I’ve installed the source version following the instructions on https://github.com/github/octocatalog-diff/blob/master/doc/installation.md . The rake test gave me a a couple of errors that I reported.

Setting up

There were three files I added to my puppet repo :

  • A hiera.yaml configuration valid for my setup
  • A bootstrap script to run r10k on my Puppetfile and to symlink local modules to the common modules dir (see below)
  • A .octocatalog-diff.cfg.rb configured as per the instructions on github Please note the the current example file has a key settings[:hiera_yaml_file]  that should read settings[:hiera_config] . (Github issue)

Running the tool

/vagrant/octocatalog-diff/bin/octocatalog-diff -f production -t [your current branch] -n [NODENAME] --fact-file /vagrant/puppetdb_facts/[NODENAME].yaml --to-fact-override vagrant_puppetrole=nagiosserver --from-fact-override vagrant_puppetrole=nagiosserver --bootstrap-script repo-bootstrap.sh

Installation paths may vary. Note: the fact override thing is my way to assign a role to host without a real ENC / foreman. This value is used by a small piece of code in my site.pp :

node default {
  ## This is a small hook to support local vagrant
  ## development. This special var get set as part
  ## of the vagrant provisioning process.
  if $vagrant_puppetrole != undef {
    class { "roles::${vagrant_puppetrole}": }
  }
}

When removing a single package from my ‘baseline’ it resulted in the expected output:

screen-shot-2016-11-03-at-14-26-40

 

 

 

diff production/NODENAME fea-puppetv4/NODENAME
*******************************************
  Package[tcpdump] =>
   parameters =>
     ensure =>
      - present
      + absent
*******************************************

Contents of  repo-bootstrap.sh

r10k puppetfile install Puppetfile
for i in site/*; do BLA=`echo $i |sed -e 's#site/##'`; ln -s ../site/$BLA modules/$BLA; done

Final notes

When trying to use the tool, I started out using Puppet 4.8. I run into some trouble with puppetlabs firewall module 1.8.1 (( [Puppet Error] Could not autoload puppet/provider/firewall/iptables: undefined method `value’ for nil:NilClass) . As soon as I downgraded to puppet 3.8.7 the firewall module stopped producing errors. Not sure if this is related to the puppet version or the combination with octocatalog-diff.

I will use the tool the coming weeks. The next step could be integrating it in the build pipeline(s).

 

Setting up puppet provisioned nagios monitoring

nagios_monitoring_sscreenshotPuppet has had native nagios resource types for quite some time. As both a nagios and a puppet fan, I really liked the idea of not setting up any monitoring but have some base level of monitoring on every managed system automatically. Deploying new systems involves a lot of steps and forgetting to setup proper monitoring is a thing a lot of clients run into one day or another.

Setting up nagios checks can involve using exported resources. Coupled with role/profile based puppet classes it allows for very specific tests for very specific applications.

What I did find is that using large amounts of exported resources can really slow down the puppet run on the nagios monitoring server. Even to the point of timing out and/or killing the puppetdb instance.

As a general rule, I’ve used a setup where very specific tests where node bound, but more generic checks were hostgroup bound. This lowers the number of exported resources that need to be collected and thus prevents overload / timeouts etc on the nagios monitoring node.

This post assumes working knowledge of both puppet and nagios.

Generic code samples

Basic setup of the puppet code for the nagios monitoring server

Collecting exported resources within the current puppet environment and purging any unmanaged nagios host and service resources.

  resources { 'nagios_host': purge => true, }
  resources { 'nagios_service': purge => true, }
    
  Nagios_host <<| tag == $environment |>>
  Nagios_service <<| tag == $environment |>> { notify => Exec['reload_nagios_config'], }

  exec { 'reload_nagios_config':
    command     => '/sbin/service nagios reload',
    refreshonly => true
  }

Basic plugin / check definition for all hosts

Just a single example. Please note that for this to work, you will also need to manage the nagios monitoring client nrpe, but this is properly documented in the puppet module docs. As it is bound to a hostgroup, it does not need to be exported.

  nagios_service { "${::fqdn}-disk-root":
    hostgroup_name      => 'puppet-managed-hosts',
    check_command       => "check_nrpe!check_disk!-a '-w 10% -c 5% -p /'",
    check_interval      => '5',
    retry_interval      => '1',
    contact_groups      => 'my_contact_group',
    service_description => 'Disk /',
    use                 => 'generic-service',
    tag                 => $environment,
    notification_period => 'none',
    check_period        => 'timeperiod_24x7',
  }

For every managed host we have a exported resource in a profile we apply to all nodes (our ‘base’ profile).

    @@nagios_host { "${::fqdn}":
      host_name           => "${::fqdn}",
      alias               => "${::fqdn}",
      address             => $::ipaddress,
      hostgroups          => 'puppet-managed-hosts',
      check_interval      => '5',
      retry_interval      => '1',
      use                 => 'generic-host',
      contact_groups      => 'my_contact_group',
      tag                 => [$environment, 'node-specific'],
      max_check_attempts  => '10',
      notification_period => 'none',
      check_period        => 'timeperiod_24x7',
    }

Role specific checks

Specific check for a certain role. The check below would be part of my clients puppet profile managing their Nexus instance :

  @@nagios_service { "${::fqdn}-nexus-webservice":
    host_name           => "${::fqdn}",
    check_command       => "check_http!-a '-I ${::ipaddress} -H ${svcname} -S -u /index.html '",
    check_interval      => '5',
    retry_interval      => '1',
    contact_groups      => 'my_contact_group',
    service_description => 'Nexus runs on https port 443',
    use                 => 'generic-service',
    tag                 => [$environment, 'node-specific'],
    notification_period => 'none',
    check_period        => 'timeperiod_24x7',
  }

 

git pre-commit hook for puppet, erb and yaml files

A pre-commit hook is a great way to run custom actions before handing over your work to GIT (and thus your CI tool chain). For puppet related GIT repositories I’ve assembled a pre-commit hook that checks the puppet code, changed ERB templates and any changed YAML files  for formatting errors.

All not rocket science when put together, but a great time saver none the less. I never fly without it these days. Just paste the code block below in your [GITREPO]/.git/hooks/pre-commit and make sure the result is executable for your own account. The code is just bits that I pieced together.

There are a few dependencies that you might have to install depending on your platform.

#!/bin/bash
# Requires bash, as it uses the [[ ]] syntax.
#
# https://puppetlabs.com/blog/using-puppet-lint-to-save-yourself-from-style-faux-pas
# https://docs.puppetlabs.com/guides/templating.html#syntax-checking
#
# If it's puppet code, lint it up.

# If we don't have puppet-lint, so just exit and leave them be.
which puppet-lint >/dev/null 2>&1 || exit

# 20150915 syncaddict
# - Added support for erb syntax checking
#
# 20151020 syncaddict
# - Added support for YAML syntax checking
# - more verbose operation
#
# Variables goes hither

declare -a FILES
IFS="
"
FILES=$(git diff --cached --name-only --diff-filter=ACM )


for file in ${FILES[@]}
do
  case $file in
    *\.pp*)
      echo "Checking puppet file $file"
      puppet-lint --no-puppet_url_without_modules-check --no-80chars-check --fix --with-filename "$file"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi

      puppet parser validate "$file"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *\.erb*)
      echo "Checking erb template $file"
      erb -P -x -T '-' $file | ruby -c
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *\.yaml*)
      echo "Checking yaml file $file"
      ruby -e "require 'yaml'; YAML.load_file('$file')"
      RC=$?
      if [ $RC -ne 0 ]; then exit $RC;fi
    ;;
    *)
      echo "Not checking file $file"
    ;;
  esac
done

exit 0

 

Speeding up puppet runs by using checksums when running execs

During my integration work for a client, I was running third party puppet code to  integrate automatically deployed application containers with the third party deployment tool (the deployit / XL Deploy module for puppet)

This module was written to run an exec for three different actions per container, all resulting in running Jython code in a local Java JVM. It would result in a couple of API / REST calls on the central deployment server to register this particular container.

Running this code every 30 minutes on 40+ hosts was dead slow and it was pounding the central server receiving the calls. Al while the properties being transferred had not changed and did not need any resending. Ugh 🙁

Kinda hoping that the vendor would come up with something smart, this situation was there for some time. But about a week ago I found the time to create an alternative. Instead of using the third party module, I wrote a shell scripts that parses a puppet managed properties file.

The script starts by reading all the properties and checksumming the concatenated result. If the sha256 sum has not changed since the last run, the script ends there. If there is no old checksum or the checksum does not match, the original Jython running in a JVM doing a rest call thing is run.

Before I started writing this alternative, puppet runs would easily run for minutes and minutes, waiting for the execs to finish. Now, with the central server offloaded, the typical puppet run is reduced to 80 seconds when running the execs and 29 seconds when the checksums matches.

[root@example ~]# puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/pe_build.rb
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/xldeploy_facts.rb
[snip]
Info: Caching catalog for example.local
Info: Applying configuration version '1426664207'
Notice: /Stage[main]/app1::Tomcat/my_tomcat::Instance[app1-service]/Exec[xldeploy-register-tomcat-app1-service]/returns: executed successfully
Notice: /Stage[main]/app2::Tomcat/my_tomcat::Instance[app2]/Exec[xldeploy-register-tomcat-app2]/returns: executed successfully
Notice: /Stage[main]/app3::Tomcat/my_tomcat::Instance[app3-service]/Exec[xldeploy-register-tomcat-app3-service]/returns: executed successfully
Notice: /Stage[main]/app4::Tomcat/my_tomcat::Instance[app4]/Exec[xldeploy-register-tomcat-app4]/returns: executed successfully
Notice: Finished catalog run in 28.85 seconds
[root@example ~]#

Xldeploy register script

#!/bin/bash
#
#
# This script can be used as an alternative to the slow puppet resources for
# XL Deploy. Use puppet to fill a property file for your XL Deploy container
# instance and call this script with the property file as argument.
#
# When it needs to register stuff in XL Deploy it is still slow as hell
# (the XL Deploy cli is just slow), but what this script does is test if
# the properties have changed since the last run and skip everything when
# nothing seems to need any updating. Should save as tons of time during
# puppet runs AND should shave us countless REST calls on the XL Deploy
# instance.
#
# 12-mar-2015
#

BASEDIR=/opt/dummy
VARDIR=/tmp

XLD_HOST=xldeploy.example.com
XLD_USER=admin
XLD_PASS=<%= @xldeploy_admin_password %>

export DEPLOYIT_CLI_HOME=/opt/deployit-cli
export DEPLOYIT_CLI_OPTS="-Xmx512m"

if [[ "X$1" == "X" ]];
then
  echo "Usage $0 [propertie file]"
  exit 1;
fi

if [[ ! -r $1 ]];
then
  echo "Unable to read propertie file $1"
  exit 2;
else
  EXT_PROPERTIES=$1
fi

function report()
{
  logger "$1"
  echo "$1"
}

function get_property()
{
  KEYWORD=$1
  VALUE=`/bin/grep $KEYWORD $EXT_PROPERTIES | /bin/cut -f1 -d '=' --complement`
  echo $VALUE
}

## run a command using the CLI. If it fails, remove any existing checksum to force
## a retry during the next run of this script.
function run_xld_command()
{
  COMMAND=$1
  /opt/deployit-cli/bin/cli.sh -q -host $XLD_HOST -port 443 -secure -context /deployit/deployit -username $XLD_USER -password $XLD_PASS $COMMAND
  if [[ $? > 0  && -f $CHECKSUM ]]; then rm $CHECKSUM; fi
}

function create_ci()
{
  CI_ID=$1
  CI_TYPE=$2
  CI_PARAMS=$3
  run_xld_command "-f /opt/deployit-puppet-module/create-ci.py -- $CI_ID $CI_TYPE $CI_PARAMS"
}

function set_tags()
{
  CI_ID=$1
  CI_TAG=$2
  run_xld_command "-f /opt/deployit-puppet-module/set-tags.py -- $CI_ID $CI_TAG"
}

function set_environment()
{
  CI_ID=$1
  CI_ENVIRONMENT=$2
  run_xld_command "-f /opt/deployit-puppet-module/set-envs.py -- $CI_ID $CI_ENVIRONMENT"
}

# The function below checks if anything in the XL Deploy settings for this
# CI has changed since the last run. If anything has changed the checksum on
# disk is updated (and the caller should rerun the command on the XL Deploy
# server to update this CI.
function xlconfig_changed()
{
  CHECK_STRING=$1
  CURRENT_SUM=`echo $CHECK_STRING | /usr/bin/sha256sum | /bin/awk '{ print $1; }'`
  if [[ -r $CHECKSUM ]];
  then
    PREVIOUS_SUM=`cat $CHECKSUM`
  else
    PREVIOUS_SUM="none"
  fi

  if [[ $PREVIOUS_SUM != $CURRENT_SUM ]];
  then
    CHANGED=true
    logger "Checksum $CURRENT_SUM does not match $PREVIOUS_SUM"
    echo $CURRENT_SUM > $CHECKSUM
  else
    CHANGED=false
    logger "Checksum $CURRENT_SUM same as $PREVIOUS_SUM"
  fi

  echo $CHANGED
}


HOST_CI_ID=$(get_property "host.ci.id")
HOST_CI_TYPE=$(get_property "host.ci.type")
HOST_CI_TAG=$(get_property "host.ci.tag")
HOST_CI_PARAMS=$(get_property "host.ci.params")

CONTAINER_CI_ID=$(get_property "container.ci.id")
CONTAINER_CI_TYPE=$(get_property "container.ci.type")
CONTAINER_CI_TAG=$(get_property "container.ci.tag")
CONTAINER_CI_PARAMS=$(get_property "container.ci.params")

CI_ENVIRONMENT=$(get_property "ci.environment")

CHECKSUM=$(get_property "checksum.path")

UPDATE_NEEDED=$(xlconfig_changed "$HOST_CI_ID $HOST_CI_TYPE $HOST_CI_TAG $HOST_CI_PARAMS $CONTAINER_CI_ID $CONTAINER_CI_TYPE $CONTAINER_CI_TAG $CONTAINER_CI_PARAMS $CI_ENVIRONMENT")

if [[ $UPDATE_NEEDED == "true" ]];
then
  report "Running the java/python code to update this CI in the XL Deploy server"
  create_ci "$HOST_CI_ID" "$HOST_CI_TYPE" "$HOST_CI_PARAMS"
  create_ci "$CONTAINER_CI_ID" "$CONTAINER_CI_TYPE" "$CONTAINER_CI_PARAMS"
  set_tags "$HOST_CI_ID" "$HOST_CI_TAG"
  set_tags "$CONTAINER_CI_ID" "$CONTAINER_CI_TAG"
  set_environment "$HOST_CI_ID" "$CI_ENVIRONMENT"
  set_environment "$CONTAINER_CI_ID" "$CI_ENVIRONMENT"
else
  report "Nothing to do here"
fi

exit 0

Example properties file

# managed by puppet
#
host.ci.id=<%= @host_ci_id %>
host.ci.type=<%= @host_ci_type %>
host.ci.tag=<%= @host_ci_tag %>
host.ci.params=<%= @host_ci_params %>

container.ci.id=<%= @container_ci_id %>
container.ci.type=<%= @container_ci_type %>
container.ci.tag=<%= @container_ci_tag %>
container.ci.params=<%= @container_ci_params %>

ci.environment=<%= @ci_environment %>
checksum.path=<%= @checksum_path %>

This code is propably way to specific to be of any use as is, but maybe the checksumming solution for ‘expensive’ exec calls during puppet runs is applicable to other scenarios as well.