In my home infrastructure I am running as much services on a single machine in order to save power. The most constrained resource is memory, so I try to save memory wherever I can. Using a Puppetmaster server to serve 10-15 nodes seems a waste of resources, so that’s why I decided to go with a stand-alone setup.

Running Puppet in a stand-alone setup might not be as daunting as you might think. Documentation and examples are sparse across the Internet, so I would like to show you how I did my setup. The only prerequisite for this configuration is that you have a working Git server. Puppet will be automatically executed using cron.

Overview

Puppet will run in stand-alone mode on all machines. Puppet modules and related files will al be kept in a Git repository, so changes can be deployed quickly across the network. Cron will periodically do a git pull and run Puppet, to ensure compliance and quick rollout. This guide assumes you are familiar with the basics of Linux, Git and Puppet.

Basic Puppet setup

The puppet/ directory will contain all files Puppet needs to run on all systems. This includes the code, what to run where and the configuration of the modules.

Create a repository for use with Puppet
Create the following directories:
- manifests/
- modules/

Create a Puppet config file.

puppet.conf:

[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$confdir/facter

Create a ‘site’ manifest file. Clients will use this file to determine which modules to run.

manifests/site.pp:
```
node default {
  include ntp
}
```
This will include the NTP module. Feel free to add any more modules with additional include statements. The default node will be used by all clients which don’t have a specific entry in the site.pp file. If you want to change the list of modules for a specific client, you can add multiple node blocks:
```
node default {
  include ntp
}
node mail1 {
  include ntp
  include postfix
}
```
This is not the recommended way to add modules to a host. It is much more convenient to use a intermediate module or use a pattern such as the role-profile pattern.
Add modules to the modules/ directory. In this example we need to put the NTP module in this directory. Download the module from Puppetforge and un-tar it in the modules/ directory.
Commit and check in to the Git repository.

This should give you enough to get a basic setup running. Install Puppet from your favourite repository manager and run it with:

# /usr/bin/puppet apply /etc/puppet/manifests/site.pp

Tip: You will have to debug any errors before proceeding to the next step.

Cron setup

Cron will be used to periodically pull the repo for any changes and run Puppet if anything changed. Puppet will also run periodically (at a lower rate) to ensure system compliance. Now that we have Puppet running, we might as well use it to make the cron config. Time for our first custom module!

We’ll perform the following functions with Puppet:

Make sure Git is installed
Disable the Puppet daemon (we’re running from cron instead)
Add a git hook to the repository to run Puppet when a change is pulled
Add a cron task and schedule it with some randomness to not distribute the load on the Git/virtualization server

In this example we assume you clone the Git repo to /etc/puppet on the clients.

Create the directory structure for the new module:

mkdir modules/cron-puppet
mkdir modules/cron-puppet/manifests
mkdir modules/cron-puppet/templates
mkdir -p modules/cron-puppet/lib/puppet/parser/functions/

Create the template files:

modules/cron-puppet/templates/cron.erb:

# This file is managed by Puppet (cron-puppet)

# Do a Git pull every 10 minutes
<%= @rand_git_pull %> * * * * root /usr/bin/git -C /etc/puppet pull >/dev/null 2>&1

# Do a Puppet run anyways every hour
<%= @rand_minute %> * * * * root /usr/bin/puppet apply /etc/puppet/manifests/site.pp >/dev/null 2>&1

modules/cron-puppet/templates/post-merge.erb:

#!/bin/bash -e
# This file is managed by Puppet (cron-puppet)
/usr/bin/puppet apply /etc/puppet/manifests/site.pp

Tip: It is always good practice to include the name of the Puppet class that is responsible for a file in the template. In that case, it’s easy to find the corresponding class when looking at a file on the system

Add the manifest files:

modules/cron-puppet/manifests/init.pp:

class cron-puppet {
   
  include cron-puppet::last-run
   
  package { 'git':
    ensure => 'latest',
  }
   
  # Disable puppet daemon as it only works with a puppetmaster
  service { 'puppet':
    ensure  => stopped,
    enable  => false,
    require => Package['puppet'],
  }
   
  # Git hook to run puppet when the repo changes
  file { 'post-hook':
    ensure  => file,
    path    => '/etc/puppet/.git/hooks/post-merge',
    content => template('cron-puppet/post-merge.erb'),
    mode    => 0755,
    owner   => root,
    group   => root,
    require => Package['puppet', 'git'],
  }
   
  $rand_minute = fqdn_rand(60)
  $rand_git_pull = generateCronTime()
  file { 'puppet-cron':
    ensure  => file,
    path    => '/etc/cron.d/puppet',
    content => template('cron-puppet/cron.erb'),
    mode    => 0644,
    owner   => root,
    group   => root,
  }
}

The manifest is reasonably self-explanatory. The fqdn_rand() function comes from the Puppet parser, it gives a random number, max 60, based on the FQDN of the host. The generateCronTime() function generates an array of minutes, every 5 minutes, based on this value (so if $rand_minute = 2 then $rand_git_pull = [2,7,12,17...]. This is done so we can schedule a git pull every 5 minutes.

The cron-puppet::last_run module will write the time the Puppet agent last ran to a file, for monitoring purposes.

modules/cron-puppet/manifests/last-run.pp:

class cron-puppet::last-run {
  include stdlib
   
  $time = strftime("%Y-%m-%d %H:%M:%S")
  $timestring = "$time\n"
   
  file { '/run/puppet/last_run.txt':
    ensure    => file,
    owner     => 'root',
    group     => 'root',
    mode      => 0444,
    content   => $timestring,
  }
   
}

Add the extra function we need to generate the cron scheduling times.

modules/cron-puppet/lib/puppet/parser/functions/generateCronTime.rb:

require 'digest/md5'
   
module Puppet::Parser::Functions
  newfunction(:generateCronTime, :type => :rvalue) do |args|
     
    minuteArray = Array.new
    # Generate random offset (from Puppet fqdn_rand.rb)
    seed = Digest::MD5.hexdigest([self['::fqdn'],args].join(':')).hex
    max=10
   
    if defined?(Random) == 'constant' && Random.class == Class
      offset = Random.new(seed).rand(max).to_s
    else
      srand(seed)
      offset = rand(max).to_s
      srand()
      offset
    end
   
    # Generate the 5 minute increments with the offset
    (0..50).step(10) do |n|
      minuteArray.push(n.to_i + offset.to_i)
    end
   
    minuteString=minuteArray.join(",")
    return minuteString.to_s
  end
end

Add the cron-puppet module to the site manifest:

manifests/site.pp:

node default {
  include ntp
  include cron-puppet
}

Run Puppet on the clients:

/usr/bin/puppet apply /etc/puppet/manifests/site.pp

This will apply the cron configuration and Puppet should run automatically now! The last run time can be checked by viewing the /run/puppet/last_run.txt file.

Also check other articles tagged puppet for more tutorials and articles regarding Puppet.

Caveats

For optimal security, clients should have read-only access to the Git repository. This can be done in several ways:
- If you use Git on a file system with ssh access, you could create a different user for access to the repository and use filesystem access controls to make the repo read-only for clients.
- If you serve Git through HTTP, the contents will be read-only automatically (unless you have WebDAV enabled ofcourse ) This is described at Resource friendly Git server
- Use access controls on your favorite Git repository tooling (Gitlab/Bitbucket)
If you or Puppet somehow manage to break the cron config running the Puppet client, you will have to manually fix it on all nodes.
There is no reporting on the Puppet runs on the network, so you won’t be able to see if all nodes have converged properly. I modified the script which takes the Puppet output status and POST’s it to a webserver, which stores it in a database.