Save greenbacks on Google Container Engine using autoscaling and preemptible VMs

There is an awesome new feature on Google Container Engine (GKE) that lets you combine autoscaling, node pools and preemptible VMs to save big $!

The basic idea is to create a small cluster with an inexpensive VM type that will run 7×24. This primary node can be used for critical services that should not be rescheduled to another pod. A good example would be a Jenkins master server. Here is an example of how to create the cluster:

gcloud alpha container clusters create $CLUSTER 
  --network "default" --num-nodes 1 
  --machine-type  ${small} --zone $ZONE 
  --disk-size 50

Now here is the money saver trick:  A second node pool is added to the cluster. This node pool is configured to auto-scale from one node up to a maximum. This additional node pool uses preemptible VMs. These are VMs that can be taken away at any time if Google needs the capacity, but in exchange you get dirt cheap images. For example, running a 4 core VM with 15GB of RAM for a month comes in under $30.

This second pool is perfect for containers that can survive a restart or migration to a new node. Jenkins slaves would be a good candidate.

Here is an example of adding the node pool to the cluster you created above:

gcloud alpha container node-pools create $NODEPOOL --cluster $CLUSTER --zone $ZONE 
    --machine-type ${medium} --preemptible --disk-size 50 
    --enable-autoscaling --min-nodes=1 --max-nodes=4

That node pool will scale down to a single VM if the cluster is not busy, and scale up to a maximum of 4 nodes.

If your VM gets preempted (and it will at least once every 24 hours),  the pods running on that node will be rescheduled onto a new node created by the auto-scaler.

Container engine assigns a label to nodes which you can use for scheduling. For example, to ensure you Jenkins Master does not get put on a preemptible node, you can add the following annotation to your Pod Spec:

apiVersion: v1kind: Podspec:  nodeSelector:    !cloud.google.com/gke-preemptible
apiVersion: v1kind: Podspec:  nodeSelector:    !cloud.google.com/gke-preemptible
nodeSelector:    !cloud.google.com/gke-preemptible

See https://cloud.google.com/container-engine/docs/preemptible-vm for the details.

This blog post was first published @ warrenstrange.blogspot.ca, included here with permission.

Automating OpenDJ backups on Kubernetes

Kubernetes StatefulSets are designed to run “pet” like services such as databases.  ForgeRock’s OpenDJ LDAP server is an excellent fit for StatefulSets as it requires stable network identity and persistent storage.

The ForgeOps project contains a Kubernetes Helm chart to deploy DJ to a Kubernetes cluster. Using a StatefulSet, the cluster will auto-provision persistent storage for our pod. We configure OpenDJ to place its backend database on this storage volume.

This gives us persistence that survives container restarts, or even restarts of the cluster. As long as we don’t delete the underlying persistent volume, our data is safe.

Persistent storage is quite reliable, but we typically want additional offline backups for our database.

The high level approach to accomplish this is as follows:

  • Configure the OpenDJ container to supported scheduled backups to a volume.
  • Configure a Kubernetes volume to store the backups.
  • Create a sidecar container that archives the backups. For our example we will use Google Cloud Storage.
Here are the steps in more detail:

Scheduled Backups:

OpenDJ has a built in task scheduler that can periodically run backups using a crontab(5) format.  We update the Dockerfile for OpenDJ with environment variables that control when backups run:

 

 # The default backup directory. Only relevant if backups have been scheduled.  
 ENV BACKUP_DIRECTORY /opt/opendj/backup  
 # Optional full backup schedule in cron (5) format.  
 ENV BACKUP_SCHEDULE_FULL "0 2 * * *"  
 # Optional incremental backup schedule in cron(5) format.  
 ENV BACKUP_SCHEDULE_INCREMENTAL "15 * * * *"  
 # The hostname to run the backups on. If this hostname does not match the container hostname, the backups will *not* be scheduled.  
 # The default value below means backups will not be scheduled automatically. Set this environment variable if you want backups.  
 ENV BACKUP_HOST dont-run-backups  

To enable backup support, the OpenDJ container runs a script on first time setup that configures the backup schedule.  A snippet from that script looks like this:

 if [ -n "$BACKUP_SCHEDULE_FULL" ];  
 then  
   echo "Scheduling full backup with cron schedule ${BACKUP_SCHEDULE_FULL}"  
   bin/backup --backupDirectory ${BACKUP_DIRECTORY} -p 4444 -D "cn=Directory Manager"   
   -j ${DIR_MANAGER_PW_FILE} --trustAll --backupAll   
   --recurringTask "${BACKUP_SCHEDULE_FULL}"  
 fi  

 

Update the Helm Chart to support backup

Next we update the OpenDJ Helm chart to mount a volume for backups and to support our new BACKUP_ variables introduced in the Dockerfile. We use a ConfigMap to pass the relevant environment variables to the OpenDJ container:

 apiVersion: v1  
 kind: ConfigMap  
 metadata:  
  name: {{ template "fullname" . }}  
 data:  
  BASE_DN: {{ .Values.baseDN }}  
  BACKUP_HOST: {{ .Values.backupHost }}  
  BACKUP_SCHEDULE_FULL: {{ .Values.backupScheduleFull }}  
  BACKUP_SCHEDULE_INCREMENTAL: {{ .Values.backupScheduleIncremental }}  

The funny looking expressions in the curly braces are Helm templates. Those variables are expanded
when the object is sent to Kubernetes. Using values allows us to parameterize the chart when we deploy it.

Next we configure the container with a volume to hold the backups:

  volumeMounts:  
     - name: data  
      mountPath: /opt/opendj/data  
     - name: dj-backup  
      mountPath: /opt/opendj/backup  

This can be any volume type supported by your Kubernetes cluster. We will use an “emptyDir” for now – which is a dynamic volume that Kubernetes creates and mounts on the container.

Configuring a sidecar backup container

Now for the pièce de résistance. We have our scheduled backups going to a Kubernetes volume. How do we send those files to offline storage?

One approach would be to modify our OpenDJ Dockerfile to support offline storage. We could, for example, include commands to write backups to Amazon S3 or Google Cloud storage.  This works, but it would specialize our container image to a unique environment. Where practical, we want our images to be flexible so they can be reused in different contexts.

This is where sidecar containers come into play.  The sidecar container holds the specialized logic for archiving files.  In general, it is a good idea to design containers that have a single responsibility. Using sidecars helps to enable this kind of design.

If you are running on Google Cloud Engine,  there is a ready made container that bundles the “gcloud” SDK, including the “gsutil” utility for cloud storage.   We update our Helm chart to include this container as a sidecar that shares the backup volume with the OpenDJ container:

  {{- if .Values.enableGcloudBackups }}  
    # An example of enabling backup to google cloud storage.  
    # The bucket must exist, and the cluster needs --scopes storage-full when it is created.  
    # This runs the gsutil command periodically to rsync the contents of the /backup folder (shared with the DJ container) to cloud storage.   
    - name: backup  
     image: gcr.io/cloud-builders/gcloud  
     imagePullPolicy: IfNotPresent  
     command: [ "/bin/sh", "-c", "while true; do gsutil -m rsync -r /backup {{ .Values.gsBucket }} ; sleep 600; done"]  
     volumeMounts:  
     - name: dj-backup  
      mountPath: /backup  
    {{- end }}  

The above container runs in a loop that periodically rsyncs the contents of the backup volume to cloud storage.  You could of course replace this sidecar with another that sends storage to a different location (say an Amazon S3 bucket).

If you enable this feature and browse to your cloud storage bucket, you should see your backed up data:

To wrap it all up, here is the final helm command that will deploy a highly available, replicated two node OpenDJ cluster, and schedule backups on the second node:

 helm install -f custom-gke.yaml   
   --set djInstance=userstore 
   --set numberSampleUsers=1000,backupHost=userstore-1,replicaCount=2 helm/opendj  

Now we just need to demonstrate that we can restore our data. Stay tuned!

This blog post was first published @ warrenstrange.blogspot.ca, included here with permission.

OpenDJ Pets on Kubernetes


Stateless “12-factor” applications are all the rage, but there are some kinds of services that are inherently stateful. Good examples are things like relational databases (Postgres, MySQL) and NoSQL databases (Cassandra, etc).

These services are difficult to containerize, because the default docker model favours ephemeral containers where the data disappears when the container is destroyed.

These services also have a strong need for identity. A database “primary” server is different than the “slave”. In Cassandra, certain nodes are designated as seed nodes, and so on.

OpenDJ is an open source LDAP directory server from ForgeRock. LDAP servers are inherently “pet like” insomuch as the directory data must persist beyond the container lifetime. OpenDJ nodes also replicate data between themselves to provide high-availability and therefore need some kind of stable network identity.

Kubernetes 1.3  introduces a feature called “Pet Sets” that is designed specifically for these kinds of stateful applications.   A Kubernetes PetSet provides applications with:

  • Permanent hostnames that persist across restarts
  • Automatically provisioned persistent disks per container that live beyond the life of a container
  • Unique identities in a group to allow for clustering and leader election
  • Initialization containers which are critical for starting up clustered applications

These features are exactly what we need to deploy OpenDJ instances.  If you want to give this a try, read on…

You will need access to a Kubernetes 1.3 environment. Using minikube is the recommended way to get started on the desktop.

You will need to fork and clone the ForgeRock Docker repository to build the OpenDJ base image. The repository is on our stash server:

 https://stash.forgerock.org/projects/DOCKER/repos/docker/browse

To build the OpenDJ image, you will do something like:

cd opendj
 docker build -t forgerock/opendj:latest .

If you are using minikube,  you should connect your docker client to the docker daemon running in your minikube cluster (use minikube docker-env).  Kubernetes will not need to “pull” the image from a registry – it will already be loaded.  For development this approach will speed things up considerably.

Take a look at the README for the OpenDJ image. There are a few environment variables that the container uses to determine how it is bootstrapped and configured.  The most important ones:

  • BOOTSTRAP: Path to a shell script that will initialize OpenDJ. This is only executed if the data/config directory is empty. Defaults to /opt/opendj/boostrap/setup.sh
  • BASE_DN: The base DN to create. Used in setup and replication
  • DJ_MASTER_SERVER: If set, run.sh will call bootstrap/replicate.sh to enable replication to this master. This only happens if the data/config directory does not exist

There are sample bootstrap setup.sh scripts provided as part of the container, but you can override these and provide your own script.

Next,  fork and clone the ForgeRock Kubernetes project here:
https://stash.forgerock.org/projects/DOCKER/repos/fretes/browse

The opendj directory contains the Pet Set example.  You must edit the files to suit your needs, but as provided, the artifacts do the following:

  • Configures two OpenDJ servers (opendj-0 and opendj-1) in a pet set.
  • Runs the  cts/setup.sh script provided as part of the docker image to configure OpenDJ as an OpenAM CTS server.
  • Optionally assigns persistent volumes to each pet, so the data will live across restarts
  • Assigns “opendj-0” as the master.  The replicate.sh script provided as part of the Docker image will replicate each node to this master.  The script ignores any attempt by the master to replicate to itself.  As each pet is added (Kubernetes creates them in order) replication will be configured between that pet and the opendj-0 master.
  • Creates a Kubernetes service to access the OpenDJ instances. Instances can be addressed by their unique name (opendj-1), or by a service name (opendj) which will go through a primitive load balancing function (at this time round robin).  Applications can also perform DNS lookup on the opendj SRV record to obtain a list of all the OpenDJ instances in the cluster.

The replication topology is quite simple. We simply replicate each OpenDJ instance to opendj-0. This is going to work fine for small OpenDJ clusters. For more complex installations you will need to enhance this example.

To create the petset:

kubectl create -f opendj/

If you bring up the minikube dashboard:

minikube dashboard

You should see the two pets being created (be patient, this takes a while).

Take a look at the pod logs using the dashboard or:

kubectl logs opendj-0 -f

Now try scaling up your PetSet. In the dashboard, edit the Pet Set object, and change the number of replicas from 2 to 3:

You should see a new OpenDJ instance being created. If you examine the logs for that instance, you will see it has joined the replication topology.

Note: Scaling down the Pet Set is not implemented at this time. Kubernetes will remove the pod, but the OpenDJ instances will still think the scaled down node is in the replication topology.

This blog post was first published @ blog.warrenstrange.com, included here with permission.

OpenDJ Pets on Kubernetes





Stateless "12-factor" applications are all the rage, but there are some kinds of services that are inherently stateful. Good examples are things like relational databases (Postgres, MySQL) and NoSQL databases (Cassandra, etc).

These services are difficult to containerize, because the default docker model favours ephemeral containers where the data disappears when the container is destroyed.

These services also have a strong need for identity. A database "primary" server is different than the "slave". In Cassandra, certain nodes are designated as seed nodes, and so on.

OpenDJ is an open source LDAP directory server from ForgeRock. LDAP servers are inherently "pet like" insomuch as the directory data must persist beyond the container lifetime. OpenDJ nodes also replicate data between themselves to provide high-availability and therefore need some kind of stable network identity.

Kubernetes 1.3  introduces a feature called "Pet Sets" that is designed specifically for these kinds of stateful applications.   A Kubernetes PetSet provides applications with:
  • Permanent hostnames that persist across restarts
  • Automatically provisioned persistent disks per container that live beyond the life of a container
  • Unique identities in a group to allow for clustering and leader election
  • Initialization containers which are critical for starting up clustered applications

These features are exactly what we need to deploy OpenDJ instances.  If you want to give this a try, read on...

You will need access to a Kubernetes 1.3 environment. Using minikube is the recommended way to get started on the desktop. 

You will need to fork and clone the ForgeRock Docker repository to build the OpenDJ base image. The repository is on our stash server: 


To build the OpenDJ image, you will do something like:

cd opendj
docker build -t forgerock/opendj:latest . 

If you are using minikube,  you should connect your docker client to the docker daemon running in your minikube cluster (use minikube docker-env).  Kubernetes will not need to "pull" the image from a registry - it will already be loaded.  For development this approach will speed things up considerably.


Take a look at the README for the OpenDJ image. There are a few environment variables that the container uses to determine how it is bootstrapped and configured.  The most important ones:
  • BOOTSTRAP: Path to a shell script that will initialize OpenDJ. This is only executed if the data/config directory is empty. Defaults to /opt/opendj/boostrap/setup.sh
  • BASE_DN: The base DN to create. Used in setup and replication
  • DJ_MASTER_SERVER: If set, run.sh will call bootstrap/replicate.sh to enable replication to this master. This only happens if the data/config directory does not exist

There are sample bootstrap setup.sh scripts provided as part of the container, but you can override these and provide your own script.  

Next,  fork and clone the ForgeRock Kubernetes project here:

The opendj directory contains the Pet Set example.  You must edit the files to suit your needs, but as provided, the artifacts do the following:

  • Configures two OpenDJ servers (opendj-0 and opendj-1) in a pet set. 
  • Runs the  cts/setup.sh script provided as part of the docker image to configure OpenDJ as an OpenAM CTS server.
  • Optionally assigns persistent volumes to each pet, so the data will live across restarts
  • Assigns "opendj-0" as the master.  The replicate.sh script provided as part of the Docker image will replicate each node to this master.  The script ignores any attempt by the master to replicate to itself.  As each pet is added (Kubernetes creates them in order) replication will be configured between that pet and the opendj-0 master. 
  • Creates a Kubernetes service to access the OpenDJ instances. Instances can be addressed by their unique name (opendj-1), or by a service name (opendj) which will go through a primitive load balancing function (at this time round robin).  Applications can also perform DNS lookup on the opendj SRV record to obtain a list of all the OpenDJ instances in the cluster.

The replication topology is quite simple. We simply replicate each OpenDJ instance to opendj-0. This is going to work fine for small OpenDJ clusters. For more complex installations you will need to enhance this example.


To create thet petset:

kubectl create -f opendj/


If you bring up the minikube dashboard:

minikube dashboard 

You should see the two pets being created (be patient, this takes a while). 




Take a look at the pod logs using the dashboard or:

kubectl logs opendj-0 -f 

Now try scaling up your PetSet. In the dashboard, edit the Pet Set object, and change the number of replicas from 2 to 3:



















You should see a new OpenDJ instance being created. If you examine the logs for that instance, you will see it has joined the replication topology. 

Note: Scaling down the Pet Set is not implemented at this time. Kubernetes will remove the pod, but the OpenDJ instances will still think the scaled down node is in the replication topology. 




Creating an internal CA and signed server certificates for OpenDJ using cfssl, keytool and openssl


Yes, that title is quite a mouthful, and mostly intended to get the Google juice if I need to find this entry again.

I spent a couple of hours figuring out the magical incantations, so thought I would document this here.

The problem: You want OpenDJ to use something other than the default self-signed certificate for SSL connections.   A "real" certificate signed by a CA (Certificate Authority) is expensive and a pain to procure and install.

The next best alternative is to create your own "internal" CA, and  have that CA sign certificates for your services.   In most cases, this is going to work fine for *internal* services that do not need to be trusted by a browser.

You might ask why is this better than just using self-signed certificates?  The idea is that you can import your CA certificate once into the truststore for your various clients, and thereafter those clients will trust any certificate presented that is signed by your CA.

For example, assume I have OpenDJ servers:  server-1,server-2 and server-3.  Using only  self-signed certificates, I will need to import the certs for each server (three in this case) into my client's truststore. If instead, I use a CA, I need only import a single CA certificate. The OpenDJ server certificates will be trusted because they are signed by my CA.  Once you start to get a lot of services deployed using self-signed certificates becomes super painful. Hopefully, that all makes sense...

Now how do you create all these certificates?  Using CloudFlare's open source  cfssl utility, Java keytool, and a little openssl.

I'll spare you the details, and point you to this shell script which you can edit for your environment:

Here is the gist:



Kubernetes Namespaces and OpenAM




I have been conducting some experiments running the ForgeRock stack on Kubernetes. I recently stumbled on namespaces.

In a nutshell Kubernetes (k8) namespaces provide isolation for instances. The typical use case is to provide isolated environments for dev, QA, production and so on.

I had an "Aha!" moment when it occurred to me that namespaces could also provide multi-tenancy on a k8 cluster. How might this work?

Let's create a two node OpenAM cluster using an external OpenDJ instance:

See https://github.com/ForgeRock/fretes  for some samples used in this article

kubectl create -f am-dj-idm/

The above command launches all the containers found in the given directory, wires them up together (updates DNS records), and create a load balancer on GCE.

 If I look at my services:

 kubectl get service 

I see something like this:

NAME       LABELS          SELECTOR   IP(S) PORT(S) 
openam-svc name=openam-svc site=site1 10.215.249.206 80/TCP 
                                      104.197.122.164 

(note: I am eliding a bit of the output here for brevity)

That second IP for openam-svc is the external IP of the load balancer configured by Kubernetes. If you bring up this IP address you will see the OpenAM login page. 

Now, let's change my namespace to another instance. Say "tenant1" (I previously created this namespace)

kc config use-context tenant1 

A kubectl get services  should be empty, as we have no services running yet in the tenant1 namespace. 

So let's create some: 

kubectl create -f am-dj-idm/ 


This is the same command we ran before - but this time we are running against a different namespace.  

Looking at our services, we see:

NAME        LABELS            SELECTOR   IP(S) PORT(S) 
openam-svc  name=openam-svc   site=site1 10.215.255.185 80/TCP 
                                         23.251.153.176 

Pretty cool. We now have two OpenAM instances deployed, with complete isolation, on the same cluster, using only a handful of commands. 

Hopefully this gives you a sense of why I am so excited about Kubernetes.



Running OpenAM and OpenDJ on Kubernetes with Google Container Engine



Still quite experimental, but if you are adventurous, have a look at:

https://github.com/ForgeRock/frstack/tree/master/docker/k8


This will set up a two node Kubernetes cluster running OpenAM and OpenDJ.  This uses images on the Docker hub that provide nightly builds for OpenAM and OpenDJ.

I will be presenting this at the ForgeRock IRM summit this thursday. Fingers crossed that the demo gods smile down on me!