2020: Machine Learning, Post Quantum Crypto & Zero Trust

Welcome to a digital identity project in 2020! You'll be expected to have a plan for post-quantum cryptography.  Your network will be littered with "zero trust" buzz words, that will make you suspect everyone, everything and every transaction.  Add to that, “machines” will be learning everything, from how you like your coffee, through to every network, authentication and authorisation decision. OK, are you ready?

Machine Learning

I'm not going to do an entire blog on machine learning (ML) and artificial intelligence (AI).  Firstly I'm not qualified enough on the topic and secondly I want to focus on the security implications.  Needless to say, within 3 years, most organisations will have relatively experienced teams who are handling big data capture from an and identity, access management and network perspective.

That data will be being fed into ML platforms, either on-premise, or via cloud services.  Leveraging either structured or unstructured learning, data from events such as login (authentication) for end users and devices, as well authorization decisions can be analysed in order to not only increase assurance and security, but for also increasing user experience.  How?  Well if the output from ML can be used to either update existing signatures (bit legacy, but still) whilst simultaneously working out the less risky logins, end user journeys can be made less intrusive. 

Step one is finding out the correct data sources to be entered into the ML “model”.  What data is available, especially within the sign up, sign in and authorization flows?  Clearly general auditing data will look to capture ML “tasks” such as successful sign ins and any other meta data associated with that – such as time, location, IP, device data, behavioural biometry and so on.  Having vast amounts of this data available is the first start, which in turn can be used to “feed” the ML engine.  Other data points would be needed to.  What resources, applications and API calls are being made to complete certain business processes?  Can patterns be identified and tied to “typical” behaviour and user and device communities.  Being able to identify and track critical data and the services that process that data would be a first step, before being able to extract task based data samples to help identify trusted and untrusted activities.


Post Quantum Crypto

Quantum computing is coming.  Which is great.  Even in 2020, it might not be ready, but you need to be ready for it.  But, and there’s always a but, the main concern is that the super power of quantum will blow away the ability for existing encryption and hashing algorithms to remain secure.  Why?  Well quantum computing ushers in a paradigm of “qubits” - a superpositional state in between the classic binary 1 and 0.  Ultimately, that means that the “solutioneering” of complex problems can be completed  in a much more efficient and non-sequential way.

The quantum boxes can basically solve certain problems faster.  The mathematics behind cryptography being one of those problems.  A basic estimate for the future effectiveness of something like AES-256, drops to 128 bits.  Scary stuff.  Commonly used approaches today for key exchange rely on protocols such as Diffie-Hellman (DH) or Elliptic Curve Diffie Hellman (ECDH).  Encryption is then handled by things like Rivest-Shamir-Adleman (RSA) or Elliptic Curve Digital Signing Algorithm (ECDSA). 

In the post-quantum (PQ) world they’re basically broken.  Clearly, the material impact on your organisation or services will largely depend on impact assessment.  There’s no point putting a $100 lock on a $20 bike.  But everyone wants encryption right?  All that data that will be flying around is likely to need even more protection whilst in transit and at rest.

Some of the potentially “safe” PQ algorithms include XMSS and SPHINCS for hashing – the former going through IETF standardization.  Ring Learning With Errors (RLWE) is basically an enhanced public key cryptosystem, that alters the structure of the private key.  Currently under research but no weakness have yet been found.  NTRU is another algorithm for the PQ world, using a hefty 12881 bit key.  NTRU is also already standardized by the IEEE which helps with the maturity aspect.

But how to decide?  There is a nice body called the PQCRYPTO Consortium that is providing guidance on current research.  Clearing you’re not going to build your own alternatives, but information assurance and crypto specialists within your organisation, will need to start data impact assessments, in order to understand where cryptography is currently used for both transport, identification and data at rest protection to understand any future potential exposures.


Zero Trust Identities

“Zero Trust” (ZT) networking has been around for a while.  The concept of organisations having a “safe” internal network versus the untrusted “hostile” public network, separated by a firewall are long gone. Organisations are perimeter-less. 

Assume every device, identity and transaction is hostile until proven otherwise.  ZT for identity especially, will be looking to bind not only a physical identity to a digital representation (session Id, token, JWT), but also that representation to a vehicle – aka mobile, tablet or device.  In turn, every transaction that tuple interacts with, is then verified – checking for changes – either contextual or behavioural that could indicate malicious intent.  That introduces a lot of complexity to transaction, data and application protection. 

Every transaction potentially requires introspection or validation.  Add to this mix an increased number of devices and data flows, which would pave the way for distributed authorization, coupled with continuous session validation.

How will that look?  Well we’re starting to see the of things like stateless JSON Web Tokens (JWT’s) as a means for hyper scale assertion issuance, along with token binding to sessions and devices.  Couple to that Fine Grained Authentication processes that are using 20+ signals of data to identify a user or thing and we’re starting to see the foundations of ZT identity infrastructures.  Microservice or hyper-mesh related application infrastructures are going to need rapid introspection and re-validation on every call so the likes of distributed authorization looks likely.


So the future is now.  As always.  We know that secure identity and access management functions has never been more needed, popular or advanced in the last 20 years.  The next 3-5 years will be critical in defining a back bone of security services that can nimbly be applied to users, devices, data and the billions of transactions that will result.

2020: Machine Learning, Post Quantum Crypto & Zero Trust

Welcome to a digital identity project in 2020! You'll be expected to have a plan for post-quantum cryptography.  Your network will be littered with "zero trust" buzz words, that will make you suspect everyone, everything and every transaction.  Add to that, “machines” will be learning everything, from how you like your coffee, through to every network, authentication and authorisation decision. OK, are you ready?

Machine Learning

I'm not going to do an entire blog on machine learning (ML) and artificial intelligence (AI).  Firstly I'm not qualified enough on the topic and secondly I want to focus on the security implications.  Needless to say, within 3 years, most organisations will have relatively experienced teams who are handling big data capture from an and identity, access management and network perspective.

That data will be being fed into ML platforms, either on-premise, or via cloud services.  Leveraging either structured or unstructured learning, data from events such as login (authentication) for end users and devices, as well authorization decisions can be analysed in order to not only increase assurance and security, but for also increasing user experience.  How?  Well if the output from ML can be used to either update existing signatures (bit legacy, but still) whilst simultaneously working out the less risky logins, end user journeys can be made less intrusive. 

Step one is finding out the correct data sources to be entered into the ML “model”.  What data is available, especially within the sign up, sign in and authorization flows?  Clearly general auditing data will look to capture ML “tasks” such as successful sign ins and any other meta data associated with that – such as time, location, IP, device data, behavioural biometry and so on.  Having vast amounts of this data available is the first start, which in turn can be used to “feed” the ML engine.  Other data points would be needed to.  What resources, applications and API calls are being made to complete certain business processes?  Can patterns be identified and tied to “typical” behaviour and user and device communities.  Being able to identify and track critical data and the services that process that data would be a first step, before being able to extract task based data samples to help identify trusted and untrusted activities.


Post Quantum Crypto

Quantum computing is coming.  Which is great.  Even in 2020, it might not be ready, but you need to be ready for it.  But, and there’s always a but, the main concern is that the super power of quantum will blow away the ability for existing encryption and hashing algorithms to remain secure.  Why?  Well quantum computing ushers in a paradigm of “qubits” - a superpositional state in between the classic binary 1 and 0.  Ultimately, that means that the “solutioneering” of complex problems can be completed  in a much more efficient and non-sequential way.

The quantum boxes can basically solve certain problems faster.  The mathematics behind cryptography being one of those problems.  A basic estimate for the future effectiveness of something like AES-256, drops to 128 bits.  Scary stuff.  Commonly used approaches today for key exchange rely on protocols such as Diffie-Hellman (DH) or Elliptic Curve Diffie Hellman (ECDH).  Encryption is then handled by things like Rivest-Shamir-Adleman (RSA) or Elliptic Curve Digital Signing Algorithm (ECDSA). 

In the post-quantum (PQ) world they’re basically broken.  Clearly, the material impact on your organisation or services will largely depend on impact assessment.  There’s no point putting a $100 lock on a $20 bike.  But everyone wants encryption right?  All that data that will be flying around is likely to need even more protection whilst in transit and at rest.

Some of the potentially “safe” PQ algorithms include XMSS and SPHINCS for hashing – the former going through IETF standardization.  Ring Learning With Errors (RLWE) is basically an enhanced public key cryptosystem, that alters the structure of the private key.  Currently under research but no weakness have yet been found.  NTRU is another algorithm for the PQ world, using a hefty 12881 bit key.  NTRU is also already standardized by the IEEE which helps with the maturity aspect.

But how to decide?  There is a nice body called the PQCRYPTO Consortium that is providing guidance on current research.  Clearing you’re not going to build your own alternatives, but information assurance and crypto specialists within your organisation, will need to start data impact assessments, in order to understand where cryptography is currently used for both transport, identification and data at rest protection to understand any future potential exposures.


Zero Trust Identities

“Zero Trust” (ZT) networking has been around for a while.  The concept of organisations having a “safe” internal network versus the untrusted “hostile” public network, separated by a firewall are long gone. Organisations are perimeter-less. 

Assume every device, identity and transaction is hostile until proven otherwise.  ZT for identity especially, will be looking to bind not only a physical identity to a digital representation (session Id, token, JWT), but also that representation to a vehicle – aka mobile, tablet or device.  In turn, every transaction that tuple interacts with, is then verified – checking for changes – either contextual or behavioural that could indicate malicious intent.  That introduces a lot of complexity to transaction, data and application protection. 

Every transaction potentially requires introspection or validation.  Add to this mix an increased number of devices and data flows, which would pave the way for distributed authorization, coupled with continuous session validation.

How will that look?  Well we’re starting to see the of things like stateless JSON Web Tokens (JWT’s) as a means for hyper scale assertion issuance, along with token binding to sessions and devices.  Couple to that Fine Grained Authentication processes that are using 20+ signals of data to identify a user or thing and we’re starting to see the foundations of ZT identity infrastructures.  Microservice or hyper-mesh related application infrastructures are going to need rapid introspection and re-validation on every call so the likes of distributed authorization looks likely.


So the future is now.  As always.  We know that secure identity and access management functions has never been more needed, popular or advanced in the last 20 years.  The next 3-5 years will be critical in defining a back bone of security services that can nimbly be applied to users, devices, data and the billions of transactions that will result.

Using IDM and DS to synchronise hashed passwords

Overview

In this post I will describe a technique for synchronising a hashed password from ForgeRock IDM to DS.

Out of the box, IDM has a Managed User object that encrypts a password in symmetric (reversible) encryption.  One reason for this is that sometimes it is necessary to pass, in clear text, the password to a destination directory in order for it to perform its own hashing before storing it.  Therefore the out of the box synchronisation model for IDM is take the encrypted password from its own store, decrypt it, and pass it in clear text (typically over a secure channel!) to DS for it to hash and store.

You can see this in the samples for IDM.

However, there are some times when storing an encrypted, rather than hashed, value for a password is not acceptable.  IDM includes the capability to hash properties (such as passwords) not just encrypt them.  In that scenario, given that password hashes are one way, it’s not possible to decrypt the password before synchronisation with other systems such as DS.

Fortunately, DS offers the capability of accepting pre-hashed passwords so IDM is able to pass the hash to DS for storage.  DS obviously needs to know this value is a hash, otherwise it will try to hash the hash!

So, what are the steps required?

  1. Ensure that DS is configured to accept hashed passwords.
  2. Ensure the IDM data model uses ‘Hashing’ for the password property.
  3. Ensure the IDM mapping is setup correctly

Ensure DS is configured to accept hashed passwords

This topic is covered excellently by Mark Craig in this article here:
https://marginnotes2.wordpress.com/2011/07/21/opendj-using-pre-encoded-passwords/

I’m using ForgeRock DS v5.0 here, but Mark references the old name for DS (OpenDJ) because this capability has been around for a while.  The key thing to note about the steps in the article is that you need the allow-pre-encoded-passwords advanced password policy property to be set for the appropriate password policy.  I’m only going to be dealing with one password policy – the default one – so Mark’s article covers everything I need.

(I will be using a Salted SHA-512 algorithm so if you want to follow all the steps, including testing out the change of a user’s password, then specify {SSHA512} in the userPassword value, rather than {SSHA}.  This test isn’t necessary for the later steps in this article, but may help you understand what’s going on).

Ensure IDM uses hashing

Like everything in IDM, you can modify configuration by changing the various config .json files, or the UI (which updates the json config files!)

I’ll use IDM v5.0 here and show the UI.

By default, the Managed User object includes a password property that is defined as ‘Encrypted’:

We need to change this to be Hashed:

And, I’m using the SHA-512 algorithm here (which is a Salted SHA-512 algorithm).

Note that making this change does not update all the user passwords that exist.  It will only take effect when a new value is saved to the property.

Now the value of a password, when it is saved, is string representation of a complex JSON object (just like it is when encrypted) but will look something like:

{"$crypto":

  {"value":

    {"algorithm":"SHA-512","data":"Quxh/PEBXMa2wfh9Jmm5xkgMwbLdQfytGRy9VFP12Bb5I2w4fcpAkgZIiMPX0tcPg8OSo+UbeJRdnNPMV8Kxc354Nj12j0DXyJpzgqkdiWE="},

    "type":"salted-hash"

  }

}

Ensure IDM Mapping is setup correctly

Now we need to configure the mapping.

As you may have noted in the 1st step, DS is told that the password is pre-hashed by the presence of {SSHA512} at the beginning of the password hash value.  Therefore we need a transformation script that takes the algorithm and hash value from IDM and concatenates it in a way suited for DS.

The script is fairly simple, but does need some logic to convert the IDM algorithm representation: SHA-512 into the DS representation: {SSHA512}

This is the transformation script (in groovy) I used (which can be extended of course for other algorithms):

String strHash;

if (source.$crypto.value.algorithm == "SHA-512" ) {

  strHash = "{SSHA512}" + source.$crypto.value.data

}

strHash;

This script replaces the default IDM script that does the decryption of the password.

(You might want to extend the script to cope with both hashed and encrypted values of passwords if you already have data.  Look at functions such as openidm.isHashed and openidm.isEncrypted in the IDM Integrators Guide).

Now when a password is changed, the password is stored in hashed form in IDM.  Then the mapping is triggered to synchronise to DS applying the transformation script that passes the pre-hashed password value.

Now there is no need to store passwords in reversible encryption!

This blog post was first published @ yaunap.blogspot.no, included here with permission from the author.

Kubernetes: Why won’t that deployment roll?

Kubernetes Deployments provide a declarative way of managing replica sets and pods.  A deployment specifies how many pods to run as part of a replica set, where to place pods, how to scale them and how to manage their availability.

Deployments are also used to perform rolling updates to a service. They can support a number of different update strategies such as blue/green and canary deployments.

The examples provided in the Kubernetes documentation explain how to trigger a rolling update by changing a docker image.  For example, suppose you edit the Docker image tag by using the kubectl edit deployment/my-app command, changing the image tag from acme/my-app:v0.2.3 to acme/my-app:v0.3.0.

When Kubernetes sees that the image tag has changed, a rolling update is triggered according to the declared strategy. This results in the pods with the 0.2.3 image being taken down and replaced with pods running the 0.3.0 image.  This is done in a rolling fashion so that the service remains available during the transition.

If your application is packaged up as Helm chart, the helm upgrade command can be used to trigger a rollout:

helm upgrade my-release my-chart

Under the covers, Helm is just applying any updates in your chart to the deployment and sending them to Kubernetes. You can achieve the same thing yourself by applying updates using kubectl.

Now let’s suppose that you want to trigger a rolling update even if the docker image has not changed. A likely scenario is that you want to apply an update to your application’s configuration.

As an example, the ForgeRock Identity Gateway (IG) Helm chart pulls its configuration from a git repository. If the configuration changes in git, we’d like to roll out a new deployment.

My first attempt at this was to perform a helm upgrade on the release, updating the ConfigMap in the chart with the new git branch for the release. Our Helm chart uses the ConfigMap to set an environment variable to the git branch (or commit) that we want to checkout:

kind: ConfigMap
 data:
 GIT_CHECKOUT_BRANCH: test

After editing the ConfigMap, and doing a helm upgrade, nothing terribly exciting happened.  My deployment did not “roll” as expected with the new git configuration.

As it turns out, Kubernetes needs to see a change in the pod’s spec.template before triggers a new deployment. Changing the image tag is one way to do that,  but any change to the template will work. As I discovered, changes to a ConfigMap *do not* trigger deployment updates.

The solution here is to move the git branch variable out of the ConfigMap and into to the pod’s spec.template in the deployment object:

 spec:
      initContainers:
      - name: git-init
        env:
        - name: GIT_CHECKOUT_BRANCH
          value: "{{ .Values.global.git.branch }}"

When the IG helm chart is updated (we supply a new value to the template variable above), the template’s spec changes, and Kubernetes will roll out the new deployment.

Here is what it looks like when we put it all together (you can check out the full IG chart here).

# install IG using helm. The git branch defaults to "master"
 $ helm install openig
 NAME:   hissing-yak

$ kubectl get deployment
 NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE

openig    1         1         1            1           4m

# Take a look at our deployment history. So far, there is only one revision:
 $ kubectl rollout history deployment
 deployments "openig"
 REVISION CHANGE-CAUSE
 1
 # Let's deploy a different configuration. We can use a commit hash here instead
 # of a branch name:
 $ helm upgrade --set global.git.branch=d0562faec4b1621ad53b852aa5ee61f1072575cc
 hissing-yak openig

# Have a look at our deployment. We now see a new revision
 $ kubectl rollout history deploy
 deployments "openig"
 REVISION  CHANGE-CAUSE
 1
 2
 # Look at the deployment history. You should see an event where the original
 # replicaset is scaled down and a new set is scaled up:
 kubectl describe deployment
 Events:
 Type    Reason             Age   From                   Message
 ----    ------             ----  ----                   -------
 ---> Time=19M ago. This is the original RS
 Normal  ScalingReplicaSet  19m   deployment-controller  Scaled up replica set openig-6f6575cdfd to 1
 ------> Time=3m ago. The new RS being brought up
 Normal  ScalingReplicaSet  3m    deployment-controller  Scaled up replica set openig-7b5788659c to 1
 -----> Time=3m ago. The old RS being scaled down
 Normal  ScalingReplicaSet  3m    deployment-controller  Scaled down replica set openig-6f6575cdfd to 0

One of the neat things we can do with deployments is roll back to a previous revision. For example, if we have an error in our configuration, and want to restore the previous release:

$ kubectl rollout undo deployment/openig
 deployment "openig" rolled back
 $ kubectl describe  deployment:
 Normal  DeploymentRollback  1m  deployment-controller  Rolled back deployment "openig" to revision 1

# We can see the old pod is being terminated and the new one has started:
 $ kubectl get pod
 NAME                      READY     STATUS        RESTARTS   AGE
 openig-6f6575cdfd-tvmgj   1/1       Running       0          28s
 openig-7b5788659c-h7kcb   0/1       Terminating   0          8m

And that my friends, is how we roll.

This blog post was first published @ warrenstrange.blogspot.ca, included here with permission.