Deploying the ForgeRock platform on Kubernetes using Skaffold and Kustomize

Image result for forgerock logo

If you are following along with the ForgeOps repository, you will see some significant changes in the way we deploy the ForgeRock IAM platform to Kubernetes.  These changes are aimed at dramatically simplifying the workflow to configure, test and deploy ForgeRock Access Manager, Identity Manager, Directory Services and the Identity Gateway.

To understand the motivation for the change, let’s recap the current deployment approach:

  • The Kubernetes manifests are maintained in one git repository (forgeops), while the product configuration is another (forgeops-init).
  • At runtime,  Kubernetes init containers clone the configuration from git and make it  available to the component using a shared volume.
The advantage of this approach is that the docker container for a product can be (relatively) stable. Usually it is the configuration that is changing, not the product binary.
This approach seemed like a good idea at the time, but in retrospect it created a lot of complexity in the deployment:
  • The runtime configuration is complex, requiring orchestration (init containers) to make the configuration available to the product.
  • It creates a runtime dependency on a git repository being available. This isn’t a show stopper (you can create a local mirror), but it is one more moving part to manage.
  • The helm charts are complicated. We need to weave git repository information throughout the deployment. For example, putting git secrets and configuration into each product chart. We had to invent a mechanism to allow the user to switch to a different git repo or configuration – adding further complexity. Feedback from users indicated this was a frequent source of errors. 
  • Iterating on configuration during development is slow. Changes need to be committed to git and the deployment rolled to test out a simple configuration change.
  • Kubernetes rolling deployments are tricky. The product container version must be in sync with the git configuration. A mistake here might not get caught until runtime. 
It became clear that it would be *much* simpler if the products could just bundle the configuration in the docker container so that it is “ready to run” without any complex orchestration or runtime dependency on git.
[As an aside, we often get asked why we don’t store configuration in ConfigMaps. The short answer is: We do – for top level configuration such as domain names and global environment variables. Products like AM have large and complex configurations (~1000 json files for a full AM export). Managing these in ConfigMaps gets to be cumbersome. We also need a hierarchical directory structure – which is an outstanding ConfigMap RFE]
The challenge with the “bake the configuration in the docker image” approach is that  it creates *a lot* of docker containers. If each configuration change results in a new (and unique) container, you quickly realize that automation is required to be successful. 
About a year ago, one of my colleagues happened to stumble across a new tool from Google called skaffold.  From the documentation

“Skaffold handles the workflow for building, pushing and deploying your application.
So you can focus more on application development”
To some extent skaffold is syntactic sugar on top of this workflow:
docker build; docker tag; docker push;
kustomize build |  kubectl apply -f – 
Calling it syntactic sugar doesn’t really do it justice, so do read through their excellent documentation. 
There isn’t anything that skaffold does that you can’t accomplish with other tools (or a whack of bash scripts), but skaffold focuses on smoothing out and automating this basic workflow.
A key element of Skaffold is its tagging strategy. Skaffold will apply a unique tag to each docker image (the tagging strategy is pluggable, but is generally a sha256 hash, or a git commit). This is essential for our workflow where we want to ensure that combination of the product (say AM) and a specific configuration is guaranteed to be unique. By using a git commit tag on the final image, we can be confident that we know exactly how a container was built including its configuration.  This also makes rolling deployments much more tractable, as we can update a deployment tag and let Kubernetes spin down the older container and replace it with the new one.
If it isn’t clear from the above, the configuration for the product lives inside the docker image, and that in turn is tracked in a git repository. If for example you check out the source for the IDM container: https://github.com/ForgeRock/forgeops/tree/master/docker/idm 
You will see that the Dockerfile COPYs the configuration into the final image. When IDM runs, its configuration will be right there, ready to go. 
Skaffold has two major modes of operation.  The “run” mode  is a one shot build, tag, push and deploy.  You will typically use skaffold run as part of CD pipeline. Watch for git commit, and invoke skaffold to deploy the change.  Again – you can do this with other tools, but Skaffold just makes it super convenient.
Where Skaffold really shines is in “dev” mode. If you run skaffold dev, it will run a continual loop watching the file system for changes, and rebuilding and deploying as you edit files.
This diagram (lifted from the skaffold project) shows the workflow:
architectureThis process is really snappy. We find that we can deploy changes within 20-30 seconds (most of that is just container restarts).  When pushing to a remote GKE cluster, the first deployment is a little slower as we need to push all those containers to gcr.io, but subsequent updates are fast as you are pushing configuration deltas that are just a few KB in size.
Note that git commits are not required during development.  A developer will iterate on the desired configuration, and only when they are happy will they commit the changes to git and create a pull request. At this stage a CD process will pick up the commit and deploy the change to a QA environment. We have a simple CD sample using Google Cloudbuild.
At this point we haven’t said anything about helm and why we decided to move to Kustomize.  
Once our runtime deployments became simpler (no more git init containers, simpler ConfigMaps, etc.), we found ourselves questioning the need for  complex helm templates. There was some internal resistance from our developers on using golang templates (they *are* pretty ugly when combined with yaml), and the security issues raised by Helm’s Tiller component raised additional red flags. 
Suffice to say, there was no compelling reason to stick with Helm, and transitioning to Kustomize was painless. A shout out to the folks at Replicated – who have a very nifty tool called ship, that will convert your helm charts to Kustomize.  The “port” from Helm to Kustomize took a couple of days. We might look at Helm 3 in the future, but for now our requirements are being met by Kustomize. One nice side effect that we noticed is that Kustomize deployments with skaffold are really fast. 
This work is being done on the master branch of forgeops (targetting the 7.0 release), but if you would like to try out this new workflow with the current (6.5.2) products, you are in luck!  We have a preview branch  that uses the current products.  
The following should just work(TM) on minikube:
cd forgeops
git checkout skaffold-6.5
skaffold dev 
There are some prerequisites that you need to install. See the README-skaffold
The initial feedback on this workflow has been very positive. We’d love for folks to try it out and let us know what you think. Feel free to reach out to me at my ForgeRock email (warren dot strange at forgerock.com).

This blog post was first published @ warrenstrange.blogspot.ca, included here with permission.

1H2019 Identity Management Funding Analysis

As the first half of 2019 has been and gone, I've taken a quick look at the funding rounds that have taken place so far this year, within the identity and access management space and attempted some coarse grained analysis.  The focus is global and the sector definition is quite broad and based on the categories Crunchbase use.

Key Facts January to June 2017 / 2018 / 2019

Funding increased 261% year on year for the first half of 2019, compared to the same period in 2018.  There were some pretty large, latter stage investments, which looks like that has skewed the number some what.  

The number of organisations actually receiving funding, dropped, as did the % of organisations receiving seed funding.  This is pretty typical as the market matures and stabilises in some sub categories.



1H2019
  • ~$604million overall funding
  • Seed funding accounted for 20%
  • Median announcement date March 27th
  • 35 companies funded

1H2018
  • ~$231million overall funding
  • Seed funding accounted for 45.3%
  • Median announcement date March 30th
  • 53 companies funded

1H2017

  • ~$237million overall funding
  • Seed funding accounted for 32.1%
  • Median announcement date April 7th
  • 56 companies funded


1H2019 Company Analysis

A coarse grained analysis of the 2019 numbers, shows a pretty typical geographic spread - with North America, as ever the major centre, not just for funded companies, but the funding entities too.



The types of companies funded, is also interesting.  The categories are based on what Crunchbase curates, and maps against company descriptions, so there might be some overlap or ambiguity when performing detailed analysis on that.



It's certainly interesting to see a range covering B2C fraud, payments and analytics - typically as the return on investment on such products is very tangible.

The stage of funding, provides a broad and distributed range.  Seed is the clear leader, but the long tail is indicating a maturity of market, with many investors expecting strong returns.


1H2019 Top 10 Companies By Funding Amounts

The following is a simple top down list, of the companies that received the highest funding and at what stage that funding was received:
  1. Dashlane ($110m, Series D) - http://www.dashlane.com/ (https://techcrunch.com/2019/05/30/dashlane-series-d/)
  1. Auth0 ($103m, Series E) - https://auth0.com/ (s://auth0.com/blog/auth0-closes-103m-in-funding-passes-1b-valuation/)
  1. OneLogin ($100m, Series D) - http://onelogin.com/ (https://venturebeat.com/2019/01/10/onelogin-raises-100-million-to-help-enterprises-manage-access-and-identity/)

  2. Onfido ($50m, Series C) - http://www.onfido.com/ (https://venturebeat.com/2019/04/03/onfido-raises-50-million-for-ai-powered-identity-verification/)

  3. Socure ($30m, Series C) - http://www.socure.com/ (https://www.socure.com/about/news/socure-raises-30-million-in-additional-financing-to-identify-the-human-race)
  1. Dashlane ($30m, Debt Financing) - http://www.dashlane.com/
  2. Payfone ($24m, Series G) - http://www.payfone.com/ (https://www.alleywatch.com/2019/05/nyc-startup-funding-top-largest-april-2019-vc/7/)
  1. Evident ($20m, Series B) - https://www.evidentid.com/ (http://www.finsmes.com/2019/05/evident-raises-20m-in-series-b-funding.html)
  1. Bamboocloud ($15m, Series B) - http://www.bamboocloud.cn/ (https://www.volanews.com/portal/article/index/id/1806.html)
  1. Proxy ($13.6m, Series A) - https://proxy.com/ (https://www.globenewswire.com/news-release/2019/03/27/1774258/0/en/Proxy-Raises-13-6M-in-Series-A-Funding-Led-by-Kleiner-Perkins-Emerges-from-Stealth-to-Launch-Its-Universal-Identity-Signal-for-Frictionless-Access-to-Everything-in-the-Physical-Wor.html)

NB - all data and reporting done via Crunchbase.

1H2019 Identity Management Funding Analysis

As the first half of 2019 has been and gone, I've taken a quick look at the funding rounds that have taken place so far this year, within the identity and access management space and attempted some coarse grained analysis.  The focus is global and the sector definition is quite broad and based on the categories Crunchbase use.

Key Facts January to June 2017 / 2018 / 2019

Funding increased 261% year on year for the first half of 2019, compared to the same period in 2018.  There were some pretty large, latter stage investments, which looks like that has skewed the number some what.  

The number of organisations actually receiving funding, dropped, as did the % of organisations receiving seed funding.  This is pretty typical as the market matures and stabilises in some sub categories.



1H2019
  • ~$604million overall funding
  • Seed funding accounted for 20%
  • Median announcement date March 27th
  • 35 companies funded

1H2018
  • ~$231million overall funding
  • Seed funding accounted for 45.3%
  • Median announcement date March 30th
  • 53 companies funded

1H2017

  • ~$237million overall funding
  • Seed funding accounted for 32.1%
  • Median announcement date April 7th
  • 56 companies funded


1H2019 Company Analysis

A coarse grained analysis of the 2019 numbers, shows a pretty typical geographic spread - with North America, as ever the major centre, not just for funded companies, but the funding entities too.



The types of companies funded, is also interesting.  The categories are based on what Crunchbase curates, and maps against company descriptions, so there might be some overlap or ambiguity when performing detailed analysis on that.



It's certainly interesting to see a range covering B2C fraud, payments and analytics - typically as the return on investment on such products is very tangible.

The stage of funding, provides a broad and distributed range.  Seed is the clear leader, but the long tail is indicating a maturity of market, with many investors expecting strong returns.


1H2019 Top 10 Companies By Funding Amounts

The following is a simple top down list, of the companies that received the highest funding and at what stage that funding was received:
  1. Dashlane ($110m, Series D) - http://www.dashlane.com/ (https://techcrunch.com/2019/05/30/dashlane-series-d/)
  1. Auth0 ($103m, Series E) - https://auth0.com/ (s://auth0.com/blog/auth0-closes-103m-in-funding-passes-1b-valuation/)
  1. OneLogin ($100m, Series D) - http://onelogin.com/ (https://venturebeat.com/2019/01/10/onelogin-raises-100-million-to-help-enterprises-manage-access-and-identity/)

  2. Onfido ($50m, Series C) - http://www.onfido.com/ (https://venturebeat.com/2019/04/03/onfido-raises-50-million-for-ai-powered-identity-verification/)

  3. Socure ($30m, Series C) - http://www.socure.com/ (https://www.socure.com/about/news/socure-raises-30-million-in-additional-financing-to-identify-the-human-race)
  1. Dashlane ($30m, Debt Financing) - http://www.dashlane.com/
  2. Payfone ($24m, Series G) - http://www.payfone.com/ (https://www.alleywatch.com/2019/05/nyc-startup-funding-top-largest-april-2019-vc/7/)
  1. Evident ($20m, Series B) - https://www.evidentid.com/ (http://www.finsmes.com/2019/05/evident-raises-20m-in-series-b-funding.html)
  1. Bamboocloud ($15m, Series B) - http://www.bamboocloud.cn/ (https://www.volanews.com/portal/article/index/id/1806.html)
  1. Proxy ($13.6m, Series A) - https://proxy.com/ (https://www.globenewswire.com/news-release/2019/03/27/1774258/0/en/Proxy-Raises-13-6M-in-Series-A-Funding-Led-by-Kleiner-Perkins-Emerges-from-Stealth-to-Launch-Its-Universal-Identity-Signal-for-Frictionless-Access-to-Everything-in-the-Physical-Wor.html)

NB - all data and reporting done via Crunchbase.

Next Generation Distributed Authorization

Many of today’s security models spend a lot of time focusing upon network segmentation and authentication.  Both of these concepts are critical in building out a baseline defensive security posture.  However, there is a major area that is often overlooked, or at least simplified to a level of limited use.  That of authorization.  Working out what, a user, service, or thing, should be able to do within another service.  The permissions.  Entitlements.  The access control entries.  I don’t want to give an introduction into the many, sometimes academic acronyms and ideas around authorization (see RBAC, MAC, DAC, ABAC, PDP/PEP amongst others). I want to spend a page delving into the some of the current and future requirements surrounding distributed authorization.

New Authorization Requirements

Classic authorization modelling, tends to have a centralised policy decision (PDP) point – a central location where applications, agents and other SDK’s call, in order to get a decision regarding a subject/object/action combination.  The PDP contains signatures (or policies) that map the objects and actions to a bunch of users and services.
That call out process is now a bottle neck, for several reasons.  Firstly the number of systems being protected is rapidly increasing, with the era of microservices, API’s and IoT devices all needing some sort of access control.  Having them all hitting a central PDP doesn’t seem a good use of network bandwidth or central processing power.  Secondly, that increase in objects, also gives way to a more mesh and federated set of interactions such as the following.

Distributed Enforcement

This gives way to a more distributed enforcement requirement.  How can the protected object perform an access control evaluation without having to go back to the mother ship?  There are a few things that could help.  
Firstly we need to probably achieve three things.  Work out what we need to identify the calling user or service (aka authentication token), map that to what that identity can do, before finally making sure that actually happens.  The first part, is often completed using tokens – and in the distributed world a token that has been cryptographically signed by a central authority.  JSON Web Tokens (JWTs) are popular, but not the only approach.
The second part – working out what they can do – could be handled in two slightly different ways.  One, is the calling subject brings with them what they can do.  They could do this by having the cryptographically signed token, contain their access control entries.  This approach, would require the service that issues tokens, to also know what the calling user or service could do, so would need to have knowledge of the access control entries to use.  That list of entries, would also need things like governance, audit, version control and so, but that is needed irregardless of where those entries are stored.
So here, a token gets issued and the objects being protected, have a method to crytographically validate the presented token, extract the access control entries (ACE) and enforce what is being asked.
Having a token that contains the actual ACE, is not that new.  Capability Based Access Control (CBAC) follows this concept, where the token could contain the object and associated actions.  It could also contain the subject identifier, or perhaps that could be delivered as a separate token.  A similar practical implementation is described in Google’s Macaroons project.
What we’ve achieved here, is to basically remove the access control logic from the object or service, but equally, removed the need to perform a call back to a policy mother ship.
A subtly different approach, is to pass the access control logic back down to the object – but instead of it originating within the service itself – it is still owned and managed by central authority – just distributed to the edges.
This allows for local enforcement, but central governance and management.  Modern distribution technologies like web sockets, or even flat file systems like JSON and YAML, allow for repave and replace approaches as policy definitions change, which fits nicely into devops deployment models.  
The object itself, would still need to know a few things to make the enforcement complete – a token representing the user or service and some context to help validate the request.

Contextual Integration

Access control decisions generally require the subject, the object and any associated actions.  For example subject=Bob, could perform actions=open on object=Meeting Room.  Another dimension that is now required, especially within zero trust based approaches, is that of context.  In Bob’s example, context may include time of day, day of the week, or even the project he is working on.  They could all impact the decision.  Previous access control requests and decisions could also come into play here.  For example, say Bob was just given access to the Safe Room where the gold bullion was stored, perhaps his request to gain access to the Back Door is denied.
The capturing of context, both during authentication time and during authorization evaluation time is now critical, as it allows the object to have a much clearer understanding of how to handle access requests.

ML – Defining Normal

I’ve talked a lot so far about access control logic and where that should sit.  Well, how do we know what that access control logic looks like?  I spent many a year, designing role based access control systems (wow, that was 10+ years ago), using a system known as role mining.  Big data crunching before machine learning was in vogue.  Taking groups of users and trying to understand what access control patterns existed, and trying to shoe horn the results into business and technical roles.
Today, there are loads of great SaaS based machine learning systems, that can take user activity logs (logs that describe user to application interactions) and provide views on whether their activity levels are “normal” – normal for them, normal for their peers, their business unit, location, purchasing patterns and so on.  The output of that process, can be used to help define the initial baseline policies.
Enforcing access based on policies though is not enough.  It is time consuming and open to many many avenues of circumvention.  Machine learning also has a huge role to play within the enforcement aspect too, especially as the idea of context and what is valid or not, becomes a highly complicated and ever changing question.
One of the key issues of modern authorization, is the distinction between access control logic, enforcement and the vehicles used to deliver the necessary parts to the protected services.
If at all possible, they should be as modular as possible, to allow for future proofing and the ability to design a secure system that is flexible enough to meet business requirements, scale out to millions of transactions a second and integrate thousands of services simultaneously.

This blog post was first published @ www.infosecprofessional.com, included here with permission.

Next Generation Distributed Authorization

Many of today's security models spend a lot of time focusing upon network segmentation and authentication.  Both of these concepts are critical in building out a baseline defensive security posture.  However, there is a major area that is often overlooked, or at least simplified to a level of limited use.  That of authorization.  Working out what, a user, service, or thing, should be able to do within another service.  The permissions.  Entitlements.  The access control entries.  I don't want to give an introduction into the many, sometimes academic acronyms and ideas around authorization (see RBAC, MAC, DAC, ABAC, PDP/PEP amongst others). I want to spend a page delving into the some of the current and future requirements surrounding distributed authorization.

New Authorization Requirements

Classic authorization modelling, tends to have a centralised policy decision point (PDP)- a central location where applications, agents and other SDK's call, in order to get a decision regarding a subject/object/action combination.  The PDP contains signatures (or policies) that map the objects and actions to a bunch of users and services.

That call out process is now a bottle neck, for several reasons.  Firstly the number of systems being protected is rapidly increasing, with the era of microservices, API's and IoT devices all needing some sort of access control.  Having them all hitting a central PDP doesn't seem a good use of network bandwidth or central processing power.  Secondly, that increase in objects, also gives way to a more mesh and federated set of interactions such as the following, where microservices and IoT are more common.


Distributed Enforcement


This gives way to a more distributed enforcement requirement.  How can the protected object perform an access control evaluation without having to go back to the mother ship?  There are a few things that could help.  

Firstly,  we need to probably achieve three things.  Work out what we need to identify the calling user or service (aka authentication token), map that to what that identity can do, before finally making sure that actually happens.  The first part, is often completed using tokens - and in the distributed world a token that has been cryptographically signed by a central authority.  JSON Web Tokens (JWTs) are popular, but not the only approach.

The second part - working out what they can do - could be handled in two slightly different ways.  One, is the calling subject brings with them what they can do.  They could do this by having the cryptographically signed token, contain their access control entries.  This approach, would require the service that issues tokens, to also know what the calling user or service could do, so would need to have knowledge of the access control entries to use.  That list of entries, would also need things like governance, audit, version control and so, but that is needed irregardless of where those entries are stored.



So here, a token gets issued and the objects being protected, have a method to crytographically validate the presented token, extract the access control entries (ACE) and enforce what is being asked.

Having a token that contains the actual ACE, is not that new.  Capability Based Access Control (CBAC) follows this concept, where the token could contain the object and associated actions.  It could also contain the subject identifier, or perhaps that could be delivered as a separate token.  A similar practical implementation is described in Google's Macaroons project.

What we've achieved here, is to basically remove the access control logic from the object or service, but equally, removed the need to perform a call back to a policy mother ship.

A subtly different approach, is to pass the access control logic back down to the object - but instead of it originating within the service itself - it is still owned and managed by central authority - just distributed to the edges.



This allows for local enforcement, but central governance and management.  Modern distribution technologies like web sockets could be useful for this.  In addition, even flat file systems like JSON and YAML, could allow for "repave and replace" approach, as policy definitions change, which fits nicely into devops deployment models.  

The object itself, would still need to know a few things to make the enforcement complete - a token representing the user or service and some context to help validate the request.

Contextual Integration

Access control decisions generally require the subject, the object and any associated actions.  For example subject=Bob, could perform actions=open on object=Meeting Room.  Another dimension that is now required, especially within zero trust based approaches, is that of context.  In Bob's example, context may include time of day, day of the week, or even the project he is working on.  They could all impact the decision. 

Previous access control requests and decisions could also come into play here.  For example, say Bob was just given access to the Safe Room where the gold bullion was stored.  Maybe his request two minutes later to gain access to the Back Door is denied.  If that first request didn't occur, perhaps his request to open the Back Door is legitimate and is permitted.

The capturing of context, both during authentication time and during authorization evaluation time is now critical, as it allows the object to have a much clearer understanding of how to handle access requests.

ML - Defining Normal

I've talked a lot so far about access control logic and where that should sit.  Well, how do we know what that access control logic looks like?  I spent many a year, designing role based access control systems (wow, that was 10+ years ago), using a system known as role mining.  Big data crunching before machine learning was in vogue.  Taking groups of users and trying to understand what access control patterns existed, and trying to shoe horn the results into business and technical roles.

Today, there are loads of great SaaS based machine learning systems, that can take user activity logs (logs that describe user to application interactions) and provide views on whether their activity levels are "normal" - normal for them, normal for their peers, their business unit, location, purchasing patterns and so on.  The typical "access path analytics".  The output of that process, can be used to help define the initial baseline policies.

Enforcing access based on policies though is not enough though.  It is time consuming and open to many many avenues of circumvention.  Machine learning also has a huge role to play within the enforcement aspect too, especially as the idea of context and what is valid or not, becomes a highly complicated and ever changing question.

One of the key issues of modern authorization, is the distinction between access control logic, enforcement and the vehicles used to deliver the necessary parts to the protected services.

If at all possible, they should be as modular as possible, to allow for future proofing and the ability to design a secure system that is flexible enough to meet business requirements, scale out to millions of transactions a second and integrate thousands of services simultaneously.


















Next Generation Distributed Authorization

Many of today's security models spend a lot of time focusing upon network segmentation and authentication.  Both of these concepts are critical in building out a baseline defensive security posture.  However, there is a major area that is often overlooked, or at least simplified to a level of limited use.  That of authorization.  Working out what, a user, service, or thing, should be able to do within another service.  The permissions.  Entitlements.  The access control entries.  I don't want to give an introduction into the many, sometimes academic acronyms and ideas around authorization (see RBAC, MAC, DAC, ABAC, PDP/PEP amongst others). I want to spend a page delving into the some of the current and future requirements surrounding distributed authorization.

New Authorization Requirements

Classic authorization modelling, tends to have a centralised policy decision point (PDP)- a central location where applications, agents and other SDK's call, in order to get a decision regarding a subject/object/action combination.  The PDP contains signatures (or policies) that map the objects and actions to a bunch of users and services.

That call out process is now a bottle neck, for several reasons.  Firstly the number of systems being protected is rapidly increasing, with the era of microservices, API's and IoT devices all needing some sort of access control.  Having them all hitting a central PDP doesn't seem a good use of network bandwidth or central processing power.  Secondly, that increase in objects, also gives way to a more mesh and federated set of interactions such as the following, where microservices and IoT are more common.


Distributed Enforcement


This gives way to a more distributed enforcement requirement.  How can the protected object perform an access control evaluation without having to go back to the mother ship?  There are a few things that could help.  

Firstly,  we need to probably achieve three things.  Work out what we need to identify the calling user or service (aka authentication token), map that to what that identity can do, before finally making sure that actually happens.  The first part, is often completed using tokens - and in the distributed world a token that has been cryptographically signed by a central authority.  JSON Web Tokens (JWTs) are popular, but not the only approach.

The second part - working out what they can do - could be handled in two slightly different ways.  One, is the calling subject brings with them what they can do.  They could do this by having the cryptographically signed token, contain their access control entries.  This approach, would require the service that issues tokens, to also know what the calling user or service could do, so would need to have knowledge of the access control entries to use.  That list of entries, would also need things like governance, audit, version control and so, but that is needed irregardless of where those entries are stored.



So here, a token gets issued and the objects being protected, have a method to crytographically validate the presented token, extract the access control entries (ACE) and enforce what is being asked.

Having a token that contains the actual ACE, is not that new.  Capability Based Access Control (CBAC) follows this concept, where the token could contain the object and associated actions.  It could also contain the subject identifier, or perhaps that could be delivered as a separate token.  A similar practical implementation is described in Google's Macaroons project.

What we've achieved here, is to basically remove the access control logic from the object or service, but equally, removed the need to perform a call back to a policy mother ship.

A subtly different approach, is to pass the access control logic back down to the object - but instead of it originating within the service itself - it is still owned and managed by central authority - just distributed to the edges.



This allows for local enforcement, but central governance and management.  Modern distribution technologies like web sockets could be useful for this.  In addition, even flat file systems like JSON and YAML, could allow for "repave and replace" approach, as policy definitions change, which fits nicely into devops deployment models.  

The object itself, would still need to know a few things to make the enforcement complete - a token representing the user or service and some context to help validate the request.

Contextual Integration

Access control decisions generally require the subject, the object and any associated actions.  For example subject=Bob, could perform actions=open on object=Meeting Room.  Another dimension that is now required, especially within zero trust based approaches, is that of context.  In Bob's example, context may include time of day, day of the week, or even the project he is working on.  They could all impact the decision. 

Previous access control requests and decisions could also come into play here.  For example, say Bob was just given access to the Safe Room where the gold bullion was stored.  Maybe his request two minutes later to gain access to the Back Door is denied.  If that first request didn't occur, perhaps his request to open the Back Door is legitimate and is permitted.

The capturing of context, both during authentication time and during authorization evaluation time is now critical, as it allows the object to have a much clearer understanding of how to handle access requests.

ML - Defining Normal

I've talked a lot so far about access control logic and where that should sit.  Well, how do we know what that access control logic looks like?  I spent many a year, designing role based access control systems (wow, that was 10+ years ago), using a system known as role mining.  Big data crunching before machine learning was in vogue.  Taking groups of users and trying to understand what access control patterns existed, and trying to shoe horn the results into business and technical roles.

Today, there are loads of great SaaS based machine learning systems, that can take user activity logs (logs that describe user to application interactions) and provide views on whether their activity levels are "normal" - normal for them, normal for their peers, their business unit, location, purchasing patterns and so on.  The typical "access path analytics".  The output of that process, can be used to help define the initial baseline policies.

Enforcing access based on policies though is not enough though.  It is time consuming and open to many many avenues of circumvention.  Machine learning also has a huge role to play within the enforcement aspect too, especially as the idea of context and what is valid or not, becomes a highly complicated and ever changing question.

One of the key issues of modern authorization, is the distinction between access control logic, enforcement and the vehicles used to deliver the necessary parts to the protected services.

If at all possible, they should be as modular as possible, to allow for future proofing and the ability to design a secure system that is flexible enough to meet business requirements, scale out to millions of transactions a second and integrate thousands of services simultaneously.


















Implementing JWT Profile for OAuth2 Access Tokens

There is a new IETF draft stream called JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens.  This is a very early 0 version, that looks to describe the format of OAuth2 issued access_tokens.

Access tokens, are typically bearer tokens, but the OAuth2 spec, doesn’t really describe what format they should be.  They typically end up being two high level types – stateless and stateful.  Stateful just means “by reference”, with a long opaque random string being issued to the requestor, which resource servers can then send back into the authorization service, in order to introspect and validate.  On their own, stateful or reference tokens, don’t really provide the resource servers with any detail.

The alternative is to use a stateless token – namely a JSON Web Token (JWT).  This new spec, aims to standardise what the content and format should be.

From a ForgeRock AM perspective, this is good news.  AM has delivered JWT based tokens (web session, OIDC id_tokens and OAuth2 access_tokens) for a long time.  The format and content of the access_tokens, out of the box, generally look something like the following:

The out of the box header (using RS256 signing):

The out of the box payload:

Note there is a lot of stuff in that access_token.  Note the cnf claim (confirmation key).  This is used for proof of possession support which is of course optional, so you can easily reduce the size by not implementing that.  There are several claims, that are specific to the AM authorization service, which may not always be needed in a stateless JWT world, where perhaps the RS is performing offline validation away from the AS.

In AM 6.5.2 and above, new functionality allows for the ability to rapidly customize the content of the access_token.  You can add custom claims, remove out of the box fields and generally build token formats that suit your deployment.  We do this, by the addition of scriptable support.  Within the settings of the OAuth2 provider, note the new field for OAuth2 Access Token Modification Script.

The scripting ability, was already in place for OIDC id_tokens.  Similar concepts now apply.

The draft JWT profile spec, basically mandates iss, exp, aud, sub and client_id, with auth_time and jti as optional.  The AM token already contains those claims.  The perhaps only differing component, is that the JWT Profile spec –  section 2.1 – recommends the header typ value be set to “at+JWT” – meaning access token JWT, so the RS does not confuse the token as an id_token.  The FR AM scripting support, does not allow for changes to the typ, but the payload already contains a tokenName claim (value as access_token) to help this distinction.

If we add a couple of lines to the out of the box script, namely the following, we cut back the token content to the recommended JWT Profile:

accessToken.removeField(“cts”);
accessToken.removeField(“expires_in”);
accessToken.removeField(“realm”);
accessToken.removeField(“grant_type”);
accessToken.removeField(“nbf”);
accessToken.removeField(“authGrantId”);
accessToken.removeField(“cnf”);

The new token payload is now much more slimmed down:

The accessToken.setField(“name”, “value”) method, allows simple extension and alteration of standard claims.

For further details see the following documentation on scripted token content – https://backstage.forgerock.com/docs/am/6.5/oauth2-guide/#modifying-access-tokens-scripts

This blog post was first published @ http://www.theidentitycookbook.com/, included here with permission from the author.

Directory Services – Docker, Kubernetes: Friends or Foes?

Two weeks ago, at the ForgeRock Identity Live conference, I did a talk about ForgeRock Directory Services (DS) in the Docker/Kubernetes (K8S) world, trying to answer the question whether DS and Docker/K8S were friends or foes.

Before I dive into the question, let me say that it’s obvious that our whole industry is moving to the Cloud, and that Docker/Kubernetes are becoming the standard way to deploy software in the Cloud, in any Cloud. Therefore whether DS and K8S are ultimately friends or foes is not the right question. I believe it is unavoidable and that in the near future we will deploy and fully support Directory Services in K8S. But is it a good idea to do it today? Let’s examine why we are questioning this today, what are the benefits of using Kubernetes to deploy software, what are the constraints of deploying the current version of Directory Services (6.5) in Kubernetes, and what ForgeRock is working on to improve DS in K8S. Finally I will highlight why Directory Services is a good solution to persist data, whether it’s on premise or in the Cloud. 

Why the discussion about DS and K8S?

The main reason we are having this discussion is due to the nature of Directory Services. DS is not the usual stateless web application. Directory Services is both a stateful application and a distributed one. These are two main aspects that require special care when trying to deploy in containers. First Directory Services is a stateful application because it is the place where one can store the state for all these stateless web-applications. In our platform, we use DS to store ForgeRock Access Management data, whether it’s runtime configuration data, tokens and user identities. Second Directory Services is a distributed application because instances need to talk with each other so that the data is replicated and consistent. Because databases and distributed applications require stronger orchestration and coordination between elements of the system, they are implemented as Stateful Sets in the Kubernetes world, and make use of Persistent Volumes (PV). Therefore our Cloud Deployment Model of ForgeRock Directory Services is also implemented this way.

It’s worth noting that Persistent Volume is a Kubernetes API and there are several types of volumes and many different providers implementations. Some of the PV types are very recent and still beta versions. So, when using Kubernetes for applications that persist data, you should have a good understanding of the characteristics and the performance of the Persistent Volumes choices that are available in your environment.

Benefits of Containers and Kubernetes

Developers are making a great use of containers because it simplifies focus on what they have to build and test. Instead of spending hours figuring how to install and configure a database, and build a monitoring platform to validate their work, they can pull one or more docker images that will automate this task.

When going into production, the automation is a key aspect. Kubernetes and its family of tools, allow administrators to describe their target architectures, automate deployment, monitoring and incident response. Typically in a Kubernetes cluster, if the administrator requires at least 3 instances of an application, Kubernetes will react to the disappearance of an instance and will restart a new one immediately. Another key benefit of Kubernetes is auto-scalability. The Kubernetes deployment can react to monitoring alerts or external signals to add or remove instances of an application in order to support a greater or smaller workload. This optimises the cost of running the solution, balancing the capacity to absorb peak loads with the cost of running at normal or low usage levels.

Directory Services 6.5 constraints in K8S

But auto-scaling is not something that is suitable to all applications, and typically Directory Services, like most of the databases, does not scale automatically by adding more running instances. Because databases have state and data, and expect exclusive access to the files, adding a new replica is a costly operation. The data needs to be duplicated in order to let another instance using it. Also, adding a Directory Services instance only helps to scale read operations. A write operation on any server will need to be replicated to all other servers. So all servers will have the same write throughput and the same amount of disk I/Os. In the world of databases, the only way to scale write operations is to distribute (shard) the data to multiple servers. Such capability is not yet available in Directory Services, but it’s planned for future releases. (Note that Directory Proxy Services 6.5 already has support for sharding, but with some constraints. And the proxy is not yet part of the Cloud Deployment Model).

Another constraint of Directory Services 6.5 is how replication works. The DS replication feature was designed years ago when customers would deploy servers and would not touch them unless they were broken. Servers had stable hostnames or IP addresses and would know all of their peers. In the container world, the address of an instance is only known after the instance is started. And sometimes you want to start several instances at the same time. The current ForgeRock Cloud Deployment Model and the Directory Services docker images that we propose, work around the design limitation of replication management, by pre-configuring replication for a fixed (and small) maximum number of replicas. It’s not possible to dynamically add another replica after that. Also, the “dsreplication” utility cannot be used in Kubernetes. Luckily, monitoring replication and more importantly its latency is possible with Prometheus which is the default monitoring technology in Kubernetes.

Coming Improvements in Directory Services

For the past year, we’ve been working hard on redesigning how we manage and bootstrap replication between Directory Services instances. Our main challenge with that work has been to do it in a way that allows us to continue to replicate with previous versions. Interoperability and compatibility of replication between different versions of Directory Services has been and will remain a key value of the product, allowing customers to roll out new versions with zero downtime of the service. We’re moving towards using full CA-based certificates and mutual TLS authentication for establishing trust between replicas. Configuring a new replica will no longer require updating all servers in the topology, and replicas that are uninstalled or stopped for some time will be automatically removed from the topology (and so will be their associated change logs and meta-data). When starting a new replica, it will only need to know of one other running replica (or be told that it is the first one). These changes will make automating the deployment of new replica much simpler and remove the limit to the number of replicas. We are also improving the way we are doing backup and restore of a database backend or the whole server, allowing to directly use cloud buckets such as S3 or GCS. All of these things are planned for the next major release due in the first half of 2020. Most of these features will be used by our own ForgeRock Identity Platform as a Service offering that will go in stages of Early Access and Beta later this year.

Once we have the ability to fully automate the deployment and the upgrade of a cluster of Directory Services instances, in one or more data-centres, we will start working on horizontal scalability for Directory Services, and provide a way to scale the number of servers as the data stored grows, allowing a consistent level of write throughout. All of this fully automated to be deployed in the Cloud using Kubernetes.

Benefits of using Directory Services as a data store

Often people ask me why they should use ForgeRock Directory Services rather than a real database. First of all, Directory Services is a database. It’s a specialised database, built on a standard data model and a standard access protocol: Lightweight Directory Access Protocol aka LDAP. Several people in the past have pointed out that LDAP might have even been the first successful NoSQL database! 🙂  Furthermore, Directory Services also exposes all of the data through a REST/JSON API, yet still providing the same security and fine grained access controls mechanisms as through LDAP. But the main value of Directory Services is that you can achieve very high availability of the data (in the 5 9’s), using standard systems (whether they are bare metal systems or virtual hosts or containers), even with world wide geographic distribution. We have many customers that have deployed a single directory services distributed in 3 to 6 data centers around the globe. The LDAP data model has a flexible schema that can be extended, customised without having to rebuild the database nor even restart the servers. The data can even be exposed through versioned APIs using our REST API. Finally, the combination of flexible and extensive schema with fine-grained access controls, allow multiple applications to access the data, but with great control of which application can read or write which data. This results in a single identity and credentials for a user, but multiple sets of attributes, that can be shared by applications or restricted to a single one: a single central view of the user that is then easier and more cost effective to manage.

Conclusion

Back to the track of Kubernetes, and because of the constraints of the current Directory Services Cloud Deployment Model with version 6.5, we would recommend that you try to keep your Directory Services deployed in VMs or on bare metal. But with the next release which underpins the ForgeRock Cloud offering, we will fully support deploying Directory Services on Docker/Kubernetes. We will continue our investment in the product to be able to support Auto-Scaling (using data sharding) in subsequent releases. Building these solutions is not extremely difficult, but we need time to prove that it’s 100% reliable in all conditions, because in the end, the most wanted and appreciated feature of ForgeRock Directory Services is its reliability.

This blog post was first published @ ludopoitou.com, included here with permission.

ForgeRock Identity Live Berlin

Last week, the IdentityLive tour stopped in Berlin for the first European event of 2019 (the second one will be in London on October 8th-9th).

It was a good opportunity to meet and discuss with our European customers (or the European teams of our global customers). For me, the main topic of discussion was Kubernetes and running Directory Services in Docker/K8S. It was also something that I’ve discussed a little bit during the Nashville Identity Live, but not as much as I did in Berlin. I also did a talk on that subject at the Identity Live Cloud Workshop (the second day of the event is focusing on the technical aspects of our products and solutions). I’ve started to write another article to detail my talk. I hope to publish it here in the next few days. Meanwhile, you can find all the photos from Identity Live Berlin on my Flickr page as usual.

ForgeRock IdentityLive Berlin 2019

Note that Identity Live Berlin took place at the “Classic Remise” which is a showroom for old and sports cars. An unusual place for a conference, but a good opportunity to admire some pretty old cars and try to take a different kind of photos.

Cars from Classic Remise Berlin

ForgeRock Identity Live Nashville, TN

Two weeks ago debuted the ForgeRock Identity Live series of events. This year the USA based event moved to Nashville TN.

Untitled

This was my first visit to the city of Country Music and honky-tonks. It was fun listening to the live music everywhere, trying (and buying) boots, visiting the Country Music Hall of Fame, although we didn’t really have much time for leisure.

Untitled
Untitled

The Identity Live event itself was really good and very well attended. The engagement of our Customers and Partners was great and we’ve had a myriad of discussions, feedbacks and questions about our products, our roadmap and our progress on our move to the Cloud.

Untitled

The videos of the sessions are already available on ForgeRock website. And you can also see the photos that I took during the event.

Next is Berlin Identity Live, on June 6-7. Registration is still open! I’m looking forward to seeing you in Berlin!