Many of today's security models spend a lot of time focusing upon network segmentation and authentication
. Both of these concepts are critical in building out a baseline defensive security posture. However, there is a major area that is often overlooked, or at least simplified to a level of limited use. That of authorization. Working out what, a user, service, or thing, should be able to do within another service. The permissions. Entitlements. The access control entries. I don't want to give an introduction into the many, sometimes academic acronyms and ideas around authorization (see RBAC
amongst others). I want to spend a page delving into the some of the current and future requirements surrounding distributed authorization.
New Authorization Requirements
Classic authorization modelling, tends to have a centralised policy decision point (PDP)- a central location where applications, agents and other SDK's call, in order to get a decision regarding a subject/object/action combination. The PDP contains signatures (or policies) that map the objects and actions to a bunch of users and services.
That call out process is now a bottle neck, for several reasons. Firstly the number of systems being protected is rapidly increasing, with the era of microservices, API's and IoT devices all needing some sort of access control. Having them all hitting a central PDP doesn't seem a good use of network bandwidth or central processing power. Secondly, that increase in objects, also gives way to a more mesh and federated set of interactions such as the following, where microservices and IoT are more common.
This gives way to a more distributed enforcement requirement. How can the protected object perform an access control evaluation without having to go back to the mother ship? There are a few things that could help.
Firstly, we need to probably achieve three things. Work out what we need to identify the calling user or service (aka authentication token), map that to what that identity can do, before finally making sure that actually happens. The first part, is often completed using tokens - and in the distributed world a token that has been cryptographically signed by a central authority. JSON Web Tokens (JWTs
) are popular, but not the only approach.
The second part - working out what they can do - could be handled in two slightly different ways. One, is the calling subject brings with them what they can do. They could do this by having the cryptographically signed token, contain their access control entries. This approach, would require the service that issues tokens, to also know what the calling user or service could do, so would need to have knowledge of the access control entries to use. That list of entries, would also need things like governance, audit, version control and so, but that is needed irregardless of where those entries are stored.
So here, a token gets issued and the objects being protected, have a method to crytographically validate the presented token, extract the access control entries (ACE) and enforce what is being asked.
Having a token that contains the actual ACE, is not that new. Capability Based Access Control (CBAC
) follows this concept, where the token could contain the object and associated actions. It could also contain the subject identifier, or perhaps that could be delivered as a separate token. A similar practical implementation is described in Google's Macaroons
What we've achieved here, is to basically remove the access control logic from the object or service, but equally, removed the need to perform a call back to a policy mother ship.
A subtly different approach, is to pass the access control logic back down to the object - but instead of it originating within the service itself - it is still owned and managed by central authority - just distributed to the edges.
This allows for local enforcement, but central governance and management. Modern distribution technologies like web sockets could be useful for this. In addition, even flat file systems like JSON and YAML, could allow for "repave and replace" approach, as policy definitions change, which fits nicely into devops deployment models.
The object itself, would still need to know a few things to make the enforcement complete - a token representing the user or service and some context to help validate the request.
Access control decisions generally require the subject, the object and any associated actions. For example subject
=Bob, could perform actions
=open on object
=Meeting Room. Another dimension that is now required, especially within zero trust
based approaches, is that of context. In Bob's example, context may include time of day, day of the week, or even the project he is working on. They could all impact the decision.
Previous access control requests and decisions could also come into play here. For example, say Bob was just given access to the Safe Room where the gold bullion was stored. Maybe his request two minutes later to gain access to the Back Door is denied. If that first request didn't occur, perhaps his request to open the Back Door is legitimate and is permitted.
The capturing of context, both during authentication time and during authorization evaluation time is now critical, as it allows the object to have a much clearer understanding of how to handle access requests.
ML - Defining Normal
I've talked a lot so far about access control logic and where that should sit. Well, how do we know what that access control logic looks like? I spent many a year, designing role based access control systems (wow, that was 10+ years ago
), using a system known as role mining. Big data crunching before machine learning was in vogue. Taking groups of users and trying to understand what access control patterns existed, and trying to shoe horn the results into business and technical roles.
Today, there are loads of great SaaS based machine learning systems, that can take user activity logs (logs that describe user to application interactions) and provide views on whether their activity levels are "normal" - normal for them, normal for their peers, their business unit, location, purchasing patterns and so on. The typical "access path analytics". The output of that process, can be used to help define the initial baseline policies.
Enforcing access based on policies though is not enough though. It is time consuming and open to many many avenues of circumvention. Machine learning also has a huge role to play within the enforcement aspect too, especially as the idea of context and what is valid or not, becomes a highly complicated and ever changing question.
One of the key issues of modern authorization, is the distinction between access control logic, enforcement and the vehicles used to deliver the necessary parts to the protected services.
If at all possible, they should be as modular as possible, to allow for future proofing and the ability to design a secure system that is flexible enough to meet business requirements, scale out to millions of transactions a second and integrate thousands of services simultaneously.