Logstash configuration for collecting OpenAM and OpenIDM logs


Following on to my previous posting, here is a logstash configuration that collects logs from both OpenAM and OpenIDM, and feeds them into elastic search:



 input {  
file {
type => idmRecon
start_position => beginning
path => "/opt/openidm/audit/recon.csv"
}
file {
type => idmActivity
start_position => beginning
path => "/opt/openidm/audit/activity.csv"
}
file {
type => amAccess
# start_position => beginning
path => "/opt/openam/openam-config/openam/log/amAuthentication.*"
}
}
filter {
if [type] == "idmRecon" {
csv {
columns => [
"idX","action","actionId","ambiguousTargetObjectIds","entryType","message","reconciling","reconId",
"rootActionId","situation","sourceObjectId","status","targetObjectId","timestamp"
]
}
date {
match => ["timestamp", "ISO8601"]
}
}
if [type] == "idmActivity" {
csv {
columns => [
"_id","action","activityId","after","before","changedFields","message","objectId","parentActionid",
"passwordChanged","requester","rev","rootActionId","status","timestamp"
]
}
date {
match => ["timestamp", "ISO8601"]
}
}
if [type] == "amAccess" {
csv {
columns => [time,Data,LoginID,ContextID, IPAddr, LogLevel,
Domain, LoggedBy, MessageID, ModuleName, NameID, HostName]
separator => " "
}
date {
match => ["time", "yyyy-MM-dd HH:mm:ss"]
}
geoip {
database => "/usr/share/GeoIP/GeoIP.dat"
source => ["IPAddr"]
}
}
}
output {
# Use stdout in debug mode again to see what logstash makes of the event.
stdout {
debug => true
codec => rubydebug
}
elasticsearch { embedded => true }
}



Now we can issue elastic search queries across all of the data sets. Here is a very simple Kibana dashboard showing events over time and their source:



















While this configuration is quite basic, it allows us to find and correlate events of interest across OpenAM and OpenIDM.

Try searching for a sample user "fred" by entering the string into the top search box. You will see all OpenAM and OpenIDM events that contain this string in any field. You can of course build more specific queries - but the default free form search does an excellent job.

5 Steps To Protecting Customer Identities

Bringing customers closer to an organization's services and applications is a key part of many Chief Digital Officers' (CDO) online strategy.  Organizations that have previously never provided their products and services online - I'm thinking traditional insurance providers, pension providers and other financial services - are now in a place where digitization of customer purchased assets is critical to future business success.

The main priority of the CDO is often to deliver new or improved online services quickly, to allow for market opportunities to be fulfilled.  Their primary concern is not necessarily focused on security, or privacy requirements.  Historically, these functions have been seen as inhibitive to user convenience, or a slowing factor in the software development cycle and are often applied retrospectively via audit and penetration testing.

So what main steps are important to securing customer identities?


1 - Identify & Register


Customers need a mechanism to register and identify themselves before they can access your online services, assets or applications.  This is generally done using a mixture of self-service, call centre and manual registration.  Unique usernames - if not using email address based identification - need to be upheld, as well as the ability to gather other personal attributes such as contact information.  This can be gathered using existing social network accounts using standards such as OAuth2 or OpenID Connect.

2 - Verify, Correlate & Store


If using self-registration, a mechanism needs to be in place to verify that the end user is who they say they are.  This becomes vitally important when dealing with financial assets, policies and so on.  Verification can occur using several methods including correlation of attribute values such as account numbers, ZIP codes and other personal information, back to an internally managed authoritative store.  The use of two-factor verification processes is also common here.  The issuance of verification codes, to either a registered email address, or more securely to a pre-registered physical mailing address, are two options.  The customer identity then needs storing in a globally available, highly scalable directory.  Depending on business requirements existing customers may well be in the hundreds of thousands, whilst potential customers could well be in the 10's of millions. This sort of scale needs to be considered.  The storage of password and other sensitive data also needs to be considered, with a wide use of hashing and salting algorithms put in place.  The algorithms and their implementation should also be done using existing frameworks and be not homegrown.


3 - Context Over Risk


Risk is of course subjective, but methods must be in place to help identify risk and apply the necessary steps to reduce business exposure to things like fake accounts, incorrect access, redundant accounts and so on.  Applying the same rules to all users, only goes to migrate the risk and not identify it.  The use of things like two factor authentication for previously unknown devices for example, is a simple way to tie down previously trusted machines.  The use of device signature printing and user risk scoring based on the time they log in, from which network and which authentication source, goes along way to help provide identity assurance levels.  A user logging in from an unknown device using a social network account, may have a lower assurance level for example, than a fully registered user using your customer directory.

4 - Give Them What They Want - But Not More Than They Need


Bringing customers closer to your brand, service or assets, not only makes good business sense (opportunities for up and cross selling), but also provides the customer with the information they want. Each customer is unique and will require access to their personal policy data, account information, purchases, unique history and service choices.  That information needs to delivered effectively across multiple device types, without the worry of cross pollination of information, or risk of misaligned access.  Provide the customer with the information and services they need, either based on what they have purchased or what you want them to purchase.  This can be done via conditional policies, enforcement points and continual resource access checking.

5 - Be Adaptive


Most digital strategies are based on agile development and rapid go to market approaches. Taking 9-12 months to implement an online service, is often too slow for that to be effective in keeping and gaining new customers.  Generation Y users (not to mention Digital Natives) require mobile ready content that is not inhibited by poorly constructed security and registration processes. The ability to rapidly build out new applications and services on top of the existing customer security platform is key in being able to drive revenue and keep customers close.  The security platform should allow for loosely coupled interfaces, often based on things like REST, that can allow for the integration of key identify and access management services, without inhibiting the agile develop of the key business services.

By Simon Moffatt

[1] Image courtesy of http://www.sxc.hu/photo/854540

Collecting OpenAM logs with logstash


Logstash is a general purpose log collector that can read, transform and collect various logs.

The following logstash configuration will collect OpenAM Access logs. The default target here is Elastic Search - which is document oriented no-sql database optimized for text search (perfect for log files).

In a future blog I will show you how you can use Kibana to makes some sexy charts of your access data.

 file {  
type => amAccess
start_position => beginning
path => "/path_to_your_install/openam/openam/log/amAuthentication.access"
}
}
filter {
if [type] == "amAccess" {
csv {
columns => [time,Data,LoginID,ContextID, IPAddr, LogLevel,
Domain, LoggedBy, MessageID, ModuleName, NameID, HostName]
separator => " "
}
date {
match => ["dateTime", "yyyy-MM-dd HH:mm:ss"]
}
geoip {
database => "/path_to_your/GeoIP.dat"
source => ["IPAddr"]
}
}
}


Here is an upstart config file to start logstash:

 # logstash - indexer instance  
#
description "logstash indexer instance"
start on virtual-filesystems
stop on runlevel [06]
respawn
respawn limit 5 30
limit nofile 65550 65550
# set HOME to point to where you want the embedded elasticsearch
# data directory to be created and ensure /opt/logstash is owned
# by logstash:adm
env HOME=/opt/logstash
#env JAVA_OPTS='-Xms512m -Xmx512m'
chdir /opt/logstash
setuid ubuntu
setgid ubuntu
#setuid logstash
#setgid adm
console log
# for versions 1.1.1 - 1.1.4 the internal web service crashes when touched
# and the current workaround is to just not run it and run Kibana instead
script
exec /opt/java/bin/java -jar logstash.jar agent -f /opt/logstash/access.conf --log /opt/logstash/log.out
end script

Protection & The Internet of Things

The 'Internet of Things' is one of the technical heatwaves that has genuinely got me excited over the last 24 months or so.  I've been playing with computers since I was 8 and like to think of myself as being pretty tech-savvy.  I can code in a number of languages, understand different architectural approaches easily and pick up new technical trends naturally.  However, the concept of the truly connected world with 'things' interconnected and graphed together, is truly mind blowing.  The exciting thing for me, is that I don't see the outcome.  I don't see the natural technical conclusion of devices and objects being linked to a single unique identity, where information can flow in multiple directions, originating from different sources and being made available in contextual bundles.  There is no limit.



They'll be No 'Connected', Just 'On'

Today we talk about connectivity, wifi hotspots and 4G network coverage.  The powerful difference between being on and off line.  As soon as you're off line, you're invisible.  Lost, unable to get the information you need, to interact with your personal and professional networks. This concept is slowly dying.  The 'Internet' is no longer a separate object that we connect with explicitly.  Very soon, the internet will be so intrinsically tied to us, that without it, basic human interactions and decision making will become stunted.  That is why I refer to objects just being 'on' - or maybe just 'being', but that is a little too sci-fi for me.  Switching an object on, or purchasing it, enabling it, checking in to it, will make that device become 'smart' and tied to us.  It will have an IP address and be able to communicate, send messages, register, interact and contain specific contextual information.  A simple example is the many running shoe companies that now provide GPS, tracking and training support information for a new running shoe.  That information is specific to an individual, centrally correlated and controlled, and then shared socially to allow better route planning and training techniques, to be created and exchanged.


Protection, Identity & Context

But what about protection?  What sort of protection?  Why does this stuff need protecting in the first place? And from what?  The more we tie individual devices to our own unique identity, the more information, services and objects we can consume, purchase and share.  Retailers see the benefit in being able to provide additional services and contextual information to a customer, as it makes them stickier to their brand.  The consumer and potential customer receives a more unique service, requiring less explicit searching and decision making.  Everything becomes personalised, which results in faster and more personalised acquisition of services and products.

However, that information exchange requires protection.  Unique identities need to be created - either for the physical person, or the devices that are being interacted with.  These identities will also need owners, custodians and access policies that govern the who, what and when, with regards to interactions.  The running shoe example may seem unimportant, but apply that logic to your fridge - seems great to be able to manage and monitor the contents of your refrigerator.  Automatic ordering and so on, seems like a dream.  But how might that affect your health insurance policy?  What about when you go on holiday and don't order any food for 3 weeks?  Ideal fodder for a burglar.  The more we connect to our own digitalpersona, the more those interactions need authentication, authorization and identity management.

Context plays an important part here too.  Objects - like people in our own social graphs - have many touch points and information flows.  A car is a simple example.  It will have a manufacturer (who is interested in safety, performance and so on), a retailer (who is interested in usage, ownership years), the owner (perhaps interested in servicing, crash history) and then other parties such as governments and police.  Not to mention potential future owners and insurance companies.  The context to which an interacting party comes from, will obviously determine what information they can consume and contribute to.  That will also need managing from an authorization perspective.


Whilst the 'Internet of Things' may seem like buzz, it has a profound impact on how we interact with physical, previously inanimate objects.  As soon as digitize and contextualize them, we can reap significant benefits when it comes to implicit information searching and tailor made services.  But, for that to work effectively, a correct balance with identity and access control needs to be found.

By Simon Moffatt

Image courtesy of http://www.sxc.hu/photo/472281