Blocking on Promises (Hard-learned lessons on asynchronous programming)

OpenIG is now 100% asynchronous! In other words, we’re using a lot of Promises. Recently, we faced a strange issue where a thread remained in the WAITING state, waiting for an HTTP response to come.

Here is the thread dump we got:

"I/O dispatcher 1" #13 prio=5 os_prio=31 tid=0x00007f8f930c3000 nid=0x5b03 in Object.wait() [0x000070000185d000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x000000076b155b80> (a org.forgerock.util.promise.PromiseImpl)
	at java.lang.Object.wait(Object.java:502)
	at org.forgerock.util.promise.PromiseImpl.await(PromiseImpl.java:618)
	- locked <0x000000076b155b80> (a org.forgerock.util.promise.PromiseImpl)
	at org.forgerock.util.promise.PromiseImpl.getOrThrow(PromiseImpl.java:144)

Ok, to tell the truth, the code was performing a blocking call on a Promise<Response>, so we got what we deserved, right? Well, that code has been around (in more or less the same form) for a long time, and, AFAIK, nobody had experienced a thread blockage issue.

Here is the code where the blocking call happened:

try {
  Promise<JsonValue, OAuth2ErrorException> promise = registration.getUserInfo(context, session);
  return promise.getOrThrow(); // < - - - - - - block here
} catch (OAuth2ErrorException e) {
  logger.error(...);
} catch (InterruptedException e) {
  logger.error(...);
}

Dead simple, isn’t it ?

The strangest thing happened when we engaged a timeout on the promise (using getOrThrow(10, SECONDS)). After the timeout expired, the Promise unblocked and we saw a real Response inside (with an associated SocketTimeoutException), just like if it was already there, but without the promise triggering callbacks.

How could this be possible? Having a thread waiting for a result of another HTTP request, when the http client library in use (Apache HttpAsyncClient in our case) is supposed to handle threads by itself (and correctly).

Well, we had to dig, but we found the key deep inside the HTTP library:

// Distribute new channels among the workers
final int i = Math.abs(this.currentWorker++ % this.workerCount);
this.dispatchers[i].addChannel(entry);

What is this code doing ?

This code is called when an NIO event comes back into the HTTP library (such as the content of a response). The code basically selects one of the worker threads to be responsible for processing the response.

Is this wrong ?

Depends on your point of view ;) Initially, I was thinking that it was plain wrong: this code doesn’t know if the thread is busy doing something else or blocked.

After a bit more thought, it’s not that obvious - because responses are processed asynchronously, the request and response flows are clearly decoupled, so there is no easy way to know if the requestor thread is the same thread as the response thread.

So what happened ?

The scenario is quite simple:

  • Create a CHF HttpClientHandler
  • Send the first HTTP request
  • When the response is there, trigger another HTTP call
  • See the blocked thread

In practice, you probably have to configure the number of workers, until you can find a setting where the distribution function re-assigns the response to the requestor’s thread. The easiest configuration is to use a single-thread :)

Here is a code sample to reproduce the “issue”:

// Create an HTTP Client with a single thread
Options options = Options.defaultOptions()
                         .set(AsyncHttpClientProvider.OPTION_WORKER_THREADS, 1);
HttpClientHandler client = new HttpClientHandler(options);

// Perform a first request
Promise<Response, NeverThrowsException> main;
Request first = new Request().setMethod("GET").setUri("http://forgerock.org");
main = client.handle(new RootContext(), first)
             .then(value -> {
                 // Perform a second request on the thread used to receive the response
                 try {
                     Request second = new Request().setMethod("GET")
                                                   .setUri(URI.create("http://www.apache.org"));
                     return client.handle(new RootContext(), second)
                                  // and block here
                                  .getOrThrow(5, TimeUnit.SECONDS);
                 } catch (InterruptedException e) {
                     return newInternalServerError(e);
                 } catch (TimeoutException e) {
                     return newInternalServerError(e);
                 }
             });

// Get the response on the "main" thread
Response response = main.getOrThrow();
long length = response.getHeaders().get(ContentLengthHeader.class).getLength();
System.out.printf("response size: %d bytes%n", length);

Note that you can clone the sauthieg/blocking-on-promise GitHub repository if you want to play with that code by yourself.

The solution

Avoid the blocking call and use Promise with appropriately typed callbacks in every step of the processing.

Registering callbacks (ResultHandler, Function or AsyncFunction) instead of actively waiting for a result/failure prevents any form of thread blockage.

So now, the caller thread is not blocked. It will be available for its next task after all callbacks are registered on the promise.

Bad code example

try {
  Promise<JsonValue, OAuth2ErrorException> promise = registration.getUserInfo(context, session);
  JsonValue info = promise.getOrThrow(); // < - - - - - - block here
  return new Response(Status.OK).setEntity(info);
} catch (OAuth2ErrorException e) {
  return newInternalServerError(e);
} catch (InterruptedException e) {
  logger.error(...);
}

Good code example

return registration.getUserInfo(context, session)
                   .then((info) -> {
                     // process the result when it will be available
                     return new Response(Status.OK).setEntity(info);
                   },
                   (e) -> {
                     // Convert exception
                     return newInternalServerError(e);
                   })

The conclusion

Never block any threads when you’re doing asynchronous processing.

The async programming model is designed to maximize use of machine’s resources, and implicitly requires that there are no blocking call on the stack. As there should be no threads blocked at anytime, any thread can be selected to process a response. That explains why our HTTP library is not even trying to see if the elected thread is busy or not.

More pragmatically, when using our Promise API, you’ll know that you’re in trouble (and a potential victim of that threading issue) if you see code that uses one of the get() method variations on the Promise interface.

In OpenIG, this can be in any Filter / Handler that you write by yourself, or in any Groovy script. So take a look at the code you execute in OpenIG: we make a point to write 100% asynchronous / non-blocking code, what about you?

Exhaustive list of blocking methods in Promise

  • Promise.get() / Promise.get(long, TimeUnit)
  • Promise.getOrThrow() / Promise.getOrThrow(long, TimeUnit)
  • Promise.getOrThrowUninterruptibly() / Promise.getOrThrowUninterruptibly(long, TimeUnit)

OpenIG 4.0 is now available

This blog post was first published @ sauthieg.github.io, included here with permission.

January’s release of the ForgeRock Identity Platform includes OpenIG 4. This release brings new API gateway features, better integration with OpenAM, extended support for standards, and increased performance.

OpenIG 4’s new audit framework now handles audit events in a common way across the whole ForgeRock platform. For example, OpenIG 4 can track interactions across OpenAM, OpenDJ, and OpenIDM. Audit logs can be centralized and transactions can be traced across the platform. Additionally, the audit framework supports logging to files, databases, and the UNIX system log.

Improved monitoring data for the servers, applications, and APIs provides a better view of how OpenIG 4 and its routes are used. Delivered through REST endpoints, data includes request and response statistics, such as the number of requests, time to respond, and throughput.

The new throttling feature limits access to applications and APIs, increasing security and fairness. Throttling can enforce flexible rate limits for a variety of use cases, such as to limit the number of requests per minute from clients at the same network address.

Several new features improve integration with OpenAM:

  • A new policy enforcement filter allows only authorized access to protected resources. You can now use OpenIG instead of an OpenAM agent for authorization, and centralize all your access control policies in OpenAM.
  • SSO and federation for applications has been extended by a token transformation filter to use with the OpenAM REST Security Token Service. By using the filter, a mobile app with an OpenID Connect token can now access resources held by a federated service provider.
  • A new password replay filter simplifies the configuration for replaying credentials in common use cases.

Support for standards has been extended:

  • OpenID Connect Discovery makes it possible for users themselves, instead of system administrators, to select identity providers.
  • Initial support is available for a User Managed Access resource server, where users can control who accesses their resources, when, and under what conditions.

Behind the scenes, OpenIG 4 internals have been refactored to improve scalability – because we are no longer blocking threads, a single deployment can handle more requests at the same time.

These are just some of the changes in OpenIG 4. Check the Release Notes for a full list of what’s new in this release, and download the software from ForgeRock’s BackStage.

We love your feedback. Please feel free to ask questions, make suggestions, and tell us what you think of OpenIG by joining the community and getting on the forum and mailing list.

OpenIG 4.0 is now available

January’s release of the ForgeRock Identity Platform includes OpenIG 4. This release brings new API gateway features, better integration with OpenAM, extended support for standards, and increased performance.

OpenIG 4’s new audit framework now handles audit events in a common way across the whole ForgeRock platform. For example, OpenIG 4 can track interactions across OpenAM, OpenDJ, and OpenIDM. Audit logs can be centralized and transactions can be traced across the platform. Additionally, the audit framework supports logging to files, databases, and the UNIX system log.

Improved monitoring data for the servers, applications, and APIs provides a better view of how OpenIG 4 and its routes are used. Delivered through REST endpoints, data includes request and response statistics, such as the number of requests, time to respond, and throughput.

The new throttling feature limits access to applications and APIs, increasing security and fairness. Throttling can enforce flexible rate limits for a variety of use cases, such as to limit the number of requests per minute from clients at the same network address.

Several new features improve integration with OpenAM:

  • A new policy enforcement filter allows only authorized access to protected resources. You can now use OpenIG instead of an OpenAM agent for authorization, and centralize all your access control policies in OpenAM.
  • SSO and federation for applications has been extended by a token transformation filter to use with the OpenAM REST Security Token Service. By using the filter, a mobile app with an OpenID Connect token can now access resources held by a federated service provider.
  • A new password replay filter simplifies the configuration for replaying credentials in common use cases.

Support for standards has been extended:

  • OpenID Connect Discovery makes it possible for users themselves, instead of system administrators, to select identity providers.
  • Initial support is available for a User Managed Access resource server, where users can control who accesses their resources, when, and under what conditions.

Behind the scenes, OpenIG 4 internals have been refactored to improve scalability - because we are no longer blocking threads, a single deployment can handle more requests at the same time.

These are just some of the changes in OpenIG 4. Check the Release Notes for a full list of what’s new in this release, and download the software from ForgeRock’s BackStage.

We love your feedback. Please feel free to ask questions, make suggestions, and tell us what you think of OpenIG by joining the community and getting on the forum and mailing list.


Note: If you happened to notice that my english has noticeably improved in this post, there is a reason :) I therefore would like to give full credits to Joanne, our new tech writer, who wrote all of this.

The break is over

This has been a year since my last post…

In the meantime, quite a few things happened: a baby boy in february, a new flat, some major OpenIG refactorings, and 2 releases.

Quite a busy year after all :)

Let’s have some kind of retrospective on the year…

January under the West Coast Sun

2015 started (from a professional point of view) very smoothly: we had our yearly company meeting on the first week of January!

Heading to San Diego, US west coast, 7 time zones to cross, we started the journey at 4am (CET) and finally arrived at 7pm (PST).

We had 3 wonderful days, meeting with Forgerockers from all around the globe. Nice place and weather, cool team building activities, interesting people: everything was here for a great week!

That was a pleasure to finally met with people that you usually only interact through HipChat/Skype. We had good feedback from sales and sales engineering on OpenIG. Engineering breakout sessions had been organized (well un-organized, unconference style :)), stateless session, new (common) projects were on the plate.

That was a very pleasant (to say the least) experience, leaving us both eager to attend the next one and energized for the year :)

New forgerock.org Web Site

You probably noticed already since this is not really new at the time of writing, but we launched a whole new forgerock.org web site: good-bye Maven generated sites (well they are still around because some content did not find yet its new home), welcome to the future:

  • Gamification support (you gain points when you participate)
  • Ease access to online resources (downloads, docs, sources, blogs, …)
  • General and per-project forums
  • CSS harmony ;)

Rebranded Documentation

Mark and his team did an amazing job this year to refresh the documentation’s style.

I have to admit that the first time I saw the new documentation, it was like … Wow!

That was so refreshing, reading the doc became again a pleasure.

Note that the documentation team continued on their way and they also provided a documentation that fits perfectly in ForgeRock’s backstage site!

Good job guys !

Great Git Migration

Ahhh, a technical item, finally (I can hear you, you know ;)).

Since day one, OpenIG’s source code (and most probably other ForgeRock projects) have been hosted on a Subversion server.

It was time to move on.

Frankly, I don’t recall having used subversion for OpenIG :) The first thing I’ve done when I’ve been hired was to git svn clone the OpenIG source code!

Over time, I demoed Git features to co-workers and team members, and gradually, they did their own clones and start enjoying working on local branches, reworking history, …

So, at the end, I think we were the most ready team for Git migration: from developers to QA and doc writers everybody felt quite comfortable with Git!

We have the chance of being a small (but still complex enough) project. That make OpenIG the candidate of choice for trying imports, giving feedback, and most important: being the first product migrated!

Kudos to the release engineering team for achieving this huge task, providing support for the whole company!

New Hires !

The OpenIG project has welcomed 2 new hires in 2015: Laurent Vaills (Senior Developer) and Joanne Henry (Doc Writer).

Laurent started in difficult conditions: his first day was all preparing the San Diego trip, and then moving to the US! That could have been worse :)

Technically Joanne started in the beginning of 2016, just like Laurent: jumping in with the annual company meeting.

Welcome to both of you.

Not One but Two OpenIG Releases

The team did a tremendous job for releasing 2 OpenIG versions in 2015:

  • OpenIG 3.1.1, a sustaining release with important bug fixes for customers
  • OpenIG 4.0.0, a major release with loads of features: UMA, PEP, STS (more on theses weirds acronyms in a later dedicated post), …

This year in numbers in OpenIG-land:

  • 68384 lines were added
  • 101208 lines were deleted
  • 521 commits
  • 13 contributors
  • 172 pull requests
  • 2512 comments

That was a busy year, I’ve told you so :)

OpenIG 3.0 Migration CLI

ForgeRock OpenIG 3.1 has been recently released (12th of December, on time ;) ) and it's time for you to give it a try! Already done that ? Great! But wait a minute, what are all theses deprecation warning messages you see on the logs ?

WED DEC 17 16:46:03 CET 2014 (WARNING) file:/.../config/config.json
The configuration field heap/objects has been deprecated. Heap objects should now be listed directly in the top level "heap" field, e.g. { "heap" : [ objects... ] }.
------------------------------
WED DEC 17 16:46:03 CET 2014 (WARNING) file:/.../config/config.json
[/] The 'handlerObject' attribute is deprecated, please use 'handler' instead
------------------------------

Don't worry, these messages are only warnings, your old configurations are still parseable by OpenIG 3.1: everything is backward compatible, we just warn you that, in the next major release, theses elements are likely to be unsupported (in other words, they would probably be ignored).

Hopefully, in order to reduce the burden of migrating your config files manually, which is always an error-prone operation, I wrote a small toolkit that will massage your existing JSON configuration and produce a 3.1 compatible JSON.

At time of writing, the openig-migration toolkit supports the following migration actions:

  • heap/objects array has been simplified to just heap
  • Heap object declaration inline (when possible)
  • Remove "config": {} (empty element)
  • Rename RedirectFilter to LocationHeaderFilter
  • Rename handlerObject to handler
  • Rename OAuth2ResourceServerFilter deprecated attributes to new names

Usage

As there is still no binary available, you have to build the binary yourself (make sure you use a JDK 8 at both compile and runtime):

git clone https://github.com/sauthieg/openig-migration.git
cd openig-migration
mvn clean install

Then, you have to execute it: it expects a path to the JSON document to migrate:

java -jar target/openig-migration-1.0-SNAPSHOT-jar-with-dependencies.jar .../config.json

This command line outputs the transformed JSON on System.out.

Example

This example is extracted from OpenIG 3.0 documentation:

{
  "heap": {
    "objects": [
      {
        "name": "DispatchHandler",
        "type": "DispatchHandler",
        "config": {
          "bindings": [
            {
              "condition": "${exchange.request.uri.path == '/login'}",
              "handler": "LoginChain",
              "baseURI": "http://TARGETIP"
            },
            {
              "handler": "OutgoingChain",
              "baseURI": "http://TARGETIP"
            }
          ]
        }
      },
      {
        "name": "LoginChain",
        "type": "Chain",
        "config": {
          "filters": [
            "LoginRequest"
          ],
          "handler": "OutgoingChain"
        }
      },
      {
        "name": "LoginRequest",
        "type": "StaticRequestFilter",
        "config": {
          "method": "POST",
          "uri": "https://TARGETIP/login",
          "form": {
            "USER": [
              "myusername"
            ],
            "PASSWORD": [
              "mypassword"
            ]
          }
        }
      },
      {
        "name": "OutgoingChain",
        "type": "Chain",
        "config": {
          "filters": [
            "CaptureFilter"
          ],
          "handler": "ClientHandler"
        }
      },
      {
        "name": "CaptureFilter",
        "type": "CaptureFilter",
        "config": {
          "captureEntity": false,
          "file": "/tmp/gateway.log"
        }
      },
      {
        "name": "ClientHandler",
        "comment": "Responsible for sending all requests to remote servers.",
        "type": "ClientHandler",
        "config": {}
      }
    ]
  },
  "handlerObject": "DispatchHandler"
}

... and is migrated to:

{
  "heap": [
    {
      "name": "DispatchHandler",
      "type": "DispatchHandler",
      "config": {
        "bindings": [
          {
            "condition": "${exchange.request.uri.path == '/login'}",
            "handler": {
              "name": "LoginChain",
              "type": "Chain",
              "config": {
                "filters": [
                  {
                    "name": "LoginRequest",
                    "type": "StaticRequestFilter",
                    "config": {
                      "method": "POST",
                      "uri": "https://TARGETIP/login",
                      "form": {
                        "USER": [
                          "myusername"
                        ],
                        "PASSWORD": [
                          "mypassword"
                        ]
                      }
                    }
                  }
                ],
                "handler": "OutgoingChain"
              }
            },
            "baseURI": "http://TARGETIP"
          },
          {
            "handler": "OutgoingChain",
            "baseURI": "http://TARGETIP"
          }
        ]
      }
    },
    {
      "name": "OutgoingChain",
      "type": "Chain",
      "config": {
        "filters": [
          {
            "name": "CaptureFilter",
            "type": "CaptureFilter",
            "config": {
              "captureEntity": false,
              "file": "/tmp/gateway.log"
            }
          }
        ],
        "handler": {
          "name": "ClientHandler",
          "comment": "Responsible for sending all requests to remote servers.",
          "type": "ClientHandler"
        }
      }
    }
  ],
  "handler": "DispatchHandler"
}

Notice the now inlined object declaration that make it easier to follow the execution flow and understand what happen to your request. The empty un-necessary elements have been removed too.

Limitations

At time of writing, some old JSON files may be incorrectly handled by the migration CLI:

  • Incorrectly escaped regular expressions (in EntitytExtractFilter)
  • Multi-line Strings (without a line ending )

Contributions

Feel free to open issues, and/or fork the repository for any improvements you can think of.

I'll be happy to consider all pull requests.

Happy hacking !

OpenIG 3.1, minor release but loads of improvements

Ho ho ho, Christmas is happening sooner this year! I'm very pleased to announce OpenIG 3.1. This minor release focused on usability improvements, monitoring enablement, session management and, obviously ... bug fixing.

For this Christmas release, the OpenIG team has been very hard at work during the last weeks to deliver (on time) a delightful version.

Improve configuration file readability/usability

We learned the hard way that trying to understand the config.json and sibling route's configuration files can gives headaches! Mentally representing a graph of objects when its representation is completely flat (a list with named pointers) is too hard.

So we decided to make it easier: life will be easier for you when writing your configurations and for us when you'll send your not-working configurations for debug ;)

Here is the list of improvements in regards to the ease of configuration:

  • Object declarations can be inlined: when an attribute is a reference to another defined heap object, you can now directly include the whole object configuration in place of the object name. Notice that inlining only makes sense when the referenced object is not used elsewhere in your configuration.
    • Inline declarations don't require a name attribute (but this is still useful for identifying source objects when looking at log messages)
    • If your configuration just has one main handler (as it's usually the case), you can even omit the heap object array and directly define your object inside of the handler element
  • Empty "config": { } elements can be removed
  • Aligned configuration attributes to use the same name when possible:
    • OAuth 2.0 Client and ResourceServer filter have been aligned on providerHandler, requireHttps, cacheExpiration and scopes (that is now, in both cases, a list of Expression instead of just String)
    • config.json: handlerObject is now handler (like in route config files)

Better with an example, here is your config before:

{
  "heap": {
    "objects": [
      {
        "name": "Chain",
        "type": "Chain",
        "config": {
          "filters": [
            "ReplaceHostFilter"
          ],
          "handler": "Router"
        }
      },
      {
        "name": "ReplaceHostFilter",
        "type": "HeaderFilter",
        "config": {
          "messageType": "REQUEST",
          "remove": [
            "host"
          ],
          "add": {
            "host": [
              "example.com"
            ]
          }
        }
      },
      {
        "name": "Router",
        "type": "Router",
        "config": {}
      }
    ]
  },
  "handler": "Chain"
}

And now:

{
  "handler": {
    "type": "Chain",
    "config": {
      "filters": [
        {
          "type": "HeaderFilter",
          "config": {
            "messageType": "REQUEST",
            "remove": [
              "host"
            ],
            "add": {
              "host": [
                "example.com"
              ]
            }
          }
        }
      ],
      "handler": {
        "type": "Router"
      }
    }
  }
}

Notice that most of this tedious work can be done with the OpenIG Migration Tool.

I did not lie when I said it was Christmas ;)

Console Logs

I can't resist to show you an excerpt of your new logs (at least when you choose a ConsoleLogSink):

TUE DEC 02 17:36:10 CET 2014 (INFO) _Router
Added route 'oauth2-resources.json' defined in file '/Users/guillaume/tmp/demo/config/routes/oauth2-resources.json'
------------------------------
TUE DEC 02 17:36:11 CET 2014 (INFO) _Router
Added route 'monitor.json' defined in file '/Users/guillaume/tmp/demo/config/routes/monitor.json'
------------------------------
TUE DEC 02 17:36:12 CET 2014 (INFO) {SamlFederationHandler}/heap/1/config/bindings/0/handler
FederationServlet init directory: /Users/guillaume/tmp/demo/SAML
------------------------------
TUE DEC 02 17:36:12 CET 2014 (WARNING) SamlSession
JWT session support has been enabled but no encryption keys have been configured. A temporary key pair will be used but this means that OpenIG will not be able to decrypt any JWT session cookies after a configuration change, a server restart, nor will it be able to decrypt JWT session cookies encrypted by another OpenIG server.
------------------------------
TUE DEC 02 17:36:12 CET 2014 (INFO) _Router
Added route 'wordpress-federation.json' defined in file '/Users/guillaume/tmp/demo/config/routes/wordpress-federation.json'
------------------------------
TUE DEC 02 17:36:19 CET 2014 (ERROR) _Router
The route defined in file '/Users/guillaume/tmp/demo/config/routes/openid-connect-carousel.json' cannot be added
------------------------------
TUE DEC 02 17:36:19 CET 2014 (ERROR) _Router
/heap/0/type: java.lang.ClassNotFoundException: ConsoleLogSinkA
[       JsonValueException] > /heap/0/type: java.lang.ClassNotFoundException: ConsoleLogSinkA
[   ClassNotFoundException] > ConsoleLogSinkA
------------------------------
TUE DEC 02 17:36:46 CET 2014 (WARNING) {HttpClient}/heap/2/config/httpClient
[/heap/2/config/httpClient/config] The 'truststore' attribute is deprecated, please use 'trustManager' instead
------------------------------

The logs are now much more readable and concise: the first line is a header line and gives you the log timestamp (date formatted according to your Locale), the message's log level and its source name (name of the heap object that produced the message), then you have the message itself until the blank line separator is reached.

The attentive reader noticed that Exception messages are handled in a different way: each line corresponds to a summary of an exception in the chain, up to the root cause.

If the ConsoleLogSink is configured with the DEBUG level, the full exception stack-trace will be printed, instead of just the condensed rendering.

Performances

On the performance side, there were also a number of enhancements:

  • A brand new clustering section has been added in the documentation. After reading this, the technics to load-balance OpenIG and configure it for fail-over will have no secrets for you. This doc tells you how to achieve this with Tomcat and Jetty as examples.
  • OAuth 2.0 caches enablement: the OAuth2ResourceServerFilter was already capable of caching token info data from the IDP, minimizing the network latency, now the OAuth2ClientFilter is able to cache (and load on-demand) the content of the user-info endpoint.
  • JWT Session support, in other word: how to deport session storage from server-side to client-side without sacrificing confidentiality. This technic for storing session data outside of the server is perfect when you want scalability: no content is stored on server, so it is stateless and can be replicated easily. JSON Web Tokens are used to have both an easily serializable and deserializable session format and to reach message confidentiality (the content is encrypted, only OpenIG can read it again).

Features

OK, we couldn't resist to add at least a few new features in OpenIG 3.1!

The most core one is decorator support: this feature enables an easy way to add behaviours to existing heap objects without having to change the code of theses objects.

Let's have the following example: in OpenIG 3.0, when you wanted to capture the messages flowing in and out of a given Handler, you had to:

  • Create a CaptureFilter
  • Create a Chain, configure it with the CaptureFilter and using your observed handler as terminal handler
  • Update the source reference to your observed handler to use the Chain name

Definitely not user-friendly, and in 3.0 there was no inline heap object declaration, making this process even more tedious :'(

{
  "heap": {
    "objects": [
      {
        "name": "Chain",
        "type": "Chain",
        "config": {
          "filters": [
            "CaptureFilter"
          ],
          "handler": "Observed"
        }
      },
      {
        "name": "CaptureFilter",
        "type": "CaptureFilter",
        "config": {
          "file": "..."
        }
      },
      {
        "name": "ConsumerOfObserved",
        "type": "....",
        "config": {
          "handler": "Chain"
        }
      },
      {
        "name": "Observed",
        "type": "ClientHandler",
        "config": { }
      }
    ]
  }
}

Now, in 3.1, you just add a capture decorator in your observed object declaration and you're done!

{
  "heap": [
    {
      "name": "Observed",
      "type": "ClientHandler",
      "capture": [ "request", "response" ]
    }
  ]
}

You just added a capture decorator to the Observed object, now for each object invokation, the decorator will be called and will do its job (here capturing the Exchange's content). Easier !

A number of decorators are available out-of-the-box:

  • capture: as seen earlier this permit easy route debugging and error finding
  • timer: compute the time spent inside of objects for performance tracking
  • audit: track messages inside of the system (monitoring)

Monitoring

Monitoring how your service is behaving is now as easy as it should be :)

Thanks to the decorator framework, your can audit your routes and receive notifications when observed components (or routes) are traversed.

This is just as simple as adding an audit 'tag' to your route configuration, (this is used to qualify the notifications):

{
  "handler": {
    "...": "..."
  },
  "audit": [ "wordpress" ]
}

Then, you just add a route that activates the MonitoringEndpointHandler (a simple audit agent), just like what @markcraig explained in his blog.

At the end, you should have a nice monitoring console:

Monitoring Console See @ludomp blog for details on the monitoring console.

Thanks

Let finish with a huge thanks to all involved in producing this release: developers, QA engineers, technical writers, architects, product managers, ...!

Now it's up to you to confirm this success: download and try it by yourself :)

And have nice Christmas vacations!

Resources

Google and OAuth 2.0 Compatibility (Eliminate all other factors, and the one which remains must be the truth. (Sherlock Holmes))

Over the weekend I was pinged by my QA engineers about new failures in our OAuth 2.0 test suite. Not a good sign, just a few days before the OpenIG 3.1 release date ! Anyway, I've made my hands dirty to track down the problem up to the source. Here is the story :)

This issue has been fixed the 12th of December 2014 (see stack overflow answer).

Everything starts with ...

... a failure, at least I had the screenshot of a nice Exception, pretty explicit by the way:

The failure

... then you start digging

... on your side. Yeah, you must have done something wrong in your code, Google is just working (Apple ©), right ?

The part of the code referred by that stack trace is dealing with the Access Token Response JSON structure returned from the OAuth 2.0 Token endpoint. But this has not been changed since it was first checked in (or almost, nothing recent at least), and the QA tests that were failing on friday have been working previously.

I've tried to think about every code checkin' to see the potential impact on the expected behaviour, found nothing.

On the QA side, they re-run an older OpenIG's version (that was working, remember ?), and now it's failing too !?

WTF ?

... looking more closely to what Google send

At that point, it's time to open your IDE and debug OpenIG, this way we'll see what see OpenIG see...

No wonder, it's really receiving a successful access token response of the following form:

{
  "access_token": "ya29.1gD56tBWtHW3K7oZ0FINTnsqa4VYiE2YGZeQXgJ4ID79E-mZxNWoyYi7pKrs_Vyxj8FZbuxh_RGTJw",
  "token_type": "Bearer",
  "expires_in": "3600",
  "refresh_token": "1/dGjGYC7sDFaBwpdUVpkJP2mYFYTU8HAh7T6szsKGYTs"
}

expires_in is really a String, even if semantically it's a Number !

Ok Google is now sending a String instead of a Number for this attribute.

My world collapse: Google can't be wrong for something that simple !

That reminds me a famous Sherlock Holmes quote: Eliminate all other factors, and the one which remains must be the truth.

Sad...

... what about the spec ?

As per the OAuth 2.0 Spec, a successful response looks like the following:

HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache
{
  "access_token":"2YotnFZFEjr1zCsicMWpAA",
  "token_type":"example",
  "expires_in":3600,
  "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA",
  "example_parameter":"example_value"
}

They even give a syntax for expires_in attribute:

expires-in = 1*DIGIT

Pretty simple, though ?

... let's find a solution

Hopefully, it was not very hard to fix: I just made one of our class a little bit smarter (dealing especially with the String case).

Tested and committed, ready for the release :)

... wait a minute, what changed on Google's side ?

They seem to have updated their OAuth 2.0 service implementation.

Take a look at the OpenID Connect Discovery endpoint returned JSON:

{
  "issuer": "accounts.google.com",
  "authorization_endpoint": "https://accounts.google.com/o/oauth2/auth",
  "token_endpoint": "https://www.googleapis.com/oauth2/v3/token",
  "userinfo_endpoint": "https://www.googleapis.com/plus/v1/people/me/openIdConnect",
  "revocation_endpoint": "https://accounts.google.com/o/oauth2/revoke",
  "jwks_uri": "https://www.googleapis.com/oauth2/v2/certs",
  "response_types_supported": [
    "code", "token", "id_token",
    "code token", "code id_token", "token id_token", "code token id_token",
    "none"
  ],
  "subject_types_supported": [ "public" ],
  "id_token_alg_values_supported": [ "RS256" ]
}

And, more closely to the token_endpoint value: it's using a v3 API. It's not even advertised on the Google API Explorer (yet ?).

Another hint that something changed was in the last edited timestamp at the bottom of the OpenIDConnect and OAuth2WebServer google developers pages:

Edited on Friday the 5th

Funny facts

I played with Google OAuth Playground this morning (BTW, it's pretty neat and usable, good stuff).

Surprisingly, when you choose the Google OAuth endpoints set, you'll see they're not using the v3 one yet: they're using https://accounts.google.com/o/oauth2/token.

When manually re-configured to use the v3 API, you clearly see the String:

Configured with v3 endpoints

Links

Introduction to OpenIG (Part 5: Ease your development with Routes)

A route is an OpenIG configuration fragment that supports hot reloading. This is the perfect tool for developing your configuration, trying it, fixing it with a very fast feedback. This is also very handy for managing your configuration complexity by splitting it into smaller, more cohesive, sections.

Route

Here is a route configuration file ... just to give you a taste.

Except for the condition property, it's really looking like a regular config.json content, right ?

You'll recognise some attributes from the main configuration file:

  • heap/objects: where you describe all of your route components and how they're linked to each others
  • handler: specify the main entry point of your route
  • baseURI: where the request should be re-routed (optional)

Inheritance

An interesting feature of routes is that they inherit objects from their parent route (config.json content being the primary route, ancestor of all others). That means that any named object defined in a parent config file (let's say config.json) can be re-used by the child route defined components:

Assuming that config.json defines a ClientHandler instance named Forwarder, someone could write in his own route that would includes this Chain:

{
  "name": "OutgoingChain",
  "type": "Chain",
  "config": {
    "filters": [ ],
    "handler": "Forwarder"
  }
}

This feature makes it very easy to share pieces of configuration logic in a common/shared place (the parent route or config.json).

Isolation

Each route is having a dedicated namespace (some kind of private area) where objects declared in its heap/objects array are kept. That means that a route cannot access objects defined in another route (except if the object is declared in the parent route, thanks to inheritance), effectively providing content isolation.

That's utmost useful for multiple reasons:

  • When writing the configuration, errors may happen (we cannot always be right on the first shot), having isolation limits the number of issues that may arise due to unmanaged (or unseen) dependencies.
  • When hot-reloading is enabled (the default), this permits fragmented update of your system (just update what you need, not the whole configuration)

Exchange processing

The heap/objects array contains all of your route components (declaration, configuration and bindings with other components). One of the declared Handler has to be referenced through the handler top level attribute:

{
  "heap": {
    "objects": [
      {
        "name": "EntryPoint",
        "type": "StaticResponseHandler",
        "config": {
          "status": 200
        }
      }
    ]
  },
  "handler": "EntryPoint"
}

Notice that the handler attribute is required, an error will be thrown if not set.

Conditional execution

Unlike config.json, a route is conditionally invoked given the result of a condition:

{
  "condition": "${exchange.request.form['forward'] == true}"
}

The condition is expressed as an Expression and gives you access to all of the properties of the Exchange being processed. It has to return a boolean (any other return type will be considered as false).

If the condition attribute is absent (or null), the route will always accept the Exchange (if proposed).

Here are some examples of useful conditions:

  • Only requests whose paths are starting with /wordpress/:
{
  "condition": "${matches(exchange.request.uri, '^/wordpress/')}"
}
  • Requests paths starting with a value or another:
  {
    "condition": "${matches(exchange.request.uri.path, '^(/carousel|/openid)')}"
  }

Uri Rebasing

When an Exchange is accepted in a route, its request.uri may be rebased (if there is a baseURI top level attribute). Notice this is the very first thing happening to the Exchange inside of the route (even before being handled by your main handler object).

Activation / Deactivation

The RouterHandler (which is the heap object managing routes) is observing a given directory (${openig.base}/routes/ unless specified to something else) for route files. Each file that ends with .json is considered as a route and is tried to be activated. If everything goes well, the new route is available into the system, otherwise, an error is displayed into the logs.

Activating a route is as simple as dropping a .json file in the right folder.

And deactivating as simple as removing that file.

Notice that you can simply rename the file with a different extension to have it ignored by the system (and thus uninstalled if it was previously active):

routes/
  wordpress.json -> wordpress.json.disabled

Scan interval

The route file scan is done at most 1 time per scanInterval (default value to 10 seconds) and that this scan is not done by a background thread but by the thread that handle the request (ie don't wait for a log message saying a new route was discovered :) ).

Ordering

The RouteHandler imposes a lexicographical order when trying to hand-off the current Exchange to one of the available route. The file name is used as a default value when there is no name top level attribute in the route configuration.

For example, given the following routes/ folder content (no name attribute defined in any of the routes), the routes would be tried in this order: 00-main.json, next.json and zz-default.json.

routes/
  00-main.json    (1)
  zz-default.json (3)
  next.json       (2)

You can override the default system provided name (based on the file name) by specifying a name attribute in the route:

{
  "name": "my-name"
}

Conclusion

Routes is a very handy tool for OpenIG users, there is even a global Router in the default configuration: just drop your json files in ${openig.base}/routes/ and you're good to go !

Next

In the next post we'll talk about OAuth 2.0.

Introduction to OpenIG (Part 4: Troubleshooting)

As transformations are dictated by the set of filters/handlers in your configuration, they are not always trivial, it's becoming very important quickly to capture the messages at different phases of the processing.

See the flow

First thing to understand when trying to debug a configuration is "where the hell are all my messages going ?" :)

This is achievable simply by activating DEBUG traces in your LogSink heap object:

When the DEBUG traces are on, you should see something like:

You'll see a new line each time an Exchanges comes into a Handler/Filter and each time it's flowing out of the element (you also get a performance measurement).

Capture the messages (requests/responses)

OpenIG provides a simple, way to see the HTTP message (being a request or a response), including both headers and (optionaly) the entity (if that's a textual content): the CaptureFilter.

Here is an output example you can obtain when you install a CaptureFilter:

Install a CaptureFilter

Being a filter, it has to be installed as part of a Chain:

It is usually best placed either as the OpenIG entry point (the first element to be invoked), that helps to see what the User-Agent sends and receives (as it's perceived by OpenIG) or just before a ClientHandler (that represents a sort of endpoint, usually your protected application).

Capture what you want

CaptureFilter is sufficient for simple capturing needs. When what you want to observe is not contained in the HTTP message, we have to use the OpenIG swiss-knife: ScriptableFilter.

This is a special filter that allows you to execute a Groovy script when traversed by an Exchange.

Here is a sample script that prints the content of the Exchange's session:

Copy this script into ~/.openig/scripts/groovy/PrintSessionFilter.groovy and configure your heap object:

Seeing the messages on wire

Sometimes, all of the previous solutions are not applicable, because you want to see the on-wire message content (as opposed to modelled by OpenIG).

For this case, the only solution is to start your OpenIG with a couple of system properties that will activate deep traces of the http client library we're using: Apache HTTP Client.

>$ bin/catalina.sh -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog 
                   -Dorg.apache.commons.logging.simplelog.log.httpclient.wire=debug 
                   -Dorg.apache.commons.logging.simplelog.log.org.apache.commons.httpclient=debug 
                   run

See the HTTP Client Logging page for more informations.

Next

In the next post we'll explain how routes can speed-up your configuration preparation.

Introduction to OpenIG (Part 3: Concepts)

The previous posts exposed you to OpenIG: use cases and an initial configuration example. Before going further, I would like to introduce the underlying concepts that you should understand.

HTTP Exchanges Requests and Responses

OpenIG is a specialized HTTP reverse-proxy: it only deals with HTTP messages. The purpose of OpenIG is to give a complete control over the messages flowing through itself (incoming requests and outgoing responses).

In order to ease handling of the whole message processing (request and response), OpenIG uses the concept of Exchange: it's a simple way to link a Request and a Response together. It also contains a Session instance that can be used to store session-scoped properties. The Exchange itself is a Map so it's a natural container for request-scoped properties.

When a request comes through OpenIG, an Exchange object is created and populated with an initial Request and a Session instance.

A Request is the OpenIG model for an incoming HTTP message, it captures both the message's entity and all of its headers. It also has dedicated accessors for message's target URI, cookies and form/query parameters.

A Response is the complementary model object for outgoing HTTP messages. In addition to the entity and headers accessors, Response provides setter for the HTTP status code (20x -> ok, 30x -> redirect, 50x -> server error, ...).

Exchange processing

Now we have a model of the message's content, what can we do to apply transformations to it?

OpenIG offers a simple, but powerful, API to process exchanges:

  • Handler that are responsible to produce a Response object into the Exchange
  • Filter that can intercept the flowing Exchange (incoming and outgoing)

Handler

What do the doc says about Handler.handle() ?

Called to request the handler respond to the request.

A handler that doesn't hand-off an exchange to another handler downstream is responsible for creating the response in the exchange object.

Hmmm, an example would help, right ?

OpenIG offers a rich set of Handlers, the most significative one would be the ClientHandler. This is a highly used component usually ending Exchange's processing that simply forward the Request to the required URI and wraps the returned HTTP message into the Exchange's Response.

In other words, it acts as a client to the protected resource (hence the name).

Filter

Again, what do the doc says about Filter.filter() ?

Filters the request and/or response of an exchange.

Initially, exchange.request contains the request to be filtered. To pass the request to the next filter or handler in the chain, the filter calls next.handle(exchange). After this call, exchange.response contains the response that can be filtered.

This method may elect not to pass the request to the next filter or handler, and instead handle the request itself. It can achieve this by merely avoiding a call to next.handle(exchange) and creating its own response object the exchange. The filter is also at liberty to replace a response with another of its own after the call to next.handle(exchange).

This is easier to understand I think, everyone is used to interceptors nowadays...

The traditional example of a Filter is the CaptureFilter: this filter simply prints the content of the incoming request, then call the next handler in chain and finally prints the outgoing response's content.

Filters are contained inside a special Handler called a Chain. The chain is responsible to sequentially invoke each of the filter declared in its configuration, before handing the flow to its terminal Handler.

All together: a Chain example

If you have your OpenIG up and running, please shut it down and replace its configuration file (config.json) with the following content:

Compared to previous, this configuration will enhance the response message with an additional HTTP header named X-Hello.

The message flow is depicted in the following diagram:

Exchange Flow

A Filter can intercept both the Request and the Reponse flows. In this case, our HeaderFilter is configured to only act on the response flow (because the outgoing message is a handy way to observe a filter in action from an outside perspective).

Wrap up

OpenIG provides a low-level HTTP model API that let you alter HTTP messages in many ways. The message processing is handled through some kind of pipeline composed of handlers and filters.

All the processing logic you want to apply to your messages finally depends on the way you compose your handlers and filters together.

Next

In the next post we'll see how to debug your configurations.