Title: Functional Core, Imperative Shell Subheading: How to write very testable code using the functional core, imperative shell technique. Alias: /post/functional-core-imperative-shell/ _Testability_ is the idea that code should be designed so that it's easy to write tests for. In [Boundaries](https://www.destroyallsoftware.com/talks/boundaries) Gary Bernhardt introduces a programming technique he calls _functional core, imperative shell_ that (among other advantages) helps to make code very easy to test. I tried to apply the approach to some real-world code: an OAuth 2.0 plugin for [CKAN](http://ckan.org/). The idea is that each function or method in your code is one of two types: Functional core : These are functions that take values in as parameters, and give values out as return values. They use explicit parameters instead of implicit dependencies. They have few or no side effects, and each one of these functions is very well isolated from the rest of the code. You try to put all the logic and decisions of your code into these "pure functions", so there are many potential paths through one of these functions. But you try to keep things like the network, the database, rendering to the screen, disk, etc out of these functions. Imperative shell : These "shell functions" are where all the side effects and persistent state go: the network and database etc. They are highly integrated with the rest of your code and with other code that you're using (e.g. your web framework). You try to keep logic and decisions out of the shell functions, there should be few possible paths through a shell function (ideally only one). If possible, they should be trivial one-liner functions. Writing code in this way has many advantages. All the logic is in the pure functions which are very modular and easy to understand. They're also *much* easier to test, and you can get 99% coverage by just testing your pure functions and ignoring the shell. ## An OAuth 2.0 Plugin for CKAN [ckanext-oauth2waad](https://github.com/ckan/ckanext-oauth2waad) is a [CKAN](http://ckan.org/) plugin that lets users login to a CKAN site using their Windows Azure Active Directory (WAAD) account instead of creating a new username and password for the site. The way it works is: 1. On the CKAN login page the user clicks a "Login with WAAD" link that sends them to the WAAD login server. This link's URL contains a **redirect URI** as a URL parameter that tells the WAAD server where to redirect the user's browser to after a successful login: `?redirect_uri=http://demo.ckan.org/_waad_redirect_uri`. 2. The user enters their WAAD username and password into the WAAD login page and clicks sign in. After a successful sign in the WAAD server redirects the user's browser to the redirect URI. The server appends an **authorization code** to the redirect URI as a URL param: `?code=ffhjkl434387jlkmdfsas`. 3. CKAN receives the request with the authorization code. The `oauth2waad` plugin's `login()` method is called, and handles the request in four steps: 1. It makes a request to the WAAD server for the user name corresponding to the given authorization code, and receives the user name back from the WAAD server. 2. It finds the user account with the given user name in CKAN's database. If no account exists, it silently creates a new account. 3. It logs the user in to CKAN by saving the user name in CKAN's session. 4. It redirects the user's browser to the dashboard page for the account they've just logged in to. There's also a cross-site request forgery (CSRF) check, but we'll ignore that for this explanation. This is how the "login experience" looks, from the user's point of view: The `oauth2waad` plugin's `login()` method that handles actually logging the user into CKAN is the method we're going to try to test using a functional core, imperative shell approach. ## A Naïve Implementation Here's a naïve implementation of the `login()` method: ```python class WAADRedirectController(toolkit.BaseController): def login(self): """Handle a login request from the WAAD server. When the WAAD server wants to log a user into our CKAN site, it redirects the user's browser to a CKAN URL that routes to this login() method. The URL params contain an auth code which we can use to get the user's name from the WAAD server, and then log the user in to CKAN by saving the name in CKAN's session. """ # Look for the auth code in the URL that the WAAD server requested # from CKAN. auth_code = pylons.request.params['code'] # The data that we're going to post to the WAAD server. # This includes a secret client ID from CKAN's config file # (which we access via the pylons.config object). data = { 'client_id': pylons.config['client_id'], 'code': auth_code, } # Use the requests library to make an HTTP POST request to the WAAD # server. response = requests.post( pylons.config['waad_server_url'], data=data) # Parse the response to get the user's name. name = response.json()['name'] # Find the CKAN user account corresponding to this WAAD user account. try: user = toolkit.get_action('user_show')(data_dict = {'id': name}) except toolkit.ObjectNotFound: # The user doesn't exist in CKAN yet, create it by calling the # CKAN API. user = toolkit.get_action('user_create')( context={'ignore_auth': True}, data_dict={'name': name}) # Log the user in to CKAN by adding their name to the Pylons # session. pylons.session['user'] = name pylons.session.save() # Finally, show the user their dashboard page. toolkit.redirect_to(controller='user', action='dashboard', id=name) ``` (This is a simplified version of the method, the production code contains a lot of error-handling and other details, but the above code is closely based on the real thing.) ## Testing the Naïve Implementation ***TLDR**: The naïve implementation is very difficult to test because it requires a lot of mocking patching and simulating. The test code takes a long time to write and is complicated, tightly coupled to CKAN internals, and slow. The rest of this section details the problems with testing the naïve implementation. Skip to [A better implementation](#better) to see the results.* The code is quite simple, straightforward and readable. But the naïve implementation is *very* difficult to test. Some of the things you'll have to do to write a test for this include: * Resetting CKAN's test database before each test, because the CKAN API functions that the method calls write things to the database that may change the outcome of the next tests. * Running CKAN inside a test web server that can simulate HTTP requests for us. (We can do this using a [webtest](http://webtest.pythonpaste.org/) `TestApp` object.) You can't just initialize a `WAADRedirectController` object and call its `login()` method. `WAADRedirectController` inherits from `ckan.plugins.toolkit.BaseController` which means it depends on a bunch of CKAN and Pylons internals and will crash if initialized outside of a Pylons HTTP request. If you do this: #!python import ckanext.oauth2waad.plugin as plugin def test_login(): controller = plugin.WAADRedirectController() controller.login() You'll get this: TypeError: No object (name: request) has been registered for this thread * Inserting test values into the Pylons config. Simply initializing a `TestApp` for CKAN won't work because the `oauth2waad` plugin won't be activated, and the various config settings that the plugin needs will be missing. Each test function needs to insert the particular settings that it needs into the `pylons.config` first and then create the test app. * Mocking the WAAD server, because when we request CKAN's login URL the `login()` method tries to make an HTTP request to the WAAD server to get the user's name. We need to mock the response to this HTTP request, which we can do using the [HTTPretty](http://falcao.it/HTTPretty/) library. * Mocking the Pylons session. Our test needs to access the Pylons session so that it can check that the `login()` method did what it was supposed to do: save the user's name in the session. We can't simply `import pylons` and access the Pylons session from our test function. If we try to do this at the end of our test function: #!python assert pylons.session['user'] == 'fred', ( "login() should add the user's name to the Pylons session") We'll get: TypeError: No object (name: session) has been registered for this thread The Pylons session is only available during an HTTP request. To get around this, we'll have to use the [mock](https://pypi.python.org/pypi/mock) library to patch `pylons.session` and replace it with a mock session object. If we simply do `@mock.patch(pylons.session)` we'll get mock objects leaking into CKAN's internals and a variety of confusing error messages from CKAN, Pylons and SQLAlchemy (as SQLAlchemy tries to save mock objects into database tables, for example). The reason for this leakage is that `pylons.session` has many different names in different parts of CKAN (which is poor design in CKAN, imho). Lots of CKAN modules do this: #!python from pylons import session When `ckan/lib/base.py` does that, for example, we now need to patch both `pylons.session` _and_ `ckan.lib.base.session`. We have to find each different name for `pylons.session` that our test happens to hit and patch each of them separately. This tightly couples our test code to CKAN internal details in a way that will be difficult to debug when those internals change. It takes hours to write a test for this and find ways around all of these obstacles, and it requires deep knowledge of CKAN and Pylons, as well as using test libraries for mocking, patching and simulating. The test code that you'll finally end up with will look something like this: ```python import json import webtest import httpretty import mock import pylons.config as config import ckan.config.middleware import ckan.model as model @mock.patch('pylons.session') @mock.patch('ckan.lib.helpers.session') @mock.patch('ckan.lib.base.session') @httpretty.activate def test_login(mock_base_session, mock_helpers_session, mock_session): """login() should add the user's name to the session.""" # Reset the database contents before each test. model.Session.close_all() model.repo.rebuild_db() # Mock the Pylons session. session_dict = {} def getitem(name): return session_dict[name] def get(name): return session_dict.get(name) def setitem(name, val): session_dict[name] = val mock_session.__getitem__.side_effect = getitem mock_session.get.side_effect = get mock_session.__setitem__.side_effect = setitem # Mock the WAAD server. waad_server_url = 'https://fake.auth.endpoint/tenant/token' def request_callback(request, url, headers): """Our mock WAAD server response.""" # The params that will go in the response's JSON body. params = {'name': 'fred'} body = json.dumps(params) return (200, headers, body) httpretty.register_uri(httpretty.POST, waad_server_url, body=request_callback) # Insert the settings we need into the Pylons config. config['ckan.plugins'] = 'oauth2waad' config['client_id'] = 'mock_client_id' config['waad_server_url'] = waad_server_url # Make a CKAN test app. app = ckan.config.middleware.make_app(config['global_conf'], **config) app = webtest.TestApp(app) # Make a simulated HTTP POST request to the login URL. app.post('/_waad_redirect_uri', {'code': 'mock_auth_code'}) # Test that the login() method added the user name into the mock Pylons # session. assert mock_session['user'] == 'fred' ``` This single test takes about *twenty seconds* to run, mostly because of the time needed to boot the whole CKAN web app inside the test web server and initialize its database. Even after this initialization the test isn't particularly fast, it's exercising the full CKAN app. This is a simplified version of the test, for a simplified version of the `login()` method. The real test would be much longer, and you'd need many such tests to test all the paths through the real `login()` method with all its error handling and everything.