Thursday, May 20, 2010

Drupal Node Access

Most website permissions work like this: you've got the anonymous public, who can see everything you've published, and a webmaster or two who can see and edit everything.

But once you add the complexity of users publishing their own content, or adding an 'intranet' where only some people are able to see private content, it gets tricky. Unlike Wordpress and Joomla that are very focussed on the simple website permissions model, Drupal was originally built as a platform for collaboration, and it's got some great tools for addressing all kinds of access use cases. In particular, sometime around when I started working with Drupal, the node access system of access control was born.

Unfortunately, the problem of access control is inherently challenging due to its complexity, and every time I try to implement it, I have to reread about the Drupal node access model to remember how it works. I've just had to do this again while implementing a customized Open Atrium installation for the Centre for Social Innovation, and so this posting is my attempt at writing it down to help me remember it for next time, and to share it with others.

When you step beyond the basic web publishing model, then the key question is: who can do what (view, edit, etc.) with what page? The obvious model is then to define those access permissions for each page, but after trying that once, you'll realise that you probably don't want to do that because:

a. it takes too long to set up (n nodes x m users x p permissions = n x m x p checkboxes to add).
b. it's impossibly cumbersome to maintain.

Or, put another way, we usually have a different mental model that includes some intermediate abstractions that helps us answer the question "who can do what". Actually, I'm oversimplifying already - I've been describing the problem of access to "node" pages - but what about those listing pages? You'll also want to restrict what's listed, depending on who's looking, right? Let's forget that problem for a second though.

As an example, in Drupal, users are grouped according to 'role', and nodes are grouped by 'type'. In a common scenario, you might want to restrict nodes of type "secret" to only be visible by users with the role "spy". In fact, that scenario is nicely covered by some standard modules, so you can implement that one without thinking about it too much. Another important example is in the case of "Organic Groups", where access for group nodes depends on a users' membership in and administrative permissions for the group.

Those two examples are not only the most common implementations of node access, they're also an excellent demonstration of how to understand and implement Drupal's node access system. The key to simplifying node access in Drupal is the abstraction of groups of related users and nodes. This is done with the concept of a 'realm' which is best thought of as the context of groups of users accessing a particular group of nodes. If you were just thinking of the first example, then this seems like excessive abstraction, but if you think of the Organic Groups example, you'll see why we need this, where the "group of users" abstraction is contextually dependent on the node you're looking at.

So: here's a recipe for implementing Drupal's node access system.

1. write down in english all the access control restrictions that your system needs. Give each group of users and nodes sensible names.

2. reread the example implementation in the api.drupal.org (search for node_grants and the node_access_records hooks).

3. define the node_access_records function per node: for a given node, what are the relevant 'grants', i.e. permissions for each user group, where 'realm' corresponds to the user group within the context of that node.

4. define the node_grants function, which then converts the abstraction of the realm, operation and user into the actual yes or no.

So the machine actual flow will go like this:

1. find the corresponding row(s) in the node_access table for the current node.

2. for each row, run the corresponding node_grants function with the current user and realm to determine that user's permission for that node.

Be sure to test the outcome, particularly if you're protecting important content!

And finally - let's go back to the question of listing pages. If you've been following along so far, you may wonder why we have to use all these abstractions, and in fact, you wouldn't have to. You could (and can) just intervene with the nodeapi hook and hide nodes that the current user shouldn't have access to.

But it turns out that by using the 'realm' concept, and breaking the answer to the question into these two hooks, we can very efficiently generate lists of nodes per user that respect the node access.

If it sort of makes sense now, but you think someone has already solved your particular use case, then check out this overview of node access modules.