Skip to main content

CiviCRM's invoice_id field and why you should love the hash

I've been banging my head against a distracted cabal of developers who seem to think that a particular CiviCRM core design, which I'm invested in via my contributed code, is bad, and that it's okay to break it.

This post is my attempt to explain why it was a good idea in the first place.

The design in question is the use of a hash function to populate a field called 'invoice_id' in CiviCRM's contribution table. The complaint was that this string is illegible to humans, and not necessary. So a few years ago some code was added to core, that ignores the current value of invoice_id and will overwrite it, when a human-readable invoice is generated.

The complaint about human-readability of course is valid, and the label on the field is misleading, but the solution is terrible for several reasons I've already written about.

In this post, I'd like to explain why the use of the hash value in the invoice_id field is actually a brilliant idea and should be embraced. And sure, let's give it a different label.

The key issue is reconciliation with payment processors. Paypal was the first payment processor to be implemented, and still relies on that invoice_id as far as I know.

Reconciliation between CiviCRM and any payment processors can (and usually should) be done in two ways - you'll want to ensure that payments in CiviCRM have matching ones in the payment processor's records, and vice-versa. There are also cases where you really need reconciliation of some kind - e.g. when visitor has paid by Paypal comes back to CiviCRM, it needs to confirm the contribution, and also after an ACH/EFT payment request, there needs to be a confirmation in a few days when the payment has actually been attempted. In fact: instant, done and forever payments are really just a dangerous illusion from the use of credit cards. If your mental model of payment processing is exclusively credit card based, you're going to mess up eventually.

Now, most payments match up just fine, and for an on-site processor doing credit cards, reconciliation is often ignored, but there are a number of ways/times when it is important (assuming the data in your CiviCRM is important).

Case 1. A payment processes fine, but is later reversed (manually, or by the donor, for example) in the payment processor interface.

Case 2. A payment completes in the payment processor, but that information doesn't get communicated back to CiviCRM. This could happen for both externally hosted payment pages like PayPal, but equally an on-site payment processor that makes a request for a payment that goes through, but fails to capture the result (e.g. due networking or server issues).

Case 3. A payment is made manually through the payment processor.

CiviCRM has two fields to help us do reconciliation:

a. the invoice string that it sends to the payment processor (or at least, it does by default, any individual processor plugin may choose not to) and saves into the invoice_id field.
b. the transaction string that it gets back. That gets saved in both the contribution table and the transaction table (because now you can have more than one payment per contribution).

The two contribution table fields in CiviCRM have a hard-coded requirement to be unique (or empty), which means that when used properly, and with enough tools provided by the payment processor, we can do reconciliation, both manual and automated.

The important thing is that we need BOTH these fields if we are to cover all three of those cases - we need to identify matching entries, as well as unique unmatched entries in CiviCRM and unique unmatched entries in the processor.

And yes, if there is more than one payment against a contribution, we don't have uniqueness of the invoice number at the payment processor end of things for each payment, but that actually doesn't break anything.

Okay, having established that we need both fields, now the question is just - why do we need such an ugly string to send to the payment processor as an invoice id?

For that, there are actually two good answers:

1. When a payment via a payment processor is attempted in CiviCRM, typically no contribution record has been created. So there is no nice integer id that we can use to generate a human-friendly invoice id. We could do some gymnastics and add some extra code so that we were generating incremental ids each time, but that's not an easy problem to solve, and the numbers we generated would not be matched up with the contribution id numbers.

2. Global uniqueness is a good thing. The hash method for generating unique strings is also used for example in git. If we were to use some kind of incremental id and a different system (e.g. Drupal commerce?) was connecting to the same payment processor, we could have overlaps of invoice numbers, making reliable reconciliation impossible.

Okay, so enough already about the invoice_id field?

Addendum, April 28:

1. I discovered that the invoice_id field dates from CiviCRM version 1.3, so about the end of 2005.

2. My answer about not having a nice integer id available is incomplete - since 4.7, Eileen has been fixing stuff so that contributions get created as pending contributions before any payment attempts are made, and that almost allows us to change the default way that invoice_ids are generated, if we were to decide that makes sense (e.g. it might make sense for pay later contributions). But actually, there's still at least one code pathway where payments are attempted without contribution ids, so answer 1. is still a valid answer/reason.

Popular posts from this blog

The Tyee: Bricolage and Drupal Integration

The Tyee is a site I've been involved with since 2006 when I wrote the first, 4.7 version of a Drupal module to integrate Drupal content into a static site that was being generated from bricolage. About a year ago, I met with Dawn Buie and Phillip Smith and we mapped out a number of ways to improve the Drupal integration on the site, including upgrading the Drupal to version 5 from 4.7. Various parts of that grand plan have been slowly incorporated into the site, but as of next week, there'll be a big leap forward that coincides with a new design [implemented in Bricolage by David Wheeler who wrote and maintains Bricolage] as well as a new Drupal release of the Bricolage integration module . Plans Application integration is tricky, and my first time round had quite a few issues. Here's a list of the improvements in the latest version: File space separation. Before, Drupal was installed in the apache document root, which is where bricolage was publishing it's co

Orchestrating Drupal + CiviCRM containers into a working site: describing the challenge

In my previous posts, I've provided my rationale for making use of Docker and the microservices model for a boutique-sized Drupal + CiviCRM hosting service. I've also described how to build and maintain images that could be used for the web server (micro) service part of such a service. The other essential microservice for a Drupal + CiviCRM website is a database, and fortunately, that's reasonably standard. Here's a project that minimally tweaks the canonical Mariadb container by adding some small configuration bits: That leaves us now with the problem of "orchestration", i.e. how would you launch a collection of such containers that would serve a bunch of Drupal + CiviCRM sites. More interestingly, can we serve them in the real world, over time, in a way that is sustainable? i.e. handle code updates, OS updates, backups, monitoring, etc? Not to mention the various crons that need to run, and how about things

Building and maintaining Drupal + CiviCRM application containers

In my previous two posts, I provided some background into why I decided on using containers for a boutique Drupal + CiviCRM hosting platform, and why Docker and its micro-services approach is a good choice for building and maintaining containers. Although I promised to talk about orchestration, that was getting ahead of the story - first I'm going to look at the challenge of keeping your application containers up-to-date with OS and application-level updates. There's a fair amount of work in that, but the tooling is mature and there is lots of good documentation. A great place to start is to visit the official Drupal docker hub page . From there, you can pull a working Drupal code container, and it gets re-built frequently with all the OS and Drupal-code updates, so you just refresh your containers whenever you want (i.e. whenever a security release comes out, or more often to stay up-to-date). A nice thing about that project is that it demonstrates a technique for mainta