Sharing code to strengthen the open data ecosystem

A principle of Open Data Services is that we routinely reuse code across different projects and publish our work on Github. We think this makes sense — it means we can spend more time improving and tailoring our systems for a specific project, rather than reinventing the wheel.

Beyond our co-operative, we think sharing code strengthens the open data ecosystem overall. It avoids duplicating work, makes bugs easier to find and fix, and builds trust in tools.

But while Github repositories and open source licenses provide the means, we know that passively publishing open source code is not enough. Over the last five years, our co-operative has learned that building networks and collaborating is crucial to sharing code actually making an impact.

We’ve recently completed some work to standardise code used to test data for the Open Contracting Data Standard (OCDS) and the Beneficial Ownership Data Standard (BODS). Because we publish our work openly and work closely with other organisations, we knew the work was technically possible — so we began to build something more generically applicable.

Both the OCDS and BODS are described in a similar way, using JSON Schemas and CSV codelists. In order to use code written for OCDS for BODS, we had to separate out some of the bits of the code that were specific to the way OCDS works.

As part of the project, we wanted to make it easier for other people to reuse this code on other data standards that are described in a technically similar way. With the help of James McKinney from Open Contracting Partnership, we’ve created a separate repository of generic code where it can be worked on separately.

You can read the documentation and find out how to reuse the library. Our aim is to create code that will work as a testing tool on any data standard defined in a similar way.