Crowdsourcing information about government in the early 21st century

I actually had no commitments today (why is it we all seem so much busier than we were before the pandemic?), so I’d set aside the afternoon to read and write about the the new project being organised by The ODI, in conjunction with Gavin Freeguard (of Data Bites fame). It’s an effort to start collecting information about all of the organisations within the UK government that are in some way associated with a responsibility related to data.

You can read a better explanation in the original blog post.

One of the key things about this work is that it is being crowdsourced. Anyone can add comments to, or directly update the in-progress document.

It turned out that I don’t actually have a lot to say about this document. If you’re interested in what myself and others have to say about specifics, you can read the comments on the document itself. The TL;DR is that I think it’s an important effort, that we still need a lot more transparency (e.g. about the Cabinet Office Transparency Team) and that, as well as listing the organisations hierarchically, it would be great to additionally list them by theme (e.g. “data quality”, “artificial intelligence” or “data ethics”) in order to highlight where a lot of groups seem to have very similar overlapping remits.

However, what spending a couple of hours reviewing the 60+ page document did do was give me time to reflect on the whole nature of how we’re still collecting and publishing such government information in 2021.

This crowdsourcing document is only open until the 10th of September, after which my assumption is that it will be refined, reviewed, approved and published as a report, to go in the large pile of 100+ page interesting-looking PDFs that many of us never find time to read (I’m still trying to get around to reading “Socioeconomic background and career progression within the Civil Service” from the Social Mobility Foundation).

This leads me to four key questions, specifically about this piece of work — but also about so many reports related to government in general.

  1. Who is the work for?
  2. What is its purpose?
  3. How is it curated?
  4. How is it paid for?

As with most things, working out the answer to the first question leads to answers to much of the rest — you can’t collate your user needs until you’ve identified your users.

In this case, it pays to speculate on who government reports are usually aimed at. My assumption is Ministers, MPs / Lords, senior civil servants, lobbyists, think tank staff, the press and, some way behind, casually interested third parties such as myself.

Once that’s established, it becomes easier to understand the purpose of the effort. I believe it’s two-fold. Firstly, to use short blocks of information to influence decision makers (e.g. “look how many government organisations say they are responsible for AI ethics”), especially those that only have time to read the executive summary. Secondly, for people who have more time, to provide enough depth to build a compelling narrative, to either change people’s minds or, for those already on board, ammunition to change the minds of others.

Either way, the thing that both potential paths have in common is that it likely gives the document in question a very short active shelf-life. It’s created to achieve a specific point and then, probably within three months, is filed away in Google Drive and on a shelf in the Parliamentary Library, unlikely to ever be looked at again.

This process leads to simple answers to questions three and four, namely that because the report has to be put together relatively quickly with a strong narrative, it will require a very small group (often one person) to do the vast majority of research, collate it, decide what is important, draft the document, organise any reviewing and then arrange to have it published. This naturally means that a specific pot of money must be set aside to pay the people involved.

The crowdsourcing we see for this paper is great — more contributors means more breadth and depth, as well as better error-checking (see Linus’s Law). But what it doesn’t do is fundamentally challenge the model of how this kind of information is gathered, stored and made available for those people who are interested in it.

In order to discuss a better way, we need to return to question one above. Is there an opportunity to open up the value captured in papers like this to more people. In general, can the information be collected and stored in a way to easily be made useful to civil servants and others who are directly involved in day-to-day policy, delivery and operational work?

As the many data-oversight groups referenced in the crowdsourced document will tell you, in order for data to be useful it has to fulfil a number of criteria. It must be at least the following:

  • Owned
  • Timely
  • Accurate
  • Have a traceable origin
  • Have appropriate metadata

In order for this kind of information to continue to meet these criteria, it must be constantly updated. In order for that to happen, there would need to be a relaxation in the rules about who is allowed to make updates (trust more people but track what they do) and, in order for that to happen, we’re now looking at a pool of volunteers — possibly including people in existing governments roles where updating such a resource becomes a minor part of their job.

In case I’ve not managed to be obvious enough about my direction of conversation yet, what I’m obviously talking about is a wiki.

There have been discussions about public-facing wikis in the civil service for at least the whole 10 years I’ve worked in the public sector. Some of these conversations have been quite high-level, including the Knowledge, Information and Records Management team in the Cabinet Office. It’s also a frequent topic of discussion on the cross-gov slack. Yet, it’s 2021 and perhaps the closest step in that direction, the GDS Service Manual, is still a half-way house from the old model, being locked down to a small team in that organisation, who are paid to look after it.

We know the government is risk averse — and that can be a good thing, particularly in these especially trying times. However, it’s now 20 years since the foundation and incredible success of Wikipedia and the government still shows no signs of even trying out this thoroughly tried and tested model. If there are still too many valid objections, it could start as a small-scale test with a limited number of initial editors from a spread of departments and an approval process to add new ones (although I would personally enable anyone with a email address to become an editor as I trust people to do the right thing and there’s always the ability to revert pages).

Despite the continued conversations about this and occasional slack threads along the lines of “we should just set one up ourselves — ask for forgiveness, not permission”, I unfortunately don’t see this happening any time soon (blame the tabloid press amongst other things).

Which brings us full circle back to this report. I’m sure the ODI will produce an excellent publication and, for a short time, it may very well be influential for its targeted audience. However, quickly, both the authors and the readers will move on and, six months later, the questions about how government data teams are organised will remain, but this effort to produce a response will be past its sell-by date.

It’s one thing to crowdsource information to go into a report, it’s quite another to set up a long-running resource and build a community around it, in order to enable everyone in the public sector, not just those that read long PDFs, to have something they can find useful for their jobs over the long term.

Senior Delivery Manager at HackIT. Ex-GDSer. Co-organiser of unconferences. Opinionated when awake, often asleep.