How to avoid chasing vampires off cliffs + testing modular research – COVEREDINBEES / A hive of TLDR & villainy

I’ll come to the vampire thing in a minute.

This post smushes two things together.

First: I’m testing a prototype writing/output structure that (1) is built to be modular, iterative and as me-friendly as possible, so work cycles are more fluid, but can build to larger outputs; (2) is reproducible/open with each stage permanently stored with its own digital object identifier (DOI via Zenodo); (3) doesn’t silo itself in academia / is open to (and acknowledges) all contributors wherever they are, and however small/large the contibution (including e.g. a useful LinkedIn comment); (4) with tools built in to make sure LLM use in writing and code is clearly delineated just as an academic would with any other source, including automatic inclusion of LLM back and forths.

Here’s a first go. What this looks like:

A self-contained project folder with all the code, data and write-ups (this is very rrtools-like) and a landing page that auto-populates from the docs folder (github action here).
A docs folder containing discrete ‘chunks’ of work. Here’s the first chunk - more on that below. When each of these is reasonably complete, I can mark it as a release with a DOI number (this 1st chunk is v0.1.1). Each release is archived, and shorter-chunk work has some provenance that can be referred back to.
The docs folder can contain anything else for public consumption like data viz, as it does here.
There’s also room in there to produce punchier, shorter summary docs targeted at different audiences but building on the same material e.g. slides for policymakers.
README and FEEDBACK markdown scripts explaining the project scope and roadmap, and how to comment / feed back / discuss, including instructions for creating a github account, though that’s not necessary. (See here for Claude-Code drafted sources on ways to do better than the author CRediT system / build on github tools.)
I’ve also tried to build in tools to make LLM use clearly separable from my own work - as I would with any other work that isn’t mine. See here for an attempt at draft LLM guidelines. Short version: un-cited LLM use isn’t different from plagiarism (though there are arguments to be had - see below again). The project has folders both for LLM-produced summaries/memory docs (marked as non-human-produced) and another for all LLM prompt back-and-forths used in project development (output as human-readable markdown via the export_all_convos.py script; currently only for Claude Code, and I’m working in Visual Studio Code). The first chunk output makes very clear where LLMs were used.

In this first project - RegEconWorks - I’m aiming to gather my work on all things regional-GVA, synthesise thoughts on it, and cycle through any feedback I can pester out of people, in whatever form.

Second…

The actual output

This first chunk is asking:

What difference could adding uncertainty to regional GVA numbers make?

It’s common to acknowledge that ‘spurious accuracy’ is an issue. But lacking any decent uncertainty guesses, we can’t think through the implications very easily. So this makes a data-driven guess at what the uncertainty could be and how that changes GVA at regional level. The ONS rules out error rates for GVA as ‘too complex’ - which it is, within a national accounts framework. But we’re doing some “what ifs” here because I think it’s worth thinking through, with some actual numbers, how much difference it could make.

As it says:

Uncertainty is powerful because it short-circuits many unhelpful ways of thinking about growth and industrial strategy, including spurious rank-building. Error bars direct us back towards our regions, what we concretely know about them, and how that knowledge is built into the choices and structures we make.

Right, the vampires. The chunk starts with a scene from the Lost Boys, mapped to type I and II errors, where the protagonist finds himself racing bloodsuckers through fog towards what transpires to be (spoilers) a deadly cliff edge. This hammers the point home a bit bluntly, but… is head vamp David’s ‘incredible certitude’ (Manski) like using ‘exact’ numbers to steer our economies by?

My point: introducing uncertainty doesn’t mean making things murkier. Quite the reverse - by understanding what’s fog and what’s not, we can extract signal from the mist. That doesn’t just mean ‘better decisions’ - it can change how we think and how we steer. E.g. what does it do to the kind of horse-race analysis that ‘exact’ numbers make so beguiling? (See Prof. Richard Harris’ excellent old piece on what uncertainty in university rankings would mean, for comparison.)

The chunk is written somewhere between academic and informal blogpost style. I have deliberately included “think is what I think and why”, and I’ve ended with some open questions to chew over for the next piece, which I hope will examine the following question:

“If we accept this uncertainty into our regional growth data, what are the decision-making implications?”

So…

Why modular?

Because I want to do work that:

Is nimble enough to build up smaller chunks of work sequentially and change direction in the light of new input, can also have a permanent record of progress and contributions, and a DOI number so versions are never lost;
Doesn’t get stuck in a silo, allowing me to talk to anyone across academia, policy, community, wherever;
Doesn’t force things into the slow, awful academic paper cycle of death, but instead lets me build from smaller chunks to larger pieces - some way earlier than the ‘preprint’ stage, which still has to be a completed paper;
Is open to input from / sharing to anywhere, and acknowledgement for anyone.

That makes it sound more planned than it was. This has evolved. Some background:

It’s been about 2.5 years since I started my secondment to the South Yorkshire Mayoral Authority (SYMCA) through Y-PERN. My elevator pitch has always been: we want to strengthen the glue between the region’s universities and other anchor institutions, as well as Yorkshire and the Humber’s communities. Some of this has been easy, gone with the grain. For other things - including research/report outputs - the cogs tend to grind more. Almost every single institutional structure and incentive differs between universities and regional government, which has meant progress is often made despite how we work, not because of it, built on relationships between people who care about the same things.

At the same time, change is happening stupidly fast. We’re firmly in a ‘no-one knows where this will land’ phase, I think. Universities are in crisis as old funding models fold, the government not showing much sign of comprehending the scale of that. Throw in this new LLM world where humans have just lost their monopoly on the written word. Just in the past couple of months, systems like Claude Code (more on that below) are realising the promise/threat of LLMs to upend how knowledge works (see Naomi Alderman’s Don’t Burn Anyone at the Stake Today for an amazing long view of this).

That’s all been sloshing around in the back of my skull as I’ve mulled how to digest my work with SYMCA and Y-PERN so far. (I’ve been keeping a list of open code and outputs here, and stuck much of the work including how-tos here.)

And personally, I’ve been trying to design a me/task/environment combo that actually goes with my own brain-grain. Traditional academic papers don’t do that so much. That’s true for many people, but we continue to bash ourselves against it, even while the whole system is making ominous creaking, groaning noises.

Oh no, LLMs

Yeah, afraid so. As I’ve mentioned before, I’m aware most of us are sick to death of People’s Opinions About LLMs, and the fact that 90% of social media posts now mention them.

The last two months have been a turning point for me. Claude Code has shifted this technology from ‘fascinating but deeply flawed’ to ‘my job and probably entire sectors just changed forever’. Pretending this isn’t happening probably won’t help, though I get the temptation. So we need some rules for ourselves as we try to navigate.

(I’ve hived off a whole ramble about LLM ethics, what it might to do to my sector and just how terrifyingly powerful Claude Code is, we’ll come back to that.)

So I’ve put ‘acknowlegement of LLM use’ fully into the project structure, and you can see LLM use acknowledgements listed in the first chunk here.

This piece has my first attempt at some principles I’m using (linking to some LLM-gathered sources on the subject) to make sure mine and anyone else’s work is clearly separable from robot output. As mentioned, the project folder also has a (Claude-Code-written) python tool that converts LLM back-and-forths taking place in the project folder locally into human-readable markdown automatically, so there’s a full trace of how work was produced.

The fundamental principle seems simple enough - it should be super-clear which words and code I did and didn’t make. This is just basic academic integrity, yes?

That piece on experimental principles acknowledges there are blurry and difficult cases, and that it pushes against the LLM marketing - which is “full of suggestions that co-pilot should quietly slip into your workflow and write all your words for you, whether email, teams comments, slide prep.”

There may be work situations where that’s OK. But for this kind of project, where it’s imperative everyone gets a correct nod for their input, it seemed like a good chance to test how we can try to (a) use these tools for our benefit while (b) making it super-visible who’s producing what.

This connects back to the ‘different incentives’ point above. The reason academic integrity rules exist, the reason plagiarism (including self-plagiarism) is a thing is precisely because credit is the lifeblood of academia. Credit should of course always go to the right people, university or not. But academia is particularly vulnerable, since credit equals career more explicitly than any other sector.

In that environment, the urge to use LLMs to appear more productive must be difficult to resist for many. But the fundamental point - you shouldn’t pass off anyone else’s work as your own - didn’t go away. LLMs can, in fact, make us more productive - that’s now very true for coding. But it needs to happen openly and ethically. Otherwise it’s no different to plagiarism.

I’m not saying I’ve got that right here, but I am trying to build it in. Feedback on this, and anything else, welcome.

…

OK, let’s see how this experiment goes. Now to harangue some people for views on the first chunk, and discover what dreadful mistakes I’ve made.