Creating renovate-packagedata-diff
to diff Renovate package data dumps
Over the last couple of years, I've been working with Renovate's package data dumps, as part of renovate-graph
and work towards dependency-management-data.
These data dumps are either via renovate-graph
, or from the debug logs, or using the experimental reportType
s and consist of a (large) JSON blob that contains information about the packages detected for a given repository.
These blobs are super important for the operation of dependency-management-data, and anyone wanting to programmatically work out the dependencies they have in a given repository, for instance using Renovate maintainer Sebastian Poxhofer's work on a Backstage plugin.
At least for usage with dependency-management-data, the recommended use of these blobs is to be committed to source code (un-prettified), and then i.e. periodically rebuilding the dependency-management-data database.
One problem with them being large JSON blobs, however, is that they're pretty unwieldy to look at.
They're purposefully stored as un-prettified JSON, to avoid ever-so-slight but unnecessary storage space, but that means that showing a git diff
is super unhelpful, at least out-of-the-box.
Additionally, there's no way at-a-glance to see whether some of the diff between files is useful, without doing a diff of the pretty-printed before/after, and then you the user knowing which fields are important and are not.
You could use a fancy diff tool i.e. icdiff
to do the diff, which results in i.e.:
But as noted, which of these fields are actually important?
(also note that icdiff
takes a while to compute the diff here, given this is a file is ~8300 lines prettified, or 196K of un-prettified text!)
For instance, for some package ecosystems, Renovate will indicate the currentVersionAgeInDays
:
{
"autoReplaceStringTemplate": "{{depName}}/restore@{{#if newDigest}}{{newDigest}}{{#if newValue}} # {{newValue}}{{/if}}{{/if}}{{#unless newDigest}}{{newValue}}{{/unless}}",
"commitMessageTopic": "{{{depName}}} action",
"currentDigest": "6849a6489940f00c2f30c0fb92c6274307ccb58a",
"currentValue": "v4.1.2",
"currentVersion": "v4.1.2",
"currentVersionAgeInDays": 43,
"currentVersionTimestamp": "2024-10-22T12:33:17.000Z",
"datasource": "github-tags",
"depName": "actions/cache",
"depType": "action",
"fixedVersion": "v4.1.2",
"packageName": "actions/cache",
"registryUrl": "https://github.com",
"replaceString": "actions/cache/restore@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2",
"sourceUrl": "https://github.com/actions/cache",
"updates": [
],
"versioning": "docker",
"warnings": [
]
},
This isn't as important as knowing that the version actually in use has changed, or that a dependency was deleted, and can make the diff far too noisy.
Because I spend a surprising amount of my time looking at these diffs, and because I want to start having a means to perform diffs of package data in CI, I set about building something to provide a human-readable diff of these data dumps, roughly 7 weeks ago (of on and off work).
With this, I've now released a new CLI in the dependency-management-data ecosystem, renovate-packagedata-diff
which aims to do this for you.
Now, instead of seeing a horribly unhelpful diff like so:
You will now get a much prettier - and human-readable - diff, indicating what's been added/modified/removed:
This is a huge quality-of-life improvement, and I've found this to already be thoroughly useful.
This should work between different types of data exports, and do not currently support diffing Renovate reports as they're a little bit more complex, but they are supported by the full dependency-management-data tooling.
As written about in the docs, this is something you can wire in via your local Git config and .gitattributes
, and then have it on by default (ish).
I say ish because Git requires you add the --ext-diff
flag to allow calling the external command i.e. git log -p --ext-diff
.
In the future I'll finalise the changes to expose this for CI purposes (with a --json
flag) and fix a bug where the same dependency (with multiple version) in the same package_file
results in incorrect diffs, but for now, I'm really happy with the result π