Creating renovate-packagedata-diff to diff Renovate package data dumps

Featured image for sharing metadata for article

Over the last couple of years, I've been working with Renovate's package data dumps, as part of renovate-graph and work towards dependency-management-data.

These data dumps are either via renovate-graph, or from the debug logs, or using the experimental reportTypes and consist of a (large) JSON blob that contains information about the packages detected for a given repository.

These blobs are super important for the operation of dependency-management-data, and anyone wanting to programmatically work out the dependencies they have in a given repository, for instance using Renovate maintainer Sebastian Poxhofer's work on a Backstage plugin.

At least for usage with dependency-management-data, the recommended use of these blobs is to be committed to source code (un-prettified), and then i.e. periodically rebuilding the dependency-management-data database.

One problem with them being large JSON blobs, however, is that they're pretty unwieldy to look at.

They're purposefully stored as un-prettified JSON, to avoid ever-so-slight but unnecessary storage space, but that means that showing a git diff is super unhelpful, at least out-of-the-box.

Additionally, there's no way at-a-glance to see whether some of the diff between files is useful, without doing a diff of the pretty-printed before/after, and then you the user knowing which fields are important and are not.

You could use a fancy diff tool i.e. icdiff to do the diff, which results in i.e.:

But as noted, which of these fields are actually important?

(also note that icdiff takes a while to compute the diff here, given this is a file is ~8300 lines prettified, or 196K of un-prettified text!)

For instance, for some package ecosystems, Renovate will indicate the currentVersionAgeInDays:

{
  "autoReplaceStringTemplate": "{{depName}}/restore@{{#if newDigest}}{{newDigest}}{{#if newValue}} # {{newValue}}{{/if}}{{/if}}{{#unless newDigest}}{{newValue}}{{/unless}}",
  "commitMessageTopic": "{{{depName}}} action",
  "currentDigest": "6849a6489940f00c2f30c0fb92c6274307ccb58a",
  "currentValue": "v4.1.2",
  "currentVersion": "v4.1.2",
  "currentVersionAgeInDays": 43,
  "currentVersionTimestamp": "2024-10-22T12:33:17.000Z",
  "datasource": "github-tags",
  "depName": "actions/cache",
  "depType": "action",
  "fixedVersion": "v4.1.2",
  "packageName": "actions/cache",
  "registryUrl": "https://github.com",
  "replaceString": "actions/cache/restore@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2",
  "sourceUrl": "https://github.com/actions/cache",
  "updates": [

  ],
  "versioning": "docker",
  "warnings": [

  ]
},

This isn't as important as knowing that the version actually in use has changed, or that a dependency was deleted, and can make the diff far too noisy.

Because I spend a surprising amount of my time looking at these diffs, and because I want to start having a means to perform diffs of package data in CI, I set about building something to provide a human-readable diff of these data dumps, roughly 7 weeks ago (of on and off work).

With this, I've now released a new CLI in the dependency-management-data ecosystem, renovate-packagedata-diff which aims to do this for you.

Now, instead of seeing a horribly unhelpful diff like so:

You will now get a much prettier - and human-readable - diff, indicating what's been added/modified/removed:

This is a huge quality-of-life improvement, and I've found this to already be thoroughly useful.

This should work between different types of data exports, and do not currently support diffing Renovate reports as they're a little bit more complex, but they are supported by the full dependency-management-data tooling.

As written about in the docs, this is something you can wire in via your local Git config and .gitattributes, and then have it on by default (ish).

I say ish because Git requires you add the --ext-diff flag to allow calling the external command i.e. git log -p --ext-diff.

In the future I'll finalise the changes to expose this for CI purposes (with a --json flag) and fix a bug where the same dependency (with multiple version) in the same package_file results in incorrect diffs, but for now, I'm really happy with the result πŸš€

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#renovate #dependency-management-data.

This post was filed under articles.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.