Dependency Management Data is now a lot easier to work with when using Software Bill of Materials
On Sunday night I released v0.100.0 of dependency-management-data, a huge release in a few ways.
First of all, it is in name a very large numerical release milestone to hit, and incidentally release number 158, and corresponded with me hitting post number 1000 on my blog.
Secondly, it introduces a much revised model for consuming and interacting with Software Bill of Materials (SBOMs).
When I first added support for SBOMs in September 2023 (as part of v0.38.0) I was primarily working with SBOMs that GitHub produced, and while I was there, added general support for a few SBOM formats.
As it came clear to me through feedback from a few prospective users, including through some discussions while at State of Open Con 2024, the SBOM support in dependency-management-data was a little shoddy.
In particular, this shoddy implementation was due to an assumption around the expectation that we should always have a Repo Key for our dependencies, which allows tracking which source repository a dependency scan comes from.
However, one thing I soon found out with SBOMs is that they don't have that information available, for instance as it may be provided to you by a vendor, built from a container, a binary, or you're just not going to as easily be able to work out where it came from.
Assuming that a Repo Key exists when performing a scan with renovate-graph
is very reasonable, as renovate-graph
is pointed at a source repository. But when you're scanning binaries, that's not the case.
This assumption led to quite a rough edge when working with SBOMs, which was slightly reduced with the dmd import bulk
command, but the requirement to know which repo a scan is for up-front was still the big concern.
After starting to look at this 3 months ago I ended up deprioritising it, as it was quite a hard problem, as the dependency-management-data codebase expects that Repo Key to be used.
Well, as of Sunday's release, SBOMs are much easier to work with, as the Repo Key is now optional for SBOMs, and we instead track the SBOM based on a "component name".
Previously, you would have to run:
dmd import sbom --db dmd.db sbom/requests_earthboundkid_9ab79a0c8b462518a3adac7c4ca21289eb2557f4.json --platform github --organisation earthboundkid --repo requests
However, now you can run:
# auto-detect the `component_name` column
dmd import sbom --db dmd.db sbom/requests_earthboundkid_9ab79a0c8b462518a3adac7c4ca21289eb2557f4.json
And if you manage to work out which repository the SBOM was produced from, you can add that metadata later, instead of requiring it up-front.
Additionally, this allows for a few other ways to import the SBOM data:
# or override the `component_name`
dmd import sbom --db dmd.db sbom/requests_earthboundkid_9ab79a0c8b462518a3adac7c4ca21289eb2557f4.json --component-name github.com/earthboundkid/requests
# or say that this is a vendor-provided SBOM
dmd import sbom --db dmd.db sbom/mux_gorilla_db9d1d0073d27a0a2d9a8c1bc52aa0af4374d265.json --vendor ExampleCorp --product 'Web Server'
Hope to hear feedback about the new approach, in particular how the new component_metadata
table can be used to provide that metadata.