Utilising Renovate's local
platform to make renovate-graph
more efficient
Last year I built renovate-graph
, a tool to extract the dependency trees for a given repository, which under the hood uses Renovate. I've been getting tonnes of value from it as part of how it fits into the wider dependency-management-data ecosystem, and providing more actionable data for understanding how you use internal and external dependencies in your projects.
However, I've found that when running this against several larger repositories, the performance starts to suffer, largely due to the way that renovate-graph
is a rather hacky wrapper around Renovate.
Whereas Renovate is expected to run against a fully cloned repository, so it can create branches with expected package changes, renovate-graph
just needs to run against (generally) the latest branch of a repository.
One option I'd been investigating to improve performance was to expose the ability to tune the arguments passed to git clone
by Renovate, so we could perform a shallow clone, but then I stumbled upon Renovate's local
platform.
The local
platform allows us to run against a local directory (that doesn't even need to have a .git
folder), which is perfect for renovate-graph
as it's a read-only operation to purely extract the dependencies for a given repo.
So what performance gains? Note that we're using renovate-graph
v0.15.1 for these comparisons.
If we take a somewhat unscientific comparison, we'll focus on using Kibana, which is a significant size - the repository checks out at 6.7GB, and the source-only archive at 670MB.
We'll first use renovate-graph
when executing against GitHub to clone + then process the repo without dependency updates lookup:
time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false npx @jamietanna/renovate-graph@v0.15.1 --token $GITHUB_TOKEN elastic/kibana
# 78.37s user 29.07s system 13% cpu 13:01.17 total
Next, if we perform the same process, but by pulling a source-only archive from GitHub + then process the repo without dependency updates lookup:
time gh api /repos/elastic/kibana/zipball/HEAD > kibana.zip
# 8.38s user 13.89s system 4% cpu 7:27.44 total
unzip kibana.zip
# 5.14s user 1.65s system 97% cpu 6.940 total
cd elastic-kibana-*
# HACK to avoid https://github.com/renovatebot/renovate/discussions/25202
rm renovate.json
time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false RG_LOCAL_PLATFORM=github RG_LOCAL_ORGANISATION=elastic RG_LOCAL_REPO=kibana npx @jamietanna/renovate-graph@v0.15.1 --platform local
# 17.66s user 3.08s system 126% cpu 16.433 total
To compare these:
platform=github | platform=local |
---|---|
781 | 454 |
So we can see that there's a 72% increase on processing with local
platform π (if I've done that maths correctly π
)