Written by Jamie Tanna
on September 3, 2024
CC-BY-NC-SA-4.0 Apache-2.0
3 mins

How to use Dependency Management Data to discover which dependencies are participating in Hacktoberfest

Featured image for sharing metadata for article

As I've mentioned before, the fact that it's September means that it's almost October, and October primarily means one thing for me: Hacktoberfest 🎃👕👚🎽

Two years ago, the precursor to dependency-management-data was created as part of the blog post Analysing our dependency trees to determine where we should send Open Source contributions for Hacktoberfest, which has a more fleshed out history of inception if you're interested.

As the first full year since the project was started, following its official birthday in February, I wanted to take this opportunity to consider - how would I do the same thing in 2024, given I have a much better understanding about how I use Open Source, thanks to dependency-management-data, and the data it understands.

I'd hoped to finish this for September 1st, but I didn't end up doing so (as you can see from the publish date), and in the last few hours I noticed that the new Hacktoberfest website and branding is live, so this is a perfect time to ride on the coattails of hype and get this post out.

With dependency-management-data, one of the key things is that once you've ingested your dependency data you can then start querying it, for instance using pre-built "reports".

To make the ability to query for repositories participating in Hacktoberfest, there's now a new report in dependency-management-data v0.106.0 which allows you to run a report such as:

# linebreaks added for readability
dmd report hacktoberfest --db dmd.db \
  --perform-external-lookup \
  --platform gitlab \
  --organisation tanna.dev \
  --repo ghprstats

This then provides you a view of which dependencies - if any - are participating in Hacktoberfest by using the hacktoberfest topic on their GitHub or GitLab repos.

Notice that when calling the report, we need to explicitly mention the Repo Key to specifically query a given repository in our database.

This is because there's a few external lookups:

discover the URL for the repository, via Ecosystems
if it's a GitLab repo, call out to gitlab.com's APIs to check the repository topics
if a GITHUB_TOKEN environment variable is set, call out to github.com's APIs to check the repository topics
otherwise (if there's no GITHUB_TOKEN or if a non-GitHub or non-GitLab repo) we use the data straight from packages.ecosyste.ms

To avoid a significant set of outbound traffic, I've made the choice to - right now - only support looking up one repo at a time, which should reduce the overhead, as a couple of datasets I've queried this with have tens of thousands of dependencies to look up, which is quite significant outbound traffic 🫣

So what does this look like?

For instance, on my ghprstats project, which does have some repos participating:

And for a repository that doesn't have any dependencies that are participating, like my readme-generator project:

This will hopefully be useful for folks who are looking for take their existing pre-built dataset of dependencies and work out where they can find dependencies that are of high value to the organisation (either used in a large number of repositories across the org, or that is a high percentage of the dependencies in use at the org) and be a good opportunity to give back to the community 👏🏼

What other information do you think would be useful to add here? Should I add dependency-management-data to projects participating in Hacktoberfest?

Written by Jamie Tanna on Tue, 03 Sep 2024 21:24:20+01:00, and last updated on Tue, 03 Sep 2024 21:24:20+01:00.

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#open-source #hacktoberfest #dependency-management-data.

This post was filed under articles.

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.

← → Top