Quick ecosystem analysis #3688
Replies: 2 comments 5 replies
-
@pawamoy Glad that pipdeptree was useful in analysing dependencies! I have not been actively involved in contributing to/maintaining pipdeptree for quite some time now. So I'd like to extend the shout to @gaborbernat and other contributors, @kemzeb, @xiacunshun, who have been contributing in a major way recently. Cheers! |
Beta Was this translation helpful? Give feedback.
-
I had fun writing a quick script that fetches dependencies on all projects in the catalog and outputs simple Mermaid diagrams for a given projects, taking into account its dependencies and reverse dependencies (projects that depend on this project). # deps.py
import json
import re
from collections import defaultdict
from pathlib import Path
import httpx
from packaging.requirements import Requirement
from yaml import safe_load
def normalize(name):
return re.sub(r"[-_.]+", "-", name).lower()
def get_direct_dependencies(client, package_name):
response = client.get(f"https://pypi.org/pypi/{package_name}/json")
if response.status_code == 200:
data = response.json()
return data["info"].get("requires_dist", None) or []
else:
print(f"Error fetching package information for {package_name}")
return []
catalog = safe_load(Path("projects.yaml").read_text())
projects = {normalize(project["pypi_id"]) for project in catalog["projects"] if "pypi_id" in project}
projects |= {"mkdocs", "markdown", "jinja2"}
dependencies = defaultdict(list)
reverse_dependencies = defaultdict(list)
with httpx.Client() as client:
for project in projects:
print(f"Fetching dependencies for {project}")
deps = get_direct_dependencies(client, project)
for dep in deps:
try:
req = Requirement(dep)
except Exception:
continue
name = normalize(req.name)
if name in projects and name != project:
dependencies[project].append(name)
reverse_dependencies[name].append(project)
with Path("dependencies.json").open("w") as f:
json.dump({"dependencies": dependencies, "reverse_dependencies": reverse_dependencies}, f, indent=2) Install packaging, HTTPX and PyYaml in a venv, run this script once inside the # graph.py
import json
import sys
with open("dependencies.json") as f:
dependencies = json.load(f)
project = sys.argv[1]
print("flowchart LR")
for dep in dependencies["dependencies"].get(project, ()):
print(f" {project} --> {dep}")
for rev_dep in dependencies["reverse_dependencies"].get(project, ()):
print(f" {rev_dep} --> {project}") Note that some projects use optional dependencies to store their development dependencies. These will appear in the graph, as there's no way to know they are development dependencies. |
Beta Was this translation helpful? Give feedback.
-
I've taken all PyPI ids from the catalog, installed them in a venv, and used
pipdeptree ... --mermaid
to create a big Mermaid diagram of the ecosystem. It uses a left-right layout, but a more organic layout could yield a better output, feel free to use another tool like Graphviz to experiment 🙂Shout out to @gaborbernat and @naiquevin, maintainers of
pipdeptree
😉requirements.txt
To obtain the following list:
You'll notice some packages commented out: they were causing resolution conflicts, couldn't be compiled, or depended on packages that couldn't be compiled. I was pleasantly surprised to see that only 7 packages couldn't be installed, at that everything else is "compatible" (as in how they specify their dependencies).
I recommend using uv to install this rather long list of packages:
cd /tmp uv venv --seed uv pip install -r requirements.txt uv pip install pipdeptree
Additional notes:
While cleaning up the output of pipdeptree (to remove packages that aren't relevant to MkDocs and its ecosystem), I noticed a lot of packages depending on BeautifulSoup. This could be valuable data if we want to start a "performance" campaign, where we would reach out to plugin authors that use BeautifulSoup to parse HTML and show them how to use Markdown extensions instead. There possibly are other insights within
pipdeptree
's output 🙂Beta Was this translation helpful? Give feedback.
All reactions