0.4.0 (2024-04-26)

This release's major additions are

Java as a new language,
automatic Markdown report with an SVG chart,
and lots of automation and testing to make the evaluation benchmark super reliable.

Features

Java language adapter with “java/plain” repository #62
Scoring through metric points and ranking of models #42
Automatic categorization of models depending on their worst result #36 #39 #48
Fully log per model and repository as results #25 #53
Migrate to symflower test instead of redoing test execution logic #62
Automatic installation of Symflower for RAG and general source code analytics to not reinvent the wheel #50
Generate test file paths through language adapters #60
Generate import / package paths through language adapters #63
Generate test framework name through language adapters #63
Human readable categories with description #57
Summary report as Markdown file with links to results #57 #77
Summary bar chart for overall results of categories as SVG in Markdown file #57