Skip to content

v0.4.0

Latest
Compare
Choose a tag to compare
@zimmski zimmski released this 26 Apr 12:42
· 123 commits to main since this release
8a38762

0.4.0 (2024-04-26)

Deep dive into evaluation with this version: https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/

This release's major additions are

  • Java as a new language,
  • automatic Markdown report with an SVG chart,
  • and lots of automation and testing to make the evaluation benchmark super reliable.

Features

  • Java language adapter with “java/plain” repository #62
  • Scoring through metric points and ranking of models #42
  • Automatic categorization of models depending on their worst result #36 #39 #48
  • Fully log per model and repository as results #25 #53
  • Migrate to symflower test instead of redoing test execution logic #62
  • Automatic installation of Symflower for RAG and general source code analytics to not reinvent the wheel #50
  • Generate test file paths through language adapters #60
  • Generate import / package paths through language adapters #63
  • Generate test framework name through language adapters #63
  • Human readable categories with description #57
  • Summary report as Markdown file with links to results #57 #77
  • Summary bar chart for overall results of categories as SVG in Markdown file #57

Bug fixes

  • More reliable parsing of code fences #70 #69
  • Do not exit process but instead panic for reliable testing and traces #69