Skip to content

Checkstyle GSoC 2015 Project Ideas

Roman Ivanov edited this page Mar 6, 2015 · 25 revisions

Project Name: Multi-thread mode for Java files processing

Skills required: intermediate Java, knowledge of Checkstyle code base.

Project type: new feature implementation.

Project goal: to improve performance of validation by introducing special multi-threading mode for Java file processing.

Description:

The source code validation in big projects can take a long time which can force users to stop using Checkstyle during on-commit builds or stop using it in IDEs as it can slow down IDE significantly. Fixing validation issues right before the release or ones in a week is not a good idea as it can result in major refactoring and can take significantly more time than it could during writing a code. Validation in compliance with Google Java Style Guide on whole Guava project (1700 files) takes about 3 min, validation on openjdk (16400 files) sources takes 2 hours 30 min. Improving performance of validation will make Checkstyle more desirable tool to be used by users.

Tasks to be done during project:

  • analyze the whole set of Rules and find Rules that already comply with multi-thread requirements and mark them by special annotation and run them in multi-thread mode;
  • use standard Java threads or use lightweight threads (Quasar, Akka, ... ) to process Java files;
  • provide detailed report of performance improvement and recommendations on how to design Rule implementation to be multi-thread compliant;
  • provide report on how performance can improve after rewriting Rule to multi-thread compliant algorithm;
  • extensive testing on a variety of open-source projects.

Prove of necessity: issue138; issue600; results of open survey.

====================================

Project Name: Flexible Suppression model

Skills required: intermediate Java.

Project type: new feature implementation.

Project goal: to ease user experience in introducing Checkstyle to a project.

Description:

To start using Checkstyle in big projects is a big challenge:

  • code clean-up (that will be a result of Checkstyle validation) is usually out of project development plan;
  • developers are overloaded with functional issues that cause problems to users, so functional problems are always a priority;
  • all Checkstyle fixes have to be applied gradually and without postponing releases, but current Suppression model is not able to allow to do this easily;

The only thing that prevents users from using Checkstyle is that it requires huge and unavoidable code refactoring at the beginning.

Checkstyle need to provide users the ability to enforce Rules on newly created code lines and suppress violations on old/legacy code till engineers are ready for clean-up. We need to invent new Suppression model that is flexible to code changes. This will allow user to have legacy code and new code in the same file and not suffer from updating suppression configuration on any code changes. Unlike of suppression based on lines' numbers in file, new model should be based on [Abstract Syntax Tree (AST) ](http://en.wikipedia.org /wiki/Abstract_syntax_tree) structure.

In scope of the project it isrequired to create Suppression file generator to ease suppression configuration creation for legacy code.

Prove of necessity: discussion with Google Guava team; discussion with openjdk team; results of open survey.

====================================

Project Name: Regression Testing Tool and HTML Report Generator

Skills required: basic Java or Shell or Groovy or Scala, basic understanding of testing principles.

Project type: creation of testing tools.

Project goal: to enforce quality and ease new Rule acceptance to project.

Description:

Regression testing tool:

Checkstyle needs a tool that will do regression testing on all existing Rules. Tool needs to ensure that after fixing an issue new problems are not introduced. That will help us to assure quality of Rules that belong to Google and old Sun convention. Tool should simply generate reports on specified open-source projects and compare them with previously generated report.

Detailed HTML report generator:

Because of simplicity of writing validation Rules, Checkstyle becomes more and more popular. More and more generic and organization specific Rules are appearing. There already are several projects (see links below) that represent extensions to Checkstyle. So a lot of Rules are waiting to be integrated to main project and being used in main distribution package by the whole Java community.

Introduction of new Rule is not an easy process since it requires not only Unit Tests but also extensive testing on several open-source projects.

We need tool that will help us to generate an easy to review validation report to test new Rules on several big projects and ensure that the most of cases are covered and there are no crashes. Report should be easily hosted on web to allow involved engineers review it at any place and at any time, example of possible view.

Prove of necessity: issues on github as request for new validations ; mail-list thread that describes reason of temporal moratorium on new Rules in Checkstyle; official sandbox prject with about 40 additional Rules ; validation ideas that would be good to borrow from Groovy experience; just another custom Rules for Checkstyle: 1, 2, 3, 3, 4, 5, 6, ... ; results of open survey.

====================================

Project Name: Practice What You Preach

Skills required: intermediate Java.

Project type: infrastructure update, code refactoring.

Project goal: to assure quality and prove the idea that following to standards is possible and beneficial.

Description: The main principle to assure quality is to use your own tool. All Rules that Checkstyle has should be applied to its own code and proved that all best practices that the tool proposes are reasonable and possible to follow. It is also required that the build fails if Rule is present in Checkstyle code repository and missed in configuration that validate the whole Checkstyle code.

Usage of newly introduced Rule in own code helps to understand how many problems/refactoring it will entail. This helps to propose to a user not only the detection of a problem but also a solution for gradual code update or half-automated code update, to make code compliant with the new Rule.

A contributor who introduces a Rule will have in the same time to upgrade the whole Checkstyle code in order to follow that Rule. Contributor also needs to persuade the whole Checkstyle team not only in necessity of Rule but also in necessity of updating the whole Checkstyle code base. Such approach will definitely result in quality of Rules as acceptance will be done by few engineers and first project that completely follow new Rule will be Checkstyle itself.

Additionally in scope of this project we need to resolve all validation problems that other tools found - Findbug, PMD, Sonar, DSM. Adjust Checkstyle build to fail on any FindBug, PMD, Sonar violation. That will speedup process of patch acceptance as first part of code review will be done automatically by these tools.

Prove of necessity: example of proposed changes to Rule that were rejected on moment of applying new behavior to Checkstyle code, issue on github pull request with long code-review; mail-list that show how much code review stages we do before accepting new code, see number of posts; bugs on SourceForge; feature request on SourceForge; issues on github; results of open survey.

====================================

Project Name: Upgrade Java Grammar from ANTLR2 to ANTLR4

Skills required: basic Java and experience with syntax analysis.

Project type: new feature implementation.

Project goal: to update core library to the latest version in order to simplify Java grammar support.

Description: Checkstyle needs to have new Java grammar that is based on ANTLR4 version. This task is very difficult but it is kind of critical for Checkstyle as ANTLR2 library is not supported (from 2006) and is far less efficient. Old version has a bunch of syntax analysis limitations that have already been resolved in ANTLR4. Our team is already experiencing difficulties with support of current grammar as it is too complicated due to limited parsing abilities of ANTLR2.

New features of ANTLR4 that we need:

  • Antlr4 got support of direct left recursion that will simplify grammar significantly. We already have a lot warnings of non-deterministic behaviour that is not possible to resolve in ANTLR2, example.
  • Antlr4 have bunch of UI tools that helps user to debug grammar and see how parser will work: IDE plugins, Parse Tree Inspector UI application from ANTLR distribution package.

Prove of necessity: results of open survey.

====================================

Project Name: Sun Code Convention and Documentation Comments Style Guide

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Description: Sun Code Convention and Documentation Comments Style Guide were one of the first guidelines on how to write a Java code. Sun Code Convention is marked as outdated (because of date of last update made in it) but best practices described there do not have an expiration date.

Sun Code Convention is already partly covered by Checkstyle. A lot of validation Rules were added and changed in Checkstyle from the time when Sun's configuration was created (2004 year). During project it is required to review both documents in more details and show publicly that we have covered all possible guidelines in the same way as we did it in Google Style Guide.

Project will mainly be focused on automation of Documentation Comments (javadoc) guidelines as reliable comments parsing was a major improvement in Checkstyle during GSoC 2014.

The result of this project will be an updated configuration with the maximum possible coverage of Sun convention and Comment style guide, coupled with detailed report at Sun's Style page to make it look like Google's Style.

Prove of necessity: javadoc bugs on SourceForge; javadoc feature request on SourceForge; javadoc issues on github; results of open survey.

Clone this wiki locally