Skip to content

Checkstyle GSoC 2024 Project Ideas

Roman Ivanov edited this page Mar 10, 2024 · 36 revisions

Project Name: Java 21 Language Features Support

Skills required: Java, basic understanding of testing principles, basic understanding of static analysis

Project goal: Support static analysis of Java 21 language features

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Nick Mancuso, Roman Ivanov, Vyom Yadav, Andrei Paikin

Description:

Developers are enthusiastic about leveraging Java's latest language features, which offer more powerful, declarative, and expressive code; these features include unnamed variables, record patterns, and string templates. However, Checkstyle currently lacks robust support for these features. This project aims to bridge this gap by updating existing checks and potentially introducing new ones to ensure thorough coverage of Java 21 syntax and conventions proposed by the JEPs associated with these language advancements. The objective is to deliver comprehensive support for the new language features through revising check modules, exhaustive testing, and detailed documentation updates. This effort not only aligns Checkstyle with cutting-edge best practices in the Java community but also contributes to the project's ongoing evolution.

Deliverables:

  • Analysis of new language features and possible Java parser update to support new features
  • Analysis of possible static analysis coverage (new Checks) for new language features
  • Updates in existing Checks to ensure no false positives and negatives for new language features
  • Documentation improvements

QnA: https://discord.com/channels/845645228467159061/1214568284955877386 (invite)


Project Name: Auto-fix Module

Skills required: intermediate Java

Project type: new feature implementation.

Project goal: implement new module, test it on real projects

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Baratali Izmailov, Vyom Yadav

Description: Checkstyle is known as tool that raises numerous minor issues. There are so many of these and they are so minor that it is hard to find time and engineer to fix them. Most of the issues are so easy to fix but navigation to certain part of the code and making the fix takes time. Engineers could spend this time doing something more valuable. Implementation of an auto-fix functionality could significantly simplify introduction of checkstyle to project as it will do most tedious work automatically.

The major part of checkstyle violations are specifically targeting the formatting of the code. It is often that IDE formatting settings are not in sync with the checkstyle configuration. The IDE can fix the code itself as part of it’s auto-formatting. The same should be done by Checkstyle. Each Check that is targeting the formatting part of the code should have “Fix” functionality built-in. This functionality will convert the code with the violation to compliant code without any user interaction. Such functionality is in huge demand by users.

In scope of this project, it is required to review all existing functionality of auto-fix of code in plugins and tools to learn challenges they have and see the whole list of requirements to resolve such a task. Make implementation of auto-fix for formatting Checks as part of a special Module that takes all reported violations and fix them that will support auto-fix. If the resulting functionality proves to be easy to maintain, and might be reused by checkstyle plugins, then propose API changes can be brought to the core library and allow any plugins to reuse it.

More details at https://github.com/checkstyle/checkstyle/issues/7427

Links to similar tools: https://docs.openrewrite.org/tutorials/automatically-fix-checkstyle-violations, https://github.com/solven-eu/cleanthat

Ai autofix for checkstyle: https://link.springer.com/article/10.1007/s10664-021-10107-0

Auto fix in Eclipse https://github.com/checkstyle/eclipse-cs/pull/566/files#diff-13e277cb135ea2a474dad0b4ac46b5cb020f9c03a2eb6676b15de010f8aec369R549

Deliverables:

  • selection of existing library that will do code modification or making our own implementation
  • defining api for triggering code changes
  • selection of Checks that can produce violations that are auto fixable
  • implementation of auto fix for selected Checks
  • find a model to avoid conflicts of auto fixes

QnA: https://discord.com/channels/845645228467159061/1214569225247793282 (invite)


Project Name: Optimization of distance between methods in single Java class

Skills required: basic Java , good analytical abilities, good background in mathematics.

Project type: new feature implementation.

Project goal: to make quality practices automated and publicly available.

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Baratali Izmailov, Ruslan Diachenko

Description:

This task is ambitious attempt to improve code read-ability by minimizing user jump/scrolls in source file to look at details of method implementation when user looks at method first usage.

It is required to analyse a lot of code and find a model to minimize distance between methods first usage and method declaration in the same file and respect users preferences to keep grouped overloaded and overridden methods together. Some other preferences may appear during investigation of open-source projects.

First step is already done by our team, we created a web service that already calculate distances between methods and make DSM matrix to ease analysis - methods-distance. We already practice it in our project.

As a second step it is required to use a matrix of distances between methods and optimize it by some empiric algorithm to allow user define expected model of class by arguments. This will allow to use this algorithm as a Check to enforce code structure automatically during build time.

Prove of necessity: we have a number of PRs where contributors put new methods at any possible place in a class but better place is close to first usage. Example #1, Example #2, Example #3, ....

Deliverables:

  • new Checkstyle's Check with optimization algorithm to share the algorithm with whole java community.
  • analytical report that proves reason why default values for Check parameters are selected
  • article with all details of analysis and algorithm details;

QnA: https://discord.com/channels/845645228467159061/1214569693336182864 (invite)


Project Name: Reconcile formatters of Eclipse , NetBeans and IntelliJ IDEA IDEs by Checkstyle config.

Skills required: basic Java.

Project type: new feature implementation, analysis of existing IDE features.

Project goal: to make well-known quality practices publicly available.

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Andrei Paikin

Description:

Usage of different IDEs in the same team is already a serious problem, as different IDEs format code base on their own rules and configurations. Unwanted formatting changes happen to code which complicate code-review process. Problem become more acute when project use static analysis tool like Checkstyle that has a wide range of code formatting Checks.

It is required to make it possible to use the same Checkstyle config to work in IDEs without conflicts with IDEs internal formatters. This will help team members be independent on IDE choice but at the same time keep the same format and code style throughout the team.

Main focus of this project is the analysis of formatting abilities of IDEs (indentation, imports order, declaration order, separator/operator wrap, .....) . Update existing Checkstyle Rules to be able to work in the similar and non-conflicting way.

Deliverables:

  • create configuration for IDEs for Checkstyle project to let Checkstyle team use it and auto-format code to conform with checkstyle_check.xml file that is used by Continuous Integration.
  • create Checkstyle config that follows default Eclipse formatting + inspection rules
  • create Checkstyle config that follows default IntelliJ IDEA formatting + inspection rules
  • create Checkstyle config that follows default NetBeans formatting + inspection rules
  • Deep refactoring of Indentation Check to fix its numerous problems.

Prove of necessity: mail-list post #1, mail-list post #2, mail-list post #3 , discussion #1

QnA: https://discord.com/channels/845645228467159061/1214571037451100180 (invite)


Project Name: Open JDK Code convention coverage

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Baratali Izmailov, Vyom Yadav

Description:

OpenJdk Code Convention was one of the first guidelines on how to write Java code. OpenJdk Code Convention is marked as outdated (because of date of last update made in it) but best practices described there do not have an expiration date. New OpenJDK Java Style Guidelines is close to the final version and most likely will be successor of OpenJdk Code Convention. But there is a number of projects in Apache that still follow OpenJdk rules, so both configurations are in need by community.

OpenJdk Code Convention is already partly covered by Checkstyle, known as Sun Code Convention. A lot of validation Rules were added and changed in Checkstyle from the time when Sun's configuration was created (2004 year).

During the project it is required to review both documents in detail and prove publicly that Checkstyle covers all guideline rules. Missed functionality needs to be created, blocking bugs need to be fixed. Page OpenJdk Java Style Checkstyle Coverage needs to be updated. New page "New OpenJDK's Java Style Checkstyle Coverage" need to be created. Both pages need to be formatted in the same way as it is done for Google's Java Style Checkstyle Coverage.

Prove of necessity: javadoc issues on github; results of open survey; request from users for Openjdk coverage support.

Deliverables:

  • embedded config file with all modules that are required for coverage
  • html page that explains how each paragraph in style guide is covered by Checkstyle

QnA: https://discord.com/channels/845645228467159061/1214571550783840307 (invite)


Project Name: Coverage of Documentation Comments Style Guide

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Baratali Izmailov

Description:

Project will mainly be focusing on automation of Documentation Comments (javadoc) guidelines by Checkstyle Checks. Reliable comments parsing was a major improvement in Checkstyle during GSoC 2014, archived results need to be reused to reliably implement automation of Javadoc best practices.

Separate configuration file with newly created Checks need to be created. Best practices in documentation make sense not for all projects. Javadoc validation matters only for library projects that need to expose online documentation in web publicly.

Deliverables: The result of this project will be a configuration file with the maximum possible coverage of Comment style guide. Report should look like Google's Java Style Checkstyle Coverage. If there will be time left we can focus on coverage of guidelines from https://blog.joda.org/2012/11/javadoc-coding-standards.html

Prove of necessity: javadoc issues on github.

QnA: https://discord.com/channels/845645228467159061/1214571282776064130 (invite)


Project Name: Spellcheck of Identifiers by English dictionary

Skills required: intermediate Java.

Project type: new feature implementation.

Project goal: implement spell checking for java code for all identifiers .

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Andrei Paikin

Description:

The correct spelling of words in code is very important, since a typo in the name of method that is part of API could result in serious problem. Mistakes in names also make reading of code frustrating and misleading, especially when a typo in one letter makes developer to read javadoc or even implementation of the method. Two most popular IDEs (Eclipse and IntelliJ IDEA) already have spell-check ability. It will be beneficial for Checkstyle to have the same functionality that could be used in any Continuous Integration system by Command Line Interface or as part of build tool (maven, ant, gradle, ....) with wide range of options to customize to users needs. Features of existing spell-checkers need to be analyzed -
IntelliJ IDEA Spellchecking , Eclipse Spelling. There are numbers of open-source projects that do spell-check. It is ok to reuse them if license is compatible. Examples: https://code.google.com/archive/p/bspell/ , http://www.softcorporation.com/products/spellcheck/, ... https://github.com/giraciopide/shellcheck-maven-plugin, https://github.com/codespell-project/codespell

Deliverables:

  • regular Checkstyle module that does validation
  • such module should be applied to all sources of our Code
  • disablement of shell based implemnetation of spellcheck in our project for java sources.
  • documentation on how to use module

QnA: https://discord.com/channels/845645228467159061/1214572273038786631 (invite)


Project Name: Automated Website Generation

Skills required: Java, basic understanding of testing principles, technical writing, continuous integration

Project goal: organize documentation and automate its maintenance

Project size: medium (175 hours)

Complexity Rating: intermediate

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

This project is designed to tackle the persistent challenge of maintaining accurate and current documentation in our dynamic development environment. Acknowledging the limitations of manual documentation processes, this initiative introduces automation to streamline content creation, with a focus on ensuring consistent formats and robust verification checks. The project's goal is to provide users with reliable, standardized, and regularly updated information while equipping contributors with templates and automated tools to simplify the incorporation of details for new modules. By elevating documentation practices, this project aligns with industry best practices, fostering clarity for both users and contributors within the Checkstyle project.

Deliverables:

  • Reusage of xdoc templates model that we already have.
  • Introduction of description macros that would take content from javadoc of module
  • Resolution of edge cases in documentation generation
  • Extend and make consistent all check usage examples
  • Reduce/eliminate manual documentation updates for examples
  • Introduce checks to ensure that all configuration options are covered in examples
  • Moving website generation logic to separate project to avoid extra classes and dependencies in checkstyle jar artifact.
  • HTML Enhancements for our website to ease navigation and user experience

QnA: https://discord.com/channels/845645228467159061/1214574452021530686 (invite)


Project Name: Internal Tooling for Regression Testing

Skills required: Java, Groovy, BASH, continuous integration, basic understanding of testing principles

Project goal:

Project size: medium (175 hours)

Complexity Rating: intermediate

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

Checkstyle requires a dedicated tool for handling check regression testing based on changes in a pull request. This tool must intelligently identify which check modules were altered in the pull request to conduct targeted testing, ensuring the avoidance of bugs or loss of functionality. It should dynamically generate testing configurations (checkstyle configuration files) based on the modified check modules and perform regression tests on the pull request code against the master branch of Checkstyle across a specified project list. Subsequently, the tool should generate a comprehensive check regression report, highlighting any differences in violations, and seamlessly share it within the pull request. While there may be manually prepared configuration chunks for each module, the project emphasizes the need for full automation, including the generation of configurations directly from the check modules themselves.

Deliverables:

  • Internal tool for determining changed modules
  • Internal tool for generating check configurations
  • Integration testing
  • Integration of new tools with existing report generation system
  • Documentation, including examples

QnA: https://discord.com/channels/845645228467159061/1214573452959154186 (invite)


Project Name: Enhance Mutation Testing Coverage

Project goal: reduce technical debt and improve code quality

Skills required: Java, basic understanding of testing principles

Project size: medium (175 hours)

Complexity Rating: intermediate

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

Checkstyle has recently enriched its mutation testing suite with a set of new mutators powered by pitest, a state-of-the-art mutation testing system renowned for providing gold standard test coverage in Java and the JVM. This project focuses on a meticulous review of suppressions employed within Checkstyle to manage pitest violations, aiming to identify opportunities for new tests or adjustments to existing ones that can effectively resolve these suppressions. The objective is to ensure the continued functional soundness of the code, potentially involving a deep dive into module logic to facilitate test identification and contribute to the resolution of suppression-related issues.

Deliverables:

  • Review of existing suppressions of pitest survivals
  • New tests or improvements to existing tests
  • Resolution of 100% of existing suppressions
  • Documentation, including examples

QnA: https://discord.com/channels/845645228467159061/1214573720056762468 (invite)


Project Name: Eliminate Maven Plugin Usage

Skills required: Java, Groovy, Maven

Project goal: remove all usages of maven-checkstyle-plugin in our tools

Project size: medium (175 hours)

Complexity Rating: intermediate

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav, Richard Veach

Description:

Checkstyle serves as a widely used library across various tools, with a notable dependency on the maven-checkstyle-plugin for continuous integration and regression testing. However, this reliance on an external tool has restricted our ability to introduce breaking changes to the Checkstyle project, given the potential disruptions it causes in testing. Consequently, we've had to implement workarounds to maintain the connection and dependence on the maven-checkstyle-plugin. To foster autonomy and minimize dependencies, Checkstyle is undertaking efforts to break away from this plugin and shift towards relying solely on tools under our maintenance. The list of connected issues below outlines specific areas that require modification to facilitate this transition.

Deliverables:

  • Remove all usages of maven-checkstyle-plugin in our tools
  • Update documentation to reflect changes
  • Update build, CI, and regression testing to use internal tools exclusively

Connected Issues:

Example of Plugin Issue: Upgrade XML logger to XML 1.1

QnA: https://discord.com/channels/845645228467159061/1214574180591214592 (invite)


Project Name: Refine Google Style Guide Implementation

Skills required: Java, basic understanding of testing principles, basic understanding of static analysis

Project goal: improve quality of google style guide implementation

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Vyom Yadav, Andrei Paikin

Description:

Checkstyle boasts a robust implementation of the Google style, well-documented on our coverage page specifying the supported style version. To maintain alignment with updates in the Google style guide, we aim to systematically review changes such as this commit and adapt our Checks configuration accordingly. Addressing reported defects and user issues regarding style guide mismatches is pivotal. All concerns reported for Modules/Checks present in the Google Style config will be thoroughly reviewed, appropriately labeled for easy filtration, and promptly rectified. Conceptually, a refactoring of our integration tests is imperative; transitioning from a per-module approach to a chapter-wise testing methodology aligned with the guideline structure. By mapping chapter requirements to sets of modules loaded from google_checks.xml, we can conduct comprehensive validations over input files, presenting a consolidated result.

Deliverables:

  • Review and update Google Style config to most recent content of style guide
  • Resolve known issues with modules/checks
  • Refactor integration tests to be chapter-wise
  • Resolve all reported issues with Google Style config

QnA: https://discord.com/channels/845645228467159061/1214568865019985931 (invite)


Project Name: Patch Suppression improvement

Skills required: basic Java

Project type: extension of existing feature implementation.

Project goal: implement new strategies for existing filter/suppression module or improve existing

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Roman Ivanov, Ruslan Diachenko

Description: Introducing Checkstyle to a project can be a challenging and NOT an easy job, especially when a project has massive amount of code, very active in development, and there are no resources to start a new process of code cleanup. It may require an extensive effort, especially when there is legacy code from previous contributors that becomes a monotonous job, that everyone tries to avoid. It is easy to say how code should look like, but may be hard to actually enforce rules in existing codebase.

For example Guava is not following google style, and it is easy to say how code should look like but hard to assign somebody to fix ALL problems from previous contributors. It is very boring activity that all will try to avoid. Good practice from openjdk actually discourage code changes without good reason.

Better approach is to let existing code be as is and validate only new code. Checkstyle already has a wide array of filter functionality that could suppress certain violations if user classify a violation as “won’t fix”. Just getting started with setting up the initial suppressions still requires a huge effort to review all the violations, or organize a team on special cleanup process.

This project was originally done at GSOC 2020, but during usage of this project we found problems that checkstyle violations are still going beyond changed code that creates avalanche of change so it complicate usage of it in real project.

We need to invest focus on parsing of patch files to get more precise location of changes and be able skip violation if fix for it goes outside of changed lines. For example: user changing line wrapping of long signature of method and we should not demand decreasing of amount of parameters or fixing names, as this will trigger changes in other part of code.

As proof of success for this project, it is required to get some open source project onboard to use checkstyle and this new feature. It would be good to try collaborate one more time with Guava project or we can ask our friends in Eclipse-CS or Spring or Hbase project.

Deliverables:

  • new Filter in Chekstyle that is applied to our code base.
  • documentation on how to use new filter.
  • apply filer to eclpse-cs project to work on each update (address feedback from usage).

QnA: https://discord.com/channels/845645228467159061/1214572538043043890 (invite)


Project Name: Extend Checker Framework Integration

Skills required: Java, basic understanding of testing principles, basic understanding of Java type system

Project goal: Further usage of Checker Framework and increase internal knowledge base

Project size: large (350 hours)

Complexity Rating: hard

Mentors: Nick Mancuso, Roman Ivanov

Description:

The goal of this project is to advance the integration of the Checker Framework into our existing codebase, enhancing code quality, correctness, and maintainability. In addition to refining the setup already present in our build, the project will focus on incorporating the Checker Framework's type system into key components of our code and creating comprehensive documentation and best practices to guide developers in utilizing the framework effectively.

Deliverables:

  • Integrate Checker type system with codebase
  • Refine existing build
  • Develop internal documentation about our usage of Checker
  • Provides examples, guidelines and best practices for developers to follow

QnA: https://discord.com/channels/845645228467159061/1214572824736571472 (invite)

Clone this wiki locally