Skip to content

Latest commit

 

History

History
431 lines (281 loc) · 13.8 KB

CONTRIBUTING.md

File metadata and controls

431 lines (281 loc) · 13.8 KB

Contributing to librdkafka

(This document is based on curl's CONTRIBUTE.md - thank you!)

This document is intended to offer guidelines on how to best contribute to the librdkafka project. This concerns new features as well as bug fixes and general improvements.

License and copyright

When contributing with code, you agree to put your changes and new code under the same license librdkafka is already using unless stated and agreed otherwise.

When changing existing source code, you do not alter the copyright of the original file(s). The copyright will still be owned by the original creator(s) or those who have been assigned copyright by the original author(s).

By submitting a patch to the librdkafka, you are assumed to have the right to the code and to be allowed by your employer or whatever to hand over that patch/code to us. We will credit you for your changes as far as possible, to give credit but also to keep a trace back to who made what changes. Please always provide us with your full real name when contributing!

Official librdkafka project maintainer(s) assume ownership and copyright ownership of all accepted submissions.

Write a good patch

API and ABI compatibility guarantees

librdkafka maintains a strict API and ABI compatibility guarantee, we guarantee not to break existing applications and we honour the SONAME version.

Note: ABI compatibility is guaranteed only for the C library, not C++.

Note to librdkafka maintainers:

Don't think we can or should bump the SONAME version, it will break all existing applications relying on librdkafka, and there's no change important enough to warrant that. Instead deprecate (but keep) old APIs and add new better APIs as required. Deprecate APIs through documentation (@deprecate ..) rather than compiler hints (RD_DEPRECATED) - since the latter will cause compilation warnings/errors for users.

Changes to existing APIs

Existing public APIs MUST NEVER be changed, as this would be a breaking API and ABI change. This line must never be crossed.

This means that no changes are allowed to:

  • public function or method signatures - arguments, types, return values.
  • public structs - existing fields may not be modified and new fields must not be added.

As for semantic changes (i.e., a function changes its behaviour), these are allowed under the following conditions:

  • the existing behaviour that is changed is not documented and not widely relied upon. Typically this revolves around what error codes a function returns.
  • the existing behaviour is well known but is clearly wrong and consistently trips people up.

All such changes must be clearly stated in the "Upgrade considerations" section of the release in CHANGELOG.md.

New public APIs

Since changes to existing APIs are strictly limited to the above rules, it is also clear that new APIs must be delicately designed to be complete and future proof, since once they've been introduced they can never be changed.

  • Never add public structs - there are some public structs in librdkafka and they were all mistakes, they've all been headaches. Instead add private types and provide accessor methods to set/get values. This allows future extension without breaking existing applications.
  • Avoid adding synchronous APIs, try to make them asynch by the use of rd_kafka_queue_t result queues, if possible. This may complicate the APIs a bit, but they're most of the time abstracted in higher-level language clients and it allows both synchronous and asynchronous usage.

Portability

librdkafka is highly portable and needs to stay that way; this means we're limited to almost-but-not-quite C99, and standard library (libc, et.al) functions that are generally available across platforms.

Also avoid adding new dependencies since dependency availability across platforms and package managers are a common problem.

If an external dependency is required, make sure that it is available as a vcpkg, and also add it as a source build dependency to mklove (see mklove/modules/configure.libcurl for an example) so that it can be built and linked statically into librdkafka as part of the packaging process.

Less is more. Don't try to be fancy, be boring.

Follow code style

When writing C code, follow the code style already established in the project. Consistent style makes code easier to read and mistakes less likely to happen.

clang-format is used to check, and fix, the style for C/C++ files, while flake8 and autopep8 is used for the Python scripts.

You must check the style before committing by running make style-check-changed from the top-level directory, and if any style errors are reported you can automatically fix them using make style-fix-changed (or just run that command directly).

The Python code may need some manual fixing since autopep8 is unable to fix all warnings reported by flake8, in particular it will not split long lines, in which case a # noqa: E501 may be needed to turn off the warning.

See the end of this document for the C style guide to use in librdkafka.

Write Separate Changes

It is annoying when you get a huge patch from someone that is said to fix 511 odd problems, but discussions and opinions don't agree with 510 of them - or 509 of them were already fixed in a different way. Then the person merging this change needs to extract the single interesting patch from somewhere within the huge pile of source, and that gives a lot of extra work.

Preferably, each fix that correct a problem should be in its own patch/commit with its own description/commit message stating exactly what they correct so that all changes can be selectively applied by the maintainer or other interested parties.

Also, separate changes enable bisecting much better when we track problems and regression in the future.

Patch Against Recent Sources

Please try to make your patches against latest master branch.

Test Cases

Bugfixes should also include a new test case in the regression test suite that verifies the bug is fixed. Create a new tests/00-<short_bug_description>.c file and try to reproduce the issue in its most simple form. Verify that the test case fails for earlier versions and passes with your bugfix in-place.

New features and APIs should also result in an added test case.

Submitted patches must pass all existing tests. For more information on the test suite see [tests/README.md].

How to get your changes into the main sources

File a pull request on github

Your change will be reviewed and discussed there and you will be expected to correct flaws pointed out and update accordingly, or the change risk stalling and eventually just get deleted without action. As a submitter of a change, you are the owner of that change until it has been merged.

Make sure to monitor your PR on github and answer questions and/or fix nits/flaws. This is very important. We will take lack of replies as a sign that you're not very anxious to get your patch accepted and we tend to simply drop such changes.

When you adjust your pull requests after review, please squash the commits so that we can review the full updated version more easily and keep history cleaner.

For example:

# Interactive rebase to let you squash/fixup commits
$ git rebase -i master

# Mark fixes-on-fixes commits as 'fixup' (or just 'f') in the
# first column. These will be silently integrated into the
# previous commit, so make sure to move the fixup-commit to
# the line beneath the parent commit.

# Since this probably rewrote the history of previously pushed
# commits you will need to make a force push, which is usually
# a bad idea but works good for pull requests.
$ git push --force origin your_feature_branch

Write good commit messages

A short guide to how to write good commit messages.

---- start ----
[area]: [short line describing the main effect] [(#issuenumber)]
       -- empty line --
[full description, no wider than 72 columns that describe as much as
possible as to why this change is made, and possibly what things
it fixes and everything else that is related]
---- stop ----

Example:

cgrp: Restart query timer on all heartbeat failures (#10023)

If unhandled errors were received in HeartbeatResponse
the cgrp could get stuck in a state where it would not
refresh its coordinator.

Important: Rebase your PR branch on top of master (git rebase -i master) and squash interim commits (to make a clean and readable git history) before pushing. Use force push to keep your history clean even after the initial PR push.

Note: Good PRs with bad commit messages or messy commit history such as "fixed review comment", will be squashed up in to a single commit with a proper commit message.

Add changelog

If the changes in the PR affects the end user in any way, such as for a user visible bug fix, new feature, API or doc change, etc, a release changelog item needs to be added to CHANGELOG.md for the next release.

Add a single line to the appropriate section (Enhancements, Fixes, ..) outlining the change, an issue number (if any), and your name or GitHub user id for attribution.

E.g.:

## Enhancements
 * Improve commit() async parameter documentation (Paul Nit, #123)

librdkafka C style and naming guide

Note: The code format style is enforced by our clang-format and pep8 rules, so that is not covered here.

Minimum C standard: "gnu90"

This is the GCC default before 5.1.0, present in CentOS 7, still supported up to its EOL in 2024.

To test it, configure with GCC and CFLAGS="-std=gnu90".

It has the following notable limitations:

  • No in-line variable declarations.

Note: the "No variable declarations after statements" (-Wdeclaration-after-statement) requirement has been dropped. Visual Studio 2012, the last version not implementing C99, has reached EOL, and there were violations already.

Function and globals naming

Use self-explanatory hierarchical snake-case naming. Pretty much all symbols should start with rd_kafka_, followed by their subsystem (e.g., cgrp, broker, buf, etc..), followed by an action (e.g, find, get, clear, ..).

The exceptions are:

  • Protocol requests and fields, use their Apache Kafka CamelCase names, .e.g: rd_kafka_ProduceRequest() and int16_t ErrorCode.
  • Public APIs that closely mimic the Apache Kafka Java counterpart, e.g., the Admin API: rd_kafka_DescribeConsumerGroups().

Variable naming

For existing types use the type prefix as variable name. The type prefix is typically the first part of struct member fields. Example:

  • rd_kafka_broker_t has field names starting with rkb_.., thus broker variable names should be named rkb

Be consistent with using the same variable name for the same type throughout the code, it makes reading the code much easier as the type can be easily inferred from the variable.

For other types use reasonably concise but descriptive names. i and j are typical int iterators.

Variable declaration

Variables must be declared at the head of a scope, no in-line variable declarations after statements are allowed.

Function parameters/arguments

For internal functions assume that all function parameters are properly specified, there is no need to check arguments for non-NULL, etc. Any maluse internally is a bug, and not something we need to preemptively protect against - the test suites should cover most of the code anyway - so put your efforts there instead.

For arguments that may be NULL, i.e., optional arguments, we explicitlly document in the function docstring that the argument is optional (NULL), but there is no need to do this for non-optional arguments.

Indenting

Use 8 spaces indent, no tabs, same as the Linux kernel. In emacs, use c-set-style "linux. For C++, use Google's C++ style.

Fix formatting issues by running make style-fix-changed prior to committing.

Comments

Use /* .. */ comments, not // ..

For functions, use doxygen syntax, e.g.:

/**
 * @brief <short description>
 * ..
 * @returns <something..>
 */

Make sure to comment non-obvious code and situations where the full context of an operation is not easily graspable.

Also make sure to update existing comments when the code changes.

Line length

Try hard to keep line length below 80 characters, when this is not possible exceed it with reason.

Braces

Braces go on the same line as their enveloping statement:

int some_func (..) {
  while (1) {
    if (1) {
      do something;
      ..
    } else {
      do something else;
      ..
    }
  }

  /* Single line scopes should not have braces */
  if (1)
    hi();
  else if (2)
    /* Say hello */
    hello();
  else
    bye();

Spaces

All expression parentheses should be prefixed and suffixed with a single space:

int some_func (int a) {

    if (1)
      ....;

    for (i = 0 ; i < 19 ; i++) {


    }
}

Use space around operators:

int a = 2;

if (b >= 3)
   c += 2;

Except for these:

d++;
--e;

New block on new line

New blocks should be on a new line:

if (1)
  new();
else
  old();

Parentheses

Don't assume the reader knows C operator precedence by heart for complex statements, add parentheses to ease readability and make the intent clear.

ifdef hell

Avoid ifdef's as much as possible. Platform support checking should be performed in configure.librdkafka.

librdkafka C++ style guide

Follow Google's C++ style guide