Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve guide to contributing #3912

Closed
jnothman opened this issue Nov 30, 2014 · 28 comments
Closed

Improve guide to contributing #3912

jnothman opened this issue Nov 30, 2014 · 28 comments
Labels
Documentation Moderate Anything that requires some knowledge of conventions and best practices

Comments

@jnothman
Copy link
Member

I don't think the contributor's guide at present is very easy to read, and it should contain a bit more on certain matters. (At the moment it is mostly focused on technical how-tos.) The main additional things to get across are:

  • what belongs in scikit-learn, and what doesn't (and if something doesn't, where can it go)? (This may belong on a separate page)
  • more ways to contribute
  • what makes a good code contribution
  • the pull request process

This needs to be a concise-as-possible one-stop shop for the hundreds of requests to contribute.

I would like to see the guidelines have a shape like the following:

Scikit-learn's scope

This may belong in the contributor's guide, or may belong elsewhere in the documentation (a new "About the project", or in the FAQ). In any case it should be referenced from the contributor's guide.

  • well-established algorithms (and occasionally, recent variants of well-established algorithms), to be mostly usable out of the box; in distinction to pylearn2 et al.
  • specifically for ML problems whose targets have minimal structure: i.e. binary, regression, multiclass, multilabel and multitask problems from more-or-less flat feature vectors. These all involve (sometimes spare) arrays as input and targets.

Ways to contribute

This is already mostly there, but needs some reordering. It needs to begin with an emphasis on how overtasked the core developer group is, and how important it is to help complete existing highlights a contributor's competence, leading way for larger contributions.

  • reporting bugs
  • easy issues and how to adopt an issue
  • finding and adopting a stagnant pull request (PR): perhaps look for old WIPs
  • new features (see scope and discussion of contributions below first; suggest following mailing list)
  • documentation
  • examples
  • testing and improving consistency
  • reviewing (an extremely valuable contribution after showing you are familiar with what makes a good contribution)
  • spreading the word

How to contribute

  • the pull request process: perhaps adopting an issue; WIP status; MRG status; MRG+1; MRG+2; adding to what's new. Refinement and rebasing along the way.
  • the ideal code contribution: well-written (clear and reasonably efficient) code consistent with the project's style; tests that ensure it works; understandable, complete docstrings; narrative documentation in the User Guide; examples that highlight interesting features of the algorithm, how to inspect its model, and how to interpret its hyperparameters.
  • code management (i.e. git)
  • coding guidelines
  • API design
  • documentation guidelines
@jnothman jnothman added Documentation Moderate Anything that requires some knowledge of conventions and best practices labels Nov 30, 2014
@raghavrv
Copy link
Member

I'd like to work on this... this will help me learn more about good contribution practices too... (and some complex rebase techniques that we may document in the code management section, if needed)

But is it okay for novices to take this up? If I did could you guide me?

@jnothman
Copy link
Member Author

If I did could you guide me?

Of course!

@raghavrv
Copy link
Member

Thanks!! I am starting work on this by 22nd... :) soon :p

@raghavrv
Copy link
Member

I am thinking of using the following as reference :

  1. Sympy's detailed Development Workflow
  2. Sympy's writing documentation guide
  3. Scipy's guide
  4. Azure contributors guide

Please let me know if you have any additional reference or contradicting view with the above references...

@raghavrv
Copy link
Member

raghavrv commented Jan 5, 2015

I have an additional suggestion...

More often than not several new devs tend to make the same set of mistakes / run into the same set of difficulty... We could add a numbered FAQ like list of common errors / suggestions... ( Apart from the contrib guide which most newbies might not read fully )

So reviewers can save a lot of time by simply saying something like... Refer #34 in that list...

I can prepare that list by scavenging through old PRs and reviews? Do you think such a list would be worth the time spent on it? Please let me know your views... :)

@amueller
Copy link
Member

amueller commented Jan 5, 2015

I would maybe do this in a format of a checklist?
Not sure what common mistakes are. Maybe initializing variables in __init__? Not using proper numpydoc format? pep8?

@raghavrv
Copy link
Member

raghavrv commented Jan 5, 2015

For example check this I am sure this must have come across in some other occasssion... Though I am learning a lot from your comments :p , It would be painstaking for you to review multiple PRs with several repeated comments / advices... Instead you could rather simply direct me to a link which has content similar to this :

#1. Use assert_warn_messages wherever appropriate, to check if the message content of the warning matches the expected value... This would be more specific than simply doing assert_warns.

And for this :

#2. Always test that the sphinx html output is rendered correctly whenever you make a change to the rsts / docstrings.

Follow this to check the rendered html output of the docs.

code

You could simply say... "Make sure you follow #2 of that list"

#3. Similarly for this.

I've pointed out my mistakes alone just to be safe... :p but I guess there must be a few more like these around in other PRs?

@amueller
Copy link
Member

amueller commented Jan 5, 2015

True, but I would really rather have it as a checklist so that people can go through it before submitting the PR so that we don't have to comment at all ;)
The assert_warn_message is not really a common one, as it is mostly used for common tests and api compatibility things. #2 is certainly common and #3 also. And whitespace before : in docstrings ;)

@raghavrv
Copy link
Member

raghavrv commented Jan 5, 2015

so that people can go through it before submitting the PR so that we don't have to comment at all ;)

Hah :D okay I'll add a checklist instead...

@jnothman
Copy link
Member Author

jnothman commented Jan 5, 2015

Any reference, more so a checklist, needs to be concise. It needs to list
the most common issues, or those that are helpful to know in advance of
implementation, but shouldn't get bogged down in detail.

On 6 January 2015 at 06:07, ragv notifications@github.com wrote:

so that people can go through it before submitting the PR so that we
don't have to comment at all ;)

Hah :D okay I'll add a checklist instead...


Reply to this email directly or view it on GitHub
#3912 (comment)
.

@raghavrv
Copy link
Member

raghavrv commented Jan 5, 2015

Ok... thanks... I'll keep that in mind... and sorry for the delay in working on this...

@jnothman
Copy link
Member Author

jnothman commented Jan 5, 2015

Delay is par for the course in open source!@

On 6 January 2015 at 08:13, ragv notifications@github.com wrote:

Ok... thanks... I'll keep that in mind... and sorry for the delay in
working on this...


Reply to this email directly or view it on GitHub
#3912 (comment)
.

@raghavrv
Copy link
Member

raghavrv commented Feb 8, 2015

Have summarized your comment into sections for the proposed new contributions page and added it to wiki for live preview!

Please take a look and feel free to modify anything!

https://github.com/scikit-learn/scikit-learn/wiki/Contributors-Guide

@sotte
Copy link
Contributor

sotte commented Mar 5, 2015

I think the contributing docs should not be spread between CONTRIBUTING.md and doc/developers/index.rst. CONTRIBUTING.md is a thinned out copy of doc/developers/index.rst anyway and DRY.

Maybe just reference the webpage and doc/developers/index.rst from CONTRIBUTING.md.

@amueller
Copy link
Member

amueller commented Mar 5, 2015

The CONTRIBUTING.md is shown by github when opening a new issue, though.

@sotte
Copy link
Contributor

sotte commented Mar 5, 2015

That's why I'd put the link to the webpage and doc/developers/index.rst into CONTRIBUTING.md.

For my semi-personal projects I often include README.rst from sphinx into the documentation. This would work for CONTRIBUTING.md/CONTRIBUTING.rst as well.

@raghavrv
Copy link
Member

raghavrv commented Mar 6, 2015

NOTE: Document format for references as described in #4344

@raghavrv
Copy link
Member

raghavrv commented Apr 9, 2015

Hey @jnothman :) Could you take a look at this wiki page and let me know if the structure as defined in "Pages and Sections" seems fine... Once you approve of the structure I'll start the work on this :)

@amueller
Copy link
Member

I just realized that our docs don't specify "don't merge master". I think they should.

@hlin117
Copy link
Contributor

hlin117 commented Jun 5, 2015

It seems that there was a pull request for this issue already. Should this issue be closed?

@raghavrv
Copy link
Member

raghavrv commented Jun 5, 2015

@hlin117 Any issue for which a PR has been raised will automatically / manually be closed as soon as that PR gets merged into master. :)

@amueller
Copy link
Member

amueller commented Jun 5, 2015

@rvraghav93 not really true ;) we often forget about that. But this one is not addressed yet.

@devashishd12
Copy link
Contributor

@amueller @rvraghav93 how about adding this with a link to this to contributing.rst under the title "Rebasing and squashing commits". Although I'm not sure whether this should go into that or somewhere else.

@amueller
Copy link
Member

we should totally add that. I'm not sure what the state of @rvraghav93's edits is. He had a draft in the wiki, but I think incremental improvements might be easier.
I think it should go to the end of http://scikit-learn.org/dev/developers/contributing.html#how-to-contribute

@devashishd12
Copy link
Contributor

@amueller alright awesome I'll add it as soon as I get a go from @rvraghav93 :) should I start working on the contributor's guide too?

@raghavrv
Copy link
Member

Please go ahead. And yes if you want to go ahead with that too. I will try to help if possible as it is a big change.

@amueller
Copy link
Member

@jnothman Going through the initial list, I feel like we have all of that. What do you think?

@jnothman
Copy link
Member Author

Yes, I think it's roughly all covered. Let it be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Moderate Anything that requires some knowledge of conventions and best practices
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants