Skip to content
This repository has been archived by the owner on Mar 29, 2022. It is now read-only.

re-evaluate and discuss participation rules #609

Open
hertg opened this issue Oct 3, 2020 · 22 comments
Open

re-evaluate and discuss participation rules #609

hertg opened this issue Oct 3, 2020 · 22 comments
Labels
feature-request Unaccepted user submitted new feature suggestion

Comments

@hertg
Copy link

hertg commented Oct 3, 2020

This issue is regarding PR #596 which recently changed the participation rules of hacktoberfest. A lot of feedback from the community went into the comments of the Pull Request.

The PR gets polluted from issue references and, unfortunately, some non-constructive comments, so I think that an issue would be a better place to continue the discussion and community feedback for the participation rules.

Please keep in mind that Hacktoberfest is an event organized by DigitalOcean, and it is entirely up to the team actually working on the event on what rules get implemented. But since the event is all about working with the open-source community, i hope that this feedback can provide some value for the Hacktoberfest team.


I'd like to start by summarizing the most constructive feedback from the PR comments here and adding some of my own thoughts. Please feel free to debate any of the points made below and add your own feedback. :)

Questionable Rules

  • Opt-In / hacktoberfest topic requirement
    The event being opt-in drastically reduces the positive impact Hacktoberfest could have on smaller Open-Source projects. Especially those where the maintainers don't know anything about Hacktoberfest. While it reduces the negative impact the first few days had on larger Open-Source projects, there are less drastic measures to give maintainers more power if they get overwhelmed by spam PRs (for example: easy opt-out via no-hacktoberfest topic).
    The new requirement of having a hacktoberfest topic on the repository doesn't necessarily reduce spam, because people might assume the maintainer just doesn't know about Hacktoberfest and bother them by requesting that they add the label.

    This also makes the event a lot less wholesome in my opinion. Why not letting maintainers of smaller / halfway abandoned projects be positively surprised when a sudden PR arrives that might have been partially motivated by Hacktoberfest. That person then bothering the maintainer to add a hacktoberfest-accepted label to the PR or even adding a non-related hacktoberfest topic on the repository makes it seem as if getting a free tshirt was the only goal of the person contributing. That might hold back some more self-aware people to even do the PRs in the first place. Not really a strong point to make, just a thought.

  • Always requiring the hacktoberfest-approved label on PRs
    That is probably a good idea, allowing maintainers to set a PR as hacktoberfest-approved when the PR can't be merged immediately. But if a PR already does get merged, why does it need that label? Isn't it already proof enough that a PR was meaningful if it got merged into the source? I don't see any benefit if a contributor needs to bother the maintainer to add this label if the PR already got accepted.

    The only positive side of requiring the label would be that the maintainer can merge a PR that just fixes a simple typo, but decide to not make it eligible to count in Hacktoberfest. However, this might cause a heated discussion between the contributor who wants to participate in Hacktoberfest and the maintainer, leaving pressure on the maintainer again. Solving that via additional minimum line changes or excluding documentation changes (see below) might solve that without leaving pressure on the maintainers.

    My info was wrong, as @fer22f and @MattIPv4 pointed out, every PR that is either approved, merged or labelled hacktoberfest-accepted actually gets counted.

Further Improvements

  • Require an account age minimum in order to participate
    I don't know if newly created accounts play even a big role in the spam PRs, but I can imagine that some spammers probably open new accounts to spam the PRs, rather than risking the "reputation" of their actual account.

  • Require a repository age minimum in order for PRs to count
    Might prevent some hello-world repositories that are just created to get the PRs

  • Require a manual review of any repository having "Hacktoberfest" in the repository name, whether they are actually eligible for the event.
    Searching for Hacktoberfest on Github lists a lot of repos that are just there for people to get the free T-Shirt. Although I'm guessing a lot of them get reported and are no longer considered for the PR count.

    Also, while browsing some repositories with the hacktoberfest topic, I found that there are a lot of repositories from colleges / universities that already let their students do assignments via Github. I'm not sure whether such repositories should be counted towards Hacktoberfest or not, they don't really provide a lot of value to people outside of those classes, but I understand that this is pretty hard to differentiate.

  • Exclude contributions to own repos.
    Or maybe only count them if the repository has a certain amount of stars, that way maintainers of meaningful repositories could still participate themselves.

  • Exclude documentation only changes to .md, .adoc, etc. and require a minimum line count
    Might be hard to implement. After all, adding documentation to projects is a very meaningful contribution, but just fixing 1 typo or adding an unnecessary sentence isn't that meaningful, so that's hard to differentiate

  • Allow maintainers to completely opt-out easily if they get overwhelmed with spam (i.e. with a no-hacktoberfest label or similar)
    Obviously, only if the event is opt-out. With the current opt-in strategy, this proposal doesn't make sense

  • Disqualify anyone who has 1+ PR marked as spam immediately.
    This seems harsh at first, but there is a big difference between the labels invalid and spam.
    Whether or not that gives too much power to the maintainer, and whether that might get abused is open for discussion. But maintainers aren't the bad actors here, the spammers / cheaters are.

@hertg hertg added the feature-request Unaccepted user submitted new feature suggestion label Oct 3, 2020
@fer22f
Copy link

fer22f commented Oct 3, 2020

I'll just comment that "Always requiring the hacktoberfest-approved label on PRs" is not correct. As per the code:

https://github.com/digitalocean/hacktoberfest/blob/daed0a32bb838307066aff80291009b97cae5d8a/app/models/pull_request.rb#L107-L112

It's either merged, approved (by review), or with label hacktoberfest-accepted.

@MattIPv4
Copy link
Member

MattIPv4 commented Oct 3, 2020

👋 Some initial thoughts before I head to bed:

Opt-In / hacktoberfest topic requirement

It has been suggested to extend opt-in to be EITHER at the repo level with the hacktoberfest topic OR at the PR level with the hacktoberfest-accepted label -- this is very much in the suggestion phase but I think this would be a simple change to the new rules that'd help with extending the reach.

Moving away from an opt-in system is no longer an option unfortunately.

Always requiring the hacktoberfest-approved label on PRs

Not quite sure what the point/paragraph is about really, we DO NOT require the hacktoberfest-approved label on all PRs. (The label we're using is hacktoberfest-accepted). A maintainer can accept a PR (in an opted-in repo) in three different ways; merging the PR OR approving the PR OR labelling the PR as hacktoberfest-accepted.

Require an account age minimum in order to participate

This is very likely off the cards for us as a big part of Hacktoberfest is about education and introducing folks to open-source for the first time. Many will look to get involved with Hacktoberfest and attend an event, having never used git or GitHub, so the first step for them during the event is often to create an account. Adding a restriction on this would essentially gate-keep Hacktoberfest from new folks.

Require a repository age minimum in order for PRs to count
Might prevent some hello-world repositories that are just created to get the PRs

One of our spam mitigation techniques is actually almost the opposite of this. For new folks that are just set on getting a t-shirt, we are allowing and essentially encouraging them to go and create their own repo, where they can create practice PRs without disturbing actual open source projects.

Require a manual review of any repository having "Hacktoberfest" in the repository name, whether they are actually eligible for the event.

We do have a system for reporting repos and do act on super duper spammy things, but see my previous point in that were mostly allowing folks to do what they want, with the intention being to keep spam away from legitimate projects. Trying to block everything cheaty would undoubtedly result in more spam being directed back at leigiamte proejcts, which we want to avoid doing.

Exclude contributions to own repos.

Contributing to your own projects is an incredibly valid thing to do, I do not see why we'd ever want to block this. Sure, Hacktoberfest is about going out there and contributing to projects in need of help, but sometimes that project might actually be your own, so we don't want to stop people from doing that if that's what they want to do.

Equally, see my above point about spam mitigation, allowing folks to create their own repos and PRs there gives them their own space to do their own thing without distrubing actual projects.

Exclude documentation only changes to .md, .adoc, etc. and require a minimum line count

This would just further limit the scope of Hacktoberfest and be borderline gate-keeping what is an open-source contribution. Not something we want to do.

Allow maintainers to completely opt-out easily if they get overwhelmed with spam (i.e. with a no-hacktoberfest label or similar)
Obviously, only if the event is opt-out. With the current opt-in strategy, this proposal doesn't make sense

Opt-out is the system we had before, folks just had to email us and we'd exclude their repos. That didn't work, the community asked for opt-in and so that's what we're doing now.

Disqualify anyone who has 1+ PR marked as spam immediately.

As noted in our update post, we are considering implementing logic to ban users that continue to not follow our values. I'm sure we will continue to evaluate the spam situation over the coming week and this'll likely be a good step for us to take and implement if the spam problem continues.

@m10653
Copy link

m10653 commented Oct 4, 2020

I really disliked the Opt-in approach. There are a few repos that I am wanting to contribute to that are somewhat obscure or outdated/ unmaintained. Some of theses I have personally patched to use and would like to contribute my changes to make them useful again. Example being https://github.com/Kulestar/powerui and here https://github.com/dox187/powerui. This by no means will prevent me from making these contributions. But it does leave a sour taste in my mouth for Oktoberfest.

I do like the idea of including a label or tag within GitHub that is automated to opt out of the event. This would be far better from requiring the repo owner to send an email. As it would be faster and more straight forward. Making this opt-out information more public/ available to repo owners also is a must.

maybe allowed Pull requests to include a #hacktoberfest to count on repos that don't opt-in or opt out. To give the repo owner a heads up for the potential reason for the Pull requests so they can opt-out if they like?
I don't know just spitballing.

@kakurasan
Copy link

I really disliked the Opt-in approach. There are a few repos that I am wanting to contribute to that are somewhat obscure or outdated/ unmaintained. ... This by no means will prevent me from making these contributions. But it does leave a sour taste in my mouth for Oktoberfest.

Absolutely.
I also want to (continue to) contribute to some less active repositories and Hacktoberfest was a big incentive to make such contributions...

Another disappointing thing was that the rules were changed in October.

@lnicola
Copy link

lnicola commented Oct 4, 2020

How about counting merged PRs in any repository and all PRs in repositories that opted in (except the ones marked as invalid or spam)? I still dislike this because I sometimes contribute to repositories that see PRs merged after a year or so, but it seems more fair. Think of it as "Hacktoberfest on hard mode". You could also show only opted-in repos on the project page and make it easier to opt out with a project topic instead of an email.

I would also cancel the timer for merged PRs. What does it protect against, malicious co-maintainers? Right now there's a large amount of uncertainty -- you might have seven PRs or zero, and you'll only find out two weeks after the event ends.

@hertg
Copy link
Author

hertg commented Oct 4, 2020

@MattIPv4

Thanks for your work on the event, and thank you for your response, it clarified a lot of things for me.
I've added some thoughts on your response below.


Quote 1

It has been suggested to extend opt-in to be EITHER at the repo level with the hacktoberfest topic OR at the PR level with the hacktoberfest-accepted label -- this is very much in the suggestion phase but I think this would be a simple change to the new rules that'd help with extending the reach.

This sounds like a reasonable update. I can't think of any reason why both should be required.


Quote 2

Moving away from an opt-in system is no longer an option unfortunately.

I understand that changing it from opt-out to opt-in and then back to opt-out would just cause further confusion to everyone. But this just indicates that the change to opt-in might have been decided a little bit prematurely. As already pointed out, I completely understand why it happened and why Hacktoberfest probably can't go back to opt-out, but it may be some valuable input for the next Hacktoberfest.


Quote 3

Not quite sure what the point/paragraph is about really, we DO NOT require the hacktoberfest-approved label on all PRs. (The label we're using is hacktoberfest-accepted). A maintainer can accept a PR (in an opted-in repo) in three different ways; merging the PR OR approving the PR OR labelling the PR as hacktoberfest-accepted.

Yes, this was based on wrong information I had. I've updated my post to make that clear :)


Quote 4

This is very likely off the cards for us as a big part of Hacktoberfest is about education and introducing folks to open-source for the first time. Many will look to get involved with Hacktoberfest and attend an event, having never used git or GitHub, so the first step for them during the event is often to create an account. Adding a restriction on this would essentially gate-keep Hacktoberfest from new folks.

Quote 5

One of our spam mitigation techniques is actually almost the opposite of this. For new folks that are just set on getting a t-shirt, we are allowing and essentially encouraging them to go and create their own repo, where they can create practice PRs without disturbing actual open source projects.

These two points actually clarify a lot for me. My understanding was that this event is something like a "Thank you" to anybody working in open source by further motivating them to open more PRs during october and letting them win a T-Shirt if they go through with it.

With your points made above, it becomes clear to me that people who are already working in open source are not the main target audience of the event. The target audience for participants seems to be students and people who never contributed to open source and the goal is to motivate them to learn how Github and PRs work in general, to lower the bar for them to open actual meaningful PRs even after October ends. There's nothing wrong about that goal, it was just a misunderstanding from my part.

Please correct me if that interpretation is wrong, but the current participation rules and your arguments communicate this pretty clearly now.


Quote 6

We do have a system for reporting repos and do act on super duper spammy things, but see my previous point in that were mostly allowing folks to do what they want, with the intention being to keep spam away from legitimate projects. Trying to block everything cheaty would undoubtedly result in more spam being directed back at leigiamte proejcts, which we want to avoid doing.

Quote 7

Contributing to your own projects is an incredibly valid thing to do, I do not see why we'd ever want to block this. Sure, Hacktoberfest is about going out there and contributing to projects in need of help, but sometimes that project might actually be your own, so we don't want to stop people from doing that if that's what they want to do.

Equally, see my above point about spam mitigation, allowing folks to create their own repos and PRs there gives them their own space to do their own thing without distrubing actual projects.

Similar to the previous points, this clarifies some more things for me. But if the only requirement for people to win a T-Shirt is that they can prove how to use Github and how to do a PR inside their own repository, without requiring the PRs to actually be a meaningful contribution to an open-source project, this might be a bit frustrating to people already working in open source.

Yes, making the rules stricter on what counts as a contribution and what doesn't could be considered "gate-keeping" to students and newbies, who are apparently the main target audience of the event, so I completely understand this point. However, it does draw quite a thin line on which PR gets counted and which doesn't. Right now it seems that every PR counts towards Hacktoberfest as long as it doesn't spam unusable code to others. Which is a valid rule to apply, after all, you got 70'000 shirts to get rid of 😛


Quote 8

Opt-out is the system we had before, folks just had to email us and we'd exclude their repos. That didn't work, the community asked for opt-in and so that's what we're doing now.

To be honest, I wouldn't consider opting out by sending an email to be "an easy way" to opt-out. While it certainly is technically an easy way and anybody can send an email, it's just not a user-friendly way to implement that and I can understand if this implementation of the opt-out frustrated a lot of maintainers.

I think you could compare that to a website where you have to send an email in order to delete your account instead of them just providing a "delete my account" button on the settings page. I think most of us have had this experience where a website handled account deletion that way, and I'd argue that everyone hated it.

But sure, if a large amount of maintainers requested that, it's probably a good idea to do it this way. I just think that applying stricter participation rules from the start and implementing an easier way to opt-out would have mitigated a lot of the negative feedback you got from maintainers. But if that isn't possible because it contradicts your main goal of the event, I understand that opt-in might be the best way for you to handle this issue, which sadly is a bit unfortunate for some participants, especially those that are already working in open-source.

@Andrew-Chen-Wang
Copy link

just ref: #608 as part of this discussion.

@jamesmcm
Copy link

jamesmcm commented Oct 5, 2020

As a maintainer and a contributor, the main issues I see are:

  • With the opt-in approach I have to tag each repo, this is a lot of effort for a temporary event. It also restricts contributions to only these repos.

  • If the malicious users are still able bypass the rules by creating fake PRs in their own repos then they will take up the limited amount of rewards available, and harm the public perception of the event, as has already happened (why put it in effort for good PRs when cheaters will get the rewards first?)

I think it would have been best to have required that PRs be merged from the start and limiting to accounts and repositories that existed before October 2020 and to repositories that have more than 10 stars. This would have drastically cut the spam as spammers would realise that they wouldn't get their fake PRs merged, and the opt-out approach may have worked okay.

But the damage has been done and we can't close Pandora's box. From Github's point of view I can totally understand insisting on opt-in events now since it looks terrible when large, professional projects like React, OpenJDK, etc. are receiving dozens of spam PRs a day.

With that in mind, I think it'd be best to also accept PRs that were merged with the hacktoberfest or hacktoberfest-accepted tags (regardless of repo topics) so it's a bit easier on maintainers, and to enforce the above requirements on account and repo age and repo stars to help cut down on spam repos and accounts automatically.

@lnicola
Copy link

lnicola commented Oct 5, 2020

I just got this in the mail:

  • Your pull requests will count toward your participation if they are in a repository with the hacktoberfest topic and once they have been merged, approved by a maintainer or labelled as hacktoberfest-accepted.

  • Additionally, any pull request with the hacktoberfest-accepted label, submitted to any public GitHub repository, with or without the hacktoberfest topic, will be considered valid for Hacktoberfest.

So it looks like counting merged PRs (but without the hacktoberfest-accepted label) in repositories without the topic is now off the table, although I don't understand the reasoning behind it. Maybe to make sure to placate the two very vocal blog post authors that asked for the event to be opt-in?

@hertg

This comment has been minimized.

@lnicola
Copy link

lnicola commented Oct 5, 2020

Please read the second bullet-point you cited again, it literally says the opposite.

I meant PRs that are merged, but not labelled as hacktoberfest-accepted, while the second bullet only refers to those unless I'm still misreading it.

My point is that if I make a PR to a random repository somebody posted on Reddit asking for a code review, I'm not going to ask the owner to add a label.

@hertg
Copy link
Author

hertg commented Oct 5, 2020

@lnicola
Yes, your edited comment makes sense now :)

Based on MattIPv4's answer above, Hacktoberfest is opt-in now, and it won't change back to opt-out, at least not for this year. The mail you've received further confirms what was already talked about above.

My point is that if I make a PR to a random repository somebody posted on Reddit asking for a code review, I'm not going to ask the owner to add a label.

I agree with you on that. That's the first point I've made in my initial post and it is very unfortunate for people that are already working in open-source. Although it's probably something any maintainer would be willing to do, I can relate to the discomfort of having to bother a maintainer that they should add some arbitrary label to a PR.

I'd like to emphasize that the event seems to be geared towards students and newcomers, not towards people who already are part of the open-source community. At least that's what the statements and actions from the Hacktoberfest Team leaves one to believe. The recent change to the participation rules were consciously decided in favor of students/newcomers rather than already active contributors.

But it seems that they were forced to choose one over the other in order to react fast to the backlash they've received. I'm not a big fan of this either, and I believe that stricter participation rules may have been a better way to go. At least that would have led to more meaningful contributions. The current rule set doesn't require a contribution to be meaningful at all, which is frustrating for more experienced open source developers.

However, giving a chance to newcomers isn't just bad. One should not completely discredit that decision, just because you are on the losing side. Sure, maybe Hacktoberfest doesn't lead to as much meaningful contributions as it could have with stricter rules, but maybe the pool of developers who will contribute meaningful stuff in the future increases by some amount because of this event. Just taking away all chances for people with less experience isn't really that cool either, and there's little to no benefit by just deciding in an elitist manner, especially not in open source.

That being said, I still would prefer some stricter rules. Right now, getting a Hacktoberfest T-Shirt doesn't really say "I care about open-source", but it just says "I know how to open a PR on Github", which is pretty sad. At least the design looks cool 😉

@peternewman
Copy link

peternewman commented Oct 5, 2020

I would also cancel the timer for merged PRs. What does it protect against, malicious co-maintainers? Right now there's a large amount of uncertainty -- you might have seven PRs or zero, and you'll only find out two weeks after the event ends.

I'd go further, the "good" states seem to be:
https://github.com/digitalocean/hacktoberfest/blob/fcce147be3dc1c1b9c5e086e4f70c82cf4d880c4/app/models/pull_request.rb#L107-L112

All of which look to me like someone has pro-actively said the PR is good. Yes you probably want some hours or a couple of days in case I hit the button by mistake, but after that can't they go eligible? Or are Digital Ocean also reviewing everything themselves too or something?

Currently it goes from waiting->eligible if the following is met:

                       pr.passed_review_period? &&
                         !pr.spammy? &&
                         !pr.labelled_invalid? &&
                         pr.in_topic_repo? &&
                         pr.maintainer_accepted?

I think it should be more like:

                       pr.passed_accepted_review_period? &&
                         !pr.spammy? &&
                         !pr.labelled_invalid? &&
                         pr.in_topic_repo? &&
                         pr.maintainer_accepted?

Where passed_accepted_review_period is < passed_review_period or perhaps just do that for merged PRs and not labelled ones.

Also looking at the tests, as I was considering opening a PR to potentially change this, it looks like the behaviour in around LATE_ARRAY means this Hacktoberfest will finish on the 17th Oct or so, as after that there won't be enough time for PRs to mature and be merged, unless I've misread something @MattIPv4 ?

@MattIPv4
Copy link
Member

MattIPv4 commented Oct 5, 2020

We intentionally extended the review period from seven days to fourteen days as part of our rule changes, as well as requiring active acceptance from maintainers that a PR is valid. This is done so that, yes, if a maintainer makes a mistake they can correct it, but it also gives us more time and flexibility to react to any further spam issues that we encounter and to block and repositories that we believe don't follow our values. For users, it has no real impact other than that they have to wait a bit longer to get their swag. Everyone has to wait this period, so it isn't unfair -- making it be shorter or non-existent for merged PRs would likely result in folks calling it unfair etc.

The data in LATE_ARRAY is actually used to test that we allow PRs in the last fourteen days of the event to mature correctly, even if the waiting timer goes beyond October. PRs submitted on the last day of October are still valid and the app will allow them time to mature before deciding if you win.

@MzHub
Copy link

MzHub commented Oct 6, 2020

Would be great if the hacktoberfest participants could somehow review each other's changes before the PR reaches the maintainers.

It would allow the event to stay as it used to be, reduce the spam drastically and maybe even help people learn some reviewing as well.

@peternewman
Copy link

but it also gives us more time and flexibility to react to any further spam issues that we encounter and to block and repositories that we believe don't follow our values.

Thanks for clarifying @MattIPv4 .

Linked to this, the email templates don't seem to reflect this, e.g. "Congrats on submitting your first Hacktoberfest pull request!" says:
"Project maintainers have a week to review each of your contributions."

Likewise similar text in other emails. But I couldn't find the source for this to open a PR to fix it.

@MattIPv4
Copy link
Member

MattIPv4 commented Oct 6, 2020

Ah, yeah. Emails aren't versioned controlled. Will poke the team responsible for those so we can get the copy updated -- good catch!

@Andre601
Copy link

Andre601 commented Oct 7, 2020

One thing I wonder is the following.
I PRed to a repo which doesn't has the hacktoberfest topic so it shows as not supported/opt-in on the site.
The maintainer said they're participating with this repo. If that PR now receives the hacktoberfest-accepted label, will it become valid, or is the topic required?

@MattIPv4
Copy link
Member

MattIPv4 commented Oct 7, 2020

Label will opt-in a PR on any repo, or merge/approve/label on a PR in a repo with the topic

@DanielJoyce
Copy link

The issue of dummy repos just for t-shirts existed prior to this whole mess. If DO wants to hand out t-shirts based on naive Github queries without curation, that is on them. They're just t-shirts, not merit badges or war medals.

@MattIPv4
Copy link
Member

MattIPv4 commented Oct 7, 2020

It's definitely something we'll continue to fight (we did last year through the repo reporting feature), but at the end of the day, they're only really being a nuisance to themselves in dummy repos and not causing a headache for open source maintainers.

@DanielJoyce
Copy link

Maybe work with MS so that there is a way for Orgs or Users to tag themselves to opt in all their repos. Would also be useful for other things....

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Unaccepted user submitted new feature suggestion
Projects
None yet
Development

No branches or pull requests