Skip to content
This repository has been archived by the owner on Mar 29, 2022. It is now read-only.

Add invalid/spam threshold for exclusion #447

Open
MattIPv4 opened this issue Jan 6, 2020 · 8 comments
Open

Add invalid/spam threshold for exclusion #447

MattIPv4 opened this issue Jan 6, 2020 · 8 comments
Labels
feature-request Unaccepted user submitted new feature suggestion
Projects

Comments

@MattIPv4
Copy link
Member

MattIPv4 commented Jan 6, 2020

Feature description

We saw quite a few users who were cheating the system in Hacktoberfest 2019 by creating many spam PRs across repositories on GitHub, far more than four, in the hopes that four of them wouldn't get labelled as "invalid" so that they could win.

Something that might help to reduce this is to add a threshold into the app logic for excluding a user from winning when they have too many invalid PRs.

I'm unsure what the threshold should be there, but let's say three for now. If a user were to get three PRs marked as "invalid" or in a spam repository during the Hacktoberfest period, they'd be excluded from participating and could not win, no matter how many other legitimate PRs they had.

We should also ensure there is an appeal system in place (maybe just a CTA to email hf@do) and a way for Hacktoberfest staff to manually remove someone from exclusion if there is a legitimate reason for them to have invalid PRs.

Further, it might be worth also having a way for Hacktoberfest staff to manually add someone to the exclusion list if they are to spot a spammer that the system hasn't recognised.

@MattIPv4 MattIPv4 added the feature-request Unaccepted user submitted new feature suggestion label Jan 6, 2020
@MattIPv4 MattIPv4 added this to Triage in 2020 via automation Jul 24, 2020
@MattIPv4 MattIPv4 moved this from Triage to P2 - If time allows in 2020 Jul 24, 2020
@Mariatta
Copy link

I support this feature request.

I was thinking perhaps if they make like 2-3 invalid PRs in the same week, then that should be flagged somehow. Perhaps at that time they should receive some notification about this, so hopefully they get to learn from their mistake and have the opportunity to improve during the remainder of the month.

@Carreau
Copy link

Carreau commented Oct 1, 2020

I think a threshold might be hard, though I think PR marked as "spam" should count as -1, maybe -2 to toward total count. (leave invalid as 0, people make mistake).

One could also count "merged" as an extra +1 to push for quality, potentially requiring 6 instead of 4 to get a shirt. Also encouraging people to follow-up.

@bsipocz
Copy link

bsipocz commented Oct 1, 2020

I'm not sure about the extra for "merged", as its very easy to merge a very minor improvement while genuinely helpful, but a bit more complex PRs also need more maintainer input, and thus their availablity during hacktoberfest should not be benefit or penalize the participants' "score". But having a -1/-2 would be certainly great for spammy PRs.

@Moeplhausen
Copy link

I just had to close over 80 (spam) pull requests in one of my repos that were submitted because of hacktoberfest. Please put something in place that stops public repos from getting swamped with that kind of pull requests!
A threshold is exactly what is needed - something as low as 3 sounds right.

@Carreau
Copy link

Carreau commented Oct 1, 2020

I'm not sure about the extra for "merged", as its very easy to merge a very minor improvement while genuinely helpful, but a bit more complex PRs also need more maintainer input, and thus their availability during hacktoberfest should not be benefit or penalize the participants' "score".

I understand, but I think it would be good to push people toward quality and followup.
Maybe have 2 tiers of tshirt, for "enough merged and long term contribution" if you are still around and contributing across a full year ? But that's going away from the subject, I think that for this october something should be done really soon to deter freeloader / spam PRs. Stronger than just "invalid"/"spam" will just not count.

@NiklasBr
Copy link

NiklasBr commented Oct 2, 2020

Seeing soo many spam PR:s as a result of this hacktoberfest. It's sad that the incentives have been set up to reward this behaviour.

@Hacktoberfest Hacktoberfest deleted a comment from hayderimran7 Oct 2, 2020
@nightlark
Copy link

I think a threshold might be hard, though I think PR marked as "spam" should count as -1, maybe -2 to toward total count. (leave invalid as 0, people make mistake).

Leaving invalid as 0 is a good idea. I've seen some suggestions in the Hacktoberfest discord server that maintainers can label PRs to their own projects as invalid if they don't want PRs as part of their usual workflow to count towards Hacktoberfest.

@nightlark
Copy link

I did some further research into how the invalid and spam labels have been getting used.

Invalid is getting used as a way to make PRs not count towards Hacktoberfest (as the website says), while the PRs still get merged in 5-10% of the PRs that have been given the label since hacktoberfest started on Oct 1 — it’s pretty clear that these uses aren’t intending for the PRs marked with the invalid label to count towards a ban. In addition to those, there’s a decent amount of use the label has seen prior to hacktoberfest (and continues to see) for things like a PR getting superseded by a newer PR, or getting auto marked invalid when the decision is made that a particular change is no longer desired (often it’s either the person that opened the PR or a bot that adds the label automatically when the PR gets closed).

On the other hand, the spam label sees about 1/3 the use of the invalid label, but seems to get merged less than 1% of the time, making it a ~10x stronger signal that the PR is actual garbage and shouldn’t count, and should likely contribute to a ban.

If desired I can share links to some of the repositories/PRs that are using invalid labels in a way that makes it pretty clear they don’t intend for a ban to happen - in one case a maintainer is even explaining the reason for adding the label with screenshots that have relevant parts of the hacktoberfest guidelines underlined for deciding what types of PRs should count.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Unaccepted user submitted new feature suggestion
Projects
2020
  
P2 - If time allows
Development

No branches or pull requests

7 participants