Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check network-wide #66

Open
wizzwizz4 opened this issue Feb 25, 2017 · 7 comments
Open

Check network-wide #66

wizzwizz4 opened this issue Feb 25, 2017 · 7 comments

Comments

@wizzwizz4
Copy link

Stack Overflow is the biggest site on the SE network, but the other sites have their fair share of plagiarism. Adding these should approximately double the load on the bot, so this might not be feasible at the moment.

@FelixSFD
Copy link
Member

At the moment, Guttenberg only checks, if the post (the "target") is copied from answers to one of the linked/related questions. Since this doesn't work across SE sites and even doesn't find many plagiarized posts on SO, we need to find an other source like search-engines.

We'll discuss this in our next room meeting: SOBotics/SOBotics.github.io#9

If you have any idea, where we can find possibly related sources/posts to find the original post of a "target", feel free to add it here (or in the linked issue for the room meeting).


The api-quota of the bot would be a problem, if we had a way to access possibly related posts via the API. The CPU and RAM load won't be a problem, since the bot will be moved from my Raspberry Pi to a VPS in 6-8 weeks.

@Bhargav-Rao
Copy link
Member

Expanding to other sites on the network is one of our plans too. SOBotics/SOBotics.github.io#4

Gut at the moment is almost production ready for other sites. There has been many flags moderator flags from it's reports. (There is a strong chance that it's accuracy would be better on other sites with text only answers). But there are a couple of small issues that we face:

  1. The ChatExchange library that we are using for the bot is not compatible with Stack Exchange Chat.
  2. We are not aware of the number of posts per minute on other sites. The bot uses up 50% of the API calls already.

The initial plan for expansion is to first go ahead with a few other similar sites, like Ask Ubuntu (need to speak to Thomas Ward) and Unix & Linux (need to speak to terdon) before going on to the other sites.

@ArtOfCode-
Copy link
Member

@Bhargav-Rao Is that Tuna's Java CE? The original CE (Python) is compatible with any chat server.

@Bhargav-Rao
Copy link
Member

Yep, @ArtOfCode-. Tuna's lib was original written only for SOCVFinder, which was intended to work only on Stack Overflow. In the latest version, there has been some progress towards making it compatible on all the 3 hosts, but, it does not still support SE and MSE chat.

@jdd-software
Copy link
Member

However this is only related to chat room, it does not directly influence the implementation. For now I see it as minor limit (room has to be on SO) and I bet Tuna can solve it if the need grows to have rooms on other chat's.

@Tunaki
Copy link

Tunaki commented Feb 27, 2017

I'm sure the problem just relies around the authentication, but I haven't had a chance to look into it yet. I'm betting this is a simple fix. I will raise the priority of this on the todo list :).

@Bhargav-Rao
Copy link
Member

@Tunaki has fixed the issue. I've setup Guttenberg for unix.se on my local machine.

http://chat.stackexchange.com/transcript/message/35733717#35733717

Looks like the unix site gets a very low amount of traffic, and most of the API calls made are only to check if a new answer has arrived. After running it for 2 hrs, there were 20 answers. So I think, if we run the check every 20 mins or so, we should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants