Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Punctuation in Devnagari Script (Hindi) reduces 'Confidence'. #68

Open
abhishekkr opened this issue Feb 11, 2021 · 2 comments
Open

Comments

@abhishekkr
Copy link

I tried giving it a test for Hindi (in Devnagari Script).

With a random sentence the confidence was 39.x%.
So I tested it with a sentence "It is What Language.", I guessed might be close to home for this.

In Devnagari Script, FullStop is written as '|'. When I tested the sentence without it, confidence was 100%. But on including it, the confidence dropped few points.

I'm a n00b when it comes to Rust, but can try fixing it if you don't have time and can point me in right direction.

Also, if you need any help training models and have a guide I can follow.

PS: Thanks for this. I was looking for an interesting project to try restart Rust journey.

Screenshot_20210211-082028_Chrome.jpgScreenshot_20210211-082019_Chrome.jpg

@greyblake
Copy link
Owner

Hi @abhishekkr ,
thank you for the report!
Right now whatlang is under heavy refactoring and improvements.
The old version is based on trigrams only.
The new one will also take alphabetics into considerations (see: https://github.com/greyblake/whatlang-rs/tree/alphabet/src/alphabets)

The reason for your results can be the following: | is not recognized as punctuation and it is used to build trigrams, what delutes a confidence in the final result.

Unfortunately I have zero knowledge about Hindi, so if you're available for assisting with Devnagari languages, that would be very helpful!

@abhishekkr
Copy link
Author

Sure, I'd be glad to help with that (Hindi being my first language) and any other feature you might not be able to pick due to schedule.

Just 2 things; I've only done minimal introductory Rust in it's pre-stable releases and might not be always on-time due to pre-commitments. I wouldn't need hand holding with Rust & project in general, that I'll manage (with feedback)... but I can't do away with commitments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants