ArSarcasm-v2 Dataset

ArSarcasm-v2 is an extension of the original ArSarcasm dataset published along with the paper From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. ArSarcasm-v2 conisists of ArSarcasm along with portions of DAICT corpus and some new tweets. Each tweet was annotated for sarcasm, sentiment and dialect. The final dataset consists of 15,548 tweets divided into 12,548 training tweets and 3,000 testing tweets. ArSarcasm-v2 was used and released as a part of the shared task on sarcasm detection and sentiment analysis in Arabic. You can find more details in the Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic

Dataset details:

ArSarcasm-v2 is provided in a CSV format, we provide the same split that was used for the shared task. The training set contains 12,548 tweets, while the test set contains 3,000 tweets.

The dataset contains the following fields:

tweet: the original tweet text.
sarcasm: boolean that indicates whether a tweet is sarcastic or not.
sentiment: the sentiment of the tweet (positive, negative, neutral).
dialect: the dialect used in the tweet, we used the 5 main regions in the Arab world, follows the labels and their meanings:
- msa: modern standard Arabic.
- egypt: the dialect of Egypt and Sudan.
- levant: the Levantine dialect including Palestine, Jordan, Syria and Lebanon.
- gulf: the Gulf countries including Saudi Arabia, UAE, Qatar, Bahrain, Yemen, Oman, Iraq and Kuwait.
- magreb: the North African Arab countries including Algeria, Libya, Tunisia and Morocco.

Citation

Please use the following citation if you use ArSarcasm-v2:

@inproceedings{abufarha-etal-2021-arsarcasm-v2,
title = "Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic",
    author = "Abu Farha, Ibrahim  and
    Zaghouani, Wajdi  and
    Magdy, Walid",
    booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop",
    month = april,
    year = "2021",
    }

Other resources

If you are interested in other Arabic NLP resources check:

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ArSarcasm-v2		ArSarcasm-v2
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ArSarcasm-v2

ArSarcasm-v2

LICENSE

LICENSE

README.md

README.md

Repository files navigation

ArSarcasm-v2 Dataset

Dataset details:

Citation

Other resources

About

Releases

Packages

License

iabufarha/ArSarcasm-v2

Folders and files

Latest commit

History

Repository files navigation

ArSarcasm-v2 Dataset

Dataset details:

Citation

Other resources

About

Topics

Resources

License

Stars

Watchers

Forks