Dataset: Reddit Crypto Sentiment

Bradley Webb
Jun 14, 2022
Dataset: Reddit Crypto Sentiment

The Gamestop saga showed the power of social media sentiment — fueled by Discord memes and chatter on /r/WallStreetBets. So how do you know whether to buy the crypto dip or HODL?

We built a free dataset of Reddit crypto comments, labeled by sentiment, to help your machine learning models find out.

The sentiment analysis dataset

The dataset contains 1000 Reddit comments about crypto from Reddit, categorized according to Positive or Negative sentiment.

Download it here!

Examples

Here are some example Reddit comments from the dataset.

Positive Sentiment
Negative Sentiment

How we labeled it

Data Labeling Workforce

Labeling crypto sentiment can be surprisingly tricky. In order to do a good job, you need to understand the community and its jargon: what are diamond hands, tendies, and DCA? Does HODLing mean you’re a bull or a bear?

Unless you’re familiar with the crypto community, it’s difficult to label these! That’s why having data labelers with the right skills is essential to creating quality datasets.

For this project, we built a team of Surgers both interested in cryptocurrency and heavy Reddit users, who've worked on our other financial categorization and social media data labeling projects.

Interface

Here's a peek at our labeling UI. Our platform makes it fast to create new labeling jobs, whether through our API or our WYSIWYG editor.

More Surge AI Datasets

Want to build a custom financial dataset? Sign up and create a new labeling project in seconds, or reach out to us for help at hello@surgehq.ai!

Check out our other free datasets:

Bradley Webb

Bradley Webb

Bradley runs Surge AI's Product and Growth teams. He previously led Integrity and Data Operations teams at Facebook, and graduated from Dartmouth.

surge ai logo

Data Labeling 2.0 for Rich, Creative AI

Superintelligent AI, meet your human teachers. Our data labeling platform is designed from the ground up to train the next generation of AI — whether it’s systems that can code in Python, summarize poetry, or detect the subtleties of toxic speech. Use our powerful data labeling workforce and tools to build the rich, human-powered datasets you need today.

Meet the world's largest
RLHF platform

Follow Surge AI!