Reddit Flair Predictor

Project Hosted on : Heroku

Machine Learning Model

Technologies Used:

Frontend - HTML/CSS
Backend - Flask
Database - Direct CSV import from pandas library
APIs - Reddit API, PRAW Model

Overall Approach towards the problem

Data Acquisition - I collected the data from the subreddit r/India, using the reddit API and praw model. I collected almost 200 post from each flair incline (AskIndia,Non-Political,[R]eddiquette, Scheduled, Photography, Science/Technology, Politics, Business/Finance, Policy/Economy, Sports, Food, AMA). I collected approximately 2200 subreddit posts collectively and represented them using graphs, wordcloud etc. ( attached below).
Flair Detection - Since I'm not so fluent in Machine Learning part, but I'm able to make the model using different algorithms including Naive Bayes, SVM, logistic regression, random forest, MLP classifier. I got the best accuracy from random forest and using this as the testing and training. Then I've split the data into 70% training and 30% testing and getting the Random forest accuracy 78% using the combination of URL, comments and title of the subreddit post. (refrences attached )
Web Application : Using Flask library as backend and HTML/CSS as frontend, all the screenshots are attached below. Unfortunately I am not able to push the CSS file to heroku library due to memory shortage ( working on this ) but the application working is totally fine.
Reported the result using graphs and visualizations.

Project Screenshots on LocalHost

Starting Screen

Predicted Flair

About Subreddit

Data Analysis with collected data

WordCloud

Title Length

Comment Length

Number of Upvotes/Downvotes

Distribution of Score

Correlation Heatmap

Libraries/Dependencies

beautifulsoup
Flask
scikit
sklearn
nltk
etc.( listed in requirements.txt)

Refrences

R

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
scraping and visualization data		scraping and visualization data
static		static
templates		templates
working_proj		working_proj
.gitattributes		.gitattributes
Procfile		Procfile
README.md		README.md
app.py		app.py
model.py		model.py
nltk.txt		nltk.txt
reddit_07sep_2.csv		reddit_07sep_2.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Flair Predictor

Technologies Used:

Overall Approach towards the problem

Project Screenshots on LocalHost

Data Analysis with collected data

Libraries/Dependencies

Refrences

About

Releases

Packages

Languages

ManthanKeim/Reddit-Flair-Predictor

Folders and files

Latest commit

History

Repository files navigation

Reddit Flair Predictor

Technologies Used:

Overall Approach towards the problem

Project Screenshots on LocalHost

Data Analysis with collected data

Libraries/Dependencies

Refrences

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages