Help Center
Cosmos-DB Post Processing
Author: Joseph Kready
Generates the sentiment and toxicity scores for the YouTube, Blog, and Twitter databases on COSMOS-DB. Any future databases should use this repo
Info
Location:
- COSMOS-Crawler, 144.167.35.49
- C:\COSMOS\PostProcessing
Dashboard: http://cosmos-starmap.host.ualr.edu/d/6CO4j–Gz/sentiment-and-toxicity?orgId=1&refresh=10s
Runs Daily
Uses Database account ‘post_processing’
Design
This script uses python async processing to improve the performance of reading/updating the database.
Sentiment Analysis: https://textblob.readthedocs.io/en/dev/quickstart.html
Toxicity Analysis: https://github.com/unitaryai/detoxify (multilingual model)
Setup
- Have python 3.7 or greater
- Install the requirements.txt file (pip install -r requirements.txt)
- Run main.py
- You can also run the ‘Tox-Sent.bat’ file on windows. This is primarily used in task scheduler to have this task run daily
- There is also 1 command line argument: -p , bool, makes the progress bars print pretty if True. Leave False to have the progress bars saved to the log file.