FAQ
Q: What issues are you trying to address?
We are attmpting to address media biases.
There are many types of media bias .
Here we try to address the following ones:
- Gatekeeping bias: which is selected or exclusion of stories based on ideological grounds.
We show this using a "Missing Topic" section at our
Graphs page.
-
Statement bias: is a verbal slant against a particular subject being covered. We
show this bias using sentiment analysis, which can be seen in our
Graphs page and
News page.
Future work will include a
Coverage bias, which will display where a story lands on the
website (i.e. main story at the top, or lower coverage story at bottom of the page).
Q: How is processing done?
Our current processing steps in AWS are as follows:
- New sources are scraped using NodeJS
, Puppeteer
, and Kafka messaging
- Data analysis is carried out using Spark
and Python for Natural Language Processing
, Cosine Similartiies
, and Topic extraction.
- All data are stored in MongoDB
- The frontend presentation uses ReactJS, Bootstrap
, and Plotly
Q: How are sources and content selected?
Sources are picked to represent a range of left and right wing political views.
The sources are chosen by us.
All content is scraped from the sources, going 2 layers deep into website. The scrapers extract a title and text, which are sent
into the analysis section.
There is no filtering of content, and all sources are treated the same (weighted, combined, etc..).
There is a selection effect due us chosing only certain sources.