FAQ

Q: What issues are you trying to address?

We are attmpting to address media biases. There are many types of media bias . Here we try to address the following ones:

Gatekeeping bias: which is selected or exclusion of stories based on ideological grounds. We show this using a "Missing Topic" section at our Graphs page.
Statement bias: is a verbal slant against a particular subject being covered. We show this bias using sentiment analysis, which can be seen in our Graphs page and News page.

Future work will include a Coverage bias, which will display where a story lands on the website (i.e. main story at the top, or lower coverage story at bottom of the page).

Q: How is processing done?

Our current processing steps in AWS are as follows:

New sources are scraped using NodeJS , Puppeteer , and Kafka messaging
Data analysis is carried out using Spark and Python for Natural Language Processing , Cosine Similartiies , and Topic extraction.
All data are stored in MongoDB
The frontend presentation uses ReactJS, Bootstrap , and Plotly

Q: How are sources and content selected?

Sources are picked to represent a range of left and right wing political views. The sources are chosen by us. All content is scraped from the sources, going 2 layers deep into website. The scrapers extract a title and text, which are sent into the analysis section. There is no filtering of content, and all sources are treated the same (weighted, combined, etc..). There is a selection effect due us chosing only certain sources.