Codes and data

Anomaly detection

The code for our algorithm MADAN published in the paper: Gutiérrez-Gómez, L. Bovet, A. & Delvenne, J.-C. Multi-scale Anomaly Detection on Attributed Networks. Proceedings of the AAAI Conference on Artificial Intelligence. Vol 34 No 01: AAAI-20 Technical Tracks 1 (2020) is available here.

Ipython notebooks for reproducing the toy example and case study of the papers are also provided.

Twitter Analysis

These softwares: hydrator, twarc and tweepy can be used to “rehydrate” the tweet_IDs, i.e. download the full tweet objects using the tweet_IDs.

See also Hernan Makse’s website.

Fake news influence

This dataset contains the retweet networks and the tweet IDs that have a URL directing toward a news outlet website of the corresponding media category.

It is used in our paper: Bovet, A. & Makse, H. A. Influence of fake news in Twitter during the 2016 US presidential election. Nat. Commun. 10, 7 (2019).

See README file and dataset.

The code used for the analysis of this dataset is available here.

The curated list of website spreading fake and extremely biased news used in our paper is available here.

Opinion mining

This dataset contains the tweet IDs of 170 million tweets from 11 million users posting about the election between June 1st 2016 until November 9th 2016.

It is used in our paper: Bovet, A., Morone, F. & Makse, H. A. Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump. Sci. Rep. 8, 8673 (2018).

See README file and dataset.

The code of algorithm that we developed is available here and the following ipython notebook explains how to use it.

Twitter social network and sentiment analysis

You will find here the ipython notebooks for the lecture I gave at The Graduate Center of the City University of New York in 2017.

The notebooks also cover the basics about how to use tweepy to connect to the Twitter API and collect tweets.