detect bots on social media
Find a file
2024-01-12 09:27:09 -05:00
data/russian-troll-tweets init 2024-01-12 09:27:09 -05:00
docs init 2024-01-12 09:27:09 -05:00
.gitignore init 2024-01-12 09:27:09 -05:00
ira_eda.ipynb init 2024-01-12 09:27:09 -05:00
README.md init 2024-01-12 09:27:09 -05:00
Untitled.ipynb init 2024-01-12 09:27:09 -05:00

Better Data:

https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows

Data:

Russian Troll Tweets

Great stuff in here for targets: https://github.com/fivethirtyeight/russian-troll-tweets/

Dictionary:

Header Definition
external_author_id An author account ID from Twitter
author The handle sending the tweet
content The text of the tweet
region A region classification, as determined by Social Studio
language The language of the tweet
publish_date The date and time the tweet was sent
harvested_date The date and time the tweet was collected by Social Studio
following The number of accounts the handle was following at the time of the tweet
followers The number of followers the handle had at the time of the tweet
updates The number of “update actions” on the account that authored the tweet, including tweets, retweets and likes
post_type Indicates if the tweet was a retweet or a quote-tweet
account_type Specific account theme, as coded by Linvill and Warren
retweet A binary indicator of whether or not the tweet is a retweet
account_category General account theme, as coded by Linvill and Warren
new_june_2018 A binary indicator of whether the handle was newly listed in June 2018
alt_external_id Reconstruction of author account ID from Twitter, derived from article_url variable and the first list provided to Congress
tweet_id Unique id assigned by twitter to each status update, derived from article_url
article_url Link to original tweet. Now redirects to "Account Suspended" page
tco1_step1 First redirect for the first http(s)://t.co/ link in a tweet, if it exists
tco2_step1 First redirect for the second http(s)://t.co/ link in a tweet, if it exists
tco3_step1 First redirect for the third http(s)://t.co/ link in a tweet, if it exists

now what?

Precise:

lazy:

  • grab any other collection of tweets
  • check for duplicates