Requirements
This application uses OAuth 1 for user authentication. To obtain a consumer key and a consumer key secret, the application has to be registered at Twitter Developer Platform https://developer.twitter.com.
Usage
Help
gowipetweet h
gowipetweet -h
gowipetweet help
gowipetweet --help
Common parameters
- -c / --config - path to configuration file. See config.example.yaml. Default value is
config.yaml
, so configuration file will be read from current directory.
A typical workflow
- Register a client Twitter app at https://developer.twitter.com, obtain consumer API keys, save them to config.yaml. Note: it's NOT needed if you would like just to generate a list of tweets to delete without actual deletion; Twitter credentials are needed for API calls.
- Generate and download a twitter archive: Link
- Convert the
data/tweet.json
file from arctive to JSON Lines format
- Generate a list of tweets to delete from JSON Lines file by filtering tweets
- Delete tweets using a list file
Twitter provides dumps as JavaScript files which are inapproptiate for analysis and filtering of records. This command converts the original JavaScript file into JSON Lines format.
gowipetweet tweets:dump:to_jsonl \
-i /home/john/somefolder/twitter_dump/data/tweet.js \
-o /home/john/somefolder/twitter_dump_processed/tweets.js
gowipetweet tweets:to_delete_list:from_jsonl \
-i /home/john/somefolder/twitter_dump/tweets.json \
-o /home/john/somefolder/twitter_dump/todo_delete.txt \
-e "created_time >= '2022-01-01 00:00:00' && created_time < '2022-02-01 00:00:00'"
Parameters
- -i / --input-file - path to CSV file where IDs of tweets to delete are listed. Each line contains a single column which is tweet ID. No CSV header. Note: it was probably a bad idea to mention the CSV format in this command, because records are not actually comma-separated :D Probably this name will be changed later.
gowipetweet tweets:delete:using_csv \
-c $PWD/.local/config.yaml \
-i /home/john/somefolder/tweets_to_delete.csv
Filtering expressions
The govaluate library is used to parse expressions. Following tweet properties are defined for each record:
- created_time
- favorite_count
- full_text
- id
- retweet_count
Examples of expressions
favorite_count + (retweet_count * 20) > 100
created_time >= '2022-01-01 00:00:00' && created_time < '2022-02-01 00:00:00'