tndx
Twitter Indexer & Archiver
Summary
tndx
fetches user details, timeline, followers, friends, and favorites from Twitter, then stores in an AWS S3 bucket via a Kinesis Delivery Stream for query via Athena/Trino. tndx
also extract "entities" (media) URLs from tweets, fetches and processes via AWS Rekognition, storing the media in S3 and resulting media meta data in DynamoDB.
AWS Services
An AWS account and local .aws credentials file must be set up to run tndx
.
SSM Parameter Store
Configuration data including Twitter API keys and S3 bucket details are stored in the AWS SSM Parameter Store.
S3
ORC data files are stored in an S3 bucket specified in the SSM Param Store. The files are stored in such a way to allow AWS Glue and Athena to "crawl" the data and then query the resulting tables.
A Twitter project and application must be configured for this project. API/Consumer keys are stored in the AWS SSM Param Store. tndx
uses Twitter's OAuth 2.0 services.