Diego Cabello

<<<Back to Coding

Twitter Tools

Date: 31 Mar 2025

Words: 439

Draft: 1 (Most recent)

I am building a suite of tools to automate Twitter functions outside the paid API.

Bookmarks Scraper (July 2024) Github

I wanted to download all my bookmarked images and posts from Twitter and index them, but it costed $100/mo to do this with the official Twitter API. So, I built a cost-effective workaround.

The Method

  1. Scraping the data
    • log into twitter on the browser, go to the page you want to scrape from, and locate one of the GET method connections for https://x.com/i/graphql/$PAGE in the browser network tab
    • copy the cookies and request headers and paste them as arguments for a curl command
    • run the command in python using the subprocess library within a while loop
    • write the JSON responses to a text file for later parsing
    • extract the bottom cursor from the last response and then use that as an argument for the next iteration
    • this will run about 90 times returning 20 posts each until it times out and blocks you
  2. Analyzing the data
    • parse each response for all the information about the posts, their authors, and media content
    • store parsed data in an sqlite database
    • download the images or videos using the python requests library

Areas for Expansion and Improvement

This scraper is an ongoing project with potential research-level scaleability (as the paid API effectively limits a lot of researchers). Improvements include:

Previous Iterations

  1. The initial attempt was to use selenium to automate scrolling and BeautifulSoup to parse the resulting HTML. This was slow, as all the images had to load each time; painful, as using BeautifulSoup to parse HTML sucks; and fragile, as it would break if twitter changed their site structure.
  2. A next attempt was to use tshark (terminal wireshark) to analyze the network traffic. Decoding encrypted https (443) traffic with cookies was a pain and I could only get it working on Windows. Curl handled this automatically for me so I switched to that.

Mass Unfollow Dashboard (Oct 2024)

I was following too many accounts on my facename twitter account so I made a dashboard to quickly go through them all and unfollow them.

  1. Get all the accounts I follow using curl using this script
  2. Put them all into an html document in table format using this script. It shows relevant information and has checkbox columns for unfollowing and for putting into categories, and exports that information into json.
  3. This script uses selenium to go through the json and add all the accounts to lists and unfollow them.
<<<Back to Coding

Made with Sculblog