[https://github.com/imperatrona/twitter-scraper] Scrape the Twitter frontend API without authentication with Golang.
Find a file
Veetaha 9c3764f484 Parse GIFs for in the GetTweet API
I am writing an app that needs to get info about all media in a tweet and forward it to a Telegram chat.

Today animated GIFs are ignored in the response of TweetDetail, although the are there (exept for the caveat mentioned below). So without this change the GIFs are not present in the twitterscraper.Tweet struct.

Following the analogy with the split between Photos and Videos I added GIFs to the Tweet type.

There is one caveat that I found during testing that I can't really explain. But GIFs don't occur in the response unless the bearerToken2 is set. I don't know what this token means, maybe it somehow identifies a destop-browser variant of twitter frontend, but with this token the GIFs are present in the response.

Please note that I never wrote Go code before in my life. I am using this library via the FFI to link it to my Rust codebase.
2023-06-18 19:35:34 +02:00
.github/workflows Skip test with authentication 2023-05-21 01:29:08 +03:00
.gitignore add scrap tweets for any search query feature 2020-05-14 14:59:33 +02:00
api.go Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
api_test.go Separate test package 2021-12-07 10:18:01 +02:00
auth.go Support 2FA 2023-06-12 18:23:26 +03:00
auth_test.go Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
go.mod Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
go.sum Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
LICENSE Add MIT license 2020-02-11 14:40:05 +02:00
profile.go improve request 2023-06-01 23:05:37 +03:00
profile_test.go Fix TestGetProfilePrivate 2023-01-10 13:02:35 +02:00
README.md Support 2FA 2023-06-12 18:23:26 +03:00
scraper.go Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
search.go Use GraphQL API with timeline v2 2023-06-01 23:20:11 +03:00
search_test.go Add LoginOpenAccount 2023-05-30 17:31:00 +03:00
timeline_v1.go Use GraphQL API with timeline v2 2023-06-01 23:20:11 +03:00
timeline_v2.go Parse GIFs for in the GetTweet API 2023-06-18 19:35:34 +02:00
trends.go Use GraphQL API with timeline v2 2023-06-01 23:20:11 +03:00
trends_test.go Fix trends 2023-05-30 17:14:38 +03:00
tweets.go Parse GIFs for in the GetTweet API 2023-06-18 19:35:34 +02:00
tweets_test.go Parse GIFs for in the GetTweet API 2023-06-18 19:35:34 +02:00
types.go Parse GIFs for in the GetTweet API 2023-06-18 19:35:34 +02:00
util.go Remove skip pinned tweet 2023-06-04 13:12:21 +03:00

Twitter Scraper

Go Reference

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse-engineered. No API rate limits. No tokens needed. No restrictions. Extremely fast.

You can use this library to get the text of any user's Tweets trivially.

Installation

go get -u github.com/n0madic/twitter-scraper

Usage

Get user tweets

package main

import (
    "context"
    "fmt"
    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New()

    for tweet := range scraper.GetTweets(context.Background(), "Twitter", 50) {
        if tweet.Error != nil {
            panic(tweet.Error)
        }
        fmt.Println(tweet.Text)
    }
}

It appears you can ask for up to 50 tweets (limit ~3200 tweets).

Get single tweet

package main

import (
    "fmt"

    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New()
    tweet, err := scraper.GetTweet("1328684389388185600")
    if err != nil {
        panic(err)
    }
    fmt.Println(tweet.Text)
}

Search tweets by query standard operators

Now the search only works for authenticated users!

Tweets containing “twitter” and “scraper” and “data“, filtering out retweets:

package main

import (
    "context"
    "fmt"
    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New()
    err := scraper.LoginOpenAccount()
    if err !== nil {
        panic(err)
    }
    for tweet := range scraper.SearchTweets(context.Background(),
        "twitter scraper data -filter:retweets", 50) {
        if tweet.Error != nil {
            panic(tweet.Error)
        }
        fmt.Println(tweet.Text)
    }
}

The search ends if we have 50 tweets.

See Rules and filtering for build standard queries.

Set search mode

scraper.SetSearchMode(twitterscraper.SearchLatest)

Options:

  • twitterscraper.SearchTop - default mode
  • twitterscraper.SearchLatest - live mode
  • twitterscraper.SearchPhotos - image mode
  • twitterscraper.SearchVideos - video mode
  • twitterscraper.SearchUsers - user mode

Get profile

package main

import (
    "fmt"
    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New()
    profile, err := scraper.GetProfile("Twitter")
    if err != nil {
        panic(err)
    }
    fmt.Printf("%+v\n", profile)
}

Search profiles by query

package main

import (
    "context"
    "fmt"
    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New().SetSearchMode(twitterscraper.SearchUsers)
    err := scraper.Login(username, password)
    if err !== nil {
        panic(err)
    }
    for profile := range scraper.SearchProfiles(context.Background(), "Twitter", 50) {
        if profile.Error != nil {
            panic(profile.Error)
        }
        fmt.Println(profile.Name)
    }
}
package main

import (
    "fmt"
    twitterscraper "github.com/n0madic/twitter-scraper"
)

func main() {
    scraper := twitterscraper.New()
    trends, err := scraper.GetTrends()
    if err != nil {
        panic(err)
    }
    fmt.Println(trends)
}

Use authentication

Some specified user tweets are protected that you must login and follow. It is also required to search.

Login

err := scraper.Login("username", "password")

Use username to login, not email! But if you have email confirmation, use email address in addition:

err := scraper.Login("username", "password", "email")

If you have two-factor authentication, use code:

err := scraper.Login("username", "password", "code")

Status of login can be checked with:

scraper.IsLoggedIn()

Logout (clear session):

scraper.Logout()

If you want save session between restarts, you can save cookies with scraper.GetCookies() and restore with scraper.SetCookies().

For example, save cookies:

cookies := scraper.GetCookies()
// serialize to JSON
js, _ := json.Marshal(cookies)
// save to file
f, _ = os.Create("cookies.json")
f.Write(js)

and load cookies:

f, _ := os.Open("cookies.json")
// deserialize from JSON
var cookies []*http.Cookie
json.NewDecoder(f).Decode(&cookies)
// load cookies
scraper.SetCookies(cookies)
// check login status
scraper.IsLoggedIn()

Open account

If you don't want to use your account, you can login as a Twitter app:

err := scraper.LoginOpenAccount()

Use Proxy

Support HTTP(s) and SOCKS5 proxy

with HTTP

err := scraper.SetProxy("http://localhost:3128")
if err != nil {
    panic(err)
}

with SOCKS5

err := scraper.SetProxy("socks5://localhost:1080")
if err != nil {
    panic(err)
}

Delay requests

Add delay between API requests (in seconds)

scraper.WithDelay(5)

Load timeline with tweet replies

scraper.WithReplies(true)