2018-11-29 17:33:44 +02:00
# Twitter Scraper
2024-03-09 04:43:58 +03:00
[](https://pkg.go.dev/github.com/imperatrona/twitter-scraper) [](https://github.com/imperatrona/twitter-scraper/actions/workflows/go.yml)
2021-03-09 10:54:25 +02:00
2024-02-21 04:03:47 +03:00
Twitter’ s API is pricey and has lots of limitations. But their frontend has its own API, which was reverse-engineered by [@n0madic ](https://github.com/n0madic ) and maintained by [@imperatrona ](https://github.com/imperatrona ). Some endpoints require authentication, but it is easy to scale by buying new accounts and proxies.
You can use this library to get tweets, profiles, and trends trivially.
< details >
< summary > < h2 > Table of Contents< / h2 > < / summary >
- [Installation ](#installation )
- [Quick start ](#quick-start )
- [Rate limits ](#rate-limits )
- [Authentication ](#authentication )
- [Using cookies ](#using-cookies )
- [Using AuthToken ](#using-authtoken )
- [OpenAccount ](#openaccount )
- [Login & Password ](#login--password )
- [Check if login ](#check-if-login )
- [Log out ](#log-out )
- [Methods ](#methods )
- [Get tweet ](#get-tweet )
- [Get user tweets ](#get-user-tweets )
- [Get user medias ](#get-user-medias )
2024-02-21 04:41:51 +03:00
- [Get bookmarks ](#get-bookmarks )
2024-02-21 04:03:47 +03:00
- [Search tweets ](#search-tweets )
- [Search params ](#search-params )
- [Get profile ](#get-profile )
- [Search profile ](#search-profile )
- [Get trends ](#get-trends )
2024-02-21 06:15:17 +03:00
- [Get following ](#get-following )
- [Get followers ](#get-followers )
2024-03-09 03:55:39 +03:00
- [Get scheduled tweets ](#get-scheduled-tweets )
- [Create scheduled tweet ](#create-scheduled-tweet )
- [Delete scheduled tweet ](#delete-scheduled-tweet )
- [Upload media ](#upload-media )
2024-02-21 04:03:47 +03:00
- [Connection ](#connection )
- [Proxy ](#proxy )
- [HTTP(s) ](#https )
- [SOCKS5 ](#socks5 )
- [Delay ](#delay )
- [Load timeline with tweet replies ](#load-timeline-with-tweet-replies )
- [Contributing ](#contributing )
- [Testing ](#testing )
< / details >
2018-11-29 17:33:44 +02:00
2020-06-15 15:05:41 +03:00
## Installation
```shell
2024-01-28 23:06:45 +03:00
go get -u github.com/imperatrona/twitter-scraper
2020-06-15 15:05:41 +03:00
```
2024-02-21 04:03:47 +03:00
## Quick start
2018-11-29 17:33:44 +02:00
2024-02-21 04:03:47 +03:00
```golang
package main
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
import (
"context"
"fmt"
twitterscraper "github.com/imperatrona/twitter-scraper"
)
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
func main() {
authToken := "auth_token"
ct0 := "ct0"
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
scraper := twitterscraper.New()
scraper.SetAuthToken(authToken, ct0)
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
// After setting Cookies or AuthToken you have to execute IsLoggedIn method.
// Without it, scraper wouldn't be able to make requests that requires authentication
if !scraper.IsLoggedIn() {
panic("Invalid AuthToken")
}
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
for tweet := range scraper.GetTweets(context.Background(), "x", 50) {
if tweet.Error != nil {
panic(tweet.Error)
}
fmt.Println(tweet.Text)
}
}
2023-07-02 01:41:48 +03:00
```
2024-02-21 04:03:47 +03:00
## Rate limits
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
Api has a global limit on how many requests per second are allowed, don’ t make requests more than once per 1.5 seconds from one account. Also each endpoint has its own limits, most of them are 150 requests per 15 minutes.
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
Apparently twitter doesn’ t limit the number of accounts that can be used per one IP address. This could change at any time. As of February 2024, I have been managing 20 accounts per IP address without receiving a ban for several months.
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
OpenAccount was great in the past, but now it’ s nerfed by twitter. They allow 180 requests instead of 150, but you can only create one account per month with one IP address. If you use OpenAccount you should save your credentials and use them later with `WithOpenAccount` method.
## Authentication
Most endpoints require authentication. The preferable way is to use SetCookies. You can also use `SetAuthToken` but `POST` endpoints will not work. Login with password may require confirmation with email and is often the reason of accounts ban.
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
Endpoints that work without authentication will not return sensitive content. To get sensitive content you need to authenticate with any available method including `OpenAccount` .
### Using cookies
2023-07-02 01:41:48 +03:00
```golang
2024-02-21 04:03:47 +03:00
// Deserialize from JSON
var cookies []*http.Cookie
f, _ := os.Open("cookies.json")
json.NewDecoder(f).Decode(& cookies)
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
scraper.SetCookies(cookies)
if !scraper.IsLoggedIn() {
panic("Invalid cookies")
}
```
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
To save cookies from an authorized client to a file, use `GetCookies` :
2023-07-02 01:41:48 +03:00
```golang
cookies := scraper.GetCookies()
2024-02-21 04:03:47 +03:00
data, _ := json.Marshal(cookies)
2023-07-02 01:41:48 +03:00
f, _ = os.Create("cookies.json")
2024-02-21 04:03:47 +03:00
f.Write(data)
2023-07-02 01:41:48 +03:00
```
2024-02-21 04:03:47 +03:00
### Using AuthToken
2023-07-02 01:41:48 +03:00
2024-03-09 18:26:46 +03:00
`SetAuthToken` method simply set required cookies `auth_token` and `ct0` .
2023-07-02 01:41:48 +03:00
```golang
2024-03-09 18:26:46 +03:00
scraper.SetAuthToken(twitterscraper.AuthToken{Token: "auth_token", CSRFToken: "ct0"})
2024-02-21 04:03:47 +03:00
if !scraper.IsLoggedIn() {
panic("Invalid AuthToken")
}
2023-07-02 01:41:48 +03:00
```
2024-02-21 04:03:47 +03:00
### OpenAccount
> [!WARNING]
> Deprecated. Nerfed by twitter, doesn't support new endpoints.
2023-07-02 01:41:48 +03:00
2024-02-21 04:03:47 +03:00
`LoginOpenAccount` is now limited to one new account per month for IP address.
2023-07-02 01:41:48 +03:00
```golang
2023-10-08 21:58:48 -03:00
account, err := scraper.LoginOpenAccount()
2023-07-02 01:41:48 +03:00
```
2024-02-21 04:03:47 +03:00
You should save `OpenAccount` returned by `LoginOpenAccount` to reuse it later.
2024-01-28 23:35:18 +03:00
```golang
scraper.WithOpenAccount(twitterscraper.OpenAccount{
OAuthToken: "TOKEN",
OAuthTokenSecret: "TOKEN_SECRET",
})
2023-07-02 01:41:48 +03:00
```
2024-02-21 04:03:47 +03:00
### Login & Password
To log in, you have to use your username, not the email!
2019-09-21 10:59:45 +03:00
2018-11-29 17:33:44 +02:00
```golang
2024-02-21 04:03:47 +03:00
err := scraper.Login("username", "password")
```
2018-11-29 17:33:44 +02:00
2024-02-21 04:03:47 +03:00
If you have email confirmation, use your email address in addition:
2018-11-29 17:33:44 +02:00
2024-02-21 04:03:47 +03:00
```golang
err := scraper.Login("username", "password", "email")
2018-11-29 17:33:44 +02:00
```
2024-02-21 04:03:47 +03:00
If you have two-factor authentication, use the code:
2018-11-29 17:33:44 +02:00
2024-02-21 04:03:47 +03:00
```golang
err := scraper.Login("username", "password", "code")
```
### Check if login
Status of login can be checked with method `IsLoggedIn` :
2024-02-13 04:45:08 +03:00
```golang
2024-02-21 04:03:47 +03:00
scraper.IsLoggedIn()
```
2024-02-13 04:45:08 +03:00
2024-02-21 04:03:47 +03:00
### Log out
2024-02-13 04:45:08 +03:00
2024-02-21 04:03:47 +03:00
```golang
scraper.Logout()
```
## Methods
### Get tweet
150 requests / 15 minutes
2024-03-08 19:37:25 +03:00
`TweetDetail` endpoint requires auth, so `TweetResultByRestId` endpoint used instead when auth not provided. Which doesn't return `InReplyToStatus` and `Thread` tweets.
2024-02-21 04:03:47 +03:00
```golang
tweet, err := scraper.GetTweet("1328684389388185600")
```
### Get user tweets
150 requests / 15 minutes
`GetTweets` returns a channel with the specified number of user tweets. It’ s using the `FetchTweets` method under the hood.
```golang
for tweet := range scraper.GetTweets(context.Background(), "taylorswift13", 50) {
if tweet.Error != nil {
panic(tweet.Error)
2024-02-13 04:45:08 +03:00
}
2024-02-21 04:03:47 +03:00
fmt.Println(tweet.Text)
2024-02-13 04:45:08 +03:00
}
```
2024-02-21 04:03:47 +03:00
FetchTweets returns tweets and cursor for fetching the next page. Each request returns up to 20 tweets.
2021-03-09 10:40:22 +02:00
```golang
2024-02-21 04:03:47 +03:00
var cursor string
tweets, cursor, err := scraper.FetchTweets("taylorswift13", 20, cursor)
```
2021-03-09 10:40:22 +02:00
2024-02-21 04:03:47 +03:00
### Get user medias
2021-03-09 10:40:22 +02:00
2024-02-21 04:03:47 +03:00
500 requests / 15 minutes
2021-03-09 10:40:22 +02:00
2024-02-21 04:03:47 +03:00
`GetMediaTweets` returns a channel with the specified number of user tweets that contain media. It’ s using the `FetchMediaTweets` method under the hood.
```golang
for tweet := range scraper.GetMediaTweets(context.Background(), "taylorswift13", 50) {
if tweet.Error != nil {
panic(tweet.Error)
2021-03-09 10:40:22 +02:00
}
fmt.Println(tweet.Text)
}
```
2024-02-21 04:03:47 +03:00
`FetchMediaTweets` returns tweets and cursor for fetching the next page. Each request returns up to 20 tweets.
2020-05-14 14:59:33 +02:00
2024-02-21 04:03:47 +03:00
```golang
var cursor string
tweets, cursor, err := scraper.FetchMediaTweets("taylorswift13", 20, cursor)
```
2023-05-21 01:10:22 +03:00
2024-02-21 04:41:51 +03:00
### Get bookmarks
> [!IMPORTANT]
> Requires authentication!
500 requests / 15 minutes
`GetBookmarks` returns a channel with the specified number of bookmarked tweets. It’ s using the `FetchBookmarks` method under the hood.
```golang
for tweet := range scraper.GetBookmarks(context.Background(), 50) {
if tweet.Error != nil {
panic(tweet.Error)
}
fmt.Println(tweet.Text)
}
```
`FetchBookmarks` returns bookmarked tweets and cursor for fetching the next page. Each request returns up to 20 tweets.
```golang
var cursor string
tweets, cursor, err := scraper.FetchBookmarks(20, cursor)
```
2020-05-14 14:59:33 +02:00
2024-02-21 04:03:47 +03:00
### Search tweets
2020-05-14 14:59:33 +02:00
2024-02-21 04:03:47 +03:00
> [!IMPORTANT]
> Requires authentication!
2020-05-14 14:59:33 +02:00
2024-02-21 04:03:47 +03:00
150 requests / 15 minutes
`SearchTweets` returns a channel with the specified number of tweets that contain media. It’ s using the `FetchSearchTweets` method under the hood.
```golang
for tweet := range scraper.SearchTweets(context.Background(),
"twitter scraper data -filter:retweets", 50) {
if tweet.Error != nil {
panic(tweet.Error)
2020-05-14 14:59:33 +02:00
}
2024-02-21 04:03:47 +03:00
fmt.Println(tweet.Text)
2020-05-14 14:59:33 +02:00
}
```
2020-12-04 15:08:33 +07:00
2024-02-21 04:03:47 +03:00
`FetchSearchTweets` returns tweets and cursor for fetching the next page. Each request returns up to 20 tweets.
2020-05-14 14:59:33 +02:00
2024-02-21 04:03:47 +03:00
```golang
tweets, cursor, err := scraper.FetchSearchTweets("taylorswift13", 20, cursor)
```
2020-12-23 19:53:48 +02:00
2024-02-21 04:03:47 +03:00
By default, search returns top tweets. You can change it by specifying the search mode before making requests. Supported modes are `SearchTop` , `SearchLatest` , `SearchPhotos` , `SearchVideos` , and `SearchUsers` .
2020-12-23 19:53:48 +02:00
2020-12-20 00:20:27 +07:00
```golang
2020-12-23 19:53:48 +02:00
scraper.SetSearchMode(twitterscraper.SearchLatest)
2020-12-20 00:20:27 +07:00
```
2024-02-21 04:03:47 +03:00
#### Search params
2020-12-23 19:53:48 +02:00
2024-02-21 04:03:47 +03:00
See [Rules and filtering ](https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators ) for build standard queries.
2020-05-14 14:59:33 +02:00
2019-09-21 10:59:45 +03:00
### Get profile
2024-02-21 04:03:47 +03:00
95 requests / 15 minutes
2019-09-21 10:59:45 +03:00
2024-02-21 04:03:47 +03:00
```golang
profile, err := scraper.GetProfile("taylorswift13")
2019-09-21 10:59:45 +03:00
```
2024-02-21 04:03:47 +03:00
### Search profile
2021-04-22 21:38:49 +03:00
2024-02-21 04:03:47 +03:00
> [!IMPORTANT]
> Requires authentication!
2021-04-22 21:38:49 +03:00
2024-02-21 04:03:47 +03:00
150 requests / 15 minutes
2021-04-22 21:38:49 +03:00
2024-02-21 04:03:47 +03:00
`SearchProfiles` returns a channel with the specified number of tweets that contain media. It’ s using the `FetchSearchProfiles` method under the hood.
```golang
for profile := range scraper.SearchProfiles(context.Background(), "Twitter", 50) {
if profile.Error != nil {
panic(profile.Error)
2021-04-22 21:38:49 +03:00
}
2024-02-21 04:03:47 +03:00
fmt.Println(profile.Name)
2021-04-22 21:38:49 +03:00
}
```
2024-02-21 04:03:47 +03:00
`FetchSearchProfiles` returns profiles and cursor for fetching the next page. Each request returns up to 20 tweets.
2020-02-12 10:45:19 +02:00
```golang
2024-02-21 04:03:47 +03:00
profiles, cursor, err := scraper.FetchSearchProfiles("taylorswift13", 20, cursor)
```
2020-02-12 10:45:19 +02:00
2024-02-21 04:03:47 +03:00
### Get trends
2020-02-12 10:45:19 +02:00
2024-02-21 04:03:47 +03:00
```golang
trends, err := scraper.GetTrends()
2020-02-12 10:45:19 +02:00
```
2020-12-11 20:58:49 +02:00
2024-02-21 06:15:17 +03:00
### Get following
> [!IMPORTANT]
> Requires authentication!
500 requests / 15 minutes
```golang
var cursor string
2024-03-09 03:55:39 +03:00
users, cursor, err := scraper.FetchFollowing("Support", 20, cursor)
2024-02-21 06:15:17 +03:00
```
### Get followers
> [!IMPORTANT]
> Requires authentication!
50 requests / 15 minutes
```golang
var cursor string
2024-03-09 03:55:39 +03:00
users, cursor, err := scraper.FetchFollowers("Support", 20, cursor)
```
### Get scheduled tweets
> [!IMPORTANT]
> Requires authentication!
500 requests / 15 minutes
```golang
tweets, err := scraper.FetchScheduledTweets()
```
### Create scheduled tweet
> [!IMPORTANT]
> Requires authentication!
500 requests / 15 minutes
```golang
tweets, err := scraper.CreateScheduledTweet(twitterscraper.TweetSchedule{
Text: "New scheduled tweet text",
Date: time.Now().Add(time.Hour * 24 * 31),
Medias: nil,
})
```
### Delete scheduled tweet
> [!IMPORTANT]
> Requires authentication!
500 requests / 15 minutes
```golang
err := scraper.DeleteScheduledTweet("123")
```
### Upload media
> [!IMPORTANT]
> Requires authentication!
50 requests / 15 minutes
Uploads photo, video or gif for further posting or scheduling. Expires in 24 hours if not used.
```golang
media, err := scraper.UploadMedia("./files/movie.mp4")
2024-02-21 06:15:17 +03:00
```
2024-02-21 04:03:47 +03:00
## Connection
2021-09-09 11:15:53 +08:00
2024-02-21 04:03:47 +03:00
### Proxy
2021-09-09 11:15:53 +08:00
2024-02-21 04:03:47 +03:00
#### HTTP(s)
2020-12-11 20:58:49 +02:00
```golang
2020-12-12 23:33:57 +02:00
err := scraper.SetProxy("http://localhost:3128")
2020-12-11 20:58:49 +02:00
```
2024-02-21 04:03:47 +03:00
#### SOCKS5
2021-09-09 11:15:53 +08:00
```golang
2021-09-13 17:30:46 +03:00
err := scraper.SetProxy("socks5://localhost:1080")
2021-09-09 11:15:53 +08:00
```
2024-02-21 04:03:47 +03:00
Socks5 proxy support authentication.
```golang
err := scraper.SetProxy("socks5://user:pass@localhost:1080 ")
```
### Delay
2021-07-16 13:52:22 +03:00
Add delay between API requests (in seconds)
```golang
scraper.WithDelay(5)
```
2020-12-11 20:58:49 +02:00
### Load timeline with tweet replies
```golang
2020-12-12 23:33:57 +02:00
scraper.WithReplies(true)
2020-12-11 20:58:49 +02:00
```
2024-02-21 04:03:47 +03:00
## Contributing
### Testing
To run some tests, you need to set any form of authentication via environment variables. You can see all possible variables in .vscode/settings.json file. You can also set them in the file to use automatically in vscode, just make sure you don’ t commit them in your contribution.