Geoproxies documentation

The simpler way to proxy through the web

Geoproxies is a smart web proxy service compatible with virtually all HTTP clients. Geoproxies offers large pools of IP addresses that your programs can pick from via a powerful and easy-to-use API.

The best features of Geoproxies are:

  • a large pool of Residential IP addresses
  • a pool of inexpensive Datacenter IP addresses
  • a universal API that any proxy-enabled web client can use
  • IPs selectable by country
  • persistent sessions
  • traffic recording
  • per-session traffic accounting
  • adblocking
  • IP address banning

Try Geoproxies on https://www.geoproxies.com/.

Getting Started

Geoproxies is very easy to use. The Geoproxies API is designed to work with any proxy-enabled web client. To control Geoproxies (for example, to pick a country-specific IP address or enable session recording), all you need to do is pass a specific string as your proxy password.

Without any parameters, using Geoproxies is as simple as this curl command:

curl --proxy "http://24f2f5b3-1867-42ac-994c-57140022fb46@proxy.geoproxies.com:1080" \
http://httpbin.org/anything

Sign up for a free account

To use Geoproxies, you need to signup on https://www.geoproxies.com. Geoproxies is a paid service that offers large, high quality pools of IP addresses. We also offer free access to a number of URLs so you can develop and test your programs without paying. All examples in this tutorials can be followed with a free account.

Free target urls includes;

You can authenticate using one of two methods: HTTP basic authentication and an IP address whitelist.

Authentifying with the proxy server

1. HTTP Basic Authentication

With HTTP basic authentication, your web client sends credentials to the proxy as part of HTTP requests. This is the method that we recommend because it enables more features in Geoproxies than IP address-based authentication does.

When you use HTTP basic authentication with Geoproxies, your credentials only take the form of a long token that you pass as the username. The password field is reserved the purpose or passing parameters to the proxy server, which we explain below. First, you need to retrieve your token.

You need to be logged in to your Geoproxies account on https://www.geoproxies.com, and navigate to the Proxy access section of the page. You will be shown a view similar to the one below. This view contains your proxy authentication and API tokens.

Account tokens

Suppose you got the authentication token 24f2f5b3-1867-42ac-994c-57140022fb46. A curl request proxied through Geoproxies would look like this:

curl --proxy "http://24f2f5b3-1867-42ac-994c-57140022fb46@proxy.geoproxies.com:1080" \
http://httpbin.org/anything

Since curl lets you configure a proxy in URL format, we pass the credentials as part of the URL, just before the @ character, with username and password separated by a colon. As you can see, we need to pass the token in the username field to gain access.

This leaves the password field, which lets us give special instructions to Geoproxies. For example, if you want Geoproxies to record the traffic between the proxy and the target server for debugging purposes, you should pass the wiretap parameter.

curl --proxy "http://24f2f5b3-1867-42ac-994c-57140022fb46:wiretap@proxy.geoproxies.com:1080" \
http://httpbin.org/anything

You can also pass multiple parameters. This command makes a request that goes through a German IP address:

curl --proxy "http://24f2f5b3-1867-42ac-994c-57140022fb46:wiretap|country=de@proxy.geoproxies.com:1080" \
http://httpbin.org/anything

Proxy credentials are passed to the proxy server in the Proxy-Authentication header. The value of this header is a string of the form username:password, which is then base64-encoded and then prefixed with the string "Basic ". For example, the command above could be rewritten as

curl --proxy "http://proxy.geoproxies.com:1080" \
-H "Proxy-Authorization: Basic MjRmMmY1YjMtMTg2Ny00MmFjLTk5NGMtNTcxNDAwMjJmYjQ2OmNvdW50cnk9ZnJ8d2lyZXRhcD0x" \
http://httpbin.org/anything

Note

Geoproxies strips the Proxy-Authorization and Proxy-Connection headers before forwarding your requests to target servers. They are never transmitted to the sites you access. Everything else is left as-is.

2. IP address whitelist

In some cases it’s inconvenient to pass credentials to a proxy server. For those cases, Geoproxies lets you define a list of IP addresses that are allowed to use your Geoproxies account.

To authorize an IP address, log in to https://www.geoproxies.com and navigate to the Proxy control section. Geoproxies allows a maximum of 10 IP addresses as whitelist.

IP Whitelist screenshot

Using sessions to keep IP addresses between requests

If you are making multiple requests to the same website, you may want successive requests to go through the same IP address.

Geoproxies provides three parameters for session managements: session, sesionGroupId and fail0nDuplicateIp. Only the session parameter is required to initiate a session. Its use is very simple: include session=XYZ in your parameters list and all requests that share the same session ID (in this case, the string XYZ) will share the same IP address, to the extent possible.

See also

For more on sessions, see Proxy Parameter Reference.

Standard versus residential IP addresses

Geoproxies lets you use two pools of IP addresses: standard and residential.

Here’s a comparison of both pools.

  Standard IPs Residential IPs
Hosting Hosted in datacenters Hosted by the home connections of real end users
Pool size Hundreds of IPs Millions of IPs
Geolocation Yes Yes
Anti-detection capability Medium to high Very high
Price Economical Premium

To use residential IPs simple include the pool=residential parameter and pool=static for the Standard IPs. See for example below.

curl --proxy "http://24f2f5b3-1867-42ac-994c-57140022fb46:pool=residential@proxy.geoproxies.com:1080" \
http://httpbin.org/anything

See also

To learn more about pools, see IP Address Pools Reference.

Built-in ad blocker

Geoproxies lets you block domains that serve ads and analytics services or are otherwise known to be nefarious. Use the parameter blockAds=1 to enable ad blocking.

Ad blocking works at the proxy level. Since requests to blocked domains are never sent to the target servers, you can save significantly on useless traffic.

With the blockAds=1 toggle, the proxy returns a 551 Domain block response when a URL suspected to be a tracker is identified.

Handling banned IP addresses

Geoproxies provides the functionality to exclude an IP or a set of IPs when serving a request. This feature is usable through the RPC APIs provided.

See API Reference for detailed information

Examples of Proxy Requests

The snippets in this section uses the basic authentication to connect to Geoproxies.

Curl

curl provides the capability to specify a proxy to connect through with the -x flag

Python

  • Using the requests library
import requests

AUTHENTICATION_TOKEN = '24f2f5b3-1867-42ac-994c-57140022fb46'
PROXY_ARGUMENTS = 'session=XYZ|wiretap'

geoproxies_url = 'http://%s:%s@proxy.geoproxies.com:1080' % (AUTHENTICATION_TOKEN, PROXY_ARGUMENTS)

r = requests.get('http://httpbin.org/ip', proxies=geoproxies_url)

print(r.text)

Node

  • Using the request module
var AUTHENTICATION_TOKEN = '24f2f5b3-1867-42ac-994c-57140022fb46';
var PROXY_ARGUMENTS = 'session=XYZ|wiretap';

var proxy = `http://${AUTHENTICATION_TOKEN}:{PROXY_ARGUMENTS}@proxy.geoproxies.com:1080/`;

require('request')({
    url: 'http://httpbin.org/ip',
    method: 'GET',
    proxy: proxy,
}, function(err,httpResponse,body) {
    console.info(err, body);
});

PHP

  • Using the inbuilt curl extension
$AUTHENTICATION_TOKEN = '24f2f5b3-1867-42ac-994c-57140022fb46';
$PROXY_ARGUMENTS = 'session=XYZ|wiretap';

$ch = curl_init("http://httpbin.org/ip");

$options = [
  CURLOPT_PROXY => "http://proxy.geoproxies.com:1080",
  CURLOPT_PROXYUSERPWD => "{$AUTHENTICATION_TOKEN}:{$PROXY_ARGUMENTS}"
];

curl_setopt_array($ch, $options);
$curl_result = curl_exec($ch);
curl_close($ch);

print($curl_result);

Firefox

Firefox allows you to modify its proxy settings. You can find it via the menu Preferences ‣ Advanced ‣ Network ‣ Connection ‣ Settings. Choose Manual proxy configuration, and then fill in the HTTP Proxy textfield to proxy.geoproxies.com and its Port to 1080.

_images/firefox_settings.png

The Geoproxies SDK for Python, JavaScript and TypeScript

Making use of Geoproxies might be a new learning curve for some developer. For this reason, Geoproxies provides helper modules in various languages, that encapsulates its Proxy parameters, in a programmable and intuitive way for use in scrapping frameworks or any other project.

Python SDK tutorial

  • Requirements and installation *

The SDK requires atleast the python 2.7 version or beyond.

To install using pip:

pip install geoproxies

Node.Js SDK Tutorial

  • Requirements and installation *

To install using npm:

npm install geoproxies

Using Geoproxies SDK with Scrappy

API references for Python and Node.Js

The API classes and methods are same on both the Python and Node.Js platform, with the exception of language norm code style.

class geoproxies.ProxySettings(token, country=None, tracking_id=None, project_id=None, target_id=None, session_id=None, session_group_id=None, fail_on_duplicate_ip=None, wiretap=None, prefer_ip=None, blockads=None, pool=None)[source]
disable_blockads()[source]

Toggles off the blockads functionality and returns the ProxySettings

disable_fail_on_duplicate_ip()[source]

Toggles off the fail_on_duplicate_ip functionality and returns the ProxySettings

disable_wiretap()[source]

Toggles off the wiretap functionality and returns the ProxySettings

enable_blockads()[source]

Toggles on the blockads functionality and returns the ProxySettings

enable_fail_on_duplicate_ip()[source]

Toggles on the fail_on_duplicate_ip functionality and returns the ProxySettings

enable_wiretap()[source]

Toggles on the wiretap functionality and returns the ProxySettings

get_api_url(name)[source]

Returns the proxy url for the api name specified

get_components()[source]

Returns a dictionary of needed Proxy strings/components as required by most Request libraries

get_country()[source]

Returns the current value of the country attribute.

get_pool()[source]

Returns the current value of the pool attribute

get_prefer_ip()[source]

Returns the current value of the prefer_ip attribute.

get_project_id()[source]

Returns the current value of the project_id attribute.

get_session_group_id()[source]

Returns the current value of the session_group_id attribute.

get_session_id()[source]

Returns the current value of the session_id attribute.

get_target_id()[source]

Returns the current value of the target_id attribute.

get_token()[source]

Returns the current value of the token attribute.

get_tracking_id()[source]

Returns the current value of the tracking_id attribute.

is_blockads()[source]

Returns the current value of the blockads attribute as a boolean.

is_fail_on_duplicate_ip()[source]

Returns the current value of the fail_on_duplicate_ip attribute as a boolean.

is_wiretap()[source]

Returns the current value of the wiretap attribute as a boolean.

new_project_id()[source]

This method clones the ProxySettings, updates the project_id attribute with a random generated number and returns the new object

new_session_group_id()[source]

This method clones the ProxySettings, updates the session_group_id attribute with a random generated number and returns the new object

new_session_id()[source]

This method clones the ProxySettings, updates the session_id attribute with a random generated number and returns the new object

new_target_id()[source]

This method clones the ProxySettings, updates the target_id attribute with a random generated number and returns the new object

new_tracking_id()[source]

This method clones the ProxySettings, updates the tracking_id attribute with a random generated number and returns the new object

set_blockads(value)[source]

Clones and return the ProxySettings with the updated blockads attribute.

set_country(value)[source]

Clones and return the ProxySettings with the updated country attribute.

set_fail_on_duplicate_ip(value)[source]

Clones and return the ProxySettings with the updated fail_on_duplicate_ip attribute.

set_pool(value)[source]

Select either from a pool of residential or hosted IP address and returns the ProxySettings

set_prefer_ip(value)[source]

Clones and return the ProxySettings with the updated prefer_ip attribute.

set_project_id(value)[source]

Clones and return the ProxySettings with the updated project_id attribute.

set_session_group_id(value)[source]

Clones and return the ProxySettings with the updated session_group_id attribute.

set_session_id(value)[source]

Clones and return the ProxySettings with the updated session_id attribute.

set_target_id(value)[source]

Clones and return the ProxySettings with the updated target_id attribute.

set_token(value)[source]

Clones and return the ProxySettings with the updated token attribute.

set_tracking_id(value)[source]

Clones and return the ProxySettings with the updated tracking_id attribute.

set_wiretap(value)[source]

Clones and return the ProxySettings with the updated wiretap attribute.

update(token=None, country=None, tracking_id=None, project_id=None, target_id=None, session_id=None, session_group_id=None, fail_on_duplicate_ip=None, wiretap=None, prefer_ip=None, blockads=None, pool=None)[source]

Returns instance of object with updated members

Named arguments

token: This is the authentication token provided by Geoproxies, the value is a UUID

country: Two-letter country codes defined in ISO 3166-1 for ex. mu for Mauritius, and fr for France etc

tracking_id: This entry is required to identify a single reques, so therefore should be unique at all time. You can opt out to use the new_tracking_id() method instead to generate a unique value

project_id: Option to categorize data use based on whatever characterization as deemed fit by the proxy user

target_id: Parameter to represent a domain. This is a needed input when the fine grain ability to control the usage of IP per domain is needed.

session_id: If tracking id represents a single request, then the session_id represents a set of related requests.

session_group_id: Use for grouping sessions and can also be use to control the provision of IP

fail_on_duplicate_ip: An optional truthy value to modify the Proxy behaviour when no unique IP is available in a serve a request. (Note) this option must be used with the session_group_id parameter

wiretap: Provide a truthy value to enable or disable the wiretapping functionality

prefer_ip: Use for requesting a particular IP address. IP is returned when available otherwise another IP is used.

blockads: Toggle option to reject known advert, analytics or attach url as provided on https://github.com/notracking/hosts-blocklists/

pool: Optional parameter to select either Residential or Static IP. Defaults to Static

For more detailed information about all the Proxy Option, please check the http://geoproxies.com/en/pDocs

Using SDK

  • Simplest Request

A valid ProxySettings is created by initializing the ProxySettings Class with the customer token value.

>>> from geoproxies import ProxySettings
>>> proxysettings = ProxySettings('24f2f5b3-1867-42ac-994c-57140022fb46')

This is all it takes to make a request with Geoproxies

  • Tracking a Request

The tracking_id is how every request is expected to be identified. The request data can be collected after a completed request via the Geoproxies API.

>>> proxysettings.set_tracking_id('any id works')

or even better generate a new tracking id for every request with the method

>>> proxysettings.new_tracking_id()

Note

It is required to provide this value when request information or wiretap data is to be retrieved afterwards.

  • Sessions and Session Groups

Using session allows for the use/re-use of the same IP address. Think of a session as associated with an IP address.

Note

The inclusion of the country or session_group option makes up for a different IP. A logical way to think of this will be, A session is made up of the provided session_id and country and session_group.

Session Group as the name suggest, is to allow for the grouping of sessions. This is particularly useful when using the same set of session over different requirements. Here the different requirement would represent a different session_group name.

Note

The fail_on_duplicate_ip in conjunction with session_group instructs the Proxy to fail when Geoproxies is unable to dispatch a unique IP address in the batch of IP(s) associated with the Session Group.

  • Making a Wiretap Request

    >>> proxysettings.set_wiretap(True)
    

Provides the feature of recording the data transferred up and down for various reasons, most notably for debugging.

  • Using the output

    >>> print(str(proxysettings))
    >>> http://24f2f5b3-1867-42ac-994c-57140022fb46:wiretap=True@proxy.geoproxies.com:1080
    

Proxy Parameter Reference

Parameter syntax

Geoproxies allows the combination of its parameters in any order.

Each option/argument must be separated with the pipe (|) character. For example using the country and project parameter will look like this: country=fi|project=analytics423.

The parameter names are case-insensitive.

IP address selection

  • pool: string

The pool parameter is use to select the preferred choice of IP pool. Defaults to pool=static. It accepts just one of two values consisting; static or residential

  • country: iso alpha-2 country code

Pass the country option to specify an IP from any one region of the world. Only the standard iso alpha-2 country code is recognized.

For example fr - France, nl - Neitherlands etc

  • preferIP: IPV4 Address

This parameter is used to pass the preferred IPV4 address to serve a request with. Should the IP passed be unavailable, another IP address will be used.

Note

You can query for the IP used to serve a request with a call to getRequestData API. See API Reference for more information on using this API.

Sessions

  • session: string

This parameter allows the persistent use of any single IP address on Geoproxies. Similar to the preferIP option, but in a much extensive way. See the sessionGroupId and failOnDuplicateIp parameters for advanced combinations.

  • sessionGroupId: string

The sessionGroupId allows for a unique way to categorize a batch of request, and most importantly to hint Geoproxies on serving distinct IP addresses when possible.

Note

This option requires the session parameter to be provided.

In addition, see the failOnDuplicateIp option to enforce a failure in the case where Geoproxies is unable serve with a new IP address based on all the arguments specified.

  • failOnDuplicateIp: mixed

The parameter to optionally fail when unique IP is not dispatched in the use of session and grouping.

Note

This requires the sessionGroupId and session parameters to be provided.

Ban management

  • targetId: string

This is intended to be used to represent a body of requests to a destination. Synonymous to a domain.

Debugging

  • wiretap: void

A toggle parameter to capture the request and response between the proxy and the target destination.

Note

The use of this option, also requires the trackingId parameter to be provided.

  • trackingId: string

User provided value to represent a request. It is generally suggested to used this parameter, when you need to extract information about the request at a later time via API calls.

Traffic control

  • blockAds: void

The blockAds option enables the filtering away of hosts and domains that are known to be analytics, adverts, attach website among others. Please see the link https://github.com/notracking/hosts-blocklists/ for a list of blocked urls.

Accounting

  • project: string

The project option is used for data grouping. Commonly used for grouping traffic usage statistically.

IP Address Pools Reference

Geoproxies many IPs are categorized into two. Providing Businesses and Individual with the best of both for easier management of cost, speed, security, stability and expectations.

Datacenter

With over 5000+ hosted IP addresses available. These IPs are hosted on Geoproxies servers and can be counted on with very little to no down time.

Pros
  • Sticky IP
  • Cheap
  • Hosted
  • Availability
Cons
  • Limited

Residential

Our residential IP collections are distributed all around the world via our P2P solutions. Thereby providing our customers with changing IP addresses to counter complex IP blacklist services.

Pros
  • Unlimited number of IPs
  • Ideal for evading very complex web services
Cons
  • Possibly shorter life cycle

API Reference

Geoproxies further enables more features through its simple to use RPC APIs

Note

Similar to case insensitivity of the proxy parameters, Geoproxies RPC APIs are also not only case insensitive but also style neutal. This means; create_access_token and CreateAccessToken will return the same result.

Attention

  • Parameter names prepended with asterisks are compulsory, otherwise optional.
  • Parameter prefixed with [p] represents POST parameters,
  • Parameters prefixed with [c] represents parameters to be passed with the basic authentication header.
  • api_token must be provided for every request. This token value can be found on your Geoproxies account proxy access information page.

Ban

Use this call to temporarily disable an IP selection associated with a session for any given amount of time.

PARAMETERS

  • [c] * session - The session identifier currently holding a reference to the IP address to ban.
  • [p] ttl - An optional delay value in seconds. If not provided, a default ttl value of 3600 is assumed.

RETURNS

“success” or failure reason

GetRequestData

Fetches all information associated with a request including wiretap.

PARAMETERS

  • [c] or [p] * trackingId - Same as the trackingId passed for each request.
  • [c] or [p] * wiretap - An always true toggle parameter.
  • [p] * tcpdump - An always true toggle parameter.
RETURNS
{
    destination: "",
    bytes_up: "",
    bytes_down: "",
    ip_address: "",
    start_datetime: "",
    end_datetime: "",
    start_timestamp: "",
    end_timestamp: "",
    boundaries: {},
    data: ""
}

When the tcpdump option is passed. Otherwise returns the well formatted json below.

{
    host: "",
    path: "",
    port: 0,
    method: "",
    start: "",
    end: "",
    ip_address: "",
    start_timestamp: "",
    end_timestamp: "",
    up: {
        bytes: "",
        headers: "",
        body: ""
    },
    down: {
        bytes: "",
        headers: "",
        body: ""
    }
}

GetWiretap

Used to return the recorded output data for a request, useful for debugging. Prefer using GetRequestData to fetch all possible information about a request including the ip_address used for that request

PARAMETERS

  • [c] or [p] * trackingId - Same as the trackingId passed for each request.
  • [p] tcpdump - An optional toggle parameter to return result as a dump.

RETURNS

{
    host: "",
    path: "",
    port: 0,
    method: "",
    start: "",
    end: "",
    start_timestamp: "",
    end_timestamp: "",
    up: {
        bytes: "",
        headers: "",
        body: ""
    },
    down: {
        bytes: "",
        headers: "",
        body: ""
    }
}

CreateAccesstoken

This call is used to create a short lived api token.

PARAMETERS

  • [p] token - .
  • [p] ttl - An optional delay value in seconds. If not provided, a default ttl value of 600 is assumed.

RETURNS

{
    newToken: "uuid-token",
    expires: 600
}

ClearSession

Removes an active session

PARAMETERS

  • [p] * session: -

RETURNS

“success” or failure reason

Indices and tables