The Geoproxies SDK for Python, JavaScript and TypeScript¶
Making use of Geoproxies might be a new learning curve for some developer. For this reason, Geoproxies provides helper modules in various languages, that encapsulates its Proxy parameters, in a programmable and intuitive way for use in scrapping frameworks or any other project.
Python SDK tutorial¶
- Requirements and installation *
The SDK requires atleast the python 2.7 version or beyond.
To install using pip:
pip install geoproxies
Using Geoproxies SDK with Scrappy¶
API references for Python and Node.Js¶
The API classes and methods are same on both the Python and Node.Js platform, with the exception of language norm code style.
-
class
geoproxies.
ProxySettings
(token, country=None, tracking_id=None, project_id=None, target_id=None, session_id=None, session_group_id=None, fail_on_duplicate_ip=None, wiretap=None, prefer_ip=None, blockads=None, pool=None)[source]¶ -
-
disable_fail_on_duplicate_ip
()[source]¶ Toggles off the fail_on_duplicate_ip functionality and returns the ProxySettings
-
enable_fail_on_duplicate_ip
()[source]¶ Toggles on the fail_on_duplicate_ip functionality and returns the ProxySettings
-
get_components
()[source]¶ Returns a dictionary of needed Proxy strings/components as required by most Request libraries
-
is_fail_on_duplicate_ip
()[source]¶ Returns the current value of the fail_on_duplicate_ip attribute as a boolean.
-
new_project_id
()[source]¶ This method clones the ProxySettings, updates the project_id attribute with a random generated number and returns the new object
-
new_session_group_id
()[source]¶ This method clones the ProxySettings, updates the session_group_id attribute with a random generated number and returns the new object
-
new_session_id
()[source]¶ This method clones the ProxySettings, updates the session_id attribute with a random generated number and returns the new object
-
new_target_id
()[source]¶ This method clones the ProxySettings, updates the target_id attribute with a random generated number and returns the new object
-
new_tracking_id
()[source]¶ This method clones the ProxySettings, updates the tracking_id attribute with a random generated number and returns the new object
-
set_blockads
(value)[source]¶ Clones and return the ProxySettings with the updated blockads attribute.
-
set_fail_on_duplicate_ip
(value)[source]¶ Clones and return the ProxySettings with the updated fail_on_duplicate_ip attribute.
-
set_pool
(value)[source]¶ Select either from a pool of residential or hosted IP address and returns the ProxySettings
-
set_prefer_ip
(value)[source]¶ Clones and return the ProxySettings with the updated prefer_ip attribute.
-
set_project_id
(value)[source]¶ Clones and return the ProxySettings with the updated project_id attribute.
-
set_session_group_id
(value)[source]¶ Clones and return the ProxySettings with the updated session_group_id attribute.
-
set_session_id
(value)[source]¶ Clones and return the ProxySettings with the updated session_id attribute.
-
set_target_id
(value)[source]¶ Clones and return the ProxySettings with the updated target_id attribute.
-
set_tracking_id
(value)[source]¶ Clones and return the ProxySettings with the updated tracking_id attribute.
-
update
(token=None, country=None, tracking_id=None, project_id=None, target_id=None, session_id=None, session_group_id=None, fail_on_duplicate_ip=None, wiretap=None, prefer_ip=None, blockads=None, pool=None)[source]¶ Returns instance of object with updated members
Named arguments
token: This is the authentication token provided by Geoproxies, the value is a UUID
country: Two-letter country codes defined in ISO 3166-1 for ex. mu for Mauritius, and fr for France etc
tracking_id: This entry is required to identify a single reques, so therefore should be unique at all time. You can opt out to use the new_tracking_id() method instead to generate a unique value
project_id: Option to categorize data use based on whatever characterization as deemed fit by the proxy user
target_id: Parameter to represent a domain. This is a needed input when the fine grain ability to control the usage of IP per domain is needed.
session_id: If tracking id represents a single request, then the session_id represents a set of related requests.
session_group_id: Use for grouping sessions and can also be use to control the provision of IP
fail_on_duplicate_ip: An optional truthy value to modify the Proxy behaviour when no unique IP is available in a serve a request. (Note) this option must be used with the session_group_id parameter
wiretap: Provide a truthy value to enable or disable the wiretapping functionality
prefer_ip: Use for requesting a particular IP address. IP is returned when available otherwise another IP is used.
blockads: Toggle option to reject known advert, analytics or attach url as provided on https://github.com/notracking/hosts-blocklists/
pool: Optional parameter to select either Residential or Static IP. Defaults to Static
For more detailed information about all the Proxy Option, please check the http://geoproxies.com/en/pDocs
-
Using SDK¶
- Simplest Request
A valid ProxySettings is created by initializing the ProxySettings Class with the customer token value.
>>> from geoproxies import ProxySettings
>>> proxysettings = ProxySettings('24f2f5b3-1867-42ac-994c-57140022fb46')
This is all it takes to make a request with Geoproxies
- Tracking a Request
The tracking_id is how every request is expected to be identified. The request data can be collected after a completed request via the Geoproxies API.
>>> proxysettings.set_tracking_id('any id works')
or even better generate a new tracking id for every request with the method
>>> proxysettings.new_tracking_id()
Note
It is required to provide this value when request information or wiretap data is to be retrieved afterwards.
- Sessions and Session Groups
Using session allows for the use/re-use of the same IP address. Think of a session as associated with an IP address.
Note
The inclusion of the country or session_group option makes up for a different IP. A logical way to think of this will be, A session is made up of the provided session_id and country and session_group.
Session Group as the name suggest, is to allow for the grouping of sessions. This is particularly useful when using the same set of session over different requirements. Here the different requirement would represent a different session_group name.
Note
The fail_on_duplicate_ip in conjunction with session_group instructs the Proxy to fail when Geoproxies is unable to dispatch a unique IP address in the batch of IP(s) associated with the Session Group.
Making a Wiretap Request
>>> proxysettings.set_wiretap(True)
Provides the feature of recording the data transferred up and down for various reasons, most notably for debugging.
Using the output
>>> print(str(proxysettings)) >>> http://24f2f5b3-1867-42ac-994c-57140022fb46:wiretap=True@proxy.geoproxies.com:1080