how to solve recaptcha when web scraping in 2024 with recaptcha solver#
reCAPTCHA Overview#
reCAPTCHA is a free service provided by Google that helps protect websites from spam and abuse. It uses advanced risk analysis techniques and adaptive challenges to distinguish humans from bots. The term "CAPTCHA" stands for "Completely Automated Public Turing test to tell Computers and Humans Apart."
There have been several versions of reCAPTCHA over the years:
reCAPTCHA v1: This was the original version which presented users with distorted text that they had to decipher and enter into a box. This was useful for digitizing books and other printed materials, but it was often difficult for humans to solve.
reCAPTCHA v2: This version introduced the "I'm not a robot" checkbox that users are asked to click. If the system isn't sure whether the user is human or not, it will present additional challenges, such as identifying objects in images.
reCAPTCHA v3: This version runs in the background and doesn't interrupt users with challenges. Instead, it assigns a risk score to each visitor based on their interactions with the website. Website owners can then use this score to decide how to handle the visitor (e.g., block, present a challenge, etc.).
Solving reCAPTCHA#
Prerequisites#
Request library
we use the python request for example code
NextCaptcha Client key
get NextCaptcha Client key from dashboard #
Sign up to NextCaptcha to get your free API key and free trial credits immediately.
python code for solve reCAPTCHA#
# create task
"""
Create NextCaptcha CAPTCHA solver task.
:param task: task of captcha dict.
:param client_key: the client key form nextcaptcha dashboard.
:param solft_id: Optional. The value of the 'solft_id'.
:param callback_url: Optional. callback when the captcha task finish.
:return: A dictionary containing the solution of the reCAPTCHA.
"""
def send_task (task, client_key, solft_id, callback_url):
HOST = "https://api.nextcaptcha.com"
session = requests.session()
data = {
"clientKey" : client_key,
"solftId" : solft_id,
"callbackUrl" : callback_url,
"task" : task,
}
resp = session.post( url = HOST + "/createTask" , json = data)
if resp.status_code != 200 :
return resp.json()
resp = resp.json()
task_id = resp.get( "taskId" )
start_time = time.time()
while True :
if time.time() - start_time > TIMEOUT :
return { "errorId" : 12 , "errorDescription" : "Timeout" , "status" : "failed" }
resp = session.post( url = HOST + "/getTaskResult" ,
json = { "clientKey" : client_key, "taskId" : task_id})
if resp.status_code != 200 :
return resp.json()
status = resp.json().get( "status" )
if status == READY_STATUS :
return resp.json()
if status == FAILED_STATUS :
return resp.json()
time.sleep( 1 )
python example code for solve reCAPTCHA v2#
def recaptchav2 (self, website_url: str , website_key: str , recaptcha_data_s_value: str = "" ,
is_invisible: bool = False , api_domain: str = "" , page_action: str = "" ) -> dict :
"""
Solve reCAPTCHA v2 challenge.
:param website_url: The URL of the website where the reCAPTCHA is located.
:param website_key: The sitekey of the reCAPTCHA.
:param recaptcha_data_s_value: Optional. The value of the 'data-s' parameter if present.
:param is_invisible: Optional. Whether the reCAPTCHA is invisible or not.
:param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
:return: A dictionary containing the solution of the reCAPTCHA.
"""
task = {
"type" : "RecaptchaV2TaskProxyless" ,
"websiteURL" : website_url,
"websiteKey" : website_key,
"recaptchaDataSValue" : recaptcha_data_s_value,
"isInvisible" : is_invisible,
"apiDomain" : api_domain,
"pageAction" : page_action,
}
return send_task(task)
python example code for solve reCAPTCHA v2 Enterprise #
def recaptchav2enterprise (self, website_url: str , website_key: str , enterprise_payload: dict = {},
is_invisible: bool = False , api_domain: str = "" , page_action: str = "" ) -> dict :
"""
Solve reCAPTCHA v2 Enterprise challenge.
:param website_url: The URL of the website where the reCAPTCHA is located.
:param website_key: The sitekey of the reCAPTCHA.
:param enterprise_payload: Optional. Additional enterprise payload parameters.
:param is_invisible: Optional. Whether the reCAPTCHA is invisible or not.
:param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
:return: A dictionary containing the solution of the reCAPTCHA.
"""
task = {
"type" : "RecaptchaV2EnterpriseTaskProxyless" ,
"websiteURL" : website_url,
"websiteKey" : website_key,
"enterprisePayload" : enterprise_payload,
"isInvisible" : is_invisible,
"apiDomain" : api_domain,
"pageAction" : page_action,
}
return send_task(task)
python example code for solve reCAPTCHA v3 #
def recaptchav3 (self, website_url: str , website_key: str , page_action: str = "" , api_domain: str = "" ,
proxy_type: str = "" , proxy_address: str = "" , proxy_port: int = 0 , proxy_login: str = "" ,
proxy_password: str = "" ) -> dict :
"""
Solve reCAPTCHA v3 challenge.
:param website_url: The URL of the website where the reCAPTCHA is located.
:param website_key: The sitekey of the reCAPTCHA.
:param page_action: Optional. The action parameter to use for the reCAPTCHA.
:param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
:param proxy_type: Optional. The type of the proxy (HTTP, HTTPS, SOCKS4, SOCKS5).
:param proxy_address: Optional. The address of the proxy.
:param proxy_port: Optional. The port of the proxy.
:param proxy_login: Optional. The login for the proxy.
:param proxy_password: Optional. The password for the proxy.
:return: A dictionary containing the solution of the reCAPTCHA.
"""
task = {
"type" : "RecaptchaV3TaskProxyless" ,
"websiteURL" : website_url,
"websiteKey" : website_key,
"pageAction" : page_action,
"apiDomain" : api_domain,
}
if proxy_address:
task[ "type" ] = "RecaptchaV3Task"
task[ "proxyType" ] = proxy_type
task[ "proxyAddress" ] = proxy_address
task[ "proxyPort" ] = proxy_port
task[ "proxyLogin" ] = proxy_login
task[ "proxyPassword" ] = proxy_password
return send_task(task)
python example code for solve reCAPTCHA mobile #
def recaptcha_mobile (self, app_key: str , app_package_name: str = "" , app_action: str = "" ) -> dict :
"""
Solve Mobile reCAPTCHA challenge.
:param app_key: The app key of the Mobile reCAPTCHA.
:param app_package_name: Optional. The package name of the mobile app.
:param app_action: Optional. The action parameter to use for the Mobile reCAPTCHA.
:return: A dictionary containing the solution of the Mobile reCAPTCHA.
"""
task = {
"type" : "RecaptchaMobileProxyless" ,
"appKey" : app_key,
"appPackageName" : app_package_name,
"appAction" : app_action,
}
return send_task(task)
Conclusion#
NextCaptcha Highly maintained, up-to-date and cheapest ReCaptcha Mobile Solver service, stably 24/7 support
For successful data retrieval, you need a powerful tool to rely completely on in order to handle CAPTCHA. NextCaptcha provides an easy-to-setup API that enables you to overcome all anti-bot challenges, and you can
try it for free today.
More#