Join And Get Free Trial!

如何使用 recaptcha 求解器在 2024 年进行网页抓取时解决 recaptcha#

logoNextCaptcha
May 20,2024

reCAPTCHA 概述#

reCAPTCHA 是 Google 提供的一项免费服务,可帮助保护网站免受垃圾邮件和滥用的侵害。它使用先进的风险分析技术和自适应挑战来区分人类和机器人。术语“验证码”代表“完全自动化的公共图灵测试,用于区分计算机和人类”。 多年来,reCAPTCHA 有过多个版本:
  • reCAPTCHA v1: 这是最初的版本,它向用户呈现扭曲的文本,他们必须解读并输入到一个框中。这对于数字化书籍和其他印刷材料很有用,但对于人类来说通常很难解决。

    • nextcaptcha recaptcha-badge

  • reCAPTCHA v2: 此版本引入了“我不是机器人”复选框,要求用户点击。如果系统不确定用户是否是人类,它将带来额外的挑战,例如识别图像中的物体。

    • nextcaptcha RecaptchaLogo

  • reCAPTCHA v3: 此版本在后台运行,不会用挑战打扰用户。相反,它会根据访问者与网站的互动情况为其分配风险分数。然后,网站所有者可以使用此分数来决定如何处理访问者(例如,阻止、提出挑战等)。

    • nextcaptcha solve-recaptcha-enterprise

解决 reCAPTCHA#

先决条件#

  • 请求库
    • 我们使用 python 请求作为示例代码
  • NextCaptcha客户端密钥

仪表板获取NextCaptcha客户端密钥#

注册 NextCaptcha即可立即获取免费 API 密钥和免费试用积分。

解决 reCAPTCHA 的 Python代码#

# create task
    """
        Create NextCaptcha CAPTCHA solver task.
 
        :param task: task of captcha dict.
        :param client_key: the client key form nextcaptcha dashboard.
        :param solft_id: Optional. The value of the 'solft_id'.
        :param callback_url: Optional. callback when the captcha task finish.
        :return: A dictionary containing the solution of the reCAPTCHA.
      """
    def send_task(task, client_key, solft_id, callback_url):
        HOST = "https://api.nextcaptcha.com"
        session = requests.session()
        data = {
            "clientKey": client_key,
            "solftId": solft_id,
            "callbackUrl": callback_url,
            "task": task,
        }
        resp = session.post(url=HOST + "/createTask", json=data)
        if resp.status_code != 200:
            return resp.json()
        resp = resp.json()
        task_id = resp.get("taskId")
 
        start_time = time.time()
        while True:
            if time.time() - start_time > TIMEOUT:
                return {"errorId": 12, "errorDescription": "Timeout", "status": "failed"}
 
            resp = session.post(url=HOST + "/getTaskResult",
                                     json={"clientKey": client_key, "taskId": task_id})
            if resp.status_code != 200:
                return resp.json()
            status = resp.json().get("status")
            if status == READY_STATUS:
                return resp.json()
            if status == FAILED_STATUS:
                return resp.json()
            time.sleep(1)
 

解决 reCAPTCHA v2 的 python 示例代码#

  def recaptchav2(self, website_url: str, website_key: str, recaptcha_data_s_value: str = "",
                    is_invisible: bool = False, api_domain: str = "", page_action: str = "") -> dict:
        """
        Solve reCAPTCHA v2 challenge.
 
        :param website_url: The URL of the website where the reCAPTCHA is located.
        :param website_key: The sitekey of the reCAPTCHA.
        :param recaptcha_data_s_value: Optional. The value of the 'data-s' parameter if present.
        :param is_invisible: Optional. Whether the reCAPTCHA is invisible or not.
        :param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
        :return: A dictionary containing the solution of the reCAPTCHA.
        """
        task = {
            "type": "RecaptchaV2TaskProxyless",
            "websiteURL": website_url,
            "websiteKey": website_key,
            "recaptchaDataSValue": recaptcha_data_s_value,
            "isInvisible": is_invisible,
            "apiDomain": api_domain,
            "pageAction": page_action,
        }
        return send_task(task)

解决 reCAPTCHA v2 Enterprise 的 python 示例代码 #

 
    def recaptchav2enterprise(self, website_url: str, website_key: str, enterprise_payload: dict = {},
                              is_invisible: bool = False, api_domain: str = "", page_action: str = "") -> dict:
        """
        Solve reCAPTCHA v2 Enterprise challenge.
 
        :param website_url: The URL of the website where the reCAPTCHA is located.
        :param website_key: The sitekey of the reCAPTCHA.
        :param enterprise_payload: Optional. Additional enterprise payload parameters.
        :param is_invisible: Optional. Whether the reCAPTCHA is invisible or not.
        :param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
        :return: A dictionary containing the solution of the reCAPTCHA.
        """
        task = {
            "type": "RecaptchaV2EnterpriseTaskProxyless",
            "websiteURL": website_url,
            "websiteKey": website_key,
            "enterprisePayload": enterprise_payload,
            "isInvisible": is_invisible,
            "apiDomain": api_domain,
            "pageAction": page_action,
        }
        return send_task(task)

解决 reCAPTCHA v3 的 python 示例代码 #

    def recaptchav3(self, website_url: str, website_key: str, page_action: str = "", api_domain: str = "",
                    proxy_type: str = "", proxy_address: str = "", proxy_port: int = 0, proxy_login: str = "",
                    proxy_password: str = "") -> dict:
        """
        Solve reCAPTCHA v3 challenge.
 
        :param website_url: The URL of the website where the reCAPTCHA is located.
        :param website_key: The sitekey of the reCAPTCHA.
        :param page_action: Optional. The action parameter to use for the reCAPTCHA.
        :param api_domain: Optional. The domain of the reCAPTCHA API if different from the default.
        :param proxy_type: Optional. The type of the proxy (HTTP, HTTPS, SOCKS4, SOCKS5).
        :param proxy_address: Optional. The address of the proxy.
        :param proxy_port: Optional. The port of the proxy.
        :param proxy_login: Optional. The login for the proxy.
        :param proxy_password: Optional. The password for the proxy.
        :return: A dictionary containing the solution of the reCAPTCHA.
        """
        task = {
            "type": "RecaptchaV3TaskProxyless",
            "websiteURL": website_url,
            "websiteKey": website_key,
            "pageAction": page_action,
            "apiDomain": api_domain,
        }
        if proxy_address:
            task["type"] = "RecaptchaV3Task"
            task["proxyType"] = proxy_type
            task["proxyAddress"] = proxy_address
            task["proxyPort"] = proxy_port
            task["proxyLogin"] = proxy_login
            task["proxyPassword"] = proxy_password
        return send_task(task)
 

解决 reCAPTCHA 移动版的 python 示例代码 #

 
    def recaptcha_mobile(self, app_key: str, app_package_name: str = "", app_action: str = "") -> dict:
        """
        Solve Mobile reCAPTCHA challenge.
 
        :param app_key: The app key of the Mobile reCAPTCHA.
        :param app_package_name: Optional. The package name of the mobile app.
        :param app_action: Optional. The action parameter to use for the Mobile reCAPTCHA.
        :return: A dictionary containing the solution of the Mobile reCAPTCHA.
        """
        task = {
            "type": "RecaptchaMobileProxyless",
            "appKey": app_key,
            "appPackageName": app_package_name,
            "appAction": app_action,
        }
        return send_task(task)
 

结论#

NextCaptcha高度维护、最新且最便宜的 ReCaptcha 移动求解器服务,稳定地提供 24/7支持
为了成功检索数据,您需要一个完全可信赖的强大工具来处理验证码提供了一个易于设置的 API,可让您克服所有反机器人挑战,您今天可以免费试用

更多的#