Selenium CAPTCHA Bypass with Tokens or Clicks? Continuing the Speed Comparison

от автора

In my previous article, Puppeteer CAPTCHA bypass: Tokens or Clicks? Let’s Break It Down (which I also published on Dev.to), I compared two CAPTCHA bypass methods (clicks and tokens) using Puppeteer. I also announced that in the next article (this one), I would conduct a practical comparison of the same methods using Selenium. This will complete the CAPTCHA bypass picture, so to speak. Well, let’s not waste time and get straight to the point.

Selenium Google CAPTCHA Bypass: Module Preparation

This time, I used modules from the same service provider but for Python (since the main difference between Selenium and Puppeteer is the programming languages they run on). I hoped to find a Python module similar to the JavaScript version, where simply changing a setting would suffice to switch the recognition method. However, either my technical skills fell short, or such a module doesn’t yet exist for Python. Therefore, I used two different modules for the comparison:

Modifications in the Selenium CAPTCHA Bypass Module

I admit, finding a suitable module for token-based CAPTCHA bypass took some effort, as it wasn’t immediately clear which one to use. Eventually, I found a module, but it was configured by default to work with the 2captcha demo page rather than the official Google reCAPTCHA demo page. Thus, I made minor adjustments to the source code to resolve this issue.

The adjustments I made:

# CONFIGURATION   url = "https://www.google.com/recaptcha/api2/demo" apikey = os.getenv('API KEY')   # LOCATORS   sitekey_locator = "//div[@id='g-recaptcha']" submit_button_captcha_locator = "//button[@data-action='demo_action']" success_message_locator = "//p[contains(@class,'successMessage')]" 
  • Configuration (first two lines): Simple enough, just replace the URL with the Google demo page and insert your API key from 2captcha.

  • Locators (last three lines): These are the correct element selectors for the Google reCAPTCHA demo page. Keep in mind that selectors may vary for different websites.

For my task, I used the following values:

# LOCATORS  sitekey_locator = "//div[@id='recaptcha-demo']" submit_button_captcha_locator = "//input[@id='recaptcha-demo-submit']" success_message_locator = "//div[contains(@class,'recaptcha-success')]"

Minor Modifications in the Click-Based CAPTCHA Bypass Module

The selenium-recaptcha-solver-using-grid module uses the Grid method for CAPTCHA bypass. This approach applies to images divided into a grid, where you must click on specific tiles (e.g., in reCAPTCHA V2). You can read more about the Grid method in the article: Puppeteer CAPTCHA bypass: Tokens or Clicks? Let’s Break It Down.

An interesting fact: the Grid method description on the service’s website mentions machine recognition (likely some neural network) as a feature for speeding up the process. By default, a human solves the CAPTCHA, but adding a specific parameter enables machine recognition, significantly increasing speed.

I didn’t enable machine recognition and barely modified the code since the extension worked well out of the box.

Only two lines were changed:

  1. Line 9: Replaced the demo page URL with https://2captcha.com/demo/recaptcha-v2.

  2. Line 10: Replaced APIKEY_2CAPTCHA with my own API key from the 2captcha service homepage.

No other changes were made.

Testing Which CAPTCHA Solver Selenium is faster? Speed Comparison

Everything was ready, so I ran each module individually. I recorded a video demonstrating the difference in recognition speed. If you’re not inclined to watch the video (it’s only 40 seconds), the test results are summarized below.

Test Results:

  1. Google CAPTCHA bypass with tokens: 1 minute 30 seconds.

  2. Google CAPTCHA bypass with clicks: 2 minutes 30 seconds.

Final Comparison Selenium CAPTCHA Solvers:

To better illustrate, let’s convert this into the number of CAPTCHAs solved and time saved per day (assuming a single-threaded process running continuously):

  • Token method: Up to 960 CAPTCHAs per day.

  • Click method: Up to 576 CAPTCHAs per day.

  • Time saved: Approximately 6.5 hours per day with the token method.

Conclusion:

In a similar comparison using Puppeteer, the token method also outperformed clicks. Moreover, click recognition with Puppeteer was embarrassingly slow—over 4 minutes.

The choice is yours: tokens or clicks?


ссылка на оригинал статьи https://habr.com/ru/articles/861974/


Комментарии

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *