{"id":440249,"date":"2024-11-29T15:01:42","date_gmt":"2024-11-29T15:01:42","guid":{"rendered":"http:\/\/savepearlharbor.com\/?p=440249"},"modified":"-0001-11-30T00:00:00","modified_gmt":"-0001-11-29T21:00:00","slug":"","status":"publish","type":"post","link":"https:\/\/savepearlharbor.com\/?p=440249","title":{"rendered":"<span>Selenium CAPTCHA Bypass with Tokens or Clicks? Continuing the Speed Comparison<\/span>"},"content":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<p>In my previous article, <a href=\"https:\/\/medium.com\/@koshka00009\/captcha-solving-with-puppeteer-tokens-or-clicks-lets-break-it-down-d8cf5988f2b7\" rel=\"noopener noreferrer nofollow\"><em>Puppeteer\u00a0CAPTCHA bypass: Tokens or Clicks? Let\u2019s Break It Down<\/em><\/a> (which I also published on <a href=\"https:\/\/dev.to\/markus009\/tools-for-automation-why-token-based-captcha-solving-wins-over-clicks-30c1\" rel=\"noopener noreferrer nofollow\">Dev.to<\/a>), I compared two CAPTCHA bypass methods (clicks and tokens) using Puppeteer. I also announced that in the next article (this one), I would conduct a practical comparison of the same methods using Selenium. This will complete the CAPTCHA bypass picture, so to speak. Well, let\u2019s not waste time and get straight to the point.<\/p>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w780q1\/getpro\/habr\/upload_files\/104\/221\/03e\/10422103e1aed78321c217d9cc185b7f.jpg\" width=\"666\" height=\"500\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/104\/221\/03e\/10422103e1aed78321c217d9cc185b7f.jpg\" data-blurred=\"true\"\/><\/figure>\n<h3>Selenium Google CAPTCHA Bypass: Module Preparation<\/h3>\n<p>This time, I used modules from the same service provider but for Python (since the main difference between Selenium and Puppeteer is the programming languages they run on). I hoped to find a Python module similar to the JavaScript version, where simply changing a setting would suffice to switch the recognition method. However, either my technical skills fell short, or such a module doesn\u2019t yet exist for Python. Therefore, I used two different modules for the comparison:<\/p>\n<ul>\n<li>\n<p><strong>Module for Google reCAPTCHA Bypass Using Tokens<\/strong>: <a href=\"https:\/\/github.com\/2captcha\/captcha-solver-selenium-python-examples\/blob\/main\/examples\/reCAPTCHA\/recaptcha_v2.py\" rel=\"noopener noreferrer nofollow\">recaptcha_v2<\/a>.<\/p>\n<\/li>\n<li>\n<p><strong>Module for Google reCAPTCHA Bypass Using Clicks<\/strong>: <a href=\"https:\/\/github.com\/2captcha\/selenium-recaptcha-solver-using-grid\" rel=\"noopener noreferrer nofollow\">selenium-recaptcha-solver-using-grid<\/a>.<\/p>\n<\/li>\n<\/ul>\n<h4>Modifications in the Selenium CAPTCHA Bypass Module<\/h4>\n<p>I admit, finding a suitable module for token-based CAPTCHA bypass took some effort, as it wasn\u2019t immediately clear which one to use. Eventually, I found a module, but it was configured by default to work with the 2captcha demo page rather than the official Google reCAPTCHA demo page. Thus, I made minor adjustments to the source code to resolve this issue.<\/p>\n<h3>The adjustments I made:<\/h3>\n<pre><code class=\"python\"># CONFIGURATION   url = \"https:\/\/www.google.com\/recaptcha\/api2\/demo\" apikey = os.getenv('API KEY')   # LOCATORS   sitekey_locator = \"\/\/div[@id='g-recaptcha']\" submit_button_captcha_locator = \"\/\/button[@data-action='demo_action']\" success_message_locator = \"\/\/p[contains(@class,'successMessage')]\" <\/code><\/pre>\n<ul>\n<li>\n<p><strong>Configuration (first two lines)<\/strong>: Simple enough, just replace the URL with the Google demo page and insert your API key from 2captcha.<\/p>\n<\/li>\n<li>\n<p><strong>Locators (last three lines)<\/strong>: These are the correct element selectors for the Google reCAPTCHA demo page. Keep in mind that selectors may vary for different websites.<\/p>\n<\/li>\n<\/ul>\n<p>For my task, I used the following values:<\/p>\n<pre><code class=\"python\"># LOCATORS  sitekey_locator = \"\/\/div[@id='recaptcha-demo']\" submit_button_captcha_locator = \"\/\/input[@id='recaptcha-demo-submit']\" success_message_locator = \"\/\/div[contains(@class,'recaptcha-success')]\"<\/code><\/pre>\n<h4>Minor Modifications in the Click-Based CAPTCHA Bypass Module<\/h4>\n<p>The <code>selenium-recaptcha-solver-using-grid module<\/code> uses the Grid method for CAPTCHA bypass. This approach applies to images divided into a grid, where you must click on specific tiles (e.g., in reCAPTCHA V2). You can read more about the Grid method in the article: <a href=\"https:\/\/medium.com\/@koshka00009\/captcha-solving-with-puppeteer-tokens-or-clicks-lets-break-it-down-d8cf5988f2b7\" rel=\"noopener noreferrer nofollow\">Puppeteer\u00a0CAPTCHA bypass: Tokens or Clicks? Let\u2019s Break It Down<\/a><em>.<\/em><\/p>\n<p>An interesting fact: the Grid method description on the service&#8217;s website mentions machine recognition (likely some neural network) as a feature for speeding up the process. By default, a human solves the CAPTCHA, but adding a specific parameter enables machine recognition, significantly increasing speed.<\/p>\n<p>I didn\u2019t enable machine recognition and barely modified the code since the extension worked well out of the box.<\/p>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w780q1\/getpro\/habr\/upload_files\/a6e\/90f\/5b1\/a6e90f5b13bbadd527581f2b89455468.jpg\" width=\"581\" height=\"430\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/a6e\/90f\/5b1\/a6e90f5b13bbadd527581f2b89455468.jpg\" data-blurred=\"true\"\/><\/figure>\n<h3>Only two lines were changed:<\/h3>\n<ol>\n<li>\n<p><strong>Line 9<\/strong>: Replaced the demo page URL with <a href=\"https:\/\/2captcha.com\/demo\/recaptcha-v2\" rel=\"noopener noreferrer nofollow\">https:\/\/2captcha.com\/demo\/recaptcha-v2<\/a>.<\/p>\n<\/li>\n<li>\n<p><strong>Line 10<\/strong>: Replaced <code>APIKEY_2CAPTCHA<\/code> with my own API key from the 2captcha service homepage.<\/p>\n<\/li>\n<\/ol>\n<p>No other changes were made.<\/p>\n<h3>Testing Which CAPTCHA Solver Selenium is faster? Speed Comparison<\/h3>\n<p>Everything was ready, so I ran each module individually. I recorded a video demonstrating the difference in recognition speed. If you\u2019re not inclined to watch the video (it\u2019s only 40 seconds), the test results are summarized below.<\/p>\n<div class=\"tm-iframe_temp\" data-src=\"https:\/\/embedd.srv.habr.com\/iframe\/674821c1f3c34e13a237f4da\" data-style=\"\" id=\"674821c1f3c34e13a237f4da\" width=\"\"><\/div>\n<h3>Test Results:<\/h3>\n<ol>\n<li>\n<p><strong>Google CAPTCHA bypass with tokens<\/strong>: 1 minute 30 seconds.<\/p>\n<\/li>\n<li>\n<p><strong>Google CAPTCHA bypass with clicks<\/strong>: 2 minutes 30 seconds.<\/p>\n<\/li>\n<\/ol>\n<h3>Final Comparison Selenium CAPTCHA Solvers:<\/h3>\n<p>To better illustrate, let\u2019s convert this into the number of CAPTCHAs solved and time saved per day (assuming a single-threaded process running continuously):<\/p>\n<ul>\n<li>\n<p><strong>Token method<\/strong>: Up to <strong>960 CAPTCHAs per day<\/strong>.<\/p>\n<\/li>\n<li>\n<p><strong>Click method<\/strong>: Up to <strong>576 CAPTCHAs per day<\/strong>.<\/p>\n<\/li>\n<li>\n<p><strong>Time saved<\/strong>: Approximately <strong>6.5 hours per day<\/strong> with the token method.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w780q1\/getpro\/habr\/upload_files\/a7c\/cbe\/3b0\/a7ccbe3b0122131e8f315fca77704f6b.jpg\" width=\"752\" height=\"500\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/a7c\/cbe\/3b0\/a7ccbe3b0122131e8f315fca77704f6b.jpg\" data-blurred=\"true\"\/><\/figure>\n<h3>Conclusion:<\/h3>\n<p>In a similar comparison using Puppeteer, the token method also outperformed clicks. Moreover, click recognition with Puppeteer was embarrassingly slow\u2014over 4 minutes.<\/p>\n<p><strong>The choice is yours: tokens or clicks?<\/strong><\/p>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><!----><!----><\/div>\n<p><!----><!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/articles\/861974\/\"> https:\/\/habr.com\/ru\/articles\/861974\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<p>In my previous article, <a href=\"https:\/\/medium.com\/@koshka00009\/captcha-solving-with-puppeteer-tokens-or-clicks-lets-break-it-down-d8cf5988f2b7\" rel=\"noopener noreferrer nofollow\"><em>Puppeteer\u00a0CAPTCHA bypass: Tokens or Clicks? Let\u2019s Break It Down<\/em><\/a> (which I also published on <a href=\"https:\/\/dev.to\/markus009\/tools-for-automation-why-token-based-captcha-solving-wins-over-clicks-30c1\" rel=\"noopener noreferrer nofollow\">Dev.to<\/a>), I compared two CAPTCHA bypass methods (clicks and tokens) using Puppeteer. I also announced that in the next article (this one), I would conduct a practical comparison of the same methods using Selenium. This will complete the CAPTCHA bypass picture, so to speak. Well, let\u2019s not waste time and get straight to the point.<\/p>\n<figure class=\"full-width\"><\/figure>\n<h3>Selenium Google CAPTCHA Bypass: Module Preparation<\/h3>\n<p>This time, I used modules from the same service provider but for Python (since the main difference between Selenium and Puppeteer is the programming languages they run on). I hoped to find a Python module similar to the JavaScript version, where simply changing a setting would suffice to switch the recognition method. However, either my technical skills fell short, or such a module doesn\u2019t yet exist for Python. Therefore, I used two different modules for the comparison:<\/p>\n<ul>\n<li>\n<p><strong>Module for Google reCAPTCHA Bypass Using Tokens<\/strong>: <a href=\"https:\/\/github.com\/2captcha\/captcha-solver-selenium-python-examples\/blob\/main\/examples\/reCAPTCHA\/recaptcha_v2.py\" rel=\"noopener noreferrer nofollow\">recaptcha_v2<\/a>.<\/p>\n<\/li>\n<li>\n<p><strong>Module for Google reCAPTCHA Bypass Using Clicks<\/strong>: <a href=\"https:\/\/github.com\/2captcha\/selenium-recaptcha-solver-using-grid\" rel=\"noopener noreferrer nofollow\">selenium-recaptcha-solver-using-grid<\/a>.<\/p>\n<\/li>\n<\/ul>\n<h4>Modifications in the Selenium CAPTCHA Bypass Module<\/h4>\n<p>I admit, finding a suitable module for token-based CAPTCHA bypass took some effort, as it wasn\u2019t immediately clear which one to use. Eventually, I found a module, but it was configured by default to work with the 2captcha demo page rather than the official Google reCAPTCHA demo page. Thus, I made minor adjustments to the source code to resolve this issue.<\/p>\n<h3>The adjustments I made:<\/h3>\n<pre><code class=\"python\"># CONFIGURATION   url = \"https:\/\/www.google.com\/recaptcha\/api2\/demo\" apikey = os.getenv('API KEY')   # LOCATORS   sitekey_locator = \"\/\/div[@id='g-recaptcha']\" submit_button_captcha_locator = \"\/\/button[@data-action='demo_action']\" success_message_locator = \"\/\/p[contains(@class,'successMessage')]\" <\/code><\/pre>\n<ul>\n<li>\n<p><strong>Configuration (first two lines)<\/strong>: Simple enough, just replace the URL with the Google demo page and insert your API key from 2captcha.<\/p>\n<\/li>\n<li>\n<p><strong>Locators (last three lines)<\/strong>: These are the correct element selectors for the Google reCAPTCHA demo page. Keep in mind that selectors may vary for different websites.<\/p>\n<\/li>\n<\/ul>\n<p>For my task, I used the following values:<\/p>\n<pre><code class=\"python\"># LOCATORS  sitekey_locator = \"\/\/div[@id='recaptcha-demo']\" submit_button_captcha_locator = \"\/\/input[@id='recaptcha-demo-submit']\" success_message_locator = \"\/\/div[contains(@class,'recaptcha-success')]\"<\/code><\/pre>\n<h4>Minor Modifications in the Click-Based CAPTCHA Bypass Module<\/h4>\n<p>The <code>selenium-recaptcha-solver-using-grid module<\/code> uses the Grid method for CAPTCHA bypass. This approach applies to images divided into a grid, where you must click on specific tiles (e.g., in reCAPTCHA V2). You can read more about the Grid method in the article: <a href=\"https:\/\/medium.com\/@koshka00009\/captcha-solving-with-puppeteer-tokens-or-clicks-lets-break-it-down-d8cf5988f2b7\" rel=\"noopener noreferrer nofollow\">Puppeteer\u00a0CAPTCHA bypass: Tokens or Clicks? Let\u2019s Break It Down<\/a><em>.<\/em><\/p>\n<p>An interesting fact: the Grid method description on the service&#8217;s website mentions machine recognition (likely some neural network) as a feature for speeding up the process. By default, a human solves the CAPTCHA, but adding a specific parameter enables machine recognition, significantly increasing speed.<\/p>\n<p>I didn\u2019t enable machine recognition and barely modified the code since the extension worked well out of the box.<\/p>\n<figure class=\"full-width\"><\/figure>\n<h3>Only two lines were changed:<\/h3>\n<ol>\n<li>\n<p><strong>Line 9<\/strong>: Replaced the demo page URL with <a href=\"https:\/\/2captcha.com\/demo\/recaptcha-v2\" rel=\"noopener noreferrer nofollow\">https:\/\/2captcha.com\/demo\/recaptcha-v2<\/a>.<\/p>\n<\/li>\n<li>\n<p><strong>Line 10<\/strong>: Replaced <code>APIKEY_2CAPTCHA<\/code> with my own API key from the 2captcha service homepage.<\/p>\n<\/li>\n<\/ol>\n<p>No other changes were made.<\/p>\n<h3>Testing Which CAPTCHA Solver Selenium is faster? Speed Comparison<\/h3>\n<p>Everything was ready, so I ran each module individually. I recorded a video demonstrating the difference in recognition speed. If you\u2019re not inclined to watch the video (it\u2019s only 40 seconds), the test results are summarized below.<\/p>\n<div class=\"tm-iframe_temp\" data-src=\"https:\/\/embedd.srv.habr.com\/iframe\/674821c1f3c34e13a237f4da\" data-style=\"\" id=\"674821c1f3c34e13a237f4da\" width=\"\"><\/div>\n<h3>Test Results:<\/h3>\n<ol>\n<li>\n<p><strong>Google CAPTCHA bypass with tokens<\/strong>: 1 minute 30 seconds.<\/p>\n<\/li>\n<li>\n<p><strong>Google CAPTCHA bypass with clicks<\/strong>: 2 minutes 30 seconds.<\/p>\n<\/li>\n<\/ol>\n<h3>Final Comparison Selenium CAPTCHA Solvers:<\/h3>\n<p>To better illustrate, let\u2019s convert this into the number of CAPTCHAs solved and time saved per day (assuming a single-threaded process running continuously):<\/p>\n<ul>\n<li>\n<p><strong>Token method<\/strong>: Up to <strong>960 CAPTCHAs per day<\/strong>.<\/p>\n<\/li>\n<li>\n<p><strong>Click method<\/strong>: Up to <strong>576 CAPTCHAs per day<\/strong>.<\/p>\n<\/li>\n<li>\n<p><strong>Time saved<\/strong>: Approximately <strong>6.5 hours per day<\/strong> with the token method.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<h3>Conclusion:<\/h3>\n<p>In a similar comparison using Puppeteer, the token method also outperformed clicks. Moreover, click recognition with Puppeteer was embarrassingly slow\u2014over 4 minutes.<\/p>\n<p><strong>The choice is yours: tokens or clicks?<\/strong><\/p>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><!----><!----><\/div>\n<p><!----><!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/articles\/861974\/\"> https:\/\/habr.com\/ru\/articles\/861974\/<\/a><br \/><\/br><\/br><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-440249","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/440249","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=440249"}],"version-history":[{"count":0,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/440249\/revisions"}],"wp:attachment":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=440249"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=440249"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=440249"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}