My Thoughts: Automating(Breaking) captcha using Selenium Webdriver

Monday, November 26

Automating(Breaking) captcha using Selenium Webdriver

Usually most of the companies either use their own captchas or one of the third party captchas(GooglejQuery plugins) in the user registration page of their sites .So these pages can't be automated fully.Infact Captcha itself is implemented to prevent automation. As per official captcha site

A CAPTCHA is a program that  protects  websites against bots  by generating and grading tests that humans can pass but current computer programs cannot.

Captchas are not brakeable but there are some third party captchas that can be breakable and one of the example for it is "jQuery Real Person" captcha . Here is the documentation  :)

Vamshi Kurra- Real Person Captcha


Here is the sample code to brake the "jQuery Real Person" Captcha using Selenium WebDriver.

import org.openqa.selenium.By;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;

public class captchaAutomtion { 
 
 WebDriver driver;
 
 @BeforeTest
 public void start(){
  driver = new FirefoxDriver();
 }
 
 @Test
 public void Test(){ 
  //Loading jQuery Real Person Captcha demonstration page
  driver.get("http://keith-wood.name/realPerson.html");
  JavascriptExecutor js = (JavascriptExecutor) driver;
  //Setting the captcha values
  js.executeScript("document.getElementsByName('defaultRealHash')[0].setAttribute('value', '-897204064')");
  driver.findElement(By.name("defaultReal")).sendKeys("QNXCUL");
  //Submit the form
  driver.findElement(By.xpath(".//*[@id='default']/form/p[2]/input")).submit(); 
 }

}

Do share some of the captcha plugins that can be breakable with me :P

31 comments:

  1. This code is wrking fine here... bt when i tried in another portal of same captcha... error is displaying like invalid captcha...

    ReplyDelete
  2. Recaptcha-Is there any option to break it??

    ReplyDelete
  3. Ya thanks mate... its prb of degradation of FF.... its wrkng nw

    ReplyDelete
  4. I am not sure. But try updating jars . I am using selenium 2.2 and firefox 20.0 . (Even testNG plugin is up to date.)

    ReplyDelete
  5. I tried bt getting following error :

    FAILED CONFIGURATION: @BeforeTest start

    java.lang.NoClassDefFoundError: org/openqa/selenium/internal/TemporaryFilesystem..



    I tried to sort out... but unable to rectify it... Can u plz suggest me..??

    ReplyDelete
  6. I tried with firefox bt it is thorwing error.... is this working for Firefox??

    ReplyDelete
  7. Thanks.. It was really helpful. But,the link which i shared with you few days back doesn't have any key value pair. So how to breakup this captcha??

    ReplyDelete
  8. Vinay,


    Right now I don't have answer for your question. I am not even sure there is a way. We have to explore a little bit.

    ReplyDelete
  9. Jquery Real person captcha is a kind of "key-value" pair. Here key is the hidden element whose name is "defaultRealHash" where as value is the actual captcha image. Here is the documentataon :http://www.exploit-db.com/exploits/18167/

    I used the "key - value" pair as "-897204064 - QNXCUL"



    All the captchas are not breakable.

    ReplyDelete
  10. js.executeScript("document.getElementsByName('defaultRealHash')[0].setAttribute('value', '-897204064')");
    driver.findElement(By.name("defaultReal")).sendKeys("QNXCUL");

    Can u guide me how u have come up with these two lines of code??is all the Captcha scenarios breakable using selenium WebDriver??

    ReplyDelete
  11. If we manually break the captcha on "windows live login" page then we can easily automate it.


    Once I find a method then I will share it with you. Have a good day..!!

    ReplyDelete
  12. captcha is an acronym. Completely Automated Turing test to tell Computers and Humans Apart. Computers can solve them, But they are much less efficient than humans. Captcha's which use actual words are "very easy" to break, as are clutter captchas. I believe recaptcha uses scans from books which the highest quality text interpretation software cannot interpret, and since it is unlikely that someone would write a better program for the sole sake of trying many password attempts or whatever, this is relatively secure. Google has a team dedicated to breaking their own captchas, and those people are pretty smart.

    ReplyDelete
  13. https://signup.live.com/signup.aspx?ru=http%3a%2f%2fmail.live.com%2f%3frru%3dinbox&wa=wsignin1.0&rpsnv=10&ct=1242258428&rver=5%20.5.4177.0&wp=MBI&wreply=http:%2F%2Fmail.live.com%2Fdefault.aspx&lc=2057&id=64855&mkt=en-GB&bk=1242258418&rollrs=12&lic=1 This is link which has Captcha at the last. How to Break this??? Your post is fantastic. I executed it and it was like magic. Can you please send the code to break the Captcha of the link given above??Thanks!!

    ReplyDelete
  14. Check the source code of portal site which you are referring and get the correct key - value pairs for the captcha and replace them in the code.

    ReplyDelete
  15. Ya tired it...changed key value pairs and entered new one... bt it is not wrking...

    ReplyDelete
  16. which site you are referring ?

    ReplyDelete
  17. nt public site...internal sites we use..

    ReplyDelete
  18. Hi Vamshi...U wer using document.getElementsByName('defaultRealHash')[0].. what does that array function represents...

    ReplyDelete
  19. It is a javascript code.

    In javascript we don't have getElementByName method. So I used getElementsByName method to find the element with name "defaultRealHash" and setting the value to "-897204064" .

    ReplyDelete
  20. hey i am using selenium for test automation when i use above code to remove readonly property of a textbox it gives an error of stack trace information what should i do

    ReplyDelete
  21. I didnot get what exactly you are doing here. Can you please copy your sample code here ?

    ReplyDelete
  22. Hi Vamshi..
    But the captcha values keep changing every time you refresh the page. This particular script will not work the second time. For ex: if we have a registration page which use online captchas. How will u handle that during multiple registration process?..We cant keep changing the script everytime.

    ReplyDelete
  23. Hi Shanti,


    This script will work everytime as long as registration page uses "real Person" captcha. Using javascript we are changing captcha values always.

    ReplyDelete
  24. Hi Aditya ,

    Recaptcha automation is not possible (atleast now :P )

    ReplyDelete
  25. I also used the same code but found that following error appeared in result:

    document.getElementsByName("defaultRealHash")[0] is undefined
    Command duration or timeout: 93 milliseconds
    Build info: version: '2.35.0', revision: '8df0c6bedf70ff9f22c647788f9fe9c8d22210e2', time: '2013-08-17 12:46:41'

    I also observed that even if i have given ID as 'document.getElementsByName('defaultRealHash')[0]' but in the error message it seems as 'document.getElementsByName("defaultRealHash")[0]' with "defaultRealHash" in double quotes.

    ReplyDelete
  26. hmm...seems it will work.

    ReplyDelete