Verifying PDF content is also part of testing.But in WebDriver (Selenium2) we don't have any direct methods to achieve this.
If you would like to extract pdf content then we can use Apache PDFBox API.
Download the Jar files and add them to your Eclipse Class path.Then you are ready to extract text from PDF file .. :)
Here is the sample script which will extract text from the below PDF file.
http://www.votigo.com/pdf/corp/CASE_STUDY_EarthBox.pdf
import java.io.BufferedInputStream; import java.io.IOException; import java.net.URL; import java.util.concurrent.TimeUnit; import org.apache.pdfbox.pdfparser.PDFParser; import org.apache.pdfbox.util.PDFTextStripper; import org.openqa.selenium.WebDriver; import org.openqa.selenium.firefox.FirefoxDriver; import org.testng.Reporter; import org.testng.annotations.BeforeTest; import org.testng.annotations.Test; public class ReadPdfFile { WebDriver driver; @BeforeTest public void setUpDriver() { driver = new FirefoxDriver(); Reporter.log("I am done"); } @Test public void start() throws IOException{ driver.get("http://votigo.com/overview_collateral.pdf"); driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS); URL url = new URL(driver.getCurrentUrl()); BufferedInputStream fileToParse=new BufferedInputStream(url.openStream()); //parse() -- This will parse the stream and populate the COSDocument object. //COSDocument object -- This is the in-memory representation of the PDF document PDFParser parser = new PDFParser(fileToParse); parser.parse(); //getPDDocument() -- This will get the PD document that was parsed. When you are done with this document you must call close() on it to release resources //PDFTextStripper() -- This class will take a pdf document and strip out all of the text and ignore the formatting and such. String output=new PDFTextStripper().getText(parser.getPDDocument()); System.out.println(output); parser.getPDDocument().close(); driver.manage().timeouts().implicitlyWait(100, TimeUnit.SECONDS); } }Here is the output of above program :
EarthBox a Day Giveaway Objectives EarthBox wanted to engage their Facebook audience with an Earth Day promotion that would also increase their Facebook likes. They needed a simple solution that would allow them to create a sweepstakes application themselves. Solution EarthBox utilized the Votigo platform to create a like- gated sweepstakes. Utilizing a theme and uploading a custom graphic they were able to create a branded promotion. Details • 1 prize awarded each day for the entire Month of April • A grand prize given away on Earth Day • Daily winner announcements on Facebook • Promoted through email newsletter blast Results (4 weeks) • 6,550 entries Facebook