How to Search a PDF for a String Using VBA for Excel

In this article we will see how you can search for a string in a PDF file and highlight it using Excel VBA. We will look at two examples, one that works with Acrobat Reader and a second that should be used if you have Acrobat Pro licensed version.

Example 1: Using SendKeys in Acrobat Reader

In this we will simply open the PDF file specified, using FollowHyperlink. Then we will use the SendKeys function to first invoke search (using Ctrl + F), type the search string and send “enter” key to perform the actual search operation. Here is the complete code. You can follow along the comments in the code.

Sub SearchStringInPDF()

    Dim PDF_path As String
    Dim searchString As String
    Dim waitTime As Date
    
    Application.DisplayAlerts = False

    'Make sure to provide a complete path to the PDF here
    'if your Excel and PDF are in separate folders
    PDF_path = "search.pdf"

    'Your search string - case-insensitive, as in actual PDF search
    searchString = "sample"

    'Open the pdf file
    ThisWorkbook.FollowHyperlink PDF_path

    'Wait time for a couple of seconds for the PDF file to fully load
    waitTime = Now
    While DateDiff("s", waitTime, Now) > 2
        DoEvents
    Wend

    'Send "CTRL + F" keys to the opened PDF to invoke find within that file
    'The second argument "true" implies that Excel will wait for the keys to be processed 
    'before returning control to the macro.
    SendKeys "^f", True
    
    'Type the search string in the find box that opens
    SendKeys searchString, True
    
    'Hit enter to perform the search
    SendKeys "{Enter}", True

    Application.DisplayAlerts = True

End Sub

 

When you run this code, the PDF file will get focus and search string will be highlighted, like this

Note that SendKeys can be unreliable mainly because it literally sends the command (keys) to the active window. So, while the macro is running if you accidentally make another window active, it will not give the desired results. It is not always possible to guarantee that the active window is the intended window.

The better solution is always to go with the API approach as you can see in the next example. API uses an object to identify and target a specific window. So, when you use objects you can be sure you are interacting with the correct window regardless of whether it is active or inactive.

Example 2: Using Adobe Application object for Acrobat Pro

If you have the Acrobat Pro licensed version, you should go with the code below. First we will create the required objects, open the PDF, bring it to the front and then search for the string using the FindText function

Sub searchUsingAcrobatPro()

    Dim searchString As String
    Dim PDF_path As String
    Dim appObj As Object, AVDocObj As Object
               
    searchString = "PDF"
    PDF_path = "search.pdf"
   
    'Check if the file exists.
    If Dir(PDF_path) = "" Then
        MsgBox "File not found..."
        Exit Sub
    End If
   
    On Error Resume Next
    
    'Create Adobe Application object.
    Set appObj = CreateObject("AcroExch.App")
    
    'Check for any errors.
    If Err.Number <> 0 Then
        MsgBox "Error in creating the Adobe Application object..."
        Set appObj = Nothing
        Exit Sub
    End If
    
    'Create the AVDoc object.
    Set AVDocObj = CreateObject("AcroExch.AVDoc")
    
    'Check for any errors.
    If Err.Number <> 0 Then
        MsgBox "Error in creating the AVDoc object..."
        Set AVDocObj = Nothing
        Set appObj = Nothing
        Exit Sub
    End If
    
    On Error GoTo 0
    
    'Open the PDF file and check if the open was successful.
    If AVDocObj.Open(PDF_path, "") = True Then
        
        'Bring the PDF file to the front.
        AVDocObj.BringToFront
        
        'Search for the string and check if the the string was found.
        'If text is found, it will be highlighted (PDF is already in focus)
        If AVDocObj.findtext(searchString, False, False, False) = False Then
            
            'If text was not found, close the PDF file and perform clean-up
            AVDocObj.Close True
            appObj.Exit
               
            'Release the objects.
            Set AVDocObj = Nothing
            Set appObj = Nothing
            
            MsgBox "The string not found in the PDF file..."
            
        End If
        
    Else
        'PDF file failed to open
        appObj.Exit

        'Release the objects.
        Set AVDocObj = Nothing
        Set appObj = Nothing
        
        MsgBox "Could not open the PDF file..."
        
    End If
    
End Sub

 

The text will again be highlighted like before. However, here you will not see the search text box on the top right corner.

Let us have a closer look at the FindText function now. The syntax of the function is given below

FindText(szText As String, bCaseSensitive As Long, bWholeWordsOnly As Long, bReset As Long) As Boolean

szText: Text to search

bCaseSensitive: Whether the search should be case sensitive (True is case-sensitive)

bWholeWordsOnly: Match the entire word or even a part of a word (True is match whole word)

bReset: If a positive number, the search begins on the first page of the document. If 0, it
begins on the current page.

Return value: The function returns true if the specified string was found. Else false is returned

The Adobe API contains many more useful functions that you can use to interact with a PDF file using Excel VBA. A comprehensive document is available on the Abode site

This is a more efficient and reliable way to search a PDF file and it’s recommended you use it.

Read more on Excel text and strings here:

  1. VBA Excel String Processing and Manipulation
  2. Excel Formulas and Function, Finding Text Within Another Text, Find()

Leave a Reply

Your email address will not be published. Required fields are marked *