How to Use VBA to Pull Excel Data from Hyperlinks

In this article, we will look at how to extract data from a website using VBA. For this we will be using the navigate method on the InternetExplorer object. Let us see this in detail.

Example 1:

Say, you want to pull the current temperature of the city of Vancouver from this site. Before we begin our coding, we need to know the tag element that contains the required data in the web site.

To do that, navigate to the site in the browser. Right click on the data to be retrieved and click on inspect element.

The element will be highlighted in the Developer Tools window that will be opened.

So, we basically want to extract the inner text of a “p” tag with class “temperature”. This can be done as follows.

First of all, we need to make sure that the required libraries are included in your VBA project, namely, Microsoft HTML Object Library and Microsoft Internet Controls.

Now, let’s start with the code. Create a new InternetExplorer object using

Dim IE As New InternetExplorer

Specify the URL from which to extract data and navigate to it using

url = "https://www.theweathernetwork.com/ca/weather/british-columbia/vancouver"

IE.navigate url

 

For troubleshooting purpose, you can make the browser window visible by using:

 IE.Visible = True

After navigating to a URL, you need to wait till the web page is entirely loaded. This can be done by using

 Do
 DoEvents
 Loop Until IE.readyState = READYSTATE_COMPLETE

(After the webpage is completely loaded, the readyState property of the InternetExplorer object is set to READYSTATE_COMPLETE)

Next, you save the contents of the webpage loaded using the document property

  Dim Doc As HTMLDocument

Set Doc = IE.document

 

You can then extract all elements of a particular type from the HTML document using

 Dim tagElements As Object

 Set tagElements = Doc.all.tags("p")

To loop through all these “p” tag elements

 Dim element As Object

For Each element In tagElements

'Code to pull temperature goes here

Next

From all the p tags, we are looking for the tag that contains temperature (°C) and has class name “temperature”. For this we use the element.innerText and element.className properties as follows

 If InStr(element.innerText, "°C") And InStr(element.className, "temperature") Then
 Sheet1.Cells(1, 1).Value = element.innerText
 End If

Finally, we do the cleanup using

 IE.Quit
 Set IE = Nothing

Here is the entire thing put together

Sub extractVancouverData()

Dim IE As New InternetExplorer
 Dim url As String
 Dim item As HTMLHtmlElement
 Dim Doc As HTMLDocument
 Dim tagElements As Object
 Dim element As Object
 Dim lastRow
 
 Application.ScreenUpdating = False
 Application.DisplayAlerts = False
 Application.EnableEvents = False
 Application.Calculation = xlCalculationManual
 
 url = "https://www.theweathernetwork.com/ca/weather/british-columbia/vancouver"

IE.navigate url
 
 IE.Visible = True
 
 Do
 DoEvents
 Loop Until IE.readyState = READYSTATE_COMPLETE

Set Doc = IE.document

lastRow = Sheet1.UsedRange.Rows.Count + 1
 
 Set tagElements = Doc.all.tags("p")
 For Each element In tagElements

  If InStr(element.innerText, "°C") > 0 And InStr(element.className, "temperature") > 0 Then
    Sheet1.Cells(lastRow, 1).Value = element.innerText
    ' Exit the for loop once you get the temperature to avoid unnecessary processing
    Exit For
  End If
 Next

IE.Quit
 Set IE = Nothing
 
 Application.ScreenUpdating = True
 Application.DisplayAlerts = True
 Application.EnableEvents = True
 Application.Calculation = xlCalculationAutomatic

End Sub

You can also extract all elements from the HTMLDocument by the element’s ID, className, Name or tagName using the respective method as shown below

Set tagElements = Doc.getElementsByClassName("temperature")

For Each element In tagElements

If InStr(element.innerText, "°C") Then
Sheet1.Cells(lastRow, 1).Value = element.innerText
End If

Next

 

Example 2:

Many times while extracting data from a website, you need to go to a page and click on a link as you cannot directly access the results using a URL. For this you can use the element.Click method. Say you click on the link “Excel” on the home page of this site. This is how you can do it

Sub clickOnLink()

Dim IE As New InternetExplorer
Dim str As String
Dim Doc As HTMLDocument
Dim tagElements As Object
Dim element As HTMLObjectElement

str = "https://software-solutions-online.com/"

IE.navigate str
IE.Visible = True

Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE

Set Doc = IE.document

'First get all the links

Set tagElements = Doc.all.tags("a")

' Loop through all the links
For Each element In tagElements

' Look for the link that contains the text Excel
If element.innerText = "Excel" Then

'Click on the link
element.Click

Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE

Exit For
End If
Next

If you run this code with the IE visible, you will see that the code clicks on the Excel link. Now you can extract the required data using the methods described above.

If you’re interested in doing something similar via recording a macro, please have a look at this article and for using a query table check out this one.

 

Leave a Reply

Your email address will not be published. Required fields are marked *