How to Use VBA to Pull Excel Data from Hyperlinks

In this article, we will look at how to extract data from a website using VBA. For this we will be using the navigate method on the InternetExplorer object. Let us see this in detail.

Example 1:

Say, you want to pull the current temperature of the city of Vancouver from this site. Before we begin our coding, we need to know the tag element that contains the required data in the web site.

To do that, navigate to the site in the browser. Right click on the data to be retrieved and click on inspect element.

The element will be highlighted in the Developer Tools window that will be opened.

So, we basically want to extract the inner text of a “p” tag with class “temperature”. This can be done as follows.

First of all, we need to make sure that the required libraries are included in your VBA project, namely, Microsoft HTML Object Library and Microsoft Internet Controls.

Now, let’s start with the code. Create a new InternetExplorer object using

Dim IE As New InternetExplorer

Specify the URL from which to extract data and navigate to it using

url = "https://www.theweathernetwork.com/ca/weather/british-columbia/vancouver"
IE.navigate url

 

For troubleshooting purpose, you can make the browser window visible by using:

IE.Visible = True

After navigating to a URL, you need to wait till the web page is entirely loaded. This can be done by using

Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE

(After the webpage is completely loaded, the readyState property of the InternetExplorer object is set to READYSTATE_COMPLETE)

Next, you save the contents of the webpage loaded using the document property

Dim Doc As HTMLDocument
Set Doc = IE.document

 

You can then extract all elements of a particular type from the HTML document using

Dim tagElements As Object
Set tagElements = Doc.all.tags("p")

To loop through all these “p” tag elements

Dim element As Object
For Each element In tagElements
'Code to pull temperature goes here
Next

From all the p tags, we are looking for the tag that contains temperature (°C) and has class name “temperature”. For this we use the element.innerText and element.className properties as follows

If InStr(element.innerText, "°C") And InStr(element.className, "temperature") Then
Sheet1.Cells(1, 1).Value = element.innerText
End If

Finally, we do the cleanup using

IE.Quit
Set IE = Nothing

Here is the entire thing put together

Sub extractVancouverData()
Dim IE As New InternetExplorer
Dim url As String
Dim item As HTMLHtmlElement
Dim Doc As HTMLDocument
Dim tagElements As Object
Dim element As Object
Dim lastRow
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Application.EnableEvents = False
Application.Calculation = xlCalculationManual
url = "https://www.theweathernetwork.com/ca/weather/british-columbia/vancouver"
IE.navigate url
IE.Visible = True
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
lastRow = Sheet1.UsedRange.Rows.Count + 1
Set tagElements = Doc.all.tags("p")
For Each element In tagElements
If InStr(element.innerText, "°C") > 0 And InStr(element.className, "temperature") > 0 Then
Sheet1.Cells(lastRow, 1).Value = element.innerText
' Exit the for loop once you get the temperature to avoid unnecessary processing
Exit For
End If
Next
IE.Quit
Set IE = Nothing
Application.ScreenUpdating = True
Application.DisplayAlerts = True
Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
End Sub

You can also extract all elements from the HTMLDocument by the element’s ID, className, Name or tagName using the respective method as shown below

Set tagElements = Doc.getElementsByClassName("temperature")
For Each element In tagElements
If InStr(element.innerText, "°C") Then
Sheet1.Cells(lastRow, 1).Value = element.innerText
End If
Next

 

Example 2:

Many times while extracting data from a website, you need to go to a page and click on a link as you cannot directly access the results using a URL. For this you can use the element.Click method. Say you click on the link “Excel” on the home page of this site. This is how you can do it

Sub clickOnLink()
Dim IE As New InternetExplorer
Dim str As String
Dim Doc As HTMLDocument
Dim tagElements As Object
Dim element As HTMLObjectElement
str = "http://software-solutions-online.com/"
IE.navigate str
IE.Visible = True
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
'First get all the links
Set tagElements = Doc.all.tags("a")
' Loop through all the links
For Each element In tagElements
' Look for the link that contains the text Excel
If element.innerText = "Excel" Then
'Click on the link
element.Click
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Exit For
End If
Next

If you run this code with the IE visible, you will see that the code clicks on the Excel link. Now you can extract the required data using the methods described above.

If you’re interested in doing something similar via recording a macro, please have a look at this article and for using a query table check out this one.

 

Leave a Reply

Your email address will not be published. Required fields are marked *