Guide: The Document Object Model in VBA

The logical structure of a document (Excel, HTML, XML, Word, etc) is called a “document object model.” It can be used to access and work on the properties, methods, and events of the object.

For example, in a Word document, there could be shapes/pictures, text, and paragraphs that can be worked on. Similarly, in an Excel object, we may need to insert/delete rows or add some content or perform some other permitted actions.

In VBA, the Document Object model, called “DOM” in short, helps us work on documents. Using it is one way of automating documents through VBA code.

This is also referred to as web-scraping/VBA web-automation/VBA automation in the case of normal automation of Windows apps.

Accessing Elements of a Web Page

For example, let us drill down into a web page to see how we can access each of its elements

In this below code, we are creating a few objects — one of which is Internet Explorer itself. This is navigated to the amazon.in URL. The meaning of the rest of the code is explained in the comments.

Sub Automate_IE_amazon()

'Declaration of variables and objects
    Dim i As Long
    Dim str_URL As String
    Dim IE As Object
    Dim obj_Element As Object
    Dim obj_Collection As Object
    
    'Create Internet Explorer Object - This is the document object – An internet explorer browser opens up
    Set IE = CreateObject("InternetExplorer.Application")
    
    'Set IE.Visible = True to make IE visible, or False for IE to run in the background i.e. IE will remain invisible - set properties of DOM
    IE.Visible = True
    
    'Define str_URL – this is the url that we will open on this IE browser now.
    str_URL = "https://www.amazon.in/"
    
    'Navigate to str_URL on the IE browser – DOM.
    IE.Navigate str_URL
    
    ' Statusbar let's user know website is loading . This is something that we are displaying to keep the user engaged or make the user understand that some background process is going on.
    Application.StatusBar = str_URL & " is loading. Please wait..."
    
    ' Wait while IE loading...
    'IE.ReadyState = 4 signifies the webpage has loaded (the first loop is set in order to avoid skipping the second loop)
    Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
    Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until
    
    'Webpage Loaded – display in status bar.
    Application.StatusBar = str_URL & " Loaded"
    
‘ Here we can do that actions we wish to 
    'Unload IE – Emptying the objects in order to give up the used RAM space of system and increase efficiency.
    Set IE = Nothing
    Set obj_Element = Nothing
    Set obj_Collection = Nothing
    
End Sub

Output:

Output of code accessing web page elements

Now using the same DOM object, we will enter a value in the search box and click on the search icon.

Method 1: Using Watch Window

Code for using a watch window

As the code runs, we stop the code at a breakpoint and add the IE object to the watch window. Now this object on the watch window can be expanded to see the list of objects, properties, and methods it has under it. Based on this, we can explore and find the search text box object.

IE.Document is the DOM in this example. It is highlighted in the watch window below:

Internet Explorer DOM in Watch Window

The second method makes it even simpler for us to proceed further spying/identifying the objects we need to work on.

Method 2: Using the HTML Elements

Function F12 can help us view the HTML code as a part of the web page.

Function F12 to view HTML code of a webpage

Using the spy icon, we can click on the object whose properties we need to inspect.

Using spy icon to find out object properties

On clicking, we can see the element’s code.

Element's code after clicking

Here’s the HTML code of the element:

<input name="field-keywords" tabindex="0" class="nav-input nav-progressive-attribute" id="twotabsearchtextbox" dir="auto" aria-label="Search" type="text" placeholder="" value="" autocomplete="off">

From the code above, we can understand that the classname of the element is nav-input nav-progressive-attribute. So, we can select the object using this classname.

Set obj_Element = IE.document.getElementsByClassName("nav-input nav-progressive-attribute")

When we run this single line of code and watch the value of the assigned object, obj_Element, we see that there are two elements with the same classname. Now, we can try assigning value to the first item.

Two elements with the same classname

On expanding the two items, I found that the id of the first item is the searchtextbox and the second is the search submit button.

So, our job is almost done.

Now, we can write some code to type search text in the first object and click on the second object.

‘type text in the search box
obj_Element(0).innerText = "Olympiad books for Grade 2"
Code to type search in the first object

Click on the submit button.

obj_Element(1).Click
Output of search code

WOW!

We have automated the search successfully!

Here is a copy of the whole code you can try out if interested:

Sub Automate_IE_amazon()

'Declaration of variables and objects
    Dim i As Long
    Dim str_URL As String
    Dim IE As Object
    Dim obj_Element As Object
    Dim obj_Collection As Object
    
    'Create Internet Explorer Object - This is the document object
    Set IE = CreateObject("InternetExplorer.Application")
    
    'Set IE.Visible = True to make IE visible, or False for IE to run in the background- set properties of DOM
    IE.Visible = True
    
    'Define str_URL
    str_URL = "https://www.amazon.in/"
    
    'Navigate to str_URL
    IE.Navigate str_URL
    
    ' Statusbar let's user know website is loading
    Application.StatusBar = str_URL & " is loading. Please wait..."
    
    ' Wait while IE loading...
    'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set in order to avoid inadvertently skipping over the second loop)
    Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
    Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until
    
    'Webpage Loaded
    Application.StatusBar = str_URL & " Loaded"    
    Set obj_Element = IE.document.getElementsByClassName("nav-input nav-progressive-attribute")    
    obj_Element(0).innerText = "Olympiad books for Grade 2"    
    obj_Element(1).Click        

    'Unload IE
    Set IE = Nothing
    Set obj_Element = Nothing
    Set obj_Collection = Nothing
    
End Sub

Conclusion

Remember, we can not only get elements using the classname, but we can also use the input tag name or ID directly.

The search text here in the example was hardcoded. But it can very well be dynamically taken from your Excel sheets in the document. This way you may search for many items one after the other specifying them in the rows of the sheet.

The more we explore the web page, the more we learn. Web scraping is very popular nowadays for companies because it reduces manual work. This automation becomes easy with the “DOM” in VBA. So, use it wisely to get your page automated.

Leave a Reply

Your email address will not be published. Required fields are marked *