Guide: The Document Object Model in VBA
The logical structure of a document (Excel, HTML, XML, Word, etc) is called a “document object model.” It can be used to access and work on the properties, methods, and events of the object.
For example, in a Word document, there could be shapes/pictures, text, and paragraphs that can be worked on. Similarly, in an Excel object, we may need to insert/delete rows or add some content or perform some other permitted actions.
In VBA, the Document Object model, called “DOM” in short, helps us work on documents. Using it is one way of automating documents through VBA code.
This is also referred to as web-scraping/VBA web-automation/VBA automation in the case of normal automation of Windows apps.
Contents
Accessing Elements of a Web Page
For example, let us drill down into a web page to see how we can access each of its elements
In this below code, we are creating a few objects — one of which is Internet Explorer itself. This is navigated to the amazon.in
URL. The meaning of the rest of the code is explained in the comments.
Sub Automate_IE_amazon() 'Declaration of variables and objects Dim i As Long Dim str_URL As String Dim IE As Object Dim obj_Element As Object Dim obj_Collection As Object 'Create Internet Explorer Object - This is the document object – An internet explorer browser opens up Set IE = CreateObject("InternetExplorer.Application") 'Set IE.Visible = True to make IE visible, or False for IE to run in the background i.e. IE will remain invisible - set properties of DOM IE.Visible = True 'Define str_URL – this is the url that we will open on this IE browser now. str_URL = "https://www.amazon.in/" 'Navigate to str_URL on the IE browser – DOM. IE.Navigate str_URL ' Statusbar let's user know website is loading . This is something that we are displaying to keep the user engaged or make the user understand that some background process is going on. Application.StatusBar = str_URL & " is loading. Please wait..." ' Wait while IE loading... 'IE.ReadyState = 4 signifies the webpage has loaded (the first loop is set in order to avoid skipping the second loop) Do While IE.ReadyState = 4: DoEvents: Loop 'Do While Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until 'Webpage Loaded – display in status bar. Application.StatusBar = str_URL & " Loaded" ‘ Here we can do that actions we wish to 'Unload IE – Emptying the objects in order to give up the used RAM space of system and increase efficiency. Set IE = Nothing Set obj_Element = Nothing Set obj_Collection = Nothing End Sub
Output:
Now using the same DOM object, we will enter a value in the search box and click on the search icon.
Method 1: Using Watch Window
As the code runs, we stop the code at a breakpoint and add the IE object to the watch window. Now this object on the watch window can be expanded to see the list of objects, properties, and methods it has under it. Based on this, we can explore and find the search text box object.
IE.Document is the DOM in this example. It is highlighted in the watch window below:
The second method makes it even simpler for us to proceed further spying/identifying the objects we need to work on.
Method 2: Using the HTML Elements
Function F12 can help us view the HTML code as a part of the web page.
Using the spy icon, we can click on the object whose properties we need to inspect.
On clicking, we can see the element’s code.
Here’s the HTML code of the element:
<input name="field-keywords" tabindex="0" class="nav-input nav-progressive-attribute" id="twotabsearchtextbox" dir="auto" aria-label="Search" type="text" placeholder="" value="" autocomplete="off">
From the code above, we can understand that the classname
of the element is nav-input nav-progressive-attribute
. So, we can select the object using this classname.
Set obj_Element = IE.document.getElementsByClassName("nav-input nav-progressive-attribute")
When we run this single line of code and watch the value of the assigned object, obj_Element, we see that there are two elements with the same classname. Now, we can try assigning value to the first item.
On expanding the two items, I found that the id of the first item is the searchtextbox and the second is the search submit button.
So, our job is almost done.
Now, we can write some code to type search text in the first object and click on the second object.
‘type text in the search box obj_Element(0).innerText = "Olympiad books for Grade 2"
Click on the submit button.
obj_Element(1).Click
WOW!
We have automated the search successfully!
Here is a copy of the whole code you can try out if interested:
Sub Automate_IE_amazon() 'Declaration of variables and objects Dim i As Long Dim str_URL As String Dim IE As Object Dim obj_Element As Object Dim obj_Collection As Object 'Create Internet Explorer Object - This is the document object Set IE = CreateObject("InternetExplorer.Application") 'Set IE.Visible = True to make IE visible, or False for IE to run in the background- set properties of DOM IE.Visible = True 'Define str_URL str_URL = "https://www.amazon.in/" 'Navigate to str_URL IE.Navigate str_URL ' Statusbar let's user know website is loading Application.StatusBar = str_URL & " is loading. Please wait..." ' Wait while IE loading... 'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set in order to avoid inadvertently skipping over the second loop) Do While IE.ReadyState = 4: DoEvents: Loop 'Do While Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until 'Webpage Loaded Application.StatusBar = str_URL & " Loaded" Set obj_Element = IE.document.getElementsByClassName("nav-input nav-progressive-attribute") obj_Element(0).innerText = "Olympiad books for Grade 2" obj_Element(1).Click 'Unload IE Set IE = Nothing Set obj_Element = Nothing Set obj_Collection = Nothing End Sub
Conclusion
Remember, we can not only get elements using the classname, but we can also use the input tag name or ID directly.
The search text here in the example was hardcoded. But it can very well be dynamically taken from your Excel sheets in the document. This way you may search for many items one after the other specifying them in the rows of the sheet.
The more we explore the web page, the more we learn. Web scraping is very popular nowadays for companies because it reduces manual work. This automation becomes easy with the “DOM” in VBA. So, use it wisely to get your page automated.