The IMPORT functions are some of the best ones for saving time when you’re working with large amounts of data from outside sources. The IMPORTHTML Google Sheets function in among the most useful as it brings data over from websites quickly and easily.
This guide will cover how to use the IMPORTHTML function with simple step-by-step instructions. We’ll also briefly cover some of the other IMPORT functions in Google Sheets so they’re easier to tackle when you encounter them in the future.
Read on to get a thorough understanding of IMPORTHTML and an introduction to IMPORT functions in Google Sheets.
Table of Contents
HTML or Hyper Text Markup Language is used to create web pages. The language describes the structure of web pages. Developers use the HTML language to design the interface that the browser displays web page elements like text, media, and hyperlinks in.
Users can use HTML to navigate and insert links, as HTML is commonly used to add hyperlinks. The language also makes it possible to format and organize documents in a similar fashion to Google Docs.
The Google Sheet IMPORTHTML function can search and extract data from an HTML table or list. The function is aimed to be used in getting lists or tables from an external website. Before we take a look at how you can use the Google Sheets import web data formula, let’s take a look at its format. Here is the formula:
=IMPORTHTML(URL, query, index)
The formula requires you to input three parameters. These are:
In this example, we want to get the table on the list of highest-paid film actors from the Wikipedia page. Doing this manually can take a lot of time and effort, which is why we will use the Google Sheets import HTML function.
Here is how to use IMPORTHTML in Google Sheets to get a table:
If the webpage has a list, you can import it into Google Sheets using the same steps you use to import a table. Here is how to get a list from a website and import html to Google Sheets:
You can also use cell references as the parameters for the IMPORTHTML function. In the example below, we’ve used the formula:
=IMPORTHTML(C1,C2,C3)
Instead of typing the URL, query, and index into the formula.
To import only a specific row and column with IMPORTHTML, you simply have to next it inside the INDEX function. In the example below, we used the formula: =INDEX(IMPORTHTML("https://en.wikipedia.org/wiki/List_of_highest-paid_film_actors","table",2),3,2)
You’ll notice that there is a ,3,2) outside the first closing bracket, this indicates to the INDEX function that you want to pull data from Row 3 and Column 2.
The list or table should be displayed within a few seconds if the formula is executed properly. However, if no data is being displayed or you get an error prompt, it may be due to the following reasons:
var index = 1; [].forEach.call(document.getElementsByTagName("table"), function(elements) < console.log("Index: " + index++, elements); >);
How to find certain html sections to import" width="1103" height="900" />
If you want to import lists instead, you should use “ul,ol” as an argument instead of “table” like so:
var index = 1; [].forEach.call(document.getElementsByTagName("ul,ol"), function(elements) < console.log("Index: " + index++, elements); >);
You can use a combination of adding a query and Google Apps Script to change how often the import is updated.
=(IMPORTHTML("https://en.wikipedia.org/wiki/List_of_highest-paid_film_actors" & "?refresh table",1)
You can use several other functions to scrape content into Google Sheets. Let’s take a look at some of them.
XML is a markup language similar to HTML. However, there is one key difference: XML does not have predefined tags. Instead, you can define your own tags to fulfill your needs. The IMPORTXML function in Google Sheets can be used to XML into Sheets.
Here is the syntax for the formula:
=IMPORTXML(link, xpath_query)
The formula uses two parameters which are link and xpath_query. The link parameter defines the webpages link you want to examine. The xpath_query parameter is the query you want to run on the data. Enclose the value for this parameter in quotation marks.
You can learn more about the formula in our IMPORTXML Google Sheets function guide.
The IMPORTRANGE formula in Google Sheets allows you to access data from another worksheet, provided that you have access permission for that sheet. The function allows real-time data transfer, and you can import exact ranges from another sheet.
Here is the syntax for the formula:
=IMPORTRANGE(spreadsheet_url, range_string)
The formula uses two parameters which are spreadsheet_url and range_string. The spreadsheet_url defines the URL of the source spreadsheet. Enclose the URL in quotation marks. The range_string parameter contains the information about the range of the cells you want to import to the current spreadsheet.
The IMPORTFEED formula in Sheets lets you get data from Atom and RSS feeds. This helps you keep track of any news or blog post items on a website.
Here is the syntax for the formula:
=IMPORTFEED(URL, query, headers, num_items)
The formula uses four parameters: URL, query, headers, and num_items. The URL parameter defines the link to the Atom or RSS feed from the website. The query parameter is an optional parameter that defines the elements you want to get from the feed. The headers parameter specifies whether you want to have headers. The num_items parameter can specify the number of items in the feed.
The IMPORTDATA function in Sheets lets you quickly get the data from a URL containing a .tsv or a .csv file. It can be useful if you are working with data only available in a CSV or a TSV format. Google Sheets will import the data and format it appropriately.
Here is the syntax for the formula:
=IMPORTDATA(URL)
The formula only requires one formula to work. The URL formula defines the URL of the file’s location. Ensure the parameter is in quotation marks.
You can refresh the IMPORTHTML function in Google Sheets in multiple ways. Either the function can be updated every hour automatically whether the user refreshes the formula or not. You can also use the NOW function to trigger a referred of the IMPORTHTML function every minute or thirty seconds.
Google Sheets automatically check for updates every hour when the document is open to keep getting fresh data, even if the user doesn’t change the formula or the sheet. The formula is recalculated if the user changes the formula or if any cell containing a reference to the function is updated. However, if your close and reopen the document, it won’t cause a refresh on any of the IMPORT functions.
You should have everything you need to start working with the IMPORTHTML Google Sheets function. Luckily, knowing how to use IMPORTHTML means it will be much easier to use the other IMPORT functions in Google Sheets too, as they work very similarly. If you found this guide useful, please check out our related content below to keep learning.
Related: