Back to learn seo
Structured Data Basics: Using Schema.Org to Help Search Engines Understand Your Content
What is structured data and why is it important for SEO?
Search engines have evolved over the years from providing simple lists of links in their results to provide search result features based on the search query type. In order to help search engines better understand the content of websites, they use structured data to gather more information about the page and content.
Structured data is formatted into a data vocabulary maintained by Schema.org. Schema.org is a collaboration between Google, Bing, Yahoo! and Yandex (Baidu has their own version, called “Baidu Open”, and does not officially support Schema.org.) This collaboration makes it easier for webmasters to provide data in a format that is shared and understood by the major search engines.
In Google’s introductory example, they point to a recipe page which includes content around ingredients, cooking time, calories, etc.
There are different schema types available, all based on the content of your page and the available data you have to publish. That content is then eligible for rich result features in the search engines.
We can’t be sure how rich results will evolve over time, but those sites that implemented the Schema.org formats for recipes are able to compete for premium placement in the SERP.
Why is this important?
While structured data by itself won’t help a web page rank higher, the page schema does help Google better understand the content of the page, which can indirectly help improve rankings. (https://www.seroundtable.com/google-structured-data-ranking-factor-25510.html)
Besides the potential for better understanding of your page’s content, being eligible for the rich results is important for other reasons:
- Improved CTR from the SERP: Having a more attractive listing in the search result can boost the click through rate (CTR) from the SERP to your page. Features such as breadcrumbs, aggregate ratings, and stock status can help improve the CTR.
- Increased ownership of the results page: Features such as FAQs and How-To provide a larger search result for the page in Google’s results. That ends up pushing competing pages lower on the page, and even off of page 1. (For a number of queries which include FAQ results, the number of page 1 results has decreased from 10 to 7 due to the increased vertical space the FAQ results occupy.)
This result from TripAdvisor includes FAQ markup and takes up the vertical search real estate of 2 listings.
- Better context for the click: Rich results provide the searcher with more context around what they’re going to get when they click on the link to your page.
Types of Rich Results powered by structured data
Google uses structured data to help power rich results in the SERP across a wide variety of formats. Some of the most common are breadcrumbs and rating stars, there are also features for recipes, events, how-to articles, and more. (Google provides a list and examples of the rich results they currently use. https://developers.google.com/search/docs/guides/search-gallery)
Common Schema.org types called out are:
- Articles
- Recipes
- Reviews
- Products
- Events
- People
- Organizations
- Local Businesses
- Medical conditions
In March and April, Schema.org also rolled out new vocabulary to support efforts around COVID-19 response.
Schema.org is a living vocabulary
One of the great things about the Schema.org vocabulary is that it’s always updating based on the needs of the market. During the COVID-19 outbreak, Schema.org made updates in March and April 2020 to allow websites to provide more data around Special Announcements, Testing Facilities, and even incorporated CDC data formats.
Schema.org vocabulary can update as often as monthly. It’s important to keep abreast of changes for your industry. (You can review the release history at: https://schema.org/docs/releases.html)
Structured data formats
There are a few methods to include structured data on a page. The most popular are using in-line markup (microdata) and JSON-LD, which allows you to put the structured data markup in a single object on the web page.
Here’s a run-down of the supported markup types for structured data. Structured data in all formats follows the Schema.org data vocabulary.
- JSON-LD: JavaScript Object Notation for Linked Data. This is Google’s preferred format for delivering structured data. The structured data is contained in a <script> object in the page head or body. Webmasters can also inject this object onto the page with dynamic data.
- Microdata: Uses an html specification to nest the structured data within your HTML content. It uses HTML tag attributes such as <itemprop> and <itemtype> to specify the markup and is mostly used in the page body.
- RDFa: An HTML5 specification like microdata that allows webmasters to provide structured data markup as HTML tag attributes.
At this time, Google prefers JSON-LD but you can also use other formats depending on what works best for your site’s code base and the rich result you are targeting. JSON-LD and Microdata are the most common formats.
What about OpenGraph?
OpenGraph is used for sharing on social networks (Facebook, Twitter) to specify how your web page will display when the URL is shared via those networks.
Schema.org data provides search engines with better understanding of your content for indexing and ranking, OpenGraph tells social networks how to display your page by specifying the Title, URL, featured image, and author when it’s shared. Ideally you would use both.
JSON-LD allows greater flexibility for implementing Schema.org publishing
One of the main advantages of using JSON-LD for delivering page schema is the ability to nest data elements. JSON-LD separates the page data from the HTML code structure, which makes the structured data object more reliable. When using Microdata, any changes in your page layout can end up breaking the structured data.
Eligibility guidelines for structured data
In order for content to be eligible for structured data markup, it has to be visible on the rendered page. This follows the search engines’ philosophy of making sure that the search features they are publishing match up with the content on the target pages.
Google has additional guidelines for structured data that pretty much mirror their policies towards quality content. In other words, don’t spam and make sure your content is relevant and accurate.
When using JSON-LD objects in the head, that means you should ensure that the content within the JSON-LD code is also visible on the page for users. (That content can be behind tabs or drop-downs and still be valid.)
Check the guidelines and code examples for each of the objects in Google’s search gallery: https://developers.google.com/search/docs/guides/search-gallery
How to create and validate your structured data
Looking at a detailed JSON-LD output for a list of restaurants or products can be intimidating. It’s important to break down the fields within the output and determine what data you have available, what other fields you have access to with more work, and what you don’t have.
As long as you are able to publish all of the required fields for the rich result you’re targeting, then you’re good to go. Any additional fields you can provide are nice to have and provide even better context.
Creating your Structured Data
First, start with the examples given by Google. Explore their search gallery and click the “Get Started” button.
You can also work on your markup at the JSON-LD Playground.
Testing and Validation of your Structured Data
Once you have sample code, test your code snippet or a published page in Google’s Rich Results test: https://search.google.com/test/rich-results
For bulk validation, you’ll need to use your favorite site crawling tool’s bulk validation features, or wait for Google to crawl your pages and review the Enhancements reports in Google Search Console (pictured).
Common errors implementing structured data
Implementing structured data across a large site is a challenging task. Each different page type often requires different structured data implementations and with frequent updates, you’ll need to be on top of available enhancements.
Some of the common errors to avoid around implementing Schema.org markup on your site include:
- Marking up content that is invisible to users. The purpose of structured data is to provide more context to search engines to provide a good result to their users. If users can’t find that content on your page, then Google may view the page as “deceptive or misleading” and apply actions against those pages.
- Applying page-specific markup to an entire site or category. Search engines want your structured data to be as specific as possible. Avoid adding review snippet schema code to category pages; limit those to individual product pages.
- Improper nesting of data. This is one of the main advantages of JSON-LD. With inline formatting it’s easy to get the nesting of your structured data wrong. Even with JSON-LD, you can mess up the nesting and order of your data. The more data points you’re providing within the page, the greater the chance of an error.
- Using data-vocabulary for breadcrumbs. In the past, data-vocabulary was an acceptable format for highlighting the breadcrumb links for a page. Google announced that it will deprecate support for data-vocabulary. This was originally going to go into effect in April, 2020 but due to COVID they are delaying implementation. (https://webmasters.googleblog.com/2020/01/data-vocabulary.html)
Continuous development opportunities for SEO
Implementing Schema.org for your site can be a complicated process, but the benefits to search visibility, CTR, and your own site traffic and conversion are well worth the efforts.
In some verticals, proper implementation of Schema.org is a must, which in others it can provide you with a competitive advantage.
For technical SEO teams, Schema.org implementation and testing should be in your technical roadmap and evaluating which items will best serve your site part of your ongoing backlog.
This is a space which will continue to evolve quickly as new needs and search behaviors emerge.