You’re either thinking one of two things at this very moment:
- “OMG, YES. I cannot wait to dig into schema markup.”
- “Ugh. THIS sucks. Not looking forward to learning about this confusing coding stuff.”
There is no in-between.
If you’re like most, you fall into Camp 2. We feel you, schema beginners.
If you’re not like most, you fall into Camp 1. Carry the hell on, technical SEO specialists and web developers.
Good news, Camp 2: Schema markup doesn’t need to be difficult.
Good news, Camp 1: Schema markup doesn’t need to be difficult (even though you’d relish a challenge).
In this article, we’re going to demystify the often complicated world of schema structured data so you can understand its value to your search engine optimization program and implement it with ease.
When you’re finished, not only will you know how Google uses schema markup to rank websites and serve content within search engine results pages (SERPs), but you’ll also know why ignoring it can cost you rankings, clicks, traffic, and customers.
Let’s go!
Get brand new SEO strategies straight to your inbox every week. 23,739 people already are!Sign Me Up
Schema markup (aka microdata or schema.org markup or schema) is a type of structured data created by Google, Yahoo, Yandex and Microsoft (Bing) designed to help search engines better understand the meaning of web pages.
Think of schema like the language of search engines: it’s a vocabulary of HTML tags that you wrap around specific parts of your text to tell Google exactly what it means and how it relates to other topics or subtopics across the web.
Schema markup gives search engines explicit instructions, and it describes the relationships between words on the page. No guessing.
Let’s explore an example.
This is what standard HTML (no schema) looks like for a section of a website that describes a person (Jane Doe), her address, and links to related information.
Now here is an example of the same HTML, only with schema microdata added:
By using Person schema, this example tells search engines that Jane is a professor who lives in Seattle, can be reached via email or phone, and works with two notable graduate students, Alice and Bob.
You or I can look at the words on the page and immediately understand the context of the related information and how it’s linked together. Search engines (ahem, machines) have a hard time understanding context and relationships (they don’t speak English). Schema markup uses a vocabulary they can understand.
In a word, disambiguation.
Schema helps add meaning to the web. It connects facts in a way that search engines can understand, without ambiguity, so they can better answer search queries and rank web pages.
But let’s peel this onion back to learn just how important microdata is to your SEO.
Essentially, Google’s algorithm goes through four steps to rank websites:
- First, Google wants to know, with certainty, what topic (aka “entity”) a page is talking about
- Then they want to analyze the related attributes of that topic to figure out how well a certain page explores said topic (i.e. relevance)
- Then they want to check how many other people think that page explores the topic well by analyzing who links to it and how often (i.e. authority)
- Last, using relevance and authority inputs, they’ll score and rank web pages
That’s pretty much it.
But how do they know with certainty what topic the page is about?
Knowledge graphs.
Knowledge graphs are structured databases full of entities (distinct topics like people, places, objects, things, or concepts). And each entity has a web of connected attributes (subtopics).
Schema is like the sticky spiral silk holding this web of information together: it takes otherwise ambiguous, disconnected pieces of information and connects them.
Like the example above, Google has its own knowledge graph, called (you guess it) the Knowledge Graph.
The Knowledge Graph (with a capital K) is home to over 500B facts about 5B entities, and Google built it using data from sources like Wikipedia, Wikidata.com, the CIA World Factbook, and schema structured data across the web.
Let’s take a closer look:
Using structured data, Google algorithms are able to look at this knowledge graph of shoes and know with certainty that high boots are a type of boot, or that running shoes are a type of sport shoe. That’s because the data is explicitly linked (see green lines).
But they’re also able to infer that “Ronverse” are a type of sport shoe, that bright crimson is a warm color, and that high boots are a type of shoe, even though they’re not explicitly linked.
That’s the power of structured data. By linking enough information together, Google is able to infer the rest.
And that’s exactly what search engines try to do every time someone types a query: to infer meaning based on what they already know, then rank pages that meet those expectations.
More structured data makes it easier for search engines to interpret the meaning of your page. The more they can interpret the meaning of your page, the better they can infer new facts about it. The more new facts they can infer, the more relevant (or irrelevant) they’ll find your page. The more relevant your page, the higher you’ll rank.
Pretty neat, ya?
Back in 2015, everyone anticipated that schema would be the next big ranking factor.
After all, if you went the extra mile to markup your HTML with explicit instructions, making search engines’ lives easier in the process, why wouldn’t you get rewarded with higher rankings?
It never happened 🙅.
But that doesn't mean structured schema doesn’t benefit your SEO.
It does, even if indirectly.
At this point in the article, hopefully, one thing is clear: schema helps Google better understand you, which can help you rank, even if indirectly.
But at the top of this totem pole of structured data and knowledge graphs is the fire-breathing head: semantic search.
Semantic search is how we describe the search of today: evolved algorithms that use artificial intelligence, machine learning, and natural language processing to understand the intent and meaning of search queries so they can better rank and serve content.
Bottom line: search engines are becoming more human (trigger theme song from iRobot).
Semantic SEO, by extension, refers to optimizing for this new age of search: focusing on search intent and helping Google better understand the meaning of your content, not obsessing over individual keywords and keyword density.
So if Google and company have already moved to a semantic web, and if they’ve made it clear that meaning is going to be the cornerstone of search moving forward, and if they already used structured data to populate SERPs, doesn’t it make sense to move with the current?
Of course. And schema is your paddle.
A snippet is another name for an organic search result.
A rich snippet (or rich result) refers to an organic Google search result that has more than the standard blue link, title, URL, and meta description. It may have a recipe, star rating, carousel, product quantity or price label, or something else.
For example, this is a rich snippet of a star review rating on Yelp:
Using Google’s free rich snippet tester, you can see Yelp includes structured data for four different elements, including reviews:
If you drill down deeper (by clicking on a structured data dropdown), you can even see the exact schema code used:
Pretty awesome, right?
Currently, you can add structured schema to the following categories and receive a rich snippet:
- Reviews
- Recipes
- Movies
- Products
- Events
- Fact-checks
- Carousels
- Videos
- Music
- FAQs
- Q&A
- How-to articles
- Apps (software)
And though not all schema categories get rich snippets now, Google has stated that they may in the future:
Why do rich snippets matter for SEO?
More search real estate. More opportunities to make your result stand out from others. More click-throughs from search (i.e. higher CTR). More traffic to your website.
Bonus: Within Google’s Search Console, rich snippets get their own data: clicks, impressions, click-through rate (CTR), and ranking position.
Easily monitor how well click-through rates perform compared to non-rich snippets.
Think of Google’s Knowledge Graph like one big knowledge graph full of other knowledge graphs (like a Russian nesting doll of knowledge graphs).
Every entity has its own knowledge graph, which in turn has its own web of related attributes, which in turn function as their own entities with attributes.
This means that you or your business or the service you sell are all their own entities with knowledge graphs that search engines use to learn from.
For example, here’s a knowledge graph of a person (the CEO of SchemaApp):
Using schema, you can create your own knowledge graphs.
For example, say you have a explainer video on your homepage. Google sees a video, content, your logo, and your URL. But they don’t necessarily know the video is about your business.
With schema, you can tell Google that the video is about your business.
You can also tell Google that you are a provider of certain services. Or that a page is about a main topic. Or that you serve a specific region. Or that your product costs $14.99.
Starting to see how building a more informed knowledge graph about you and your business can help search engines better understand what you do and who you’re for so they can serve the best search result?
Bonus: schema helps inform your Knowledge Panel too.
A Knowledge Panel refers to the box of additional information to the right of SERP results.
For example, this is what KlientBoost’s Knowledge Panel looks like:
The more structured your knowledge graph, the more informative your Knowledge Panel.
First things first: err on the side of caution and seek help from a webmaster if you’re not one yourself.
It’s not that schema is terribly difficult to figure out on your own. But implementing it without any technical background might prove difficult (and cause more headaches than anything else).
Either way, follow these five steps to ensure you get it right.
You know what schema does. But what are the mechanics that make it possible?
Essentially, with every schema you create, you're linking a subject and object together.
“This is talking about that.”
To do that, not only do you need a subject and an object, but you also need a schema property to function as the link (otherwise ‘this’ and ‘that’ are just floating off in space).
Martha Van Berkel of SchemaApp calls this a triple: subject (1), schema property link (2), object (3).
This (subject) belongs to (property) that (object).
Let’s look at two of her triple examples:
In the above example, we see a schema that states “The Queen’s spouse is Prince Philip.”
The Queen is the subject, and “spouse” is the schema property that links to Prince Philip who is the object.
In this example, we see a schema that states “Harrods services the area of Britain.”
Harrods is the subject, and areaServed is the schema property that links to the object Britain.
Subject. Property (the link). Object.
This belongs to That.
If you can remember that, you’re 90% of the way there.
The schema vocabulary is vast.
How many subjects can you actually link to objects? Like, a ton.
So instead of exploring each one in detail, we encourage you to explore Schema.org’s full list of available schemas, organized in hierarchical order: Schema.org vocabulary of schemas.
You’ll find at least a handful of available schemas that fit perfectly with your business that you never expected were possible. Promise.
Otherwise, here are the most popular schema types that every business should consider, along with links to their documentation:
Google supports schema via three methods:
- Microdata
- RDFa
- JSON-LD (best/recommend)
While you can use either method, JSON-LD has clear advantages.
Let’s explore.
Microdata is Schema.org’s original method for nesting structured data within HTML.
However, it has its limitations: not only is it tedious and unscalable, but it’s difficult to mix vocabularies or invert the direction of a property relationship since it requires that information be close together on the page.
It’s not the easiest method, but still effective in many cases.
Microdata schema includes three building blocks:
- Itemscope
- Itemtype
- Itemprop
<div itemscope itemtype ="https://schema.org/Movie">
<h1 itemprop="name">Dune</h1>
<span>Director: <span itemprop="director">Denis Villenueve</span> (born October 03, 1967)</span>
<span itemprop="genre">Science fiction</span>
<a href="../movies/dune-theatrical-trailer.html" itemprop="trailer">Trailer</a>
</div>
It sounds foreign, I know. But stay with me.
- Itemscope is a container (like a box). It tells search engines that the information contained within the <div></div> is part of the same family. That’s it.
- Itemtype, which comes immediately after the scope, tells search engines what specifically the information within the container is talking about. Is it a movie? A product? A review? A person? Think of the itemtype as the subject of the sentence. But it’s still pretty broad.
- Itemprop is where we get specific. The itemprop specifies a property, or related attribute, of the itemtype. And you can include as many as you want. Sticking with our movie example: you can have a director property, release date property, genre property, trailer property, and much more. The itemprop is the connector, and whatever it specifies is the object.
RDFa is an acronym for Resource Description Framework in Attributes.
Conceptually, RDFa is an almost identical vocabulary as microdata:
- Microdata: <p itemscope itemtype="http://schema.org/Person"> <span itemprop="name">Jimmy Dean</span> is his name. </p>
- RDFa: <p typeof="schema:Person"> <span property="schema:name">Jimmy Dean</span> is his name. </p>
However, they have important differences.
In general, RDFa is a more flexible, standardized method than microdata that lets you mix vocabularies and invert properties.
JSON-LD is an acronym for JavaScript Object Notation for Linked Data.
In layman's terms, JSON-LD is a way to link information on a page using javascript.
JSON-LD is the easiest (and less tedious) method for implementing schema because you don’t have to wrap individual HTML elements with microdata. Instead, you can just paste your JSON-LD into the <head> section of your HTML document.
It’s also easier to express nested data using JSON-LD since it doesn’t require that the information be close together on the page (i.e. within the same HTML container).
JSON-LD uses a complex language with immutable elements (non-negotiable syntax). Mess up the syntax and you mess up your schema. Which is why you should consult your webmaster (or hire one) for the job.
Good news: A handful of tools exist that will generate JSON-LD for you (though not for all schemas; only the popular ones).
Our favorite is the Merkle Schema Generator (JSON-LD):
Using Merkle, you can select which schema you’d like to create, fill in the fields, and they’ll automatically generate your javascript. No coding experience required.
Bonus: You can also use Google Tag Manager to insert your JSON-LD (advanced users only).
Use JSON-LD to implement your schema. Not only is it a recommended method by Google, but it’s a recommended method by W3C (World Wide Web Consortium).
There are certain instances where microdata or RDFa might make more sense, like if you’re only implementing a small piece of schema or if you want to edit existing HTML instead of inserting new data. Otherwise, use JSON-LD.
Bonus: Depending on your content management system (CMS), you might be able to use a plugin to implement schema markup. For example, if you use WordPress, the Yoast Plugin will do the heavy schema lifting for you.
Before pushing your schema markup live, always validate for accuracy.
Luckily, two free structured data testing tools exist that make it easy:
Simply paste your HTML markup into the left side of the validator, click the green arrow in the center (bottom), then check to see if it passes on the right. If there are errors in your markup, Schema.org will notify you and provide instructions on how to fix it.
This tool only tests for markup that produces rich results, which is only a fraction of available schemas. But it’s still a useful tool for previewing rich results before they go live. With the Rich Results Test, you can enter a URL (Google will scan the page for markup), or you can paste your code directly (for JSON-LD).
More good news: Google has a structured data markup tool that will produce JSON-LD markup for you if you just highlight parts of the page with your mouse.
Seriously.
Just highlight the elements of your page that you want to markup, and it will generate the markup for you. Then you can paste the microdata or JSON-LD.
Here’s how it works:
Step 1: Input the URL or HTML, and select the type of markup
Step 2: Highlight elements of the page you want to markup
Step 3: Check right panel to make sure you marked everything up
Step 4: Hit “Create HTML” in the top right corner. Then copy and paste to your website.
Congrats, you made it to the end! 👏 👏 👏
We know: learning how to make your content machine-readable isn’t every SEOs favorite topic. But it’s important nonetheless.
Bottom line: We’re living in the world of semantic search and structured data like schema is the semantic vocabulary search engines understand best.
Without structured data, the world wide web is a universe of disconnected stars. With structured data, that universe has meaning to search engines.
Ignore schema markup at your own risk.
But answer this one question first: Would you rather Google guess what your website is about, or would you rather tell them?
Thought so ;)