Expert - Google Employee
Justin Quizon

Structured data markup for datasets

Have a look before you post -- maybe the answer to your question is already here!

If not, kick off a new thread in the Structured Data section.

Q: What is Google Dataset Search and how do I find out more?

There are several resources that help you learn more about Google Dataset Search:

Q: How do I add my dataset to Dataset Search?

If you have a web page that describes a dataset (or many such web pages), you need to do the following to have it included in Dataset Search:

  • [Required] Add metadata in schema.org to each page that describes a dataset (documentation).

  • Verify that the markup produces structured data that you expect in Structured Data Testing Tool

  • If you have multiple pages, create a sitemap and add that sitemap to your Search Console

If the page has been crawled but after a week or two you still don't see it in Dataset Search, please file a bug using the "Feedback" button.

Q: Why is a specific dataset not showing up in Dataset search results?

Most likely, this happens because there is no structured data on the page that describes the datasets. To verify, copy the link for the page that you expect to see in Dataset search results, and put it into the Structured Data Testing Tool. If you do not see any "Dataset" on the right-hand side, this means there is no schema.org/Dataset markup on the page or it is incorrect. If you own the page, you can fix it (instructions) or you can contact the owner of the page.

Even if there is markup on the page, we may not have gotten to it yet. If you own the page, you can check the search console to check the crawl status.

Q: Is a contract required to have the results be listed in Dataset Search?

No. The implementation is structured data mark-up: schema.org, which is an open standard widely used around the web. Mark-up on sites is purely voluntary and partners can remove the mark-up at any time.

Q: Will my site rank higher if I implement this feature?

No, your result will maintain its organic ranking. However, implementing this mark-up is a requirement for the results to show up in Dataset Search.

Q: Can I expect a traffic increase?

As with Search in general, there are no traffic guarantees. However, we believe there will be more discovery of your content by users. And the users that click/tap through to your site will have a much higher intent.

Q: How do I delete a dataset from Dataset search results?

If you don't want a dataset to show up in Dataset search results, and you own the dataset page, simply delete the structured data for schema.org/Dataset on the page. Keep in mind that it might take some time (days or weeks, depending on the crawl schedule) for the changes to be reflected on the Dataset search side.

Q: Why is some information from structured-data markup now displayed in the Dataset search results?

The goal for the results pages is to provide our users with the most reliable and predictable experience across the data that we collect from thousands of repositories. While we use all the structured data in our product, the decision of what to display and how to display it is guided by many different factors. As owners of dataset repositories provide more structured high quality data describing their content, we will continue to enrich the result pages in the product.

Q: What tools are available to help with markup?

You can use the Markup Helper to generate sample JSON-LD markup for a page that describes a dataset. Simply select "Dataset" in the Markup Helper, put in an address for one of your pages, and select and tag different components.

When you already have structured data on the page, the Structured Data Testing Tool (SDTT) is useful to verify the data. You can also use this tool to look at pages from other sites for examples of markup.

Q: How should the publication DOI and Dataset DOI be specified in the markup?


"@context" : "http://schema.org/",

"@type" : "Dataset",

"@id" : "https://doi.org/10.5061/dryad.8nm16",

"url" : "https://doi.org/10.5061/dryad.8nm16",

"identifier": "10.5061/dryad.8nm16",

"citation":  "doi:10.1111/jav.01596"


We are not sure why SDTT marks this as a problem since the value for citation can be text, but please be assured that if you plan to include markup as shown in these examples, it should be okay.

An even better markup example would be:


"@context" : "http://schema.org/",

"@type" : "Dataset",

"@id" : "https://doi.org/10.5061/dryad.8nm16",

"url" : "https://doi.org/10.5061/dryad.8nm16",

"identifier": "10.5061/dryad.8nm16",

"citation":  {

  "@type" : "Article",

  "identifier" : "doi:10.1111/jav.01596",


as this example specifies what kind of DOI is being provided in the citation. Please note that if you cannot include “citation” property, just specifying the "identifier" property should be fine, but it should follow the same format as in the sample above.

Q: Will Google's crawler be loading JavaScript from pages when looking for markup?

The Structured Data Testing Tool and crawlers should be able to execute javascript and load the markup. That being said, if there are issues with the tool, it would be best to try to adjust the javascript such that the markup is visible on the tool as well. Otherwise, you are allowed to directly place your markup from the page into the structure data tool to verify that the markup is good.

It is also worth noting that it would be best to use the Structured Data Testing Tool provided by Google rather than a third party extension as our tool more closely follows our dataset guidelines.

Q: The “identifier” property is not appearing on the SDTT, why is that?

This is a known issue with the tool which will be addressed by our team in a future iteration. While the structure data testing tool is a great way to validate the syntax of your markup, it may throw errors or warning that may not need to be addressed. So long as the markup of the page source adheres to the guidelines provided in our documentation, there should be no issues.

Q: How should multiple authors be specified in the markup?

We highly recommend using the "citation" property to include this information. You can create an array of citations and order the authors how you see best fits for the dataset.

Q: How would you add acknowledgement for funding in the dataset markup?

Although it is currently not a listed property in our documentation, you can refer to details about the "funder" property on http://schema.org/Dataset.

Q: How would we raise any issues we have about the current Dataset properties or suggest new properties to be added to the feature?

We suggest going to https://github.com/schemaorg/schemaorg/issues to discuss any markup properties changes or issues that your team is seeing.

Q: Is there something formal about actually submitting the sitemap to search console versus  just having dataset content present on the web pages to be discovered organically?

The submission is to ensure that the sitemap will be crawled if has not already been crawled naturally. That being said there is some latency in crawling the pages so please do allow for some time for them to be detected.

Q: We noticed that the structured data tool seems to strip newlines from the output on the right hand side; is that correct?

So long as the new line characters are included in the "description" property for source code of the page, that should be fine.

Q: Is it possible to create a sitemap for a subsection of our web site?

You are allowed to create and submit a sitemap specific to a subsection of your website rather than a sitemap of the entire site. Once you have a sitemap ready, you can submit them to our Google search console.

Q: How would you go about adding multiple “GeoCoordinates” or “GeoShapes”?

You can use an array of values for the “spatialCoverage” property in order to specify multiple points or shapes or a combination of both. I have included a sample below as reference:

"spatialCoverage": [{

    "@type": "Place",

    "geo": {

      "@type": "GeoCoordinates",

      "latitude": 39.3280,

      "longitude": 120.1633




    "@type": "Place",

    "geo": {

      "@type": "GeoShape",

      "box": "39.3280 120.1633 40.445 123.7878"




Q: Our company logo is not appearing or is appearing incorrectly next to our dataset search results, how can we adjust this?

There are several steps to getting the image to show up. You should be able to see your logo in the panel on the right hand side of main Google search results, similar to what you see if you search for a search such as [pangaea]. If you don't see the image or it is not the image that you want, you will need to follow the blog entry and this developer resource to update the image.

Community content may not be verified or up-to-date. Learn more.
Expert Replies (4)
Were these replies helpful?
How can we improve them?
All Replies (20)
Marten Hogeweg
Marten Hogeweg
I have a DCAT JSON file rendered from my metadata catalog (sample: http://gptogc.esri.com/geoportal/dcat.json). This is being used to let Data.gov harvest my catalog. Is there a way to register such a DCAT file instead of exposing the same content as a sitemap file?
Vincent Armentano Jr.
Vincent Armentano Jr.
Searching seems to prefer different phrasing than regular google, is there a template for how to make searching most effective? (eg "weather site:noaa.gov") 
Ellie Kesselman
Ellie Kesselman
My question is the same as Vincent's, but more generally phrased:  What is the search syntax for Google Dataset Search?

As Vincent said, it is clearly different from Google regular or special search syntax. Is there a page that lists everything comprehensively? I checked the newest Google blog posts on Dataset Search, https://www.blog.google/products/search/making-it-easier-discover-datasets/ and also this one from July 2018, about dataset search for Google News initiative, but couldn't find a dataset search syntax listing. 
Peter Sefton
Peter Sefton
The dataset  guidelines give an example of a contactPoint property on an Organization with the Organization referenced by a creator property on the Dataset. Would this also work if the property was "publisher" rather than "creator"? Better still, couldn't you add contactPoint as a property directly on Dataset?

(I am working on a standard for packaging research data which is already using Schema.org, your feedback would be appreciated: https://github.com/UTS-eResearch/datacrate/blob/master/spec/0.3/data_crate_specification_v0.3.md)
Peter Sefton
Peter Sefton
In reference to my other comment: would Dataset search index a contactPoint property on a person if they are a creator?
Mindey I.
Mindey I.
Do you plan of a better option to markup data than JSON-LD?
This question is locked and replying has been disabled. Still have questions? Ask the Help Community.


Some community members might have badges that indicate their identity or level of participation in a community.

Expert - Google Employee — Googler guides and community managers
Expert - Community Specialist — Google partners who share their expertise
Expert - Gold — Trusted members who are knowledgeable and active contributors
Expert - Platinum — Seasoned members who contribute beyond providing help through mentoring, creating content, and more
Expert - Alumni — Past members who are no longer active, but were previously recognized for their helpfulness
Expert - Silver — New members who are developing their product knowledge
Community content may not be verified or up-to-date. Learn more.


Member levels indicate a user's level of participation in a forum. The greater the participation, the higher the level. Everyone starts at level 1 and can rise to level 10. These activities can increase your level in a forum:

  • Post an answer.
  • Having your answer selected as the best answer.
  • Having your post rated as helpful.
  • Vote up a post.
  • Correctly mark a topic or post as abuse.

Having a post marked and removed as abuse will slow a user's advance in levels.

View profile in forum?

To view this member's profile, you need to leave the current Help page.

Report abuse in forum?

This comment originated in the Google Product Forum. To report abuse, you need to leave the current Help page.

Reply in forum?

This comment originated in the Google Product Forum. To reply, you need to leave the current Help page.