Skip to Content
2020.08.25

Fun With JSON-LD

Learning about JSON-LD is all about and why we should.

Working with Adam Riemer on SmugMug’s SEO has been a re­ally il­lu­mi­nat­ing ex­pe­ri­ence. SEO con­sult­ing has al­ways been flagged in my mind as Snake Oil Business”, but Adam re­ally is the best in the field. Almost all of his SEO sug­ges­tions fo­cus on per­for­mance and ac­ces­si­bil­ity, and he has some clear, hard met­rics to de­fine good”. This squares with my fun­da­men­tal un­der­stand­ing of good SEO prac­tices, and has broad­ened my hori­zons and un­der­stand­ing of of the prac­tice.

Something that Adam in­tro­duced me to is JSON-LD — a way of cre­at­ing struc­tured meta­data for pages that’s more ex­plicit that mi­cro­data for­mats. Here’s what I’ve learned about JSON-LD so far.

JSON-LD is Google’s pre­ferred for­mat for ac­cu­rately and suc­cinctly struc­tur­ing meta­data for pages. This gives them in­sight into what’s on your page and why, and they use The Algorithm to in­ter­act and con­sume this data. Using their stan­dards gives you the op­por­tu­nity to get top, fancy search re­sults but there’s no guar­an­tee of that. The best thing to do is to use your struc­tured data to give the best, more ac­cu­rate, and com­plete pic­ture of what con­tent your page has for your au­di­ence. Trying to game SEO here is prob­a­bly go­ing to back­fire, just de­scribe things as they are as clearly as pos­si­ble.

The pri­mary pur­pose of struc­tured data is to cre­ate a ma­chine-read­able and al­go­rithm friendly meta­data for your con­tent. This al­lows the con­tent to be con­sumed by the crawlers and the ro­bots, and join in the mesh of con­tent that Google ex­poses to users when they per­form searches or ask ques­tions of it.

Clearly this is a dou­ble-edged propo­si­tion. By us­ing struc­tured data you’re ex­plic­itly buy­ing in to the ecosys­tem that Google is cre­at­ing, and al­low­ing your con­tent to be trawled and used and un­der­stood how­ever they want. You un­doubtable end up pro­vid­ing value to Google in ex­cess to what they are pro­vid­ing to you. Not to men­tion par­tic­i­pat­ing in the pro­ject of mak­ing the world ma­chine-read­able, which has it’s own philo­soph­i­cal freight.

Schema.org has a lot of data types that might be ap­pro­pri­ate for your pro­ject: Articles, Books, Breadcrumbs, Carousel, Course, Critic Review, Dataset, Event, How-to, Local Business, Movie, Podcast, Product, Software App, and Video are all ones that look in­ter­est­ing to me.

For some­thing like this site, we’re us­ing pretty much en­tirely Website and Article — and con­nect them with a CollectionPage and a Person who is me! Maybe some of the art will be a CreativeWork.

Some in­for­ma­tion on these types:

Lets work through Google’s ex­am­ple of an ar­ti­cle, maybe for this ar­ti­cle!

Here’s the script tag that is home to our struc­tured data:

<script type="application/ld+json"></script>

We fill it with a JSON ob­ject that de­scribes our data struc­ture:

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Article headline",
  "datePublished": "2020-08-25T16:42:53.786Z",
  "dateModified": "2020-08-25T16:42:53.786Z"
}

The @context key clues the ro­bot in to the data de­f­i­n­i­tion we’re go­ing to be us­ing, which is the schema.org de­f­i­n­i­tions. The @type tag as­so­ci­ates the fol­low­ing data with the pre-de­fined struc­ture. From there on it’s rel­e­vant data! headline, datePublished and dateModified are all di­rectly pulled from the con­tent it­self. In out case our data looks like this:

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Fun With JSON-LD",
  "datePublished": "2020-08-12T08:00:00+08:00",
  "dateModified": "2020-08-12T08:00:00+08:00"
}

Open ques­tion: BlogPosting or Article? Im go­ing to stick with BlogPosting since these texts are re­ally just that. I would use Article if I was writ­ing a news piece or a re­view, or some­thing maybe more schol­arly.

The last re­quired field is an image:

For best re­sults, pro­vide mul­ti­ple high-res­o­lu­tion im­ages (minimum of 300,000 pix­els when mul­ti­ply­ing width and height) with the fol­low­ing as­pect ra­tios: 16x9, 4x3, and 1x1.

{
	…
  "image": [
    "https://example.com/photos/1x1/photo.jpg",
    "https://example.com/photos/4x3/photo.jpg",
    "https://example.com/photos/16x9/photo.jpg"
  ]
}

This means that cre­at­ing thumb­nails for every Article is im­por­tant, and those im­ages need to ex­ist on the page in a way that user can see.

For this site, the main use of these im­ages is go­ing to be for shar­ing thumb­nails. The fact that the im­age needs to be on the pages is in­ter­est­ing, since that re­ally in­flu­ences the de­sign of the page. I’ve found that cre­at­ing the ne­ces­sity for a promi­nent thumb­nail or hero im­age that ac­com­pa­nies each ar­ti­cle is a recipe for a) not writ­ing ar­ti­cles and b) bland stock pho­tog­ra­phy. I want to avoid both. That means for this site I’m go­ing to do il­lus­trated im­ages, small sketches and mo­tif ex­plo­rations that may or may not il­lus­trate the ar­ti­cle, and at­tach it to the bot­tom of the ar­ti­cle.

There are two other sec­tions I want to look at, even though they are not re­quire­ments ac­cord­ing to Google. These are the author and the publisher fields. The goal of us­ing these fields is to cre­ate an as­so­ci­a­tion be­tween you and your work; or in the case of the publisher field be­tween an im­print en­tity and the cre­ative works they’ve pub­lished. In our use case for this site, my goal is to cre­ate a ma­chine-read­able en­tity that is Nikolas Wise’ and at­tach my ar­ti­cles and my work to that, in or­der to cre­ate a co­her­ent en­tity that is ex­posed to the broader web.

The author field is a Person or an Organization, the publisher field is an Organization. Lets start with Person:

A per­son (alive, dead, un­dead, or fic­tional). https://​schema.org/​Per­son

It gets added to our LSON-LD like this:

{
	…
  "author": {
	  "@type": "Person",
	  …
  }
}

There are a lot of prop­er­ties in this schema, like deathPlace and knows. One could re­ally get into this and make it a very ro­bust and com­plete data ob­ject, but I’m not sure how much value that would bring at the end of the day. There’s a fine line be­tween fol­low­ing specs and best prac­tices to achieve a goal and tick­ing boxes to struc­ture our lives solely in or­der to make them leg­i­ble to the al­go­rithm. I guess we each de­cide where that line is for our­selves.

For me, I’m go­ing to stick with name, url, image, jobTitle, knowsLanguage, and sameAs. Although publishingPrinciples seems in­ter­est­ing, and I might write one of those.

Most of the fields are sim­ple text strings, and can get filled out like so:

{
	…
  "author": {
	  "@type": "Person",
		"name": "Nikolas Wise",
		"url": "https://nikolas.ws",
		"image": "https://photos.smugmug.com/Portraits/i-ThnJCF5/0/f9013fdc/X4/wise-X4.jpg",
		"jobTitle": "Web Developer",
		"knowsLanguage": "en, fr",
		"sameAs": …,
  }
}

The lan­guage codes are from the lan­guage code spec, and could also be lan­guage schema ob­jects. The job ti­tle could be a Defined Term schema ob­ject.

The sameAs key is an in­ter­est­ing one, it’s ei­ther a URL or an ar­ray of URLs that con­nect this @person with other parts of the web that are also that @person.

{
	…
  "@person": {
	  …
		"sameAs": [
			"https://twitter.com/nikolaswise",
			"https://github.com/nikolaswise",
			"https://www.instagram.com/nikolaswise/",
			"https://www.linkedin.com/in/nikolas-wise-6b170265/",
		],
  }
}

This will con­nect me” with this site and my twit­ter, github, in­sta­gram, and linkedin pro­files. Those are the pages that I want to the al­go­rithm to as­so­ci­ate with me”.

@organization is sim­i­lar to @person in a lot of ways, and the fun­da­men­tal idea is the same. The goal is to cre­ate a sin­gle en­tity that the al­go­rithm can con­nect dis­parate pages and items too. I’m not go­ing to set of an @organization here, but the the @organization schema type has the spec for the ob­ject.

So that’s it! That means the en­tire JSON-LD for this ar­ti­cle — and there­for the rest of the texts as well, looks like this:

<script type="application/ld+json">
	{
	  "@context": "https://schema.org",
	  "@type": "Article",
	  "headline": "Article headline",
	  "datePublished": "2020-08-25T16:42:53.786Z",
	  "dateModified": "2020-08-25T16:42:53.786Z",
	  "image": [
	    "https://example.com/photos/1x1/photo.jpg",
	    "https://example.com/photos/4x3/photo.jpg",
	    "https://example.com/photos/16x9/photo.jpg"
	  ],
	  "author": {
		  "@type": "Person",
			"name": "Nikolas Wise",
			"url": "https://nikolas.ws",
			"image": "https://photos.smugmug.com/Portraits/i-ThnJCF5/0/f9013fdc/X4/wise-X4.jpg",
			"jobTitle": "Web Developer",
			"knowsLanguage": "en, fr",
			"sameAs": [
				"https://twitter.com/nikolaswise",
				"https://github.com/nikolaswise",
				"https://www.instagram.com/nikolaswise/",
				"https://www.linkedin.com/in/nikolas-wise-6b170265/",
			],
	  }
	}
</script>