Monday 24 October 2016

Linked Data

You've probably heard Linked Data being talked about by library people.

You may have even heard jargon being thrown around like triplets, the Semantic Web, RDF and Bibframe.

But if you're anything like me it has not been totally clear what linked data is and how it could benefit libaries.  But I've been let into the club and you can be too.

The WHAT and HOW?

I've heard a number of speakers at different events talk about linked data and I thought I kind of understood it based on those talks. As someone who writes code for websites I was comfortable enough with the descriptions of linked data in terms of the way it is encoded. Without going into the actual syntax of the code, it boils down to the relationships between uniquely identified things, expressed in a collection of triplets. 

For example, Tim Winton is a person. Cloudstreet is a book. Tim Winton is the Author of Cloudstreet. There is a triplet - it expresses a relationship between three defined concepts - [person] is the [author] of [work].  This particular relationship has high relevance for library data but it is not the only type of relationship. What about subjects? Mount Everest [place] is the [subject] of Into Thin Air [book]. Linked data is not purely a library thing. There are many ways that different data can be expressed as linked data. Tony Abbot [person] is a [friend] of Malcom Turnbull [person].

Linked Data is encoded in a structured way based on defined ontologies. Ontologies define concepts like a 'person' - without a person ontology you can't uniquely identify Tony Abbott. And when you have enough of these triplets you have much potential for computers to be able to deduce answers to questions based on the relationships in the data.

All that's fine, but it left me totally lost when it came to libraries. I could see how the collection data in a library catalogue could be encoded as linked data. But this would have to be done programmatically - there's no way a few people can hand code all the relationships in a library catalogue.  And even if you could, so what? Where is the value?  I thought that it was potentially of use to the Library of Congress and other national libraries. That maybe those larger organisations might be able to make a case for it out of public interest. What could developers do if all of the Trove data was available to them as linked data?

I had a hazy picture of the what and how of linked data but I was no closer to seeing the potential.

The Penny Drops

So for a couple of years Linked Data has been around the back of my mind in place where I felt like the hype around it suggested I should be paying more attention.  A couple of weeks ago I signed up for and watched a webinar chaired by Novelist, called Getting Your Library Visible on the Web, simply out of interest.

It turns out that Novelist (Ebsco) offers a Linked Data service where they will convert your bibliographic data to Linked Data and publish it online via the Library.Link network. I could be wrong about this part, but I gather that the actual work to convert the data and the hosting and running of Library.Link is done by a company called Zepheira, formed out the various collaborative efforts to get library linked data up and running.  To make it possible Ebsco are also collaborating with various library vendors to get the data; and at least one, Innovative Interfaces, is offering a similar Linked Data Service directly to their customers.

So you can pay for your library data to be converted and published as linked data through a selection of different vendors, but they all seem to be working together and ulitmately I think Zepheira does the data conversion and publishing via Library.Link.  That solves the HOW but still not the WHY.

Ulitmately, the WHY is simple. Discoverability.

Publishing your library collection information as linked data allows it to be harvested by Google and other search engines. It gets your collection information out of the closed silo that is your libary catalogue and in to the world where searchers are looking.

The first two results in this search link to Denver Library's catalogue data

The possibility that items from a library catalogue might show up in Google search results is an enticing proposition, for sure.  But it doesn't end there.

You have probably already seen an example of how Google is using Linked Data, probably without even knowing.  Have you ever searched for a movie and been presented with something like this, showing cinemas and session times along with information about the movie?

Google Search results for the movie Magnificent Seven showing session time, and movie information

Google shows these data cards for all types of search terms - people, places, things.  We are seeing the search industry moving from simple lists of results matching search terms to being answer engines. Google is aggregating data from all available sources and trying to present answers to searchers.

Even in a library context it is evident that people are becoming used to this sort of result. But at present, Google sees libraries as roughly the equivalent of small business. Here's the Google search result for my library. Users can see the location, opening times and contact details right on Google in addition to getting a link to our website, even if they can't yet get collection results. They can get directions, or telephone the library simply by clicking on a link.


And we get analytics from Google that show us people are using that convenience and bypassing our website altogether.



What's more, with current generation operating systems we are seeing these answer engines being decoupled from the web and web browsers.  Siri, Cortana, and Google App are all about removing the search process from user and focusing on answers.

All this relies on the machines having access to the data. "Hey Siri, where can I borrow a copy of Cloudstreet by Tim Winton?"


No comments:

Post a Comment