Linked Data FAQ

May 26th, 2016

Anna Christiansen

Associate Marketing Writer

Get the Facts from our in-depth FAQ.

BLUEcloud Visibility adoption has increased in momentum lately with more and more libraries moving to make their records visible on the web. SirsiDynix is excited for all the BLUEcloud Visibility libraries that are expanding their presence on the web. We also understand that with more libraries moving towards BLUEcloud Visibility there comes a great deal more questions. Not to worry, we’ve got you covered.

To help your library better understand BLUEcloud Visibility, we’ve compiled a list of our frequently asked questions. These are the questions we often hear from libraries at demos and webinars. Some questions are straightforward questions about how the web visibility solution works, other hard-hitting questions about the future of libraries and data. We’ve compiled it all here for you, with an effort to list the questions from most simple to most complex and group similar ideas together.

Question: Who are some early BLUEcloud visibility adapters?

Answer: There are multiple SirsiDynix customers that participated in Zepheira’s early adopter program. Some of these sites include: Douglas County, Anythink, Edmonton Public Library, and Calgary Public Library.

Question: How often are the web records updated? Is there a lag time between bib modifications and updates of data to the discovery experience? How will the continuous update process impact crawling and indexing by search engines? How do we reconcile ensuring currency in publishing our bib data and the crawl rate by search engines?

Answer: At present, BLUEcloud Visibility sends twice monthly updates of your bib records and institution data. With this schedule, we have seen new records, as well as changes to existing records, appear in search results in around 10 days. Keep in mind, these numbers will vary according to the richness of the published data and how quickly that data is indexed by each search engine.

Question: Can event information can be transformed to BIBFRAME. Does it need to be in MARC format first?

Answer: MARC probably isn’t the way to go. After an event is cataloged in MARC, that record will need to be harvested, geo-located, transformed into Linked Data, and then crawled and indexed by Google and other search engines. By that time the event might already be over. Here’s how Google structures the data and it can start with just a simple HTML page

Question: If BIBFRAME was designed to replace the MARC standards. Are MARC and RDA becoming outmoded?

Answer: BIBFRAME was designed to replace the MARC standards, and to use linked data principles to make bibliographic data more useful both within and outside the library community. For any evolution of a standard it is responsible to help minimize the cost (social, technical, etc.) of a transition. The benefits we see from using BIBFRAME, specifically in the context of the library, bridge the gap between existing cataloging practices and the searchable Web.

Question: What resources are available to harvest data from MARC records?

Answer: The goal of BLUEcloud Visibility is to transform MARC records into a structure that the searchable web understands. This structure includes meaningful vocabulary, structured connected data, and geographic associations for both physical and virtual items. In the future we want every item, virtual and physical, to be harvested for BLUEcloud visibility in order to teach the Web about library data and how to connect users back to their library for fulfillment

Question: What technology does SirsiDynix use to link from Enterprise to external sources like EBSCO Discovery Service (EDS)?

Answer: SirsiDynix uses EBSCO’s APIs to communicate with the EDS system in real-time. From Enterprise, we search their index for the same terms that the Enterprise user enters and EDS gives us the results back. We also get their facets back and can narrow the EDS results by their own facets. We also use their web services to bring back Research Starter results and Publication Placards according to the search term entered. So basically the magic happens using shared APIs between SirsiDynix and EBSCO.

Question: How does BLUEcloud Visibility work with discovery layers?

Answer: BLUEcloud Visibility takes the Library’s catalog records, transforms them into BIBFRAME records and connects them to the corresponding record within the PAC with the explicit goal of allowing these records to be discoverable through Google and other search engines. Please note, that while third-party discovery layers are available for use with BLUEcloud Visibility, they may require additional testing and configuration to perform as a supported option.

Question: How is BIBFRAME is different than schema.org in how it handles special collection records?

Answer: In essence, BIBFRAME is a new data model for bibliographic description and would replace the traditional MARC record. It was developed with the intent to expose library records better on the web using the Linked Data model. It’s an ontology that libraries can use to describe the specifics and detailed description that libraries use and all things libraries, especially materials in Special Collections. On the other hand, schema.org was built for a universal purpose that might not be able to describe library materials and services in a way BIBFRAME does. Schema does not take into consideration or does not even know about the FRBR conceptual model that BIBFRAME takes full advantage of.

Question: How will the relevance ranking in Google place the library catalogues?

Answer: Google considers hundreds of factors when determining relevancy, many of which are not only private, but are closely held secrets. It is generally accepted that much of this ranking is based on the richness and perceived value of the data.

In the Linked Data world, the more links or connections a library record has, the more valuable it will become in the eyes of search engines.

Question: What is the incentive for search engines to pull semantic data from libraries? Eventually, might libraries need to pony up for the search results page real estate?

Answer: Libraries have been describing records for centuries and search engines are thrilled to have data that has been curated by professionals a part of the open web. This well-developed data, combined with the vast market segmentations among library users, is a commercial search engine’s dream. As the floodgates open up for library data, search engines would have to create some sort of a special area or “real estate” just for library materials. It will be up to the search engines to aggregate all library data and present it in a way that it’s not overwhelming to the user. One way is by geo-location where it narrows that library results to a certain diameter from where the user is located.

Question: If a library uses Portfolio to manage digital assets, like scanned historical documents and images, can BIBFRAME “understand” the content of the PDF or only the metadata about the PDF?

Answer: The metadata is all that is currently available to act on. Any information extracted from the PDF and made available via structure could be acted on. BLUEcloud Visibility currently transforms only MARC 21 records, however, we plan on adding Portfolio digital assets in the future.

Question: If a library uses Portfolio to manage digital assets, like scanned historical documents and images, can BIBFRAME “understand” the content of the PDF or only the metadata about the PDF?

Answer: The metadata is all that is currently available to act on. Any information extracted from the PDF and made available via structure could be acted on. BLUEcloud Visibility currently transforms only MARC 21 records, however, we plan on adding Portfolio digital assets in the future.

Question: Many librarians catalog in OCLC or download records from OCLC. Will OCLC be bypassed?

Answer: The future of OCLC in a BIBFRAME context is uncertain. SirsiDynix is working on solutions for convenient access to records/metadata in a linked data format. In the near term, libraries can continue to grab a record in OCLC, store them in their catalog, and BLUEcloud Visibility will transform those records. In the future, SirsiDynix’s CloudSource will be a source of Linked Data records and as libraries catalog in BIBFRAME, data will be stored there for libraries to link to, retrieve, and make portable if needed.

Question: Are FAST headings in OCLC records in support of BIBFRAME? Or are they for something else?

Answer: In BLUEcloud Visibility, FAST headings in MARC records are transformed into Web Identifiers and then become linked authority control points.

FAST wasn’t designed specifically in support of BIBFRAME, but rather BIBFRAME was designed to take advantage of the libraries’ (OCLC, LC, etc.) capacity to surface authority information about Persons, Organizations, Concepts etc., as Web Identifiers.

Question: Is traditional ‘access point’ in a marc record (i.e. subject heading) reinterpreted as a singular ‘resource’ in the BIBFRAME environment?

Answer: Traditional MARC ‘access points’ are a way of constructing common representations (identifiers) for the kinds of things MARC describes. In a Web environment the equivalent is a URI (Uniform Resource Identifier) and these are used to identify a resource in BIBFRAME.

Question: With regard to the technical “plumbing” related to linked data, are there any libraries (as in code), APIs, and the like that you might recommend to someone wanting to start tinkering?

Answer: One of the important aspects Zepheira brought to the design of the BIBFRAME standard (and one that we’ve used in the past at W3C) was making sure there was at least 2 independent, open source code bases that could implement to each aspect of the design. Open source transformation tools include GitHub for BIBFRAME, GitHub for MARC, and BIBFRAME.org.

Terry Reese has also begun experimenting with Linked Data and BIBFRAME in MARCEdit.

Question: As my library starts to research for a new library services platform, what big ideas should we consider? Is there anything significant about our legacy data that we should be mindful of? In regards to authority work, what about incorporating relationship designators into our practice? Is this significant as we move towards linked data?

Answer: Most definitely! There’s also authority sources designated in $0’s that would be useful in linking to other linked data sources. Take a look at VIAF, LC NAF, Wikipedia, ORCID Id’s, etc. RDA conversion would also be beneficial in transforming your records to this format. The more valuable metadata there are, the better the linking to other sources.

And finally…

Question: How are the costs for BLUEcloud Visibility figured?

Answer: Talk to your SD sales rep to get pricing.

Talk to your LRM if you have further questions. We are just as excited as you to see your library’s catalog on the web!