One of the first talks I went to at BookExpo America (BEA) this year was called, Understanding Metadata in the Digital Age. Bill Newlin, the publisher of Avalon, an imprint of Perseus, and Fran Toolan, the chief ignitor of Firebrand, presented. For this blog post, I will be focusing on external metadata (that goes in to feeds and on websites, for example) for ebooks (metadata for print books is also important but this blog is more about digital publishing).
So first, what is metadata? Well, the simple answer is that metadata is data about data. For books, this can mean author name, publisher, title, and date published, among other things. But I think the presenters had a better answer.
“[Metadata] is communicating the best information you know about the product.”
One great example of metadata is the description of a book (also known as flap copy). For publishers, this means they have the responsibility to make sure all the information about their books is right.
There are two types of metadata: core metadata and enhanced metadata. Core metadata includes the title, subtitles, and author/contributors, as well as format, price, page count (for print books), illustrations, series, edition, publication date, photos, ISBN, BISAC/subject codes, and territorial rights. Enhanced metadata is what helps consumers discover a book. It is crucial, and it includes either a long description (2000 characters) or short description (250 characters), an author/contributor bio, reviews, and reader feedback. Other enhanced metadata could be Q&A, publicity, and supplementary material. This may be author book tours, interviews, and even footage on TV.
What’s interesting is how metadata can affect the discovery process. For example, say a book is called “The Big Book of Metadata.” In the metadata, the title may be incorrectly indexed as “Big Book of Metadata, The.” This means that readers searching for this book will not be able to find it, because they will be looking for “The Big Book…” while search engines will be looking for “Big Book of…” This also affects that way author names are entered into the metadata.
“Digital identifiers are crucial for consumers,” the presenters said. In fact, many retailers (Amazon, B&N, etc.) often assign additional identifiers. Amazon, through its KDP program for example, will add an ASIN number–it’s substitute for ISBNs.
So why is enhanced metadata in particular so important? Well, according to Bill and Fran, it is critical for online discovery (for both print and ebooks). Most titles are discovered via search query, which means if a book does not appear in the top 10-20 results there’s almost no chance of it being discovered. After all, how many people look past the second page of a Google search?
For external metadata, there are many different systems of classification. Each company has it’s own system, including amazon, Apple, Barnes & Noble, Bowker, Google, and Ingram. Oftentimes these systems are not compatible with each other, because they use different fields.
This is how ONIX, an XML data feed, came into existence. ONIX’s system is constantly changing, and it allows publishers to easily add and update metadata in one place, while it takes care of customization for each vendor’s needs. It typically takes 1-2 weeks once the metadata is input to be visible on a catalog or vendor’s site.
But it’s still up to the publisher to enter the correct metadata for each title. If there is incorrect metadata, there is a chance the title will drop out of one or more databases, which means readers will not be able to find the book. For example, an author page on Amazon will automatically capture all of an author’s titles, IF the author’s name is spelled correctly in the metadata for each of his/her titles. But if there is a misspelling, then the book will be missing from the author page and a reader will not be able to find it (and buy it).
Publishers should also take the time to experiment, and see which inputs yield the best returns. There’s no risk, since there is no limit to the number of times the metadata fields may be changed. (Although the presenters warned that occasionally it may be an issue to constantly change the description in an ebook).
How does metadata work?
Well, it works best if considered in the beginning of a book cycle, and if everyone involved with the book helps out. According to our presenters, there are four phases: collection, production, distribution, and follow through. Collection includes editors, production, marketers, and business management (contracts, publication plan, price, etc.) However, editors know the most about a work, so they should help out the most with metadata. Production is making the book, and distribution is packaging and delivering the book. Lastly, in the follow through, publishers should spot check websites to make sure the book appears in the right places.
Fran and Bill suggest picking 2-3 keywords with high monthly searches and low competition, and to avoid general terms and be more targeted, relevant, and specific. Once you pick keywords, use them in the title and subtitle, if possible, as well as the description, keynote, and author bio.
Another important marketing metadata tool is BISAC codes. These subject categories are used for retail and library classification as well as online discoverability. Often, BISAC codes are used in combination with proprietary classification systems, such as B&N, Amazon, and Google. BISAC has 52 main subject areas and 3600+ categories. Each category has a “general” subcategory, which should be avoided at all costs. Additionally, you should aim to assign at least 3 subcategory codes per title.
“It’s almost always better to assign more targeted BISAC codes,” the presenters said. Also, the primary code is the most important, since some databases, such as Bookscan, only use the primary code.
One piece of advice the presenters gave was to NEVER apply an adult BISAC code and a juvenile code to the same book. But other than that, it’s good to experiment with different codes, especially with fast-selling titles.
For ebooks, it’s also important to make sure the publication date on file works to the publisher’s advantage. It can’t be too early because that will cause confusion, and it can’t be too late because that will mean loss of sales. It’s also good to experiment with the price of ebooks to see what sells more.
With ebooks in particular, there are fewer metadata management partners with tougher requirements. There is no standard approach (yet) but territory rights must be explicitly identified. Also, there is no such thing as having too much metadata–so long as all of it is correct.
Although not covered in this post, metadata in an EPUB file is also very important. I will revisit this topic once I’ve interviewed enough people who know about metadata in EPUB3. Stay tuned!