Metadata Schmetadata: What’s it Good For?

Metadata is literally data about data – this might sound dry but it has become a little more interesting over the years, and a little more important to get right. This post details the basics of what you need to know as an author or small publisher.

metadata library index card and .xml file, old and new

Historically, you could find metadata hiding in a filing cabinet at the back of the library. This index card (aka metadata) told librarians and book-sellers what the book was about (and even how heavy it is) – today it also tells consumers and search engines. Since the 90s metadata has come a long way; self-publishing authors would have likely encountered metadata when Amazon prompted them to choose a subject area for the book.

Is it important?

Yes! With the digitisation of books and the existence of digital-only copies of books, metadata’s importance surged: as far as a reader is concerned, without good metadata, the ebook doesn’t exist.

Can metadata be used to market a book?

Digital Publishing 101 claims that metadata isn’t just a formula; it’s a tool publishers can—and should—adapt more creatively in order to experiment with different approaches to online bookselling. They list two types of metadata:

  • Core metadata includes essential information such as the title, price, author, category (classification), and so on.
  • Enhanced metadata is marketing-related. It helps to sell the book but it’s not essential to getting it listed on bookseller sites or in library catalogs. Things like blurbs, author bios, quotes from reviews, sample chapters and so on.

For our Theory on Demand Series, we use a mix of core and enhanced metadata in our ePubs. This makes it easier for readers to find an author or area of interest published in our books on Lulu or any other platforms we upload to.


In practical terms, there is no universally accepted standard for consumer metadata for out-of-the-container content yet. Different audience’s need different formats:

Librarians rely on several metadata standards: the Dublin Core, Library of Congress, METS, MARC, and (to some degree) ONIX. All of these standards help librarians describe, locate, purchase, and recommend books (and ebooks).

The Theory on Demand’s metadata is targeted to readers and book-sellers and what we have been using is a simpler format, the .xml file (with EPUB 3 using the Dublin Core Metadata Element Set).

Where is the metadata found in an ePub?

The place where all this information is organized is the package document, an XML file that is one of the fundamental components of an EPUB, the .opf file. (The extension .opf stands for Open Package Format, which was the precursor to the new Publications specification.)

If you unzip your ePub file, you can see it here (with the title highlighted on line 5):

the metadata file is found in the opf file

if you unzip an ePub the metadata is found in the opf file


What’s in the metadata?

When publishing, it is crucial to have an International Standard Book Number (isbn) – this will certainly go into your metadata.

Note: the isbn number for digital publications changes if you are creating an ePub3 or an ePub2, be sure you know the final format before you apply for one.

What else should you put in there if metadata is a crucial ingredient in reaching your readers and having book-sellers accept your book? This can become a complex decision, for Theory on Demand we included some key areas:

dc:identifier – which lists the isbn number of the book

dc:title id – the title of the book (include the subtitle here too, it should be the same for the cover, the title page and in metadata).

dc:publisher – the publisher (quite self-evident)

dc:date id – we include the month and year, ie ‘2015-08’

dc:language – English, ‘US-en’

dc:creator id – list author(s) name here

dc:rights – for the publisher it’s very important to distribute these books under a creative commons license

dc:subject – keywords, we included around 50 as this is an anthology with 18 authors, it helps readers find content relevant to them.

dc:description – we included the blurb of the book here

If you’re wondering what else to include/exclude, Bill Kasdorf provides much more detail here.

This is a simple overview of how we dealt with metadata. Below you can find further resources to help you on your way I highly recommend reading more from metadata expert Laura Dawson.




Digital Publishing 101,


Bill Kasdorf,

Read more:

Zen and the Art of Metadata Maintenance by John W. Warren