Rights Metadata Made Simple

Maureen Whalen

Since writing this chapter for the 2008 edition of Introduction to Metadata, I have found that people are now more aware of the importance of rights metadata and the need to collect and share it. More institutions are implementing digital asset management systems and seeking ways to expand the distribution of their collections through websites and social media. Underlying all such efforts are intellectual property laws governing copyright, privacy, publicity, and trademarks. Keeping track of what rights an institution has, who the rights holders are, and what their contact information is, is essential for institutions that want to participate actively and quickly in online environments. This chapter includes some tips and insights learned through experience over the last several years, so that mistakes need not be repeated and improvements can be considered for incorporation into ongoing rights metadata efforts. In addition, more and more institutions are now including rights metadata along with other information about works in their collections and efforts to improve standardization of terms and definitions continues. The Digital Public Library of America1 is one of several organizations seeking to help institutions find simple and flexible solutions to rights metadata challenges. It is our hope that in the next few years, rights metadata will become an expected, routine component of any metadata record about a work and that its existence will improve public online access to digital surrogates of a wide variety of cultural materials.

Introduction

There are three common reactions when the issue of rights metadata arises:

  1. “It’s too complicated and overwhelming.”
  2. “We don’t have the staff or the money.”
  3. “It’s not the library’s [or archive’s, or museum’s] job; it’s up to users to figure out rights information if they want to publish something from our collections.”

Here are some reasoned responses:

  1. Yes, rights metadata can be complicated and overwhelming, but so is knitting a cardigan sweater until one simplifies the project by mastering a few basic techniques and following the instructions step by step.
  2. Your institution is probably already spending staff time and money on rights research. Capturing rights metadata in a shared information system as a routine, programmatic activity with structured data rules and values and an established workflow should not cost any more than ad hoc rights research—and it will provide longer-lasting benefits.
  3. In a world where “if it’s not digital, it doesn’t exist,” libraries, archives, and museums have new roles with respect to their users as well as the creators and authors of the works in their collections. Moreover, cultural heritage institutions need rights information for their own uses of the works in their collections. Rights metadata is not just about compliance with intellectual property laws. Rights metadata is about being responsible stewards of the works in our collections and their digital surrogates—and in a digital world, it is crucial to the institution’s broader mission of collection, preservation, and access.

Rights Metadata Dictionary

A major breakthrough in rights metadata efforts for the Getty Research Institute was the creation and implementation of a rights metadata dictionary for Special Collections.2 Two important improvements came from this project: (1) clarification of which work was being described in the record, and (2) the addition of terms to the drop-down menus that allow users to better understand some of the ambiguities or unknowns about the rights information provided.

Core Work

For various reasons, staff in Special Collections tended to be confused about which work was being described in the metadata record. This confusion was resolved with a clear definition that the work being described is the work in the library’s collections (neither the work depicted in a visual work nor the digital surrogate thereof), which we describe with the term core work. The core work may be a digital work.

People and Works Depicted

Frequently, the core work includes images of people or copyright-protected works. Metadata records in a digital asset management system or other information system may not provide fields for these kinds of “layers” of rights information. To ensure that rights information is collected about people or works depicted in a core work, the rights metadata dictionary instructs users to identify people and works depicted and to identify them as “potential claimants.” Detailed information for these potential claimants may be included in the rights metadata records, if known.

Unknown, Additional Research Required, and Not Researched

Not all the rights information will be available when people are creating the metadata records. Some granularity may be desired to give those using the records over time a better sense of the status of rights research. In addition, due-diligence research about rights holders for designated orphan works will be necessary to document what was done and when. For that reason, precise definitions for “unknown,” “additional research required,” and “not researched” should be used; depending on the collection, more nuanced definitions may be added to the rights metadata dictionary.

Priority Information

Usable, shareable, repurposable rights metadata can be obtained by capturing the following core information:3

  1. The name of the creator of the work or image, including the nationality, date of birth, and, if applicable, death date. Ideally, this information should be copied automatically from an authority file. (Generally, the “work” is the original work in the institution’s collection and NOT a digital surrogate. If the ­institution wants to create a rights metadata record for the digital surrogate, the rights metadata approach described herein would be valid, provided the digital “work” is described and differentiated from the original work.)
  2. The year the work was created. The year of creation may not be the same as the year of publication. Where two different dates exist, they should be identified separately. If the publication date is known, it should be recorded in the “publication status” field.
  3. Copyright status (one of these five options can be selected from a controlled pick list by staff tasked with recording rights metadata):
    • Copyright owned by the institution means that the copyright is assumed valid and is owned by the institution that holds the work.
    • Copyright owned by a third party assumes that the copyright is valid and is owned by someone or some entity other than the institution that holds the work; if known, the name of the third party should be captured in a database field or metadata element designated for that purpose.4
    • Public domain. If the work is determined to be in the public domain, it is helpful to identify the year in which the work entered (or will enter) the public domain, if known. Some institutions, depending on the nature of their collections, may want more information about why the work is in the public domain—for example, if it is a work of the federal government, the copyright term has expired, or the work was published without a copyright notice before 1978 and did not qualify for copyright restoration.
    • Orphan work is a work that may be protected by copyright law but for which the copyright owner or claimant cannot be identified or located.5 Given the two-prong definition, it is recommended that the reason why a work is characterized as an orphan work should be included in the rights metadata. Therefore, two terms should be used: Orphan work—rights holder cannot be identified and Orphan work—rights holder identified but cannot be located.
  4. Publication status (one of these four options can be selected from a controlled pick list by staff tasked with recording rights metadata):
    • Published. Include date, if known. Publication is defined in the Copyright Act as “the distribution of copies … of a work to the public by sale or other transfer of ownership, or by rental, lease, or lending.” Note that the offer to distribute copies—including the original work, even if there is only one copy of it—constitutes publication.6 Because of different treatment of foreign works under copyright law, some institutions may want to clarify where the work was published—in the United States only, in a foreign country only, or both.
    • Unpublished. Some materials such as manuscripts and correspondence may be easily determined to be unpublished. Other works, however, such as a speech or painting that is known to the public, can still be considered “unpublished” under the Copyright Act definition.
    • Unknown. It is sometimes difficult to determine whether or not a work has been published, particularly for photographs of which there may be multiple prints or for manuscripts from which a work was later published.
  5. Date that rights research was conducted (if there are multiple dates on which rights research was conducted, best practice would be to include all of those dates, along with the initials of the researcher[s]).

Gathering rights metadata and including it in an institutional information system7 will allow users with some basic understanding of copyright to make thoughtful judgments about how the law may affect use of the work in accordance with a legal exception.8 It may also help guide determinations about how easy or difficult it might be to obtain permission, if needed.

Here are some examples of how the above rights metadata elements can be applied in day-to-day decision making:9

  • Knowing the birth and death dates of the creator, or the year(s) in which the work was created and published, will allow for some quick calculations about the copyright term for the work. To do the analysis and arithmetic, follow Peter Hirtle’s excellent chart “Copyright Term and the Public Domain in the United States.”10 Note: There are slightly different rules for works of foreign (non-US) origin, including restoration of copyrights in works of foreign origin that may have been in the public domain for a period of time before restoration; that is why it is good practice to identify the nationality of the creator, if known.
  • Unpublished works tend to have longer copyright terms than published works; therefore, if the work is assumed to be unpublished, the term of copyright protection should be calculated in accordance with the formula for unpublished works.
  • While the Copyright Act specifically states that unpublished works qualify for fair use, courts tend to protect the creator’s right to decide about first publication, so the standard for fair use of unpublished works is usually higher than for published materials.11 If the rights metadata states that a work is unpublished, the user can assess how that status affects the fair use.
  • For works published in the United States between 1923 and 1963, renewal of the original copyright registration was required.12 Therefore, a work published in 1945 with the correct copyright notice and registration would require a renewal of the original copyright in 1973 (1945 + 28 = 1973) in order for that copyright to be valid today. One study indicates that 15 percent or fewer of the works in their original copyright terms between 1923 and 1963 were renewed.13 This means the majority of works initially protected by copyright during this period are now in the public domain. Of course, the more famous the work, the greater likelihood the original copyright registration was renewed. By contrast, renewals of registrations for more obscure works may be less likely.
  • Creation date may determine when the copyright term begins and ends; it is especially important when the author is unknown, the work is a work made for hire, or the work is one of corporate authorship (i.e., a work created by a “corporate body” such as a movie studio or record company).
  • In 2006 the US Copyright Office issued its report on orphan works.14 Hearings on orphan works were held in both the House of Representatives and the Senate, and legislation amending the Copyright Act to reduce the legal liabilities relating to use of orphan works was introduced in the House. While this legislation did not pass, many experts think that orphan-works legislation could be enacted in the next few years. Indeed, the US Copyright Office issued a Notice of Inquiry seeking updated comments about current issues relating to orphan works. If new legislation is introduced and passed, many hope that penalties and remedies for use of orphan works will be reduced or eliminated altogether. For that reason, it makes sense to identify works in institutional collections as orphan works now. Moreover, regardless of whether or not orphan-work legislation passes, it seems reasonable that if an institution attempts to identify and/or locate the copyright claimant and cannot do so despite diligent efforts, and this is explained to the court, there may be some recognition of this good-faith activity by the judge if an infringement claim is brought by the emergent copyright claimant.
  • Prior to 1978 the law required that a copyright notice be affixed to published works. Failure to include a legally sufficient notice put into the public domain American works that were published in the United States (without the notice). Therefore, an institution may decide to classify works as in the public domain if they were purchased before January 1, 1978, or were believed to have been offered for sale to the public before that date, and there is no copyright notice affixed to the work.
  • Obviously, if one knows a work is in the public domain or if the institution owns the copyright, permission to use of the work is not required by law, although local policy may require internal authorization.

In order for catalogers and rights metadata analysts to be able to populate the recommended metadata elements, the institution will need some basic rules or assumptions to apply when copyright and publication status may not be clear and some suggestions for resources to help locate the sought-after information. There are numerous recommendations for where to look for the information requested. Currently, there is no resource that sets forth commonly accepted practices about what is legally reasonable to assume about copyright or publication when only limited information is available, so institutions will need to draft their own guidelines.15 Of course, local policy regarding use of material presumed to be protected by copyright and the institution’s risk tolerance for infringement claims that arise in case the assumption is wrong will govern use decisions.16 With a little bit of effort, however, the basic information needed to make informed decisions about rights for many works in an institution’s collections could be easily available and accessible if the suggested rights information is captured.

Any rights metadata effort should be viewed as dynamic and ongoing. New information may come from various sources: a user, a curator, a librarian, or even the creator of the work. Rights information needs to be updated and augmented, and additional information will need to be captured for works with more complicated rights, such as audiovisual materials. Therefore it is important that staff tasked with inputting rights metadata be identified to all those involved in cataloging and digitization efforts so that when new rights information is discovered, it can be input into the appropriate information system.

Now is the time to get started and not to be overwhelmed. Rights metadata can be made simple if everyone in the institution is aware of its long-term importance and there is a concerted, coordinated effort to research it, record it according to standards and best practices, and share it in fulfillment of the institutional mission in the digital age.

Table 1. Example of Core Elements for Rights Metadata

Metadata element Valid data values for this element Example: public domain work Example: work not in the public domain
Title The data values for this element should be copied (preferably in an automated manner) from the title element from the descriptive metadata record for the work or item. Per Cataloging Cultural Objects (CCO), this element, which is repeatable, can contain translated titles, brief titles, display titles, etc., in addition to the title that is inscribed on the item or object, if one exists. Include a subelement for the parent object/work (“title larger entity”) when applicable. Puzza in the Likeness of Isis, Seated on a Lotus Flower/Puzza sous une forme parallele à Isis, assise sur la fleur de lotos

from Cérémonies et coutumes religieuses de tous les peuples du monde
San Diego Stadium (San Diego, California)

from Julius Shulman photography archive
Creator The name of the creator of the original object or work, taken from a published controlled vocabulary (e.g., Library of Congress Name Authority File, Library of Congress Subject Headings, the Getty’s Union List of Artist Names) or local authority file whenever possible. Picart, Bernard Shulman, Julius
The life dates in the case of individual creators, including the death date if applicable. Dates should be expressed according to a standard format, (e.g., ISO 8601). b. 1673–11–06
d. 1733–08–05
b. 1910
Creation dates The date(s) of the creation of the work.* Dates should be expressed according to a standard format (e.g., ISO 8601). 1723–1743 1967
Creator nationality The nationality or culture of the creator of the work, if known French American
Copyright status

Valid values for this element should be selected from a controlled list. For example:

  • Copyright owned by the institution that holds the original object/work or item
  • Copyright owned by a third party—include a subelement for the name of the third party, taken from a published controlled vocabulary whenever possible
  • Public domain
  • Orphan work
  • Not yet researched
public domain copyright owned by institution

© J. Paul Getty Trust
Publication status

Valid values for this element should be selected from a controlled list. For example:

  • Published—include a subelement with the date of publication, if known, in a standard format (e.g., ISO 8601). Note that date of creation and date of publication are not necessarily identical.
  • Unpublished (in which case, the creator dates and/or date of creation are extremely important)
  • Unknown, after research and due diligence
  • Not yet researched
published 1723–1743 not researched
Date of rights-metadata research This should be a repeating element, since metadata research is often necessarily an incremental process to which more than one individual contributes. The individual’s name or initials should be provided by the information system and associated with the relevant dates of research. Dates should be expressed according to a standard format (e.g., ISO 8601). 2008–10–07 MTW 2007–09–13 MTW
* Note that under current US copyright law, a work is protected for the life of an individual author/creator plus 70 years regardless of the date of creation. The copyright term for corporate works and works made for hire is 125 years from the date of creation, or 95 years from the date of publication.

Author’s Note

The rights metadata proposal and examples provided herein are not legal advice. To answer specific questions of law or address policy matters with legal implications, professional advice from an attorney is always recommended.

  1. See http://dp.la/.

  2. For a detailed explanation of the process and results of the effort to create the rights metadata dictionary, see my article “Developing a Rights Metadata Dictionary for Digital Surrogates,” Journal of Library Metadata 9, nos. 1–3 (2009): 15–35.

  3. These suggestions for a simplified rights metadata approach are based on required rights metadata recommendations for copyrightMD, an XML schema for rights metadata developed by the California Digital Library. The copyrightMD schema is designed for incorporation with other XML schemas for descriptive and structural metadata (e.g., CDWA Lite, MARCXML, METS, MODS). See http://www.cdlib.org/groups/rmg/.

    Note that the title of the work is not identified herein as a rights metadata element per se; it is assumed that the title would be included in any metadata schema used to describe the work and, thus, that element could be copied into the rights metadata schema from the descriptive metadata record in an automated manner.

  4. There may be certain conditions under which a license for certain specified uses of the work may have been granted to the institution. A license is not the same as ownership. If desired, when the copyright is known to be owned by a third party, the pick list could include an option for “license granted to the institution”; such a notation by itself, however, would not be adequate to describe the various rights granted, or denied, or the specific term during which the license is valid, so a review of the specific licensed rights would be necessary.

  5. “An ‘orphan work’ is an original work of authorship for which a good faith, prospective user cannot readily identify and/or locate the copyright owner(s) in a situation where permission from the copyright owner(s) is necessary as a matter of law.” “Notices: Library of Congress, Copyright Office [Docket No. 12-2012], Orphan Works and Mass ­Digitization,” Federal Register 77, no. 204 (Monday, October 22, 2012): 64555; http://www.copyright.gov/fedreg/2012/77fr64555.pdf.

  6. See the Copyright Act of 1976, 17 U.S.C. §101.

  7. There is increasing discussion about embedding rights metadata into the same file as the digital surrogate of the image, thus avoiding the problem of two digital files that can and do get separated during transmission. To date, embedding rights data has been done only under limited circumstances, and the software necessary to embed the data and provide users with access to it using a free, downloadable reader is not yet widely available.

  8. The Copyright Act includes a number of limitations on (rights holders’) exclusive rights. The most well-known of these limitations is fair use (section 107), whereby use of copyrighted works without permission of the rights holder is permitted if the use meets the statutory four-factor test. Another important exception applies to libraries and archives (section 108). Under this exception, libraries and archives are permitted to make copies of works in their collections under certain circumstances without permission of the rights holder, including, replacement copies of published works, preservation and security copies for unpublished works, and copies for users provided the copy becomes the property of the user and it is for private study, scholarship, or research.

  9. Examples include assumptions based on US copyright law; examples and assumptions for non-US jurisdictions are not provided herein.

  10. Available at http://copyright.cornell.edu/resources/publicdomain.cfm and http://copyright.cornell.edu/resources/docs/copyrightterm.pdf.

  11. From section 107 of the Copyright Act of 1976:

    Limitations on exclusive rights: Fair use. Notwithstanding the provisions of sections 106 and 106A, the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include—

    • (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
    • (2) the nature of the copyrighted work;
    • (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
    • (4) the effect of the use upon the potential market for or value of the copyrighted work.

    The fact that a work is unpublished shall not itself bar a finding of fair use if such finding is made upon consideration of all the above factors. (emphasis added)

    Prior to passage of the Copyright Act of 1976, fair use was based on court decisions. In 1985 the US Supreme Court, in Harper & Row Publishers Inc. v. Nation Enterprises (471 U.S. 539), ruled on the applicability of the fair-use defense to unpublished works, noting that the “author’s right to control the first public appearance of his undisseminated expression will outweigh a claim of fair use” (p. 555). In order to clarify how the unpublished nature of a work was to be evaluated under the four-factor fair-use test set forth above and to reverse a growing presumption that fair use was not available as a defense against an infringement claim for all unpublished works, Congress passed an amendment to the law in 1992, and the last sentence of this section was added—the one bolded above. Notwithstanding this amendment, there is general legal consensus that courts will give greater weight to the unpublished nature of the work in fair-use cases than would be given if the work had already been published.

  12. All terms of original copyright run through the end of the twenty-eighth calendar year following publication, making the period for renewal registration in the above example December 31, 1973, to December 31, 1974. When checking the US Copyright Office renewal records, it is advisable to look at the years immediately preceding and following the calculated year for copyright-term expiration. This will ensure the work was not renewed properly in a different year.

  13. William M. Landes and Richard A. Posner, “Indefinitely Renewable Copyright” (John M. Olin Program in Law and Economics Working Paper No. 154, University of Chicago Law School, 2002), http://ssrn.com/abstract=319321 or DOI: 10.2139/ssrn.319321.

  14. Report on Orphan Works: A Report of the Register of Copyrights (Washington, DC: US Copyright Office, 2006), http://www.copyright.gov/orphan/orphan-report.pdf.

  15. Drafting the assumptions to be applied locally should not be used as an excuse to delay capturing rights metadata. If necessary, start with the rights information that is known and agree on the assumptions over time.

  16. Institutions may have zero risk tolerance or may have collections comprised primarily of works of living artists. In either case, local policy may be to seek permission. Others may feel that a good-faith judgment based on reasonable assumptions applied to the law and the facts is sufficient to allow use and defend in cases of infringement claims.