Layer one approach and compromises

Where does this fit?

Level One is the first metadata schema implemented in OAI4Courts, and corresponds to the mandatory unqualified Dublin Core schema required by OAI-PMH.

What are things called?

If you're not familiar with the operation of American courts, some definitions may be helpful. And if you are familiar with American courts, you may find that we've used terms in specialized, unfamiliar ways.
  • Decisions represent an occasion on which a court issues an opinion about a legal matter. The decision may involve one or more cases or disputes; appellate courts like to combine similar cases for disposition.
  • A decision is expressed as one or more writings. We consider a writing to be a computer file.
  • Different courts bundle their writings differently. For example, the Supreme Court of the US issues its decisions in separate files, one each for a summary of the case, a majority opinion, and (if present) concurrences, dissents, and so on. By contrast, the New York Court of Appeals issues its decisions as a single computer file that contains all of these things as divisions of a single electronic document. For OAI purposes, a writing is an item about which metadata can be disseminated in a record. In conversation, this can be a little confusing, since when we speak casually of the "opinion" we might be referring collectively to all of the writings associated with the decision, or to the particular writing that is the majority opinion.
  • Writings have formal identifiers, as OAI requires for items. The syntax and semantics of those identifiers are addressed in a separate document. These identifiers may be considered retrieval keys for a particular repository. While they may well contain elements that relate them to existing systems of citation (and in particular the official citation), it is worth remembering that conventional, print-based citations are not unique identifiers, and are often not known at the time the decision is first disseminated. Most repositories will provide a series of alternative identifiers in the metadata record for the case, including official citation once it becomes available.
  • Writings also have authors, who are a subset of the judges involved in a case.
  • A decision might dispose of multiple cases. It is fairly frequent for an appellate court in the US to combine similar appeals for purposes of issuing a decision. For a really complicated illustration of this, see AT&T Corp. v. Iowa Utilities Board ; the interesting stuff is in the asterisked footnote.
  • Cases have parties and advocates and other people associated with them, and those people in turn have roles (plaintiff, defendant, etc.)

How does this translate into unqualified Dublin Core metadata?

OAI-PMH mandates the use of unqualified Dublin Core elements as one of the available schemas. This forces difficult choices. In most cases, I have found ruthless pruning of metadata to be preferable to the use of implied semantics.

Dates provide a good example of how one might make choices about this. Most decisions have multiple significant dates associated with them: the date the case was argued, the date it was decided, and a variety of others related to procedural and publication milestones. Because unqualified Dublin Core provides no mechanism by which one may explicitly distinguish one flavor of date from another, it is tempting to imply qualification by (say) repeating the element three times and assuming that "everyone" knows that the first date is the date argued, the second the date decided, and the third the date posted to the Internet.

Experience shows that such practices are flimsy and dangerous. Better to prune ruthlessly and issue a single dc:date element for the date of decision, which is by any measure the most important of the lot. We make similar decisions throughout this standard. The only exception is overloading of the <description> element to provide helpful hints when a particular item contains multiple subdocuments or disposes of multiple cases.

Decisions about representation

A discussion of these decisions follows, along with a best-practices guide. A simple summation of the standard by element appears elsewhere. No doubt there will be many situations in which (for example) choosing a single piece of metadata from competing candidates will be a real problem, with arguments to be made in favor of multiple different candidates. In such cases good practice would be to document the choice in one of the description elements associated with the OAI Identify response, which functions as a sort of colophon.

What follows on this page is a small (probably incomplete) set of guidelines for representation decisions. A summary of dc element usage that is also suitable for application to in-file markup (eg. as a header using META tags) appears on another page. We also provide a few worked examples.

Choosing among multiple candidates when only a single element is available

  • The date of decision is represented using the dc:date element. It should be the official date of the judgment, represented in ISO8601 YYYY-MM-DD format.
  • In the case of documents where there is no date of decision per se, the general guideline would be to use the date on which whatever is decided becomes effective -- such as a date of signature, effective date of an order, etc. We realize that there are many situations in which this presents difficult choices (one that comes to mind is when the signature date of an official document is not the one on which parties are compelled to take some sort of action).
  • Date of argument, posting, effective date, and others are not encoded at this level of the standard. This will be clarified and made explicit in the level 2 standard.
  • Problems of multiplicity
    • Unqualified DC does not deal well with situations where multiple authors have contributed sub-documents within a multiply-authored document. The presence of sitting judges who are not identified as authors can be an additional complication in many cases. The encodings described below are a deliberate compromise that emphasizes the importance of authorship somewhat at the expense of non-author participants in the process.
  • Author versus sitting judge
    • The primary author of the judgment is encoded using the dc:creator element. Use of multiple dc:creator elements for a single item can be interpreted as indicating group authorship of the item. Other sitting judges who are not authors are not represented at this level of the standard. While it would be tempting to use dc:contributor for this, the dc:contributor element is reserved for situations involving multiple authors of subdocuments within a compound document.
  • Group authorship
    • Group authorship is indicated through the use of multiple dc:creator elements with a single item.
    • In situations where a majority opinion is packaged in the same file with minority opinions under separate authorship, the dc:creator element should be reserved for the author of the majority opinion. The dc:contributor element is used for judges writing dissents, concurrences, and so on. Identification of contributors with particular subdocuments should be done in the markup of the file itself.
    • In situations where a decision is expressed in multiple files (each with some part of the decision, such as a concurrence, dissent, and so on), the primary author of each file will be identified with dc:creator. Use of dc:contributor to identify judges joining the primary author is optional, and possibly not a very good idea.
  • Names of parties
    • Party names are not encoded as such; rather, they appear in the dc.title element. Unfortunately, unqualified Dublin Core does not provide much support for names or roles other than those associated with authorship. As a practical matter, titles of judgments are typically designed to embed party names in ways that are easy to recognize and parse, so this may not be much of a problem.
  • Names of legal representatives
    • No attempt is made to encode the names of legal representatives (attorneys) at this level of the standard.
  • Names of judges
    • Treated under "authors". Sitting judges who are not authors are not encoded at this level of the standard.
  • Roles
    • Beyond authorship and participation in the decisionmaking process, no attempt is made to encode roles of parties or their representatives at this level of the standard.

Single elements

Court or official body
Represented by the dc.coverage element. This seems an odd choice, but is consistent with DC practice. This is a place where some form of metadata registry would be particularly useful. At this level of the standard, the Coverage element should encode only the name of the court or official body. More complicated representations of the court's type, jurisdiction, and relationship with other courts are left to higher levels of the standard.

Use of a dc:identifier element that specifies a URI for retrieval of the actual item described is mandatory (this should not be confused with the OAI identifier, which we describe here). Other identifier elements can be used for official citation, docket numbers, parallel citations, and the like, without limit. This level of the standard pays no attention to the internal semantics or structure of these other identifiers.

More on different packages of writing

Courts package their writings differently under different circumstances. Particular challenges are posed by the dissemination of a single judgement in multiple files (as when majority opinion, concurrences, and dissents are separately disseminated) or when a single writing or package of writings disposes of multiple cases or matters that the court has chosen to combine for purposes of decision. Similarly, the bundling of multiple subdocuments (majority opinion, concurrence, dissents) into a single item poses challenges. In general, this level of the standard makes compromises based on the assumption that these difficult cases are the exception rather than the rule. The rest of this section may be considered a "how-to" guide for the more difficult situations:

Multiply-authored judgements with a single majority opinion, such as per-curiam decisions

  • Use multiple dc.creator elements to specify all authors
  • The metadata is silent about sitting judges who are not authors.
Judgements with multiple subdocuments in a single dissemination
  • Use the dc.creator element to encode the author of the majority opinion
  • Use the dc.contributor element to encode authors of other subdocuments
  • Use a dc.description element with multipart: token to indicate the presence of multiple subdocuments
Judgements disseminated as multiple files or items
  • Use the dc.type element to describe what type of sub-document this item is
  • Use the dc.relation element to specify other parts of the same judgement
Judgements that dispose of multiple cases from lower courts
  • Use a dc.description element with multicase: token to indicate the disposition of multiple cases
Dissemination of judgements in multiple languages
  • Use the dc.language element to describe the language used.
  • Use the dc.relation element to specify a URI for another version

Assertions about rights

In general this document assumes that assertions about intellectual property rights will apply uniformly to all items in a repository. This may be an overly simplistic approach born of American idiosyncrasies -- namely, the fact that in the US there is no copyright in government works. For this reason it seems reasonable to deprecate the use of separate statements about rights at the item level, and assume that they will be made at the repository level.