Child pages
  • Aggie input: metadata scheme

This is a public wiki space. All contents are publicly accessible unless page restrictions are in place.

Skip to end of metadata
Go to start of metadata

Before we fully implement the metadata scheme within the UCLDC framework – and in particular, within the Nuxeo DAMS platform -- we'd like your input and feedback on the digital object metadata model.  In particular, we'd like to confirm the following:

1) Scope of digital object metadata scheme: Please review the "Scope" statement on the UCLDC metadata scheme wiki page, and see also the spreadsheet (1st tab, "Digital Objects").  Does the  scheme generally accommodate the range and kinds of metadata that you need to describe your resources?

2) Required digital object metadata: Please review the spreadsheet (1st tab, "Digital Objects"; see "Obligations" column).  The specification includes -- for each metadata element -- an indication if the data element is required or not.  We would like to confirm that the proposed requirements are acceptable.  As we get further into defining the requirements for the UCLDC discovery/delivery interface, the requirements may need adjustments.

  • No labels

14 Comments

  1. Just a question for Adrian: Places is currently missing a DC property. Should it be dc:coverage?

  2. I think Places should be mapped to dc:coverage and dcterms:spatial (sub-property); on the other hand, Physical Location should be mapped to dc:description or dc:source?

  3. Hi Chrissy, Shu – oops, that's a mistake in the spreadsheet regarding the lack of a DC mapping for "Places."  It should indeed map to dc:coverage for the DC property (and dcterms:spatial for the DC sub-property). I'll update the spreadsheet to reflect this.

    "Physical Location" could be mapped to an alternative DC property, although there's no nice, clear candidate for handling of this type of information.  dc:coverage is a stretch, but details about the physical location of the item could potentially fall under this purview (http://www.dublincore.org/documents/usageguide/elements.shtml#coverage).

    Thanks!

    -- Adrian

  4. I thought coverage normally describes the intellectual content/scope of a resource - this is indeed a tough one!

  5. Hi all,

    Here are my initial thoughts on the draft UCLDC metadata model.

    1) The scheme does generally seem to accommodate the range and kinds of metadata needed to describe our digital resources. A few comments:

    • This may be a typo, but in the example column of the draft scheme for publisher (for the original object) is color-coded as a controlled vocabulary field. We typically record a descriptive publication statement for publisher (e.g., New York : Harper) (mapping to MARC 260, mods:originInfo, dc:publisher). If you do plan to use this as a controlled vocabulary field, how would you accommodate descriptive publication statements?
    • Including an optional abstract or description field at the collection level might be nice for the user.
    • Is it too early to be talking about administrative metadata used to manage the digital object (e.g., date the record was last changed, descriptive standard used, etc.)?

    2) Yes, the proposed required fields are acceptable. Requirements can be tricky, especially when dealing with legacy digital objects from multiple campuses, but I think you have struck a nice balance.

    In terms of dates (which I am recalling Adrian mentioned briefly on our call), I would strongly (if at all possible) encourage requiring a machinable date when the date field is used.

  6. Hi Chrissy,

    Thank you for your comments and feedback!  Responses to your questions, below.

    -- Adrian

    1A) We were initially envisioning that the Publisher data field would hold the formal name of the publisher (personal name, organization name) -- derived according to a content standard or local guidelines, or taken from an authority file.  I.e., MARC 260 $b or MODS "publisher".  The date of publication could be indicated in the Date data field (which will have a type indicator for creation vs. publication dates).  We don't have a good analog in place yet for place of publication.

    That said, we could just configure the Publisher data field so that it could accommodate more descriptive, narrative publication info. -- basically, a free-text field -- if that's preferred by everyone.

    1B) Thanks for raising this.  We haven't started to fully map out the descriptive data fields for Collection information, that would be primarily managed in the Collection Registry -- and this is something that we'd definitely like to start building out, and getting your collective feedback on in upcoming releases.  One thing that we'll need to balance is the degree to which we'd replicate information about collections in the Collection Registry, where the information is already in the form of EAD finding aids and MARC records (or on HTML web pages, etc.).

    1C) We also haven't mapped out non-descriptive or rights metadata that will be tracked in Nuxeo -- such as administrative metadata about files that are imported, etc. (creation date, modification, user that imported the file, version history, etc.).  That said, there are a number of default fields that we can enable and configure, which will automatically record and track this type of information.  For this upcoming Aggie release, you'll be able to see some examples of this in sample objects that we've loaded.  This is something we can also start to flesh out, in upcoming releases.

    3) The Date data fields will be modeled in Nuxeo to include both a descriptive form of date information (e.g., "Circa 1920"), as well as normalized form of dates.

  7. Hi all,

    Here are my specific suggestions:

    • Alternative Title: should be repeatable; map to DPLA Title (dpla:SourceResource)
    • Campus: yes, should be repeatable; map to DPLA Publisher (dpla:SourceResource)
    • Contributors: should map to DPLA Contributor (dpla:SourceResource)
    • Dates: Single dates should be formatted in “YYYY-MM-DD”? Obligation should be yes for both instances?
    • Forms/Genres: should map to DPLA Type (dpla:SourceResource)?
    • Identifier: should be String? Can this be auto supplied?
    • Local Identifier: should be repeatable?
    • Places: map to dc:subject or dc:coverage (dcterms:spatial)
    • Provenance: should be repeatable
    • Related Resources: should be repeatable; use Text instead of List?
    • Repository: yes, should be repeatable
    • Subjects: obligation should be yes for both instances?
    • Temporal Coverage: make repeatable?
    • Title: obligation should be yes for both instances
    • Copyright Statement: obligation should be yes for both instances

    Here are my general questions:

    1. Confused with the DAMS widget for data entry, “List”; since under Scope it says “(Authority) terms are not dynamically pulled into the system from the external sources” – if this is true, how come there is a DAMS widget “List”?
    2. Definitions of String, Text, Textarea?
    3. In the Scope, there listed two options dealing with technical metadata – which one is the way to go? It would be more consistent if we decide on one.
    4. Will the DAMS or Merritt have automated processes dealing with preservation metadata according to PREMIS?
  8. Echo Shu's question about "List" for data entry. Not sure how this impacts elements like contributor or creator.

    Specifics:

    1. Access restrictions–this may be rare, but is there the ability to accommodate embargoes?
    2. Alt title - multivalued
    3. Campus - should be multivalued, esp. leaving open to new models of collaborative/collective acquisition. (In that sense, if it's a pick list, would we want to add "multi-campus" or "University of California" as a value?)
    4. Local identifier - multivalued
    5. Physical location - agree with Shu's suggestion; wouldn't map it to dc:coverage. It seems like dc:source would be more appropriate, but am not sure how that element could be elaborated.
    6. Provenance - multivalued
    7. Publishers: would funders or grant for a digitization project, for example, be acknowledged in Publishers element? In that case perhaps free text field may be better.
    8. Related resources - multivalued
    9. Repository - multivalued. Not sure about values being auto-supplied from collection registry. Would this accommodate non-UC repositories? (Again, thinking of a digital resource which may be deposited at multiple locations, including a site outside of UC?)
    10. Licensing statements/terms - "vocab" or syntax could include CC licenses?

     

     

  9. Hi all,

    I think generally we should be able to fit our metadata into this schema with some minor re-mapping of our Dublin Core fields. Here are the fields we are using differently:

    • Campus/dc:publisher - right now we're only using that for the original publisher, not our campus name
    • Collection Title/dc:relation - we're putting this info in dc:title. Series titles are mapped to dc:relation
    • Format/Physical Description/dc:format - we use the field for "original format", "original size", and "scale" for maps. We could probably combine the separate fields into one.
    • Forms/genre/dc:type - we use limited terms for this field such as image, map, etc. The types of terms in the examples would go into our "original format" field, which is dc:format
    • Physical location/dc:coverage - Our physical location field is currently unmapped to DC, and we use dc:coverage for geographic location.
    • Places/dc:coverage - See above. We have 3 fields for this: Geographic Location.LCSH, Geographic Location.TGN, and Geographic Location.Local. Geographic coordinates also map to dc:coverage.
    • Provenance/dc:provenance - we have one field "donor and provenance" which maps to dc:source
    • Related Resources/dc:relation - we use this field for hyperlinks to related resources (finding aids, for example)
    • Repostiory/dc:publisher - we put this info in "Owning Institution and Contact Info" and "Owning Institution Homepage", which are currently unmapped to DC.
    • Temporal Coverage/dc:coverage - we put info in the date field

    Feel free to follow up with me or our metadata specialist, Belinda Egan if you need more details or examples!

    And my answer to both of your questions above Adrian is yes - the scope and the required fields both seem reasonable to me.

  10. (1) I reviewed the scheme with our Metadata Librarian. We feel the scheme will work for our needs.

    Regarding the date field, we are glad to hear that there will be descriptive and normalized forms of the date. We'd advocate for defining what standard (ISO?) will be used for normalized dates and providing clear instructions and examples for the normalized dates.

    Regarding the publishers field discussion, will UCLDC maintain the authority list that participants would use?

    (2) The required fields as proposed are acceptable to us.   We think that copyright should be required for the discovery and delivery systems, but not for DAMS.

  11. ok, just figured that the DAMS widget "List" is a fairly static list vs. a dynamic one - see "Vocabulary schema" under "Nomenclatures used in the specification" on the "Metadata scheme" page.

  12. Here are UCLA's suggestions, which repeat some of the above!

    Suggest that Alternative Titles, Descriptions, and Related Resources be repeatable fields. In each case the contents can be of different types and it's nice to distinguish between them, even if only through visual separation in the display.

    Also suggest that Date be a required field, even if only to the century. It will help users browse and find relevant resources, and also help them make sense of things they find.

     

    Just for comparison, here are the elements UCLA requires that are in addition to the proposed CDL required elements:
    •  Creator (if available)
    • Date
    • Language
    • Physical Description




  13. 1) In terms of the scope of digital object metadata scheme, one of the needs we have for a DAMS is management and re-use of assets (internal), so the more discovery-focused scope fits only part of our use case. Looking at more narrowly at discovery and access, however, as well as sharing, the scheme seems sufficient for those applications.

    As for the specific data field definitions and mappings, I agree with much of what has already been noted, so will leave further discussion of the elements for when we dive into those more specifically.

     

    2) On the review of the Obligations/required digital object metadata fields, it seems the following should be required in the two instances (obligation for valid object and obligation for publication):

    - Date (oblig-valid/oblig-pub)

    - Language (oblig-pub)

    - Title (oblig-valid/oblig-pub)

    - Type (oblig-valid/oblig-pub) 

  14. Hi all,

     

    Thank you for reviewing the draft UCLDC metadata scheme, and for your feedback and input -- and sorry for the late follow-up.  I'm compiling everyone's comments into a summary, and will post this along with responses to specific questions and proposed changes to the scheme.

     

    Thanks again!

     

    -- Adrian