Page tree
Skip to end of metadata
Go to start of metadata

Date

Attendees

Goals

  • Learn about opportunities for using Merritt to preserve web archive collections; connect on new leadership for WACKG and current campus collecting initiatives

Discussion items

TimeItemWhoNotes
~30 minutesMerritt for web archives digital preservationEric Lopatin, John Chodacki

presentation slides: https://docs.google.com/presentation/d/1gJ5zjhDNJPEVA28gmJmoSjOZawZjb9wuRt7Pifcd7_s/edit?usp=sharing

Eric's email address for follow ups: eric.lopatin@ucop.edu

WACKG-compiled questions doc 

  • post presentation q's
    • CM: how to structure collections in Merritt, make fine grained decisions about what to preserve if not everything? JC: could set up default for collecting all, or do a custom, selective approach. would use the existing collections structure in AIT.
    • KM: WARC playback potential outside of AIT/Wayback (if somehow we lose connection with AIT)? could Merritt function as an access repository? EL: WARCs + CDX files should ensure/position for playback. building playback could be option down the road? JC: out of the box, pre-assigned URLs if both WARC and CDX in Merritt, could pull them down and bring into other system for playback, without mediation. could also generate the index off the WARC as a back up, which could be a backup
    • could bring in other IA files to back up as well using other IA API (smile)
    • CM: potential for digging deep into mechanics of WARC files, tech spec's, etc. JC: could connect on this in the future for the CKG and perhaps with IA (Jefferson et al) as well.
    • CM: metadata in Merritt? JC: could add metadata EL: can augment the local ID and also add e-resource citation, metadata, etc.
    • LO: metadata is crucial for UCSC and CA.gov. would love to get the anatomy of a WARC file to better understand. also how flexible are the workflows, scheduling harvests. how to signal wanting to do a reload/new upload? JC: yes, and APIs allow for querying for changes.
  • JC: costs - they've become relatively low. given your local decision making for costs, do you think the stated costs would be a concern for approval? interested in lowering any hurdles for folks... can explore more and get back to this group on this - please stay in touch with thoughts on interest and costs. CM: as with anything, yes - hasn't been an issue so far. EA: for Irvine, seems doable given that Merritt prices have gone down; thinking about sustainability, as colls added, will increase over time (this year will see an increase).
  • EA: has anyone tested downloading from AIT/uploading WARCs to Merritt? building a visual flowchart start to finish from nomination to management/access, but not sure about the preservation piece. TM: have downloaded, but not done anything with them.
  • NB that ingest could take time and iteration. thoughts on collections mirrored in Merritt? would be great to have them be the same as in AIT?
  • Next meeting a continuation of this discussion once folks have had a time to digest, consider details for working with Merritt.
5-10 minutesAdministriviaKathryn
  • WACKG leadership going forward
  • Discussion topic backlog:

    • AIT hygene (e.g., how are collections structured, metadata used to organize/describe, etc.)
    • capture tools scan/assessment

    • web archives use (including analytics from AIT)

    • web archives as data

    • deep dive on UCSB COVID-19 collecting?

10-15 minutesNews of Note: announcements, conference presentations/report backs, project updates, calls for participation, etc. All

Action items

  •