The automated harvesting system will closely monitor publication sources, including public and licensed publication indexes, for any new materials published by UC authors. The system will gather as much information about the publication as possible and will notify the author by email when any have been detected. Author approved publications will then be submitted to eScholarship, where they will be available to the public. Phase 1 is a Pilot Phase that includes three campus partners: UC Irvine, UCLA, and UCSF.
Requirements that will be addressed in the initial release (June 2014), following identification of the vendor in the RFP process
|One system for multiple campuses||The harvesting system must support activity by faculty from three separate campuses in one install while maintaining clear distinctions between those campuses. I.e. faculty from UC Irvine should not have to filter out information about UCLA or UCSF.|
|Integration with OAPI Waiver/Embargo/Addendum Generator||The harvesting system must be able to integrate information faculty have supplied to the OAPI waiver/embargo/addendum generator. For instance, if the system harvests a publication record for an item for which a faculty has already established the need for a 12 month embargo and that embargo period has not expired, the faculty member should not be prompted to submit that item. The harvesting system will also need to allow all faculty to apply a waiver or embargo through this system.|
|Harvesting of publication metadata from non-UC sources||The harvesting system will harvest publication records for UC faculty from key non-UC sources, such as Web of Science and CrossRef. It must support the ability to harvest from licensed sources, as well as non-licensed sources (e.g. arXiv). In addition to having connections to major existing publication sources, the harvester will also be able to connect to new external sources as they arise or become of interest.|
|Harvesting of publication metadata from UC sources||The harvesting system will draw from UC sources that may hold UC faculty publication information, including eScholarship and campus systems where faculty may be maintaining publication lists. These targets are likely to contain publication information from journals and other venues not included in the major indexes such as Web of Science, but that are equally important to capture in OAPI. The harvester will support the ability for faculty to correct metadata before it is submitted to eScholarship.|
|Automatic notification of faculty|
Manual deposit is an insufficient strategy for populating repositories with previously published materials, as it is a burdensome practice that most faculty will not take on. To ease the load on UC researchers, CDL desires to provide a harvesting and notification system to faculty, which will harvest metadata about their publications, alert them to the presence of new items via an email alert, and allow them with the fewest clicks possible to then claim a publication, upload the appropriate full-text version, and then submit it to eScholarship.
|Manual deposit||A manual deposit workflow is required for those publications not captured through harvesting. eScholarship has a manual submission workflow that may serve this purpose, but it may be less confusing for faculty to use a manual deposit workflow from within the same environment in which they manage harvested publications.|
|Faculty ability to refine harvesting search queries||Faculty will need to be able to add alternate name versions associated with specific publication targets in order to increase the precision and recall of the harvester. Optimally a faculty person would be able to trigger a harvest immediately after modifying the search pattern in order to get confirmation of the efficacy of that modification.|
|License assignment||Faculty will need to be able to choose from various license options, including Creative Commons, depending upon publication date and campus.|
|Metadata fields||We will need to be able to determine which fields are required or not, and which fields appear or not. The system must be as simple as possible and not present unused or unnecessary fields that faculty members will have to visually filter out.|
|Customizable UI||We will need to be able to shift the order of metadata fields, sections of forms and brand the system so that faculty immediately recognize that it is a UC service and will also have to reflect a campus identity.|
|Customizable workflow/support for business rules||Faculty will need to be given different guidance depending upon whether a harvested article was published on or after UC and UCSF's open access policies. The CRIS will support different workflows based on publication date and campus, since the UCSF policy is currently different than UC's.|
|Faculty profile page||Faculty publications will have to be clearly presented on a profile page in the harvesting system.|
|Proxy support||The harvester will support the ability for faculty person to designate proxies to manage publications on their behalf. Administrators will need to have the ability to act for faculty as well.|
|Hierarchical listing of departments||Faculty profiles will be organized by departments, which will in turn be represented in the hierarchical structure for that campus.|
|Ingest of HR data||Faculty names, departments, etc. will be initially established and regularly updated through feeds from UC HR systems. Optimally this will come from UC Path, but may initially begin with campus HR systems.|
|Shibboleth||Shibboleth/UC Trust will be used to authentication and authorize users for the harvesting system, so that an additional set of credentials will not be required.|
|Push to multiple designated repositories||The harvester will allow faculty to approve submission of the uploaded publication and metadata to eScholarship and will also support the direct submission to Merritt for dark archiving only.|
|APIs for integration/connection to third||Various campus systems will consume data from the harvester in order to function more effectively. For instance, campus profile systems and campus academic personnel systems will use the harvester's API in order to incorporate faculty publication data and eliminate the need for duplicate entry.|
|Reporting||The Harvester will provide default reports about activity, such as number of faculty profiles;publications harvested, claimed and approved over periods of time; etc. In addition, the Harvester will support the ability to create custom reports.|
Symbol Guide: = Completed; = In Progress; = On Hold / Awaiting Other Development
Requirements that will be addressed in future releases
|Entire UC System Implementation||All campuses will have access to the system.|
|Integration of other CRIS functionality||Harvesting systems are part of larger Current Research Information Systems that support the entire range of a scholar's activity. Future requirements will include functionality related to grants management, graduate student support, etc.|
|Expansion to other UC researchers||Graduate students and staff could also be included in the CRIS functionality.|
Released RFP, as well as planning schematics can be accessed on the Automated Harvesting System Planning Documents page.
Outreach & Community Engagement
The scope of work for this project has been shaped directly by requirements we have gathered (and continue to gather) from UC faculty, OA project stakeholders and colleagues at other institutions.
For information about past and present opportunities to connect with this project, visit the Community Engagement page. We also invite you to add your voice to the project's discussion forum or contact the implementation team directly.
- No labels