This is an 11 month project, to start February 2007 - January 200810.
There are two components to the proposal development and evaluation:
The rationale for this dual approach is that, in the spirit of investigation for new approaches/models of the Tools and Innovation thread, we are proposing a new paradigm for tag use. That is, semantic tagging introduces new uses for the tag concept, such as domain exploration and search by tag. While we have a clear development plan for the architecture and associated services for an initial deployment as described in the architecture section below, we will be particularly at models for evolving the level of Semantic Web technology take up in Digital Repositories. We are starting with very lightweight Semantics - developing folksonomically11, as it were, taxonomies to be associated with tags. We will investigate the ways in which it may be possible and/or appropriate to evolve these taxonomies towards more formal ontologies. In sum, we will use this project to investigate the most appropriate logics mechanisms to support these services.
It will also be critical to investigate take up of semantic tagging in terms of understanding how a semantic tagging model might be optimised for large-scale deployment. To that end, our second project strand is a series of iterative evaluations to be carried out throughout the project cycle. WP1 Service Development (11 months) Work Package 1 focuses on the Development of the Semantic Tagging Web Service and integration of the service with the EPrints Open Archive Interface. Each of the deliverables for WP1, referenced in the architecture section above is described in more detail in Appendix B.
This work package involves the creation of the Semantic Tagging Web Service to provide the following functionality, and needs to be completed before WP1.2 or WP1.3 are started.
The deliverable for WP1.1 is a fully functional Web Service API providing both HTTP and SOAP interfaces as well as an exemplar central Semantic Tagging host which will be the repository for the tags and their related information. For the project, this host will be maintained at USouthampton.
Work package 1.3 will involve the creation of a direct web interface to the semantic tagging web service. The web interface will provide easy access to all of the functions the tagging service defines and will also make all semantic tags at the Host accessible to services which make calls to it via the Semantic Tagging API and Widgets.
The deliverable for this work package will be a fully functioning Web Service site that interacts with the tagging service and semantic tag central tag host to allow creation, editing, annotation and linking of tags and exploration of the tag data stored by the service. It is important to note that any 3rd party services which uses the Semantic Tagging API/Widgets will have access to all semantic tags and their associated resources via this central Web Service. It is this access to shared tagging that enables rapid discovery across repositories of information that can be inferred to be related to that semantic tag. In other words, no 3rd party service such as Intute need reinvent the tags of another service such as EPrints at ECS Southampton.
There are two parts to our approach for investigating the usability and usefulness of semantic tagging in digital repositories: formal studies and a field study. The first of these studies in WP 2.1-2.3 are formal design studies to develop best practice design metrics to support the specific attributes of semantic tags to facilitate creation, revision and exploration of tags and their associated artefacts. The results of these studies will fuel the deployed design of the tag interfaces. Once the tag mechanisms are deployed the study in WP 2.4 is designed to look at both the usability and usefulness of the semantic tags over time. This combination of metrics will give us a richer picture both for refining our services for the e-Framework and for making recommendations towards effective semantic tagging models for digital repositories in general.
Using interactive mock-ups, a study will be done to discover the best approach for representing and browsing tag associations. Such methods include a tree diagrams or multi-column viewers; where multi-column viewers are often preferred for accessing large multi-layered structures, this may be overkill for the task of representing a tag and its associations. Multi-column viewers may be better for exploring through tags and documents, and tree diagrams may be best used for tag overviews. Understanding how best to represent semantic tag information is the core rationale for this study. Also, we have identified three areas for particular attention in terms of building a model for semantic tag creation and use: semantic depth, tag creation and temporal tag changes. We will be investigating best approaches to address each of these tag attributes. Semantic Depth. One concern for users is that of chain length when creating and browsing tag hierarchies. For example, "mSpace" may be subsumed with the tag "Semantic Web", but this in turn may be associated with "Grid Technology". However, "mSpace" is not directly concerned with Grid Technology and so the user may want to exclude tags at such depths in a tag hierarchy. The extent of this problem is not known, as we do not know the advantages of exposing such depth to the user or constraining search results by third-level terms ("Grid Technology" in this case), for example.
Tag Creation. Another design challenge is found during the editing of new or existing tags. When the user is tagging a document, they can select from an auto-complete, which tries to expose various tags that could have the same label but have different connections. When creating a tag, the user may have to choose between tags in a similar way, in order to associate them to the new tag. This step requires the inclusion of tag lookup within tag creation and will be a key UI challenge during the research.
Temporal Tag Changes. Finally, this project supports the exploration of tag-changes over time. Some projects, for example Cloudalicious12, have looked at the changes of tag-clouds over time. As time evolves, the meaning of tags may vary. The tag representing the "Semantic Web" may evolve as new projects and technologies are developed. The changes made to tags, however, may affect their existing associations with documents positively or negatively. New technologies may be added to the concept of mSpace that were not used by mSpace at the time of publication, causing miscommunication over the content of the publication. Alternatively the publication may also be benefited by automatic updates; new publicly recognised demonstrators could be used to recognise all publications by mSpace.
One aim of the project is to enhance the benefits of tagging, through using Semantic Technologies, without adding significantly greater cost to the user. Users currently tag by adding words freely to documents; these tags have no explicit relationship to other tags given to the same document. We want to maintain this ease in the tagging action, but support the connectivity of tags by enhancing the explicit relationships. The study, therefore, is to evaluate any costs on tag creation against the current tagging capabilities of EPrints, through Connotea13. The study would compare the time taken to tag documents using both new and existing tags. We expect there to be greater time required to set up new tags.
This study would be to compare the new benefits of semantic tagging. There are two elements to this second study. First to show that the ease of current standards for tag-based document retrieval is maintained, and second to expose the new potential tag-based retrieval methods afforded by the subsumption hierarchy. The study would compare the time taken to retrieve documents, for example, using first level tags (those used to directly tag a document) and second level tags (tags associated with first level tags).
We will be doing prototype deployments of the full semantic tagging architecture by the mid-point of the project. A combination of longitudinal field studies of the system in use, combined with interviews and analysis of system logging will allow us to look at how semantic tags are used in practice. This study will help us to refine models to support practice.
The deliverable of the WP2.1 is to provide recommendations for use of implementation resources; saving time and monetary costs on developing throw-away software. WP2.2 and WP2.3 help us to evaluate the Semantic Tag's impact on the tagging communities, providing guidelines for both its future development and deployment. WP2.4 will have implications on the future research put into this research area, identifying further questions to be addressed.
For Semantic Tagging to be successful as a technology and sustainable as a service, we recognize the importance of working with JISC partners throughout the lifecycle of the project to promote awareness of the resource, and to gather early participation and feedback in the development of the technology. WP 3.1 To enable the above interaction, we will take advice from JISC to identify appropriate contacts and key stakeholders via the Repositories Support Project and via the proposed Intute Repository Search Service in particular. The goal of this workpackage is to develop an ongoing dialogue throughout the project to make testing and take up by the Services and/or projects as lightweight, sustainable as possible.
To help consolodate feedback for development and take up strategies (as distinct from the usability assessment in WP2), we propose at month 9 of the project, to hold a post-WP2 workshop with the stakeholders and community at large as a means to consolidate feedback on the services and help develop recommendations for a next iteration of a semantic tagging model. This workshop will enable the Repositories Support Project and Intute to hear from users about the perceived benefit of the global service and assess benefit for uptake; the workshop will also enable us as a community of interest to bring together approaches towards recommendations for sustainable deployment of Semantic Tags as a service.
The core outcomes of the engagement/workshops will be effective engagement with the key stakeholders throughout the project lifecycle. The core deliverable is that with this feedback we will have a strong plan for both post-project (a) takeup and (b) sustainability.
Semantic Tagging for Cross Repository Exploration is a new paradigm for knowledge building with repositories, providing new functions for research support such as domain exploration and automated background search. This project is pushing into new territory. While our proposal is technically highly feasible, and we have confidence in our usability approach, we know that these will be First Use deployments that will, if as successful as we imagine, open opportunities for general use and extensibility, beyond digital repositories, linking into other knowledge domains such as VREs. The overall outcome of each of the three main components to this project will give us an additional source of analysis for achieving the goals of the Tools and Innovation call: to understand workflow and common practices in terms of how new tools can better support these within the Digital Repositories context, and in particular to gather data from initial, functioning tools deployment and evaluation of same towards new approaches, models for practice and evaluative reports.
The technical outcomes associated with the Semantic Tag work will be (1) a semantic tagging web service (2) an API on how any developer can connect to this service (similar to the Google Maps approach), (3) a sample set of widgets, such as those illustrated in the proposal, which can be used or adapted by services, projects for their own use. We are deploying these tools as Open Source in accordance with JISC policy. For the duration of the project, the central semantic tagging web service will be hosted at ECS, USouthampton. We plan to work with JISC throughout the project to find appropriate outlets for the post-project maintenance and ongoing deployment of the Semantic Tag central service and associated code base. For example, Intute or the the proposed Intute Repository Search Service may wish to host the semantic tag service while we work with the JISC eFramework to archive the code artefacts for access by the JISC community.