Rich Tags

Operationalising Semantic Tags as Repository Functionality

Our approach to Semantic Tagging is based on our experience that a little bit of semantics goes a long way. To that end, semantic tagging does not require previously existing ontologies to work. The act of tagging via association itself creates the meaning of the tag in both human and machine tractable terms. In effect, the act of semantic tagging effectively builds light weight ontologies. For instance, imagine someone is the first to create the tag Open Access. They use the s-tagging widget to tag a paper with this concept; the widget seeing no other terms matching open access requests that the person add some meanings to the tag, which might be listed as Policy, Software for, and Related to Open Source. The next time someone wishes to tag a document with Open Source, they can choose which of those meanings is most apt for what they propose, add a new meaning, or refine an existing meaning. Or, if someone wishes to pull together material related to Open Access, they can gather all material either tagged directly with Open Source, or inferred to be connected with associated meanings like the philosophy of open source software. All this extra benefit from just a dash of shared meaning.

Our architecture described below in effect is a test of our hypothesis that by making it easy to add light weight semantics to tags we can successfully improve their usefulness in the ways described above.

Approach: Architecture and evaluation

The architecture of the tag service needs to allow a 3rd party application, such as EPrints, to access the core functionality for creating and using meaningful semantic tags, whilst being behind the scenes so users do not actually realise the service is being used. Tagging itself is a commonly used element in Web2.0 applications; Web2.0 is described in Tim O'Reilly's article: "What is Web 2.0"9. Web2.0 applications provide functions as services, and avoid the need to continually refresh pages each time information changes. For this reason the proposed architecture of the tagging service will be directed towards the use within Web2.0 applications.

Figure 3 - 2 Layer Tag Service Communication

There are 2 distinct layers in the architecture: the tagging service itself and the 3rd party applications which use the service, both of which will need to communicate with each other to expose the tagging service functionality. The EPrints interface, a 3rd party application which will be using the semantic tagging service, is web based and will need to have extra functionality integrated into its interface for communication with the tagging service. We will prototype adding functionality to the EPrints interface via pluggable JavaScript Page Components, which can be easily added (by importing the JavaScript component and creating the component on the page) to any web page to enhance the interface with the tagging service functionality. Each JavaScript component will provide a User Interface enhancement that will directly communicate with the tagging service functions using AJAX (Advanced JavaScript and XML), a commonly used technology in Web2.0 applications that can communicate with Web Service APIs behind the scenes.

The Architecture of the System uses common Web Service communication methods SOAP (Simple Object Access Protocol) and HTTP. The JavaScript Widgets will communicate directly with the Semantic Tag Service using HTTP. Each tagging service function will require certain parameters and will provide a return response. A SOAP interface will be created for future extension of the System by other 3rd party applications. 3rd party applications use the Tag Service to lookup or create a tag's Unique Resource Identifier (URI), which is stored locally within the 3rd party application to associate tags with application artefacts. Using the URI to identify tags within both layers creates semantic links between the tags and artefacts.

The Architecture Diagram, Figure 3, shows the 4 main function groups that the API will provide to support the above services: Tag Creation/Editing. The tag creation/editing function will be used to create and edit all tag details, such as descriptions or tag links. The return value for this function will always be the Tags URI.

Tag Lookup. Tag lookup will either take a single keyword or sentence and return a list of matching tag details, or will take a whole paragraph and return a list of tag matches based on the parsing of the paragraph. This function can also be passed a list or URIs to resolve the descriptive tag names. Tag Trackback. The Tag Trackback function will take the details of each tag that is applied to an artefact, to keep a record of how tags are used. This is used by the trackback lookup function for advanced features, such as exploring artefacts that have been tagged across archives or 3rd party applications. Tag Trackback Lookup. The Tag Trackback Lookup function will return the details of how a tag has been used based on the constraints passed (such as tag, usage type, location, time etc).