Semantic publishing

From Wikipedia, the free encyclopedia

Semantic publishing on the Web or semantic web publishing refers to publishing information as data objects using a semantic web language or as documents with explicit semantic markups. Semantic publication is intended for computers to understand the structure and even the meaning of the published information, making information search and data integration more efficient.

Although semantic publishing is not specific to the Web, it has been driven by the rising of the semantic web – a web of data. In order to make the semantic web work and realize its potentials, information must be presented (i.e. published) in semantic format on the web. Thus, as the semantic web is further developed and adopted, semantic publishing will become a main form of web publishing.

Semantic publishing is expected to change the face of web publishing. When this will happen depends on when killer applications will emerge. The current technologies are capable to build web site with all contents in both HTML format and semantic format. Examples are mindswap, UMBC ebiquity, and web2express.org open lab. However, semantic web site is not common yet. Earlier version of news feed, specifically RSS1.0, is in RDF (a semantic web standard) format, although it has become less popular than RSS2.0 and Atom feed. A new attempt from web2express.org is trying to apply RDF standard more broadly to various data feeds. Anyone can use the new free online service (ufeed) to create and provide RDF data resources and datafeeds for products, news, events, jobs and studies.

Semantic publishing will also revolutionize scientific publishing. Tim Berners-Lee predicted in 2001 that the semantic web “will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine” [1]. Revisiting the semantic web in 2006, he and his colleagues believed the semantic web “could bring about a revolution in how, for example, scientific content is managed throughout its life cycle” [2]. One simple idea that may radically change scientific communication is for researchers to directly self-publish their experiment data in semantic format on the web. In one scenario, a scientist could design and run an experiment, and share the experiment information with the world in real time by publishing the data as a semantic object on the web. Semantic search engines will make these semantic data available at everyone’s finger tips. W3C interest group in healthcare and life sciences is exploring this idea of self-publishing of experiment now, for which a demo is available.

Contents

[edit] Two different approaches to semantic publishing

  • Publish information as data objects using semantic web languages like RDF and OWL. Ontology is usually developed for specific information domain, which is then used to formally represent the data in such domain. Semantic publishing of more general information like product information, news, and job openings uses so-called shallow ontology, as exemplified by the free Ufeed online tool. The W3C SWEO Linking Open Data Project maintains a list of data sources that follow this approach as well as a list of Semantic Publishing Tools
  • Embed formal metadata in documents using new markup languages like RDFa and Microformats.

[edit] Examples of ontologies and vocabularies for publishing

[edit] Examples of free or open source tools and services

  • Semantic MediaWiki software: A single solution for semantic annotation that fits the needs of most Wikimedia projects and still meets the Wiki-specific requirements of usability and performance.
  • Swoogle: A search engine for ontologies and instance data on the Web.
  • Ufeed: A free online tool for publishing data resources and data feeds in RDF, including product information, news, events, jobs and studies.
  • BigBlogZoo: 60,000 xml sources are regularly crawled and articles are reaggregated under a Semantic URL. Articles are categorized using the DMOZ RDF classification schema.

[edit] See also

[edit] References

  1. W3C: W3C is developing semantic web infrastructures and standards through its many semantic web activities.
  2. Resource Description Framework (RDF): a language for representing information about resources in the World Wide Web.
  3. Web Ontology Language (OWL): OWL facilitates greater machine interoperability of Web content.
  4. Scientific publishing on the ‘semantic web’, by Tim Berners-Lee and James Hendler, Nature 410, 1023 - 1024 (26 Apr 2001).
  5. The Semantic Web Revisited, by Nigel Shadbolt, Tim Berners-Lee and Wendy Hall, IEEE Intelligent Systems 21(3) pp. 96-101, May/June 2006.