my.blog
My.Projects
Game Baker Easy to use, graphical Game Designer for linux.
Social Comic Book Display your twitter posts in a comic book layout.
Seam Resizer Implementation of seam removal and insertion for photo editing.
Viral Ad Network Make money from your website by showing viral ads on your site.
Santa's Snowy Workshop A highly playable Christmas Real Time Strategy game..
My.Papers
Average Views on YouTube The average daily views/video on YouTube doubles at the end of 2007.
My.Blog
Nerdy news updates and articles
Tim Wintle's Blog
Tim works at Team Rubber, where he uses Python, large computers, and some clever maths to look at the web in new ways. In his free time he codes various other bits of software, and web apps.
.
Sun, 08 Apr 2007
Why tagging can take us furthur away from the semantic web?
There is a lot of talk these days about the "Semantic Web", and some purists suggesting that we tag all data. I believe that, while tagging may be wide-spread now due to it's relative ease of implementation, it is likely to hurt the long-term aims of the semantic web.
Firstly, let us define semantics. Here is the wikipedia definition:
Semantics ... refers to the aspects of meaning that are expressed in a language, code, or other form of representation. Semantics is contrasted with two other aspects of meaningful expression, namely, syntax, the construction of complex signs from simpler signs, and pragmatics, the practical use of signs by agents or communities of interpretation in particular circumstances and contexts. ...semantics may also denote the theoretical study of meaning in systems of signs.
So, the idea is that every item on the web can be uniquely categorised by some series of symbols, which occur within an alphabet. In everyday linguistics, we would take the symbols to be words, and the alphabet to be all the words in the dictionary. Notice that this is separate from the order of the words and punctuation.
Regarding natural language, there are two possibilities:
- Language is fully capable of describing the entire concept of a document
- Language can only describe a subset of concepts
- The semantics of natural language (i.e. words used) are fully capable of describing the entire concept of a document
- Language can only describe all concepts when it includes the syntax and pragmatics
Now for some comments on tagging:
- Tagging tend to be taken from a smaller alphabet than words used in articles / web pages / full transcripts (in the case of video/audio). Basically, in the full text, an author will probably have used more than one synonym, where in selecting tags, people are more likely to choose the most commonly used word.
- Tagging removes punctuation. This is not technically removing any semantics from those used in the text, however it is perfectly possible to create semantics describing a page which relate to the grammar and linguistics. This is an ability to effectively increase the alphabet size that is missed by tagging.
- Tagging only uses one occurrence of each tag - this removes the ability to make use of the density of a word. Imagine you are putting up some new shelves. You measure your wall to see how long you want them, but your tape measure only has two marks, 0 and 1. Your wall is nearer 1, so you go to Ikea to get your shelves (which are also marked 0 or 1), and just have to hope that they fit.
But how can this harm the semantic web? I hear you ask. Well, the more that tagging gets used, the more that we change the distribution of these words in our overall semantic, making it harder for people to fairly extract semantic data in the future.
In conclusion, if you are designing a site with tagging, that is all very well for usability, and for the semantic web in the stage we are at. All this tagging may, however, have a detrimental effect on the growth of the true semantic web, so please try to separate them off, and make it clear they are tags, as this will make it much easier for future algorithms, and the evolution of the web.
TrackBack ping me at:
http://www.timwintle.co.uk/blog.pl/Search/tagging-semantic-web.trackback