Every now and then I see posts like this one where someone says they are going to "Tag" the post with a tag and that this will be useful in some way. I never understand it. I don't think it works, and here is why.
Users make up the Tags
Users are stupid.
I include myself in the above defined users. In the example at that page, Fred Wilson is saying that he will tag all posts about stocks with the tag "Stocks". This seems simple, but here is the problem. Somone else might be more granular than Fred and tag things as "biotech stocks" or "industrial stock" or they might use the singular and say "stock". And so - when I go to a site or go to run a search I have absolutly no idea what tag people used and so I fail and then I get negative reinforcement and I don't get back to it ever.
You can see this on Flickr - where you can't just look at one stream and get everything that is pictures of something. You have to look at a couple of tags. If your lucky you will see a post with some way of also tagging that always you to wander on (a rosetta stone if you will).
Why does this matter to me so much? Transgenic Mice and my thesis. AND the notion of this Semantic Web thing that keeps getting batted about. You can see Scoble talk about it to sometimes.
Mesh headings are a bit related to tags, in that you can search for things that fall in a Mesh heading on Medline. They seem to be more hidden these days than they used to be, but they are still there lurking uselessly in the background. They are assigned by editors to put papers in categories so that one could browse down a hierarchy and end up with only papers talking about, say, "Transgenic mice".
Now the problem. Papers are assigned to the hierarchy by people. They read and make judgements about where things should go, and the papers are assigned.
When Transgenics first came out, people used all kinds of words for them (and the free text search engines were slow anyway) AND then, to cap it off, the editors at the National Library of Medicine decided to put them in at least 5 different MESH headings. SO - if you were doing transgenic work close to the time that transgenics were a new thing (that would date me a bit...) then you had a problem of being able to find all of the literature on the subject. Eventually, after a couple of years, they got it together and solved this problem.....only to have it again when knockout mice came on the scene. By this time, I was only doing full text searches anyway, so didn't care as much. BUT - the point is - Tagging, when done by humans, is useless.
Humans have opinions about things (for the most part) and not all of them are enlightened enough to just full agree with me. This means that when you lable something, I may not agree with the label you put on it.
A further problem crept in to this when, a few years ago, I was working with some people at GSK who were putting in a system to manually categorize every bit of paper they had at the site they were at, put it in a computer, and then have this mass repository of stuff that would somehow magically produce drugs faster. They were just starting to run in to the problem that if you gave the same bit of paper to a couple of people, they wouldn't categorize it all the same way. There would be subtle differences that would cause them to read it just a bit differently, and thus they would file it differently.
Another similar problem creeps in when you try to file paper in files by company name (say, all of your legal files). How do you deal with University of Southern California vs. University of California at Los Angeles? + all the other University of.... 's --- You can use the full name of every university and spend a lot of time typing and have very long file names. Or you can always shorten to U.S.C. or USC or U SC (but then what do you do with South Carolina?). Given this problem, people will pick a different system that to them makes all the sense in the world. Others will look at the system and be like "what were they thinking" and then they will make statements like I did at the beginning of "users are stupid". Outsiders would look at my filing system and (I think) would understand it. BUT they would be unlikely to have duplicated it without peeking as the shortcuts I use are based on my background and my perception of the world. I have just had to teach this to the woman I hired, as we have to share the filing system back and forth and she admits it makes sense. I have clear conventions etc... but she also said it was totally different from what she had done at her previous job. They had different conventions. Neither of us is "right" but it just goes to show that from the computers point of view - users are stupid.
I think they are going to build massive engines and interpreters and other confabulators to deal with this, and they may even do it = but I will be real surprised if they pull it off becuase even if you do - if it is dependent on the user doing anything then it won't get done. I could tag these posts, but I don't. Why? because I am lazy. I think most people are lazy and they won't do this tagging stuff becuase they don't reall care that much about it and they have better things to do with their life.
So - In summary - Users are both stupid and lazy.