Automatic vs. Manual Tagging - Born to tag? and to What End?
Dennis McDonald recently posted an entry on his experience with Reuters' automatic tagging tool called Calais. He concludes:
Despite the issues, I’m impressed and looking forward to tools like this making their way into more products and services. The addition of features such as learning, training, and authority lists will provide significant aids to both manual and automated use of such tools.
In response to a comment, he adds:
At this stage of my life and my career I have finally learned what I believe is a great truth: some people are born to tag, and some people are not.
I agree! I think there is a great role for category focused folksonomies, but clearly see the limits to how well enterprise users will tag content. Furthermore, technology can replace or just augment the tagging that users do on their own.
All that said, there is a really important and key distinction between tagging for categorization and tagging for other purposes such as for action, priority or content type.
We may be collectively lazy and uncoordinated when tagging for categorization - thats where automated tagging really can help, and generally outperform humans. In Traction TeamPage, we've implemented the FAST Search Module which does an incredible job at entity extraction and interactive drill down. It basically turns entities (like those mentioned by Dennis - company, person, location - as well as keywords which are linguistically significant nouns or noun phrases) into groups and displays them as dynamic permission filtered "tag clouds." See our FAST Module page.
When it comes to tagging for action, priority or content type - the situation is different. This is a more sophisticated, but EXCEPTIONALLY valuable tagging strategy that can increase the value of your content by many orders of magnitude.
A prime example is writing "requirements" - tagging each blog or wiki page as Requirement, P1 or P2, To Do or Done, and Milestone Alpha or Milestone Beta. This approach allows you to take a page posted as feedback by a customer and funnel it into your project process. This approach allows you to slice and dice the content based on the task at hand (e.g. prioritizing pages and assigning them to different milestones).
This is the essence of content re-use which has been talked about but never really implemented in "Document Management 1.0" and "Collaboration 1.0" models. This capability, by contrast to tagging for category, can't be automated. More on this whole topic in my slide set on Enterprise Tagging Strategies.