Thursday, March 10, 2011

In Summary

To build an effective content summarizer is in many ways the same task as building an effective topical analysis and filtering system for unstructured content.

An effective summarizer - a program that 'reads' a lengthy article and then provides a natural language (or even bullet points for now, we can handle that) summary of the 'pertinent' points of the article means determining the key topics and then extracting and re-phrasing those points in shortened format. So, the challenge is not only what are the topics, but what are the "key" topics.

Then we get into subjectivity. The ultimate arbiter of what are the key topics is the author, but different readers may get different values from the content so ultimately the decision on what the key topics are depends on the needs of the audience. This reader is looking for information on nanotechnology, is that a key topic in the piece or just a passing mention?

These are the challenges. What's the key topic of this blog entry? Well, I'm not going to make it easy and put any labels on it..

Monday, March 07, 2011

Positive Reinforcement

I spoke at the NFAIS 11 annual conference last week in Philadelphia. As I was speaking, my colleagues back in the office were tracking tweets from the conference audience describing what I was saying. They wanted to run a sentiment analysis on the tweets and then text me an update during my talk as to how I was doing. Unfortunately the tweets were non-sentimental, just reportage. Wouldn't it be great having a real-time sentiment analysis feedback system? I know in the last presidential election they had the audience pressing buttons up and down to express their sentiment as the presidential candidates squared off on a live debate, but I'm talking about bigger than a few hundred and unstructured. Only a matter of time..

Anyway, I'll take it as good thing no-one expressed negative sentiment about my talk. John Blossom wrote a very good summary - way better than anything any existent automated content summarization tool we know about or have made ourselves could do - here.