A few weeks ago, I posted here on ‘drinking from the firehose‘, looking at what we can learn from the huge tide of commentary that social media is creating for the first time.
For me, the most interesting aspect of this is that it gives journalists, PRs, or even more sophisticated intelligence gatherers, the chance to see things through the filter of the Internet’s hive mind. Journalists, PRs and researchers can learn from Hedge Funds and doctors here:
The interesting information can be found in changes and in the related stories that are spinning around a term that you are watching. A significant shift in the computer-tracked sentiment is the thing that tells the story. Hedge Funds have jumped on Twitter precisely because of this. So have doctors looking for ‘flu epidemics.
In the wild, predators notice movement. The market of human interest is the same. We process newness and change.
To understand a developing situation, the firehose (with the right analysis) can tell us when a new dimension to a story emerges or attitudes to an existing aspect of the story changes noticeably. As predators know, we need to be looking at the space where change is happening. Large numbers are often less important that sharp variations.
Case study: Noticing that a story is happening and what the key factors are
I’d now like to start to look at how we can dig into a story – in stages. I’m doing some work with some developers on a tool called Repknight, and they are aiming to create one-click ways of doing a lot of what follows, but this post is intended as a bit of a slo-mo walkthrough what is possible.
Once we have all of that data in one place (providing we have the processing power – i.e loads of servers), we can start to run all kinds of analysis over what we have. Sentiment analysis is one of the most common processes that we can use to sift this data.
Take a search term. For illustration purposes, I’ll use the recent ‘Dale Farm’ evictions as an example of a developing story. The term ‘dalefarm’ was a text-string that was being used by critics and supporters in mentions of the evictions on social media outlets.
Mining online comments with ‘dalefarm’ in them was, therefore, a useful way of keeping tabs on a developing situation.
So what could we find out?
Let’s start at the most obvious level: Can we can tell if something is happening at all? Here is a graph from Repknight showing mentions of this term across a wide range of social media platforms (Facebook, blogs, Twitter, YouTube, Flickr etc)
Yes. We can safely say that on the 18th and 19th October, something was definately happening. While Dale Farm may not be the best example of this (it was a ubiquitous news story that we all knew was going to happen), if you are monitoring a particular term (your organisation’s brand or name, a particular news issue, etc), this can be useful. Journalists, in particular, are looking for relatively large prolonged jumps (a ‘spike’ can often be a story that doesn’t have legs – a rumour that is quickly scotched).
Next, we need to drill into what was happening on those days during a short time-period around this term. We can look at a list of other words that also appear in comments with ‘dalefarm’ in them – like this one (click to enlarge).
Here we see a map of the protagonists around this story – the police, Basildon Council, Richard Howitt (an MEP who got involved), bailiffs and, bizarrely, Newts. Each player in the story has some positives and negative sentimented comments around the story on a range of different media platforms.
So, what use is this to us? Sentiment Analysis in itself, isn’t hugely accurate as a quick play with this entry-level tool will show you. As a way of understanding individual short messages, it is deeply flawed.
But automated sentiment analysis is getting better (and people are often bad at detecting sentiment accurately as well!). Changes here are what matters – as investment analysts and epidemic-tracking medics will tell you.
I noticed towards the end of last week that the sentiment around the ‘Occupy London Stock Exchange’ story (#OccupyLSX) was going from a fairly positive general response to quite a negative one. I’d already registered that there was a developing fuss around access to the St Paul’s but I’d tuned it out, thinking it was one of those aspects of the stories that reporters were talking about to fill in time.
But drilling in to that day’s sentiment, I found that ‘St Pauls’ had a strongly negative balance. This wasn’t going to go away. Throughout the day, this grew as the closure of the Cathedral was picked up by opponents as well as neutrals, it became the aspect of the story that dominated the news bulletins.
This information is only one of the building blocks needed to cover a story. All of the information I’ve mentioned here so far is available within a couple of clicks. Repknight are allowing users to build up a store of data around particular subjects over a long period of time, thereby allowing users to be able to identify significant changes properly.
In a subsequent post, I’m going to look at how we can dig into the negative messages, identify the communities that are talking among themselves, identify the influencers within those communities and the connectors between them. We can see who is making the running on a particular story – and even intervene with them to influence how a story unfolds.