An Apriori Algorithm Analysis Of Steve Jobs’ Tribute Messages

Yesterday, Neil Kodner wrote an interesting post in which he scraped and analysed the tribute messages for Steve Jobs on the Apple website. Some interesting insights were, for example, that people talked about the Mac and iPhone the most, and compared Steve Jobs with great minds like Einstein, Ford and Edison. Also, Neil found that ‘rest in peace’ was the most used trigram in all the messages.

Seeing this, it made me think of applying the apriori algorithm, which I recently implemented for my Web Text Mining class, to the tribute messages. The apriori algorithm explained according to wikipedia:

In computer science and data mining, Apriori is a classic algorithm for learning association rules. As is common in association rule mining, given a set of itemsets, the algorithm attempts to find subsets which are common to at least a minimum number C of the itemsets.

The way me and my group-mate Rene Dekker implemented it, the algorithm extracts association rules for words from a sentence or document (stopwords, punctuation and numbers are removed from analysis). So, I took the text file with tribute messages and applied the algorithm to see what word combinations are used frequently within one tribute message. I’ll get into the algorithm in a later blogpost, but here are the results for a minimum support level of 1% and minimum confidence level of 85%.

Interpreting the results

Jobs, friends, condolences -> family

This means that the four words ‘Jobs’, ‘friends’, ‘condolences’, & ‘family’ together (but not necessarily next to each other) occur in at least 1% of the tribute messages. Also, when ‘Jobs’, ‘friends’, & ‘condolences’ occur, at least 85% of the times the word ‘family’ is also present in the message.

friends, Steve -> family
Peace, Jobs -> Steve
Thank, Jobs, us -> Steve
world, friends, condolences -> family
Mr, world -> Jobs
Jobs, computers -> Steve
know, friends -> family
friends, many -> family
Mr, friends -> family
friends -> family
iPad, Jobs, Apple -> Steve
Jobs, created -> Steve
condolences, friends -> family
Mr, friends -> Jobs
friends, lost -> family
go, friends -> family
friends, Apple -> family
never, friends, Steve -> family
people, Jobs, world -> Steve
friends, like -> family
life, friends, Steve -> family
Jobs, friends, condolences -> family
friends, thoughts -> family
friends, always -> family
never, Mr -> Jobs
friends, Steves -> family
friends, Apple, condolences -> family
world, friends, Steve -> family
friends, man -> family
condolences, Steves -> family
Jobs, life, great -> Steve
prayers, friends -> family
Jobs, world, changed -> Steve
human, Jobs -> Steve
friends, Apple, Steve -> family
brought, Jobs -> Steve
friends, condolences, Steve -> family
friends, condolences -> family
friends, us -> family

I’ll elaborate more on the algorithm and different improvements in efficiency and usefulness we made in a later blogpost — please stay tuned.