I joined Twitter in early 2008, 18 months after it was officially launched, but only started tweeting regularly after 2010. I have however never attempted to do any data mining on my tweets. I should probably say text mining instead, as Twitter is essentially a platform that captures words in sentences limited to 140 characters,1 Some studies suggest that in English the average number of words in a sentence should be between 9 and 14 to increase readability. See http://www.onlinegrammar.com.au/how-many-words-are-too-many-in-a-sentence/. Note that hashtags are part of the analysis., including web links.
One easy way to do a simple analysis is to generate a word cloud of tweets. A word could presents in graphical format the words used in tweets, ranked by frequency of use. Words most used are displayed in larger fonts. The chart above2 Click to enlarge! depicts the overall word distribution of my tweets, including only those that occurred at last 20 times over the years. Some of the priority topics I tried to cover over the years displayed in the chart. Privacy, innovation, surveillance, gender, and data take the top spots.
How about this year? Are there any emerging topics? Sure there are. The chart on the right shows the results. Innovation has lost its luster and blockchain has raced to the top. While gender and surveillance are still up there, privacy has gone down a notch. The other big winner in 2016 is artificial intelligence while inequality seems to have remained at the same average level. I was surprised not to see robotics in the top tier as I thought I had been following the topic very closely this year. But is has certainly gained some relative relevance.
The chart on the left shows the word cloud for 2015. This one is much closer to the overall tweet distribution with gender displacing innovation. Sustainability here is one of the top spots probably thanks to the UN Sustainable Development Goals and the Climate Change global meetings.
One advantage of using hashtags is that they allow two or more words to be placed together. So, for example, Silicon Valley will how up as one entry in our word cloud, as is the case with artificialintelligence in my tweets.
Anyways, it feels good to know with some data what is one really tweeting about.
Cheers, Raúl
Endnotes
⇧1 | Some studies suggest that in English the average number of words in a sentence should be between 9 and 14 to increase readability. See http://www.onlinegrammar.com.au/how-many-words-are-too-many-in-a-sentence/. Note that hashtags are part of the analysis. |
---|---|
⇧2 | Click to enlarge! |