Showing posts with label Word clouds. Show all posts
Showing posts with label Word clouds. Show all posts

Monday, July 18, 2016

Comparison of top 10 and bottom 11 Fortune 500 firms

Hello readers,

A text analytics based comparison was done to find unique characteristics of firms belonging in the two clusters:

Cluster 1:
Rank 1: Walmart
Rank 2: Exxon Mobil
Rank 3: Apple
Rank 4: Berkshire Hathaway
Rank 5: McKesson
Rank 6: United Health Group
Rank 7: CVS Health
Rank 8: General Motors
Rank 9: Ford Motor
Rank 10: AT&T

Cluster 2:
Rank 490: Rockwell Collins
Rank 491: Lam Research
Rank 492: Fiserv
Rank 493: Spectra Energy
Rank 494: Navient
Rank 495: Big Lots
Rank 496: Telephone & Data Systems
Rank 497: First American Financial
Rank 498: NVR
Rank 499: Cincinnati Financial
Rank 500: Burlington Stores

Download report:
Detailed report containing observation commentary, correlation plots and word clouds is available for download here: Report


Correlation plot for Ford Motor
R programming language (along with packages "tm", "RGraphviz", "graph" and "slam") was used to prepare this analysis report. Citation for each has been provided below.

Do leave comments below. Thanks.



Citations:
  • R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
  • Kurt Hornik, David Meyer and Christian Buchta (2016). slam: Sparse Lightweight Arrays and Matrices. R package version 0.1-35. https://CRAN.R-project.org/package=slam
  • Kasper Daniel Hansen, Jeff Gentry, Li Long, Robert Gentleman, Seth Falcon, Florian Hahne and Deepayan Sarkar (2016). Rgraphviz: Provides plotting capabilities for R graph objects. R package version 2.16.0.
  • R. Gentleman, Elizabeth Whalen, W. Huber and S. Falcon (2016). graph: graph: A package to handle graph data structures. R package version 1.50.0.
  • Ingo Feinerer and Kurt Hornik (2015). tm: Text Mining Package. R package version 0.6-2. https://CRAN.R-project.org/package=tm 
  • Ingo Feinerer, Kurt Hornik, and David Meyer (2008). Text Mining Infrastructure in R. Journal of Statistical Software 25(5): 1-54. URL: http://www.jstatsoft.org/v25/i05/.
  • Latest annual reports were collected from individual websites of each of the 21 companies analyzed in the report.

Thursday, July 7, 2016

Chilcot Report : Text analytics (Wordcloud and correlation plots)

Hello readers,

The Chilcot Report has been published.
You can read the report here:
http://www.iraqinquiry.org.uk/the-report/

I have taken a shot at it by merging all the 58 PDF files into a merged PDF document.
I then converted it into text and made a wordcloud.
The wordcloud is as under:
More analysis to follow.
Thanks.

UPDATE:

The correlation plot has been prepared. It can be viewed from the following link:
CHILCOT REPORT ANALYSIS

Tuesday, January 12, 2016

Pyaasa- Timeless Indian classic movie- word cloud of dialogues

Source: subscene's subtitles
Package: wordcloud
(click to enlarge)
Pyaasa movie-wordcloud
Pyaasa movie-wordcloud

Devdas 2002- word cloud

Source: sub scene subtitles
R package: word cloud
(click to enlarge)
Devdas (new)
Devdas (new)

Godfather 1, 2 & 3 - word cloud of dialogues

Source: sub scene subtitles
Packages: word cloud
                                                               (click on image to enlarge)
Godfather 1
Godfather 1
                                                           (click on image to enlarge)
Godfather 2
Godfather 2

                                                              (click on image to enlarge)
Godfather 3
Godfather 3

Sunday, January 10, 2016

New Testament- Wordcloud

Source: Internet
R-package used: wordcloud
                
                      (click on image to enlarge)

What is Strategy by Porter- Wordcloud

R-package used: wordcloud

This text was parsed from the PDF file and therefore a lot of words are grammatically incorrect due to inefficient text extraction algorithm. Still, you can make sense of most of the terms by using your own good judgement.

(Click on image to enlarge)

Little Miss Sunshine- Wordcloud of all the dialogues

Source: Little Miss Sunshine subtitles at 
http://www.opensubtitles.org/en/search/sublanguageid-eng/idmovie-20504
R-package used: wordcloud


(click on image to enlarge)


Alice's Adventures in Wonderland: Wordcloud

Data source: Project Gutenberg
R-package used: wordcloud

(click on image to enlarge)

Pride and prejudice by Jane Austen: Wordcloud

Source of text: Project Gutenberg
R-packages used: wordcloud
(Click on image to enlarge)

War and peace by Leo Tolstoy: Word cloud

Source: Project Gutenberg
R-package used: wordcloud
(Click on image to enlarge)

Great Expectations by Charles Dickens: A wordcloud


Data source: Gutenberg.
R-packages used: wordcloud
(Click on image to enlarge)







Friday, January 8, 2016

Text analysis and word clouds of 3 memorable speeches


Winter of discontent, Margaret Thatcher

(Click on image to enlarge)
I have a dream, Martin Luther King (original, unedited speech. Contains n words.)

(Click on image to enlarge)
 Tryst with destiny, Jawahar Lal Nehru

(Click on image to enlarge)

Updated Monthly Job Posts Trend Chart - July 2025

Hi, This is not the sum of all the jobs posted in India.  Instead, this is the chart containing ana analysis of job results collected and sa...