Introduction to data analysis & visualization through excel

With the advent of cheap computation power (PC), the amount of data and related graphs/charts & info-graphics has seen a tremendous growth in last few decades.  While there are hundreds of specialized tools available in the market, excel remains the most widely used tool due to its mass reach.

Excel provides a great set of tools for data analysis and visualizations. With a bit of creativity & efforts it can be extended beyond the obvious uses to produce stunning results.

Given the barrage of options that excel provides in terms of formatting/graphs/charts etc, it is important to understand few basics of data analysis & visualizations.

Here are the 5 top most rules that one must keep in mind before starting any data analysis/visualization work:

1.       Keep User Focus: If the exercise is not just about data exploration (when you just analyze the data for your own understanding of the data set), usually you analyze data and create visualizations for an audience. The most important rule is to keep that audience in mind.

Things to avoid:

i.  Using a specific visualization because you want to showcase your mastery on it

ii.  Presenting incomplete pictures, not having information on most obvious follow-up questions

Actually the visualization has to address a problem. In the above given graph, perhaps if the context was to analyze lower than planned value growth, a comparison with planned growth will make more sense(and this individual growth rates will make little sense)

 

2.       Right tools for the problem: We come across several examples of really bad visualizations, examples of using a chart when none is required or of dumping a lot of text & numbers when a chart was really required. Also, using a wrong type of chart is a commonplace problem.

In the example given below to summarize (visualize) the information given in the table, there is perhaps no need to go for any visualization at all, as there are just 2 data points.

If the intention is to focus on the relationship between the data points(achievement of 18 Mn vs target of 97 Mn for the year), the below given second visualization works better due to its minimal design and better representation of information.

3.       Substance over style: Why do you go for visualization? Is it to depict an important relationship in the data in a concise & easy to understand format or just to impress others with some visual elements. Good design requires that a bulk of your focus should be on the first step(simple & meaningful representation of data). The result can be spruced up a bit for visual appeal, but too much focus on visual appeal or too many elements to create a visual appeal(like multiple colors, borders, gradients, shadows, 3D effects etc) will ruin the message most of the times.

 

4.        Minimize data distortion: Being in control of large data sets can often tempt us to distort the data in favour of what we want the audience to see (& not what the data talks about itself). This is where we kill the spirit of data analysis. Users will rather go through the data in its entirety if they suspect that they are being misled through these summarization/visualisation.

 

One example of such distortion is given below:

distorting information through visualization

Company XYZ is growing at a spectacular rate of 23%, much higher than any of its competitor. An overenthusiastic analyst comes up with the first picture to show this situation. Please note that the axis starts at 10% and hence the whole picture is distorted.

The legends don’t match with the length of the bars (hence what’s the use of this visualization?), and by just looking at the height of columns, you’ll get an impression that XYZ is growing at more than 6X rate of company B(while actually it is 23% vs 12%)

A neutral representation of the same is shown in this second figure.

5.       Understand & tell the story:

Last but the most important thing is to understand the story that is being presented through the visualization. Disjointed charts have little meaning and are just replacing data tables. What makes visualization work is an underlying story (like sales are not increasing much but profits are growing at a steady rate, or to say that human development & per capita GDP has a correlation across countries). Mundane, day-to-day reporting also provides enough chances to discover/weave a story around the numbers. 

 

If this excites you and you want to know more, please treat yourself with the work of Edward. R Tufte, the all time best book on the subject : The Visual Display of Quantitative Information

If you are just starting your excel journey, please go through these great articles to learn simple data visualtion techniques on excelladder.