Data Visualization Criticism…
Static Visualization Critique
In this blog, I will critique a static data visualization in detail.
The goal of the network graph as shown in Figure 1 is to determine the correlation between the frequent adjectives used by the sommeliers.
This visualization shows the words and phrases used to describe red and white wines and the words used to describe tannins, acidity, and much more. These words are mapped to determine whether they are related strongly or weakly. Furthermore, the visualization depicts the average wine ratings for the corresponding terminology.
This visualization can be utilized by oenophiles to best describe the wine. They can use this visualization to discover particular words other connoisseurs use. It can also help them in determining whether the words are interchangeable or not when describing wine, and even to ascertain if the words are used correctly or not while describing.
This visualization depicts three data variables average rating, word count, and correlation, and the encodings used are hue, area, and connectivity respectively.
The visualization made by the author is simple and comprehensible. Anyone who will have a quick look at it can understand it quite easily without much effort and time. Labels are placed carefully in the network to avoid hotchpotch. Moreover, the data points are placed in organized clusters as shown in Figure 2 the top left corner comprises adjectives pertaining to the fruity flavors. The data points taken from the bottom right as shown in Figure 3 have words related to earth and herbs. Moreover, the data points that have no relationship to other words are thoughtfully placed in the boundary of the network.
The colors used in this visualization chart are indistinguishable. It uses a dichromatic color palette making color shades difficult to differentiate thus, requiring extreme cognitive efforts. Further, the circle size uses the area encodings making it a game of guesstimate to determine the actual value. With this encoding, we can only ascertain a range of values that each point can have based on the legend. Apart from this, the weak links depicted in the graph are not recognizable enough making it difficult for the user.
To better improve this visualization the author can use a different color scheme so that all points are a bit distinguishable. The color used to represent the weak can be a bit darker as the lighter shade merges with the background. Instead of using the area for the number of occurrences, the author could have used the different shapes to depict 4 different occurrences. Moreover, the author could have included the most words used by connoisseurs using a horizontal bar chart at the bottom that could tell us any particular phrases that they prefer.
Even with obstacles, the network graph highlights the most used words and their synonyms which was the main purpose of the visualization.
I like this particular visualization because when I came across this dataset I was not able to properly convey the right information from it (shown in figure 2). The author has imported an extra dataset that includes the most used adjectives of wine (shown in figure 3). Then he obtained these adjectives used in the wine description column of the dataset and mapped them to the average ratings of wine as well. After having dabbed with this dataset I can say that this is the best visualization to represent the wine description column. This dataset was challenging and I couldn’t find a better way to envision it. However, the author has omitted other columns like region and price. The author could have included that as well to better understand the type of wine belonging to a particular region.
The visualization is by Dr. Torsten Sprenger — https://github.com/spren9er/tidytuesday/blob/master/images/tidytuesday_201922_wine_ratings_most_frequent_words. png, 2019.