Originally published 10 October 2006
This article is the third in a five-part series that features the winning solutions to our 2006 Data Visualization Competition. The third scenario in our competition asked participants to create a visual display that would enable real estate agents to monitor what’s going on in the housing market.
Here’s the scenario as it was described to participants:
As an analyst for a group of real estate agents, you want to create a visualization that will allow them to view several characteristics of house sales in a given month to help them better track and understand what's happening in the housing market. This group of agents deals with properties in five neighborhoods. You believe that they would gain meaningful insights if they could simultaneously examine several sales-related variables at once to make useful connections, so you want to display everything on a single page (but not necessarily in a single graph). You believe that each of the data items that appear below in the data section is significant.
I supplied participants with an Excel spreadsheet that reported individual house sales for the month divided into five neighborhoods, including the actual sales amount, the original asking price, and the number of days each house was on the market. Also, per neighborhood, the spreadsheet provided the median actual sales amount and the median original asking price for the month and for the same month of the previous year.
While reviewing the many solutions that were submitted for this scenario, in addition clarity of communication and ease of use, I was looking for a display that would allow people to see the following:
All of these characteristics displayed within eye span on a single screen or page.
The Winning Solution
Jock Mackinlay of Tableau Software submitted the winning solution for this scenario, which he created using Tableau 2.0, which has since been updated to the 2.1 release. Tableau Software is uniquely capable of displaying complex quantitative relationships, such as those featured in this solution.
Take a moment to look at Jock’s solution (Figure 1) to determine for your self how well he succeeded in communicating this complex picture of the real estate market.
Figure 1: Jock Mackinlay of Tableau Software’s winning solution.
Let’s begin our examination of Jock’s solution by allowing him describe it in his own words:
Real estate agents have a keen interest in the sales price of a property because that determines their commission. They are also interested in how long properties take to sell in a given neighborhood and the relationship between the selling price and asking price. These metrics normally correlate with the satisfaction of their customers. Agents want to get referrals and repeat business.
Our design visually represents the individual property sales for May 2006 as a small-multiple chart for the five neighborhoods. It supports the individual and comparative analysis of these neighborhoods. Each display compares a property’s days on market to its sales price with the data shared between the small-multiples. We designed it this way because that is the most important data for real estate agents. The design also allows us to incorporate the median prices for 2005 and 2006 for each neighborhood.
The marks are fundamentally Gantt bars. One end is the selling price and the other end is the asking price. The length is the variance. We also facilitate the reading of these marks by using color to encode positive and negative variance. The unconventional rendering of these Gantt bars emphasizes the selling price. The rendering also helps to distinguish marks that are adjacent.
Although we could have calculated and visualized aggregate statistics for this data, we have chosen to plot every property sale because real estate agents would find this very accessible. They would be able to find their individual sales in the visualization. They can scan the display to ascertain a general sense of the statistical distributions.
For example:
- Badlands has a tight cluster of property sales that sell quickly and with a significantly positive variance when visually compared to the other neighborhoods. The median marks show little change from 2005. This is a good neighborhood except that the selling price is low when compared to the other neighborhoods.
Shady Ways has two or three clusters with the lowest cluster below Badlands. This clustering would have been lost with views that used aggregate statistics. The variance is negative and pretty large. After a slight gap, the properties sell pretty quickly. People asked for more this year and got more. This looks like a good neighborhood for a real estate agent, if you can avoid the low cluster.- Melancholy Acres has a reasonably tight selling price but a wide range of days on the market. As you would expect, the longer the property is on the market, the more the selling price drops from the asking price. The median asking price went up from the previous year, but the median selling price is down slightly. This is a good neighborhood for agents, if you avoid the slow selling properties.
- Somnolent Community has many property sales, but they vary widely in amount and the days on market. The variance is pretty negative. The median data shows a flip with the previous year. In 2005, people got more than they asked. In 2006, they asked for that amount, but got what was asked in 2005. This looks like a difficult neighborhood. On the positive side, it has many sales.
- Filthy Richlands does not have very many sales, but it has a higher commission. The number of days on the market is very predictable. People asked for a lot more in 2006 and got less than in 2005.
Jock designed an elegant solution. He found a way to use the visualization capabilities of Tableau 2.0 to create an innovative solution that follows the rules of visual perception, resulting in a display that is easy to read and comprehend, despite the many data elements it combines into a single presentation.
Notice how easy it is to see both summary (expressed as medians for each neighborhood) and detail information in the same display. Notice also how easily you can compare neighborhoods. By enabling real estate agents to see all of these critical variables together, connections can be made between them that might not otherwise be discovered.
In this series of articles, I normally make a few suggestions for how the solution can be improved, but I’m at a total loss to find anything significant lacking in Jock’s solution. I’m determined, however, to add some value, so I’ll get really picky and point out that better colors than red and green could have been used to differentiate positive and negative variances between asking prices and actual sales amounts, because about 10% of males and 1% of females – those who are colorblind – would have difficulty discriminating between these colors. This is not a big deal, however, because the differing shapes of the positive variances (shaped like a “T”) and the negative variances (shaped like an upside-down “T”) are fairly easy to discriminate, even without the color differences.
Solutions that Fell Short
Let’s try to learn something from some of the solutions that failed to work in one way or another. The first appears to have been created with software from Tableau, just like Jock’s winning solution, but it doesn’t exhibit the same clear representations of the data. In fact, I find it very difficult to interpret. This underscores the fact that having good software doesn’t guarantee a good solution. You must still apply your skills to visually encode information in ways that clearly and accurately present it to the eyes of your readers.
Figure 2: This solution is difficult to interpret.
Let’s isolate a single graph (see Figure 3) in this series of 10 graphs and attempt to understand it. Here’s how it works:
Figure 3: A single graph from the complete solution that appears in Figure 2.
Here are some of the reasons this display doesn’t communicate effectively:
This was a creative attempt to meet the requirements for this display, but it doesn’t pass the test of clear and efficient communication.
The next example is not as difficult to understand as the last, but it also fails to communicate effectively, primarily because it breaks several rules of visual perception. Take a couple of minutes to review it on your own and list the problems that you find.
Figure 4: This solution breaks several rules of visual perception.
Here are a few of the problems that I found:
The final example is typical of many displays today in that it focuses on flashy visual effects, rather than clear and efficient communication. Here are some of its specific problems:
Figure 5: This solution is filled with visual effects, decoration, and textual clutter, which distract from the data.
I hope you’ve found these solutions and my critiques informative. Next month, I’ll feature the winning solution to scenario #4 of our 2006 Data Visualization Competition, which succeeds in designing a dashboard that be used quite effectively by airline executives. If you’re interested in dashboard design, I believe you’ll find this next article enlightening.
SOURCE: Simple Displays of Complex Quantitative Relationships
Recent articles by Stephen Few
Comments
Want to post a comment? Login or become a member today!
Be the first to comment!