This, the third, in Jim Lane's series of articles on data modelling, looks more specifically at visualising data. There are countless textbooks available on the subject - but this is not a literature review. This article captures the thoughts and experiences that both myself and colleagues have learned over many years working in healthcare.
They say that a picture paints a thousand words. By the same token, a well constructed visualisation can paint a thousand rows of data
Spreadsheet outputs have not historically been the most beautiful things to look at, but that no longer needs to be the case. There are now many options for making data look great, whilst also increasing the impact it can have. There are now many tools available to help bring data to life - my personal preference is Tableau, but it's worth noting that the functionality of Microsoft Excel has improved significantly. Sometimes all it takes is just a little bit of patience, creativity and a small dose of effort.
A good visualisation should maximise the cognitive ease of the reader – it should be easy on the eye, but more importantly it should articulate the message with clarity and concision
The success of a visualisation can be directly linked to its ability to maximise the cognitive ease associated with it. This is a measure of how easy it is for us (our brains) to process information. It can alter how we feel about something and can determine the time and effort we might invest in trying to understand and rationalise something. That’s enough on psychology for now however, I’d rather focus on the reality of my own experience and the experience of colleagues.
A well-constructed visualisation can help answer specific questions and shine a light on where to look deeper
As a developer of visualisations, it's really important to understand what end-users of the output are looking for. Having canvassed the views of colleagues and clients, I have compiled a short list of requirements which may well resonate with others. These are therefore key things that you might wish to bear in mind when developing and designing your own visualisation (viz).
- They need to be insightful, support problem solving and be action oriented.
- They need to be intuitive and self explanatory, allowing you to easily interpret the key takeaways. They should draw you to the right things.
- They need to be technically elegant and eye catching with visual hooks which draw people in.
- They need to be clearly marked, legible and annotated if possible. Minimal time should be spent trying to acclimate to and decode the viz (i.e. trying to interpret the axis categories).
- They need to be purposeful and help to tell you something that isn't obvious from just looking at the underlying data.
- They should try and tell a story clearly and unambiguously. They need to clearly indicate what is good/bad at a glance, helping to develop and test hypotheses by understanding cause and effect relationships.
it's all about storytelling
Albeit a snapshot, that's quite a lot of things to consider and honestly, it won't always be possible to tick everything off this list each time, but it is important to understand which are the most important things based on the questions you're trying to answer. So what could you think about to support this:
- Make it is easy on the eye. Use colours appropriately and think of the audience and how they will be viewing the viz. If it's on paper, then bright, fluorescent colours tend not to work. If it's on screen think about the quality of the projection as this may affect the size of the font you use.
- Use clear labels and give your viz a title! Unless it is blatantly obvious or communicated through annotations, a graphic without a clearly marked axis will mean that the reader has to spend unnecessary time trying to interpret what they're looking at.
- Make it easy to understand. You might be able to produce the most amazing Sankey diagram ever seen, but if the intended reader can’t understand it, then it has no value. Appreciate the end-user, their technical competence and how much time they will have to understand the viz. With free reign, I recently developed a Sankey diagram to help a client understand how patients flowed through a system, but it was rejected because they didn't understand it.
- Be creative, but don’t reinvent the wheel. I'll admit that to borrowing ideas from others. I take the view that if a certain viz has worked on me, they’ll likely work on others too.
- Make the visualisation relevant. Select the right type of graphic for your audience. Sometimes a simple bar chart is all that’s needed and is the most appropriate method. Also consider how the viz will be communicated - if the target audience is small and you're on hand to explain things, then you can possibly be more creative. If however you're presenting something to a much broader audience, keep things as simple as possible and test it out on a few people first.
- Annotate the graphic and provide relevant narrative which will help to explain how to interpret the viz. What may seem obvious to you may not be to others. Try and put yourself in the shoes of the end user, think about the questions they might ask when trying to interpret the viz and then build these in to the narrative.
- Ensure it answers or supports the question at hand. I make no apologies for referencing this a number of times as ultimately this is how success will be measured.
These are not hard and fast rules, they're based on feedback I've had in the past and on the situation at hand. Perhaps the best way to articulate some of these points is to do it through real examples. Below are a number of examples from work that I have previously undertaken. These outputs have been produced using Tableau, but that is just one of a myriad of tools available, others such Qlikview are equally capable of producing similar outputs.
The first example was used to help answer questions about variation in A&E performance throughout the hours of the day and days of the week. There is lots of information on the page, but this is intentional as it helps build a story of variation. We're often presented with averages at such a high level of detail, (A&E sit-reps are calculated monthly for example), but this masks considerable variation on a hourly and daily basis. If you need to understand when to schedule staff to reduce breaches for example, then having average monthly data is of no use at all. The graphic is supported by a narrative to help the reader understand in more detail what is being shown. Hopefully it is visually appealing and the use of colours and sizing of text is consistent.
It's worth remembering that an individual visualisation doesn't need to answer every question in detail, but it should at the very least shine a light on where you may need to look. In the example above, you may want to look at the acuity of patients or the workforce supply to determine if that is impacting on performance and/or admission rates at particular times of the day/week.
Below is an example of how a fairly simple graphic can represent a number of different dimensions. The x and y axis represent individual values relating to risk. The third dimension is the use of colour to represent different categories. Size, the fourth dimension, is used to provide an indication of an adjusted 'score'. Combined, these latter two make it easy to determine whether particular cohorts fall within certain thresholds. The fifth element is the value shown within each bubble, which helps to identify a particular component.
The third example below uses a combination of run charts, bar charts and thematic colouring. This viz was designed to answer questions about the use of endoscopy lists, (a list is a four hour period during which endoscopes are undertaken), from the perspective of activity. The analysis addressed the problem from two perspectives. The first was to look at the overall average number of points per list on a weekly basis to see if this was changing. This suggested that the average was increasing for the period, however the annotated notes provide further points to consider in terms of the linear trend. The second perspective was to look at the profile of the number of lists with 'x' points per list on a monthly basis, (the switch from weekly to monthly was required due to data volumes). This showed that in Feb-18, there was a small change in the distribution, which warranted further investigation.
The next example looks at bed census. The underlying question was about understanding how different cohorts of patients were occupying beds on a 6 hourly basis over an extended period of time. Typically this information provides useful reference material and can help to understand which particular cohorts of patients may need to be targeted. By including a reference line showing the average within each cohort, we're able to quickly see that a significant number of beds are occupied by older patients, (300 beds occupied by patients aged 65+), and long stay patients, (>150 beds occupied by patients with an acute length of stay of more than 3 weeks). In both cases there is clearly variation across the period which could be investigated further by using different views on the data.
The colour schemes have been used to make the visualisations more appealing. The selected colour scheme should also be used consistently across any supporting analysis.
In summary, there is no single 'right' way to visualise data, but there are certainly many wrong ways to do it. This article has provided a number of considerations that you may want to adopt when visualising data. The most important considerations from my perspective are the ease with which the viz can be interpreted and whether it draws people in, providing a clear and unambiguous storyline.