Tag Archives: Design

Stephen Few: Information Dashboard Design

Readers:

Stephen_Few2I am in Portland, Oregon this week attending three data visualization workshops by industry expert, Stephen Few. I am very excited to be sitting at the foot of the master for three days and soak in all of this great dataviz information.

Today, was the second workshop, Information Dashboard Design which is based on Steve’s best-selling book (see photo below).

To not give away too much of what Steve is teaching in the workshops, I have decided to discuss one of the dashboard exercises we did in class. The goal here was to find what we feel is wrong with the dashboard.

I will show you the dashboard first. Then, you can see our critique below.

You can find future workshops by Steve on his website, Perceptual Edge.

Best Regards,

Michael

Information Dashboard Design

 

Dashboard To Critique

CORDA Airlines Dashboard

Critique Key Points

  • Top left chart – Only left hand corner chart has anything to do with flight loading
  • Top left chart – are flight numbers useful?
  • Two Expand/Print buttons – Need more clarity (right-click on chart would be a better choice)
  • Top right chart – Poor use of pie charts – size of pies are telling largest sales channel – use small multiple bar charts, total sales as a fourth bar chart
  • Redundant use of “February” – In the title and in charts
  • Bottom left chart – why does it have a pie chart in it?
  • Bottom right chart – map may be better as a bar chart (geographical display could be useful if we had more information). Current way bubbles are being expressed is not useful (use % cancellations instead). Symbols may have a different meaning every day
  • Bottom right chart – CORDAir Logo – is this necessary?
  • Location of drop-down. Not clear if it applies to top left chart or all charts
  • Backgrounds – heavy colors, gradients
  • Instructions should be in a separate help document. Only need to learn this once.
  • Top left chart: Faint Image in background. Suppose to look like a flight seating map. Do you really want to see this every day? It is a visual distraction.
  • IMPORTANT: Is there visual context offered with any of the graphs? No. This is critical.

————————————————————————————————-

Dashboard Example Source: Website of Corda Technologies Incorporated, which has since been acquired by Domo.

The Lean UX Manifesto – Principle Driven Design

Lean UX Manifesto

Who Wrote The Manifesto?

This was a joint effort between Anthony Viviano, Ajay Revels and Ha Phan.

Anthony and Ajay work at a large financial institution and are trying to apply Lean UX within their enterprise. Ha works at a startup called Porch.

Why a Manifesto?

So, why a manifesto? Anthony, Ajay and Ha were inspired by the Agile Manifesto. Anthony stated that it is simple and to the point. It’s not a list of rules, but a value statement that can be used to guide you through a project or an organizational change. It’s tempting to lay down rules. As if to say, “this list of methods are required to practice Lean UX. Check these boxes in your process and you can brand this a Lean UX project.” We don’t like rules. We prefer principles that drive the methods needed.

Lean UX applies well to uncertainty, but not everything is uncertain. You may know your customer, so you can breeze through customer development. Or, you may already have a design, so a design studio is not needed.In addition to their anti-rules stance, there’s another reason why a manifesto makes sense. Anthony heard a few practitioners say that only a startup can apply this process in its purest form. While that might be true, enterprise entrepreneurs (a.k.a. intraprenuers) shouldn’t be excluded from this great thinking. We can take advantage of it by doing what we can to customize it to our unique culture and structure.

Anthony, Ajay and Ha hope you allow these values to guide you through your Lean UX journey.

The Lean UX Manifesto

Anthony, Ajay and Ha are developing a way to create digital experiences that are valued by our end users. Through this work, we hold in high regard the following:

  • Early customer validation over releasing products with unknown end-user value
  • Collaborative design over designing on an island
  • Solving user problems over designing the next “cool” feature
  • Measuring KPIs over undefined success metrics
  • Applying appropriate tools over following a rigid plan
  • Nimble design over heavy wireframes, comps or specs

As stated in the Agile Manifesto, “While there is value in the items on the right, we value the items on the left more.”

How The Lean UX Manifesto Works

Let’s take each of these in turn and see how we can follow the principles of lean UX.

Early Customer Validation Over Releasing Products With Unknown User Value

What if you worked at a company where usability testing just wasn’t done? Unfortunately, this is the sad state in which many of our fellow UX practitioners find themselves. How, then, do they follow the principles of lean UX?

With usability testing, we seek customer validation or early failure. Customer validation may be sought through other means as well. For example, does your company gather feedback from users? If that feedback is circulated, are you on the list of people who receive it?

Here are other sources of learning about customer needs:

  • Customer service representatives
    Their focus is on helping customers overcome experience issues. Try to speak to them regularly. They are likely documenting their calls, so see whether you can create some system for tagging experience issues that you can follow up on.
  • Sales representatives
    This is another group that is focused on the customer. They will understand what customer problems are out there to be solved. They’ll also know which features are important and which innovations customers want.
  • Website search data
    This is an invaluable source of customer desires. Search data can uncover website navigation problems and features or problems that customers are looking for.

Salespeople and customer service reps can be great sources of customer needs.
Salespeople and customer service representatives are great sources of learning about customer needs. (Image: Renato Ganoza)

Collaborative Design Over Designing on an Island

Design should not be a solo exercise. Being a design team of one is no excuse. Anthony uses the design studio process and adopt the role of facilitator. Gather team members who own a piece of the project, and host a design studio workshop. Include at least the following people (adjusting to suit your unique organization):

  • Domain owner
    Your subject matter expert
  • Requirements Owner
    A business analyst or the person who gathers and writes the requirements
  • Data provider
    A data analyst on hand who is familiar with the analytics and can pull the info you need
  • Technology owner
    A developer, someone who understands the technology constraints and design patterns
  • Product or business owner
    A product manager or the person who owns this piece of business
  • Designer
    The UX or visual designer or person who owns the design and can facilitate the design studio
  • Researcher
    The usability analyst or UX researcher or person who owns customer development and persona creation

Solving User Problems Over Designing the Next Cool Feature

When you’re handed a requirements document, a thought-out solution, a feature, a brief or whatever artifact you receive to inform your work, begin by asking, “What problem are we trying to solve?” Ideally, you should clearly understand the customer’s problem. Design is problem-solving, so if you don’t know the problem, you can’t design a solution. If you do this enough, then the stakeholders will understand that you’re more than just a wireframe jockey. You’re a professional problem-solver with a system for creating solutions that make sense.

Measuring KPIs Over Undefined Success Metrics

You can’t measure success if you aren’t… er, measuring. Avoid vanity metrics. Anthony loves Dave McClure’s pirate metrics:

  • Acquisitions
    Users come to the website from various channels.
  • Activation
    Users enjoy their first visit (a “happy” user experience).
  • Retention
    Users come back, visiting multiple times.
  • Referral
    Users like the product enough to refer others.
  • Revenue
    Users conduct some monetization behavior.

Applying Appropriate Tools Over Following a Rigid Plan

Lean UX should be a flexible process. As Anthony started to develop the process steps for one cycle, he found himself overwhelmed with the number of tools being recommended. Anthony’s advice, similar to what he had said when creating a minimum viable product, is to apply the minimum tools required to get you to “pivot” or “persevere.”

Here are a few tools that Anthony found useful (not an exhaustive list):

  • provisional personas, right sized for the effort;
  • persona map (which we learned from Menlo Innovations);
  • assumptions, with the riskiest identified;
  • design studio;
  • paper prototyping in early stages;
  • digital prototyping (HTML preferred) in later stages;
  • guerilla design assessment (a better name for usability testing);
  • co-location wherever possible.

The design studio method is popular for collaborative design
The design studio method is popular for collaborative design. (Image: visualpun.ch)

Everything else should be applied as it makes sense. For example, if more customer development is needed, then take the time to interview as a team and to internalize customer needs. The lean startup world has no shortage of tools. Use only the ones that make sense to your project and that get you to a validated solution faster.

Nimble Design Over Heavy Wireframes, Comps or Specs

The goal is to release a product. Once it’s released, users won’t interact with the wireframes or requirements document as part of the product. They will interact with the product itself. So, try to spend less time on your design artifacts.

How can you lighten your wireframes?

  • Lighter annotations and more presentation
    Anthony found that if I take the time to present my unfinished wireframes to stakeholders, He would get valuable feedback sooner and save time.
  • Co-design
    If developers, quality assurance testers and business analysts are involved in the design, then they will share ownership and internalize the annotations. When this happens, you can pass off sketches as wireframes because team members will already understand the interactions.
  • Paper prototypes
    These serve a dual purpose. They get you to design validation (i.e. usability testing) sooner, but they also demonstrate the interactions. No need to write detailed wire annotations if the user can see the interactions firsthand.

It’s All About Principle-Driven Design

This all boils down to something that I call principle-driven design. As stated, some lean UX is better than none, so applying these principles as best you can will get you to customer-validated, early-failure solutions more quickly. Rules are for practitioners who don’t really know the value of this process, while principles demand wisdom and maturity.

By allowing principles to drive you, you’ll find that you’re more nimble, reasonable and collaborative. Really, you’ll be overall better at getting to solutions. This will please your stakeholders and team members from other disciplines (development, visual design, business, etc.). To quote the late Stephen Covey:

“There are three constants in life: change, choice and principles.”

——————————————————

Sources:

[1] Anthony Viviano, Ajay Revels and Ha Phan, The Lean UX Manifesto, http://www.leanuxmanifesto.com/.

[2] Anthony Viviano, The Lean UX Manifesto: Principle-Driven Design, Smashing magazine, January 8, 2014, http://www.smashingmagazine.com/2014/01/08/lean-ux-manifesto-principle-driven-design/.

 

Fast Company: Is Flat Design Already Passé?

Source: John Brownlee, Is Flat Design Already Passé?, Fast Company, Co.DESIGN, April 11, 2014, http://www.fastcodesign.com/3028944/is-flat-design-already-passe.

Skeuomorph CalendarBlog Note: A skeuomorph is a derivative object that retains ornamental design cues from structures that were necessary in the original. Examples include pottery embellished with imitation rivets reminiscent of similar pots made of metal and a software calendar that imitates the appearance of binding on a paper desk calendar (see image to the right).

Over the last few years, we’ve seen an upheaval in the way computer interfaces are designed. Skeuomorphism is out, and flat is in. But have we gone too far? Perhaps we’ve taken the skeuomorphic death hunts as far as they can go, and its high time we usher in a new era of post-flat digital design.

John Brownlee

John Brownlee

Ever since the original Macintosh redefined the way we interact with computers by creating a virtual desktop, computer interfaces have largely been skeuomorphic by mimicking the look of real-world objects. In the beginning, computer displays were limited in resolution and color, so the goal was to make computer interfaces look as realistic as possible. And since most computer users weren’t experienced yet, skeuomorphism became a valuable tool to help people understand digital interfaces.

But skeuomorphism didn’t make sense once photo-realistic computer displays became ubiquitous. Computers have no problem tricking us into thinking that we’re looking at something real so we don’t need to use tacky design tricks like fake stitching or Corinthian leather to fool us into thinking our displays are better than they are. In addition, most people have grown up in a world where digital interfaces are common. UI elements don’t have to look like real-world objects anymore for people to understand them.

This is why Jony Ive took over the design of Apple’s iOS and OS X operating systems and began a relentless purge of the numerous skeuomorphic design elements that his predecessor, skeuomaniac Scott Forstall, created. To quote Fast Company’s own John Pavlus, “skeuomorphism is a solution to a problem that iOS no longer has,” and that’s true of other operating systems and apps too. Google, Microsoft, Twitter, Facebook, Dropbox, even Samsung, they’re all embracing flat design, throwing out the textures and gradients that once defined their products, in favor of solid hues and typography-driven design.

This is, without a doubt, a good thing. Skeuomorphism led to some exceedingly one-dimensional designs, such as iOS 6’s execrable billiard-style Game Center design. But in an excellent post, Collective Ray designer Wells Riley argues that things are going too far.

Flat design is essentially as far away from skeuomorphism as you can get. Compare iOS 7’s bold colors, unadorned icons, transparent overlays, and typography-based design to its immediate precessor, iOS 6. Where once every app on iOS had fake reflections, quasi-realistic textures, drop shadows, and pseudo-3-D elements, iOS 7 uses pure colors, no gradients, no shadows, and embraces the 2-D nature of a modern smartphone display. But while flat design has made iOS 7 look remarkably consistent, it has also removed a lot of personality from the operating system. It’s like the Gropius house, when the old iOS 6 was a circus funhouse. Maybe it needs to get a little bit of that sense of madness back.

Here’s how Riley defines elements of a post-flat interface:

• Hierarchy defined using size and composition along with color.

• Affordant buttons, forms, and interactive elements.

• Skeuomorphs to represent 1:1 analogs to real-life objects (the curl of an e-book page, for example) in the name of user delight.

• Strong emphasis on content, not ornamentation.

• Beautiful, readable typography.

 

Riley’s argument is that flat design has allowed digital designers to brush the slate clean in terms of how they approach their work, but it has also hindered a sense of wonder and whimsy. Software should still strongly emphasizes content, color, and typography over ornamentation, but why is, say, the curl of a page when you’re reading an e-book such a crime, when it so clearly gives users delight?

“Without strict visual requirements associated with flat design, post-flat offers designers tons of variety to explore new aesthetics—informed by the best qualities of skeuomorphic and flat design.” Riley writes. “Dust off your drop shadows and gradients, and introduce them to your flat color buttons and icons. Do your absolute best work without feeling restricted to a single aesthetic. Bring variety, creativity, and delight back to your interfaces.”

Maybe Riley has a point. Why should mad ol’ Scott Forstall be allowed to ruin skeuomorphism for everyone? With the lightest of brush strokes, skeuomorphism can be used to bring back a sense of personality and joy to our apps. For those of us growing listless in the wake of countless nearly identical “flat” app designs, he makes a good point. It is time for the pendulum towards flat and away from skeuomorphism to swing back, if only a little bit.

An Introduction to Data Blending – Part 4 (Data Blending Design Principles)

Readers:

In Part 3 of this series on data blending, we examining the benefits of blending data. We also reviewed an example of data blending that illustrated the possible outcomes of an election for the District 2 Supervisor of San Francisco.

Today, in Part 4 of this series, we will discuss data blending design principles and show another illustrative example of data blending using Tableau.

Again, much of Parts 1, 2, 3 and 4 are based on a research paper written by Kristi Morton from The University of Washington (and others) [1].

You can learn more about Ms. Morton’s research as well as other resources used to create this blog post by referring to the References at the end of the blog post.

Best Regards,

Michael

Data Blending Design Principles

In Part 3, we describe the primary design principles upon which Tableau’s data blending feature was based. These principles were influenced by the application needs of Tableau’s end-user. In particular, we designed the blending system to be able to integrate datasets on-the-fly, be responsive to change, and driven by the visualization. Additionally, we assumed that the user may not know exactly what she is looking for initially, and needs a flexible, interactive system that can handle exploratory visual analysis.

Push Computation to Data and Minimize Data Movement

Tableau’s approach to data visualization allows users to leverage the power of a fast database system. Tableau’s VizQL algebra is a declarative language for succinctly describing visual representations of data and analytics operations on the data. Tableau compiles the VizQL declarative formalism representing a visual specification into SQL or MDX and pushes this computation close to the data, where the fast database system handles computationally intensive aggregation and filtering operations. In response, the database provides a relatively small result set for Tableau to render. This is an important factor in Tableau’s choice of post-aggregate data integration across disparate data sources – since the integrated result sets must represent a cognitively manageable amount of information, the data integration process operates on small amounts of aggregated, filtered data from each data source. This approach avoids the costly migration effort to collocate massive data sets in a single warehouse, and continues to leverage fast databases for performing expensive queries close to the data.

Automate as Much as Possible, but Keep User in Loop

Tableau’s primary focus has been on ease of use since most of Tableau’s end-users are not database experts, but range from a variety of domains and disciplines: business analysts, journalists, scientists, students, etc. This lead them to take a simple, pay-as-you-go integration approach in which the user invests minimal upfront effort or time to receive the benefits of the system. For example, the data blending system does not require the user to specify schemas for their data sets, rather the system tries to infer this information as well as how to apply schema matching techniques to blend them for a given visualization. Furthermore, the system provides a simple drag-and-drop interface for the user to specify the fields for a visualization, and if there are fields from multiple data sources in play at the same time, the blending system infers how to join them to satisfy the needs of the visualization.

In the case that something goes wrong, for example, if the schema matching could not succeed, the blending system provides a simple interface for specifying data source relationships and how blending should proceed. Additionally, the system provides several techniques for managing the impact of dirty data on blending, which we discuss in more in Part 5 of this series.

Another Example: Patient Falls Dashboard [3]

NOTE: The following example is from Jonathan Drummey via the Drawing with Numbers blog site. The example uses Tableau v7, but at the end of the instructions on how he creates this dashboard in Tableau v7, Mr. Drummey includes instructions how the steps became more simplied in Tableau v8. I have included a reference to this blog post on his site in the reference section of my blog entry. The “I”, “me” voice you read in this example is that of Mr. Drummey.

As part of improving patient safety, we track all patient falls in our healthcare system, and the number of patient days – the total of the number of days of inpatient stays at the hospital. Every month report we report to the state our “fall rate,” a metric of the number of falls with injury for certain units in the hospital per 1000 patient days, i.e. days that patients are at the hospital. Our annualized target is to have less than 0.7 falls with injury per 1000 patient days.

A goal for our internal dashboard is to show the last 13 months of fall rates as a line chart, with the most recent fall events as a bar chart, in a combined chart, along with a separate text table showing some details of each fall event. Here’s the desired chart, with mocked-up data:

 

combo bars and lines

On the surface, blending this data seems really straightforward. We generate a falls rate very month for every reporting unit, so use that as the primary, then blend in the falls as they happen. However, this has the following issues:

  • Sparse Data – As I’m writing this, it’s March 7th. We usually don’t get the denominator of the patient days for the prior month (February) for a few more days yet, so there won’t be any February row of measure data to use as the primary to get the February fall events to show on the dashboard. In addition, there still wouldn’t be any March data to get the March fall events. Sometimes when working with blend, the solution is to flip our choices for the primary and secondary datasource. However, that doesn’t work either because a unit might go for months or years without a patient fall, so there wouldn’t be any fall events to blend in the measure data.
  • Falls With and Without Injury – In the bar chart, we don’t just want to show the number of patient falls, we want to break down the falls by whether or not they were falls with injury – the numerator for the fall rate metric – and all other falls. The goal of displaying that data is to help the user keep in mind that as important as it is to reduce the number of falls with injury, we also need to keep the overall number of falls down as well. No fall = no chance of fall with injury.
  • Unit Level of Detail – Because the blend needs to work at the per-unit level of detail as well as across all reporting units, that means (in version 7 at least) that the Unit needs to be in the view for the blend to work. But we want to display a single falls rate no matter how many units are selected.

Sparse Data

To deal with issue of sparse data, there are a few possible solutions:

  • Change the combined line and bar chart into separate charts. This would perhaps be the easiest, though it would require some messing about with filters, hidden reference lines, and continuous date axes to ensure that the two charts had similar axis ranges no matter what. However, that would miss out on the key capability of the combined chart to directly see how a fall contributes to the fall rate. In addition, there would be no reason to write this blog post. :)
  • Perform padding in the data source, either via a query/view or Custom SQL. In an earlier version of this project I’d built this, and maintaining a bunch of queries with Cartesian joins isn’t my favorite cup of tea.
  • Building a scaffold data source with all combinations of the month and unit and using the scaffold as the primary data source. While possible, this introduces maintenance issues when there’s a need for additional fields at a finer level of detail. For example, the falls measure actually has three separate fall rates – monthly, quarterly, and annual. These are generated as separate rows in our measures data and the particular duration is indicated by the Period field. So the scaffold source would have to include the Period field to get the data, but then that could be too much detail for the blended fall event data, and make for more complexity in the calculations to make sure the aggregations worked properly.
  • Do a tiny bit of padding in the query, then do the rest in Tableau via Show Missing Values aka domain padding. As I’d noted in an earlier post on blending, domain padding occurs before data is blended so we can pad out the measure data through the current date and then include all the falls. This is the technique I chose, for the reason that padding one row to the data is trivial and turning on Show Missing Values is a couple of mouse clicks. Here’s how I did that:

In my case, the primary data source is a Microsoft Access query that gets the falls measure results from a table that also holds results for hundreds of other metrics that we track. I created a second query with the same number of columns that returns Null for every field except the Measure Date, which has a value of 1/1/1900. Then a third query UNION’s those two queries together, and that’s what is used as the data source in Tableau.

Then, in Tableau, I added a calculated field called Date with the following formula:

//used for padding out display to today
IF [Measure Date] == #1/1/1900# THEN 
    TODAY() 
ELSE 
    [Measure Date] 
END

The measure results data contains a row per measure, reporting unit, and the period. These are pre-calculated because the data is used in a variety of different outputs. Since in this dashboard we are combining the results across units, we can’t just use the rate, we need to go back to the original numerator and denominator. So, I also created a new field for the Calculated Rate:

SUM([Numerator])/SUM([Denominator])

Now it’s possible to start building the line chart view:

  1. Put the Month(Date) – the full month/year version as a discrete – on Columns, Calculated Rate on Rows, Period on the Color Shelf. This only shows the data that exists in the data source, including the empty value for the current month (March in this case):

 

Screen Shot 2013-03-09 at 1.11.25 PM

 

  1. Turn on Show Missing Values for Month(Date) to start domain padding. Now we can see the additional column(s) for the month(s) – February in this case between January to the current month that Tableau has added in:

 

Screen Shot 2013-03-09 at 1.14.19 PM

 

With a continuous (green pill) date, this particular set-up won’t work in version 8. Tableau’s domain padding is not triggered when the last value of the measure is Null. I’m hoping this is just an issue with the beta, I’ll revise this section with an update once I find out what’s going on.

Even though the measure data only has end of month dates, instead of using Exact Date for the month I used Month(Date) because of two combined factors: One is that the default import of most date fields from MS Jet sources turns them into DateTime fields, the second is that Show Missing Values won’t work on an Exact Date for a DateTime field, you have to assign an aggregation to a DateTime (even Second will work). This is because domain padding at this level can create an immense number of new rows and cause Tableau to run out of memory, so Tableau keeps the option off unless you want it. Also note that you can turn on Show Missing Values for an Exact Date for a Date Field.

  1. Now for some cleanup steps: for the purposes of this dashboard, filter Period to remove Monthly (we do quarterly reporting), but leave in Null because that’s needed for the domain padding.
  2. Right-click Null on the Color Legend and Hide it. Again, we don’t exclude this because this would cause the extra row for the domain padding to fail.
  3. Set up a relative date filter on the Date field for the last 13 months. This filter works just fine with the domain padding.

Filtering on Unit

Here’s a complicating factor: If we add a filter on Unit, there’s a Null listed here:

 

Screen Shot 2013-03-09 at 1.18.31 PM

I’d just want to see the list of units. But if we filter that Null out, then we lose the domain padding, the last date is now January 2013:

 

Screen Shot 2013-03-09 at 1.18.58 PM

 

One solution here would be to alter the padding to add a padding row for every unit, instead of just one unit. Since Tableau doesn’t let us just hide elements in a filter, and we actually have more reporting units in our data than we are displaying on the dashboards, I chose to use a parameter filter because there are more reporting units in our production data than we are displaying on the dashboards, yet the all-unit rate needs to include all of the data. Setting this up included a parameter with All and each of the units, and a calculated field called “Chosen Unit Filter” with the following formula, that is set to Filter on False:

[Choose Unit] == "All" OR [Choose Unit] == [Unit]

Falls With and Without Injury

In a fantasy world, to create the desired stacked bars I’d be able to drag the Number of Records from the secondary datasource, i.e. the number of fall events, drag an Injury indicator onto the Color Shelf, and be done. However, that runs into the issue of having a finer level of detail in the secondary than in the primary, which I’ll walk through solutions for in the next section. In this case, since there are only two different numbers, the easy way is to generate two separate measures, then use Measure Names/Measure Values to create the stacked bars – Measure Values on Rows, and Measure Names on the Color Shelf. Here’s the basic calculation for Falls with Injury:

SUM(IF [Injury] != "None" THEN 1 ELSE 0 END)

We’re using a row-level calculated field to generate the measure, and a slightly different calc for Falls w/out Injury.

Unit Level of Detail

When we want to blend in Tableau at a finer level of detail and aggregate to a higher level, historically there have been three options:

  • Don’t use blending at all, instead use a query to perform the “blend” outside of Tableau. In the case that there are totally different data sources, this can be more difficult but not impossible by using one of the systems or a different system to create a federated data source, for example by adding your Oracle table as an ODBC connection to your Excel data, then making the query on that. In this case, we don’t have to do that.
  • Use Tableau’s Primary Groups feature “push” the detail from the secondary into the primary data source. This is a really helpful feature, the one drawback is that it’s not dynamic so any time there are new groupings in the secondary it would have to be re-run. Personally, I prefer automating as much as possible so I tend not to use this technique.
  • Set up the view with the needed dimensions in the view – on the Level of Detail Shelf, for example – and then use table calculations to do the aggregation. This is how I’ve typically built this kind of view.

Tableau version 8 adds a fourth option:

  • Tell Tableau what fields to blend on, then bring in your measures from the secondary.

I’ll walk through the table calculation technique, which works the same in version 7 and version 8, and then how to take advantage of v8′s new feature.

Using Table Calculations to Aggregate Blended Data

In order to blend the the falls data at the hospital unit level to make sure that we’re only showing falls for the selected unit(s), the Unit has to be in the view (on the Rows, Columns, or Pages Shelves, or on the Marks Card). Since we don’t actually need to display the Unit, the Level of Detail Shelf is where we’ll put that dimension. However, just adding that to the view leads to a bar for each unit, for example for April 2012 one unit had one fall with injury and another had two, and two units each had two falls without injury.

 

Screen Shot 2013-03-09 at 1.30.27 PM

 

To control things like tooltips (along with performance in some cases), it’s a lot easier to have a single bar for each month/measure. To do that, we turn to a table calculation, here’s the Falls w/Injury for v7 Blend calculated field, set up in the secondary data source:

IF FIRST()==0 THEN
	TOTAL([Falls w/Injury])
END

This table calculation has a Compute Using of Unit, so it partitions on the Month of Date. The IF FIRST()==0 part ensures that there is only one mark per partition. I’m using the TOTAL() aggregation here because it’s easier to set up and maintain. The alternative is to use WINDOW_SUM(), but in Tableau prior to version 7 there are some performance issues, so the calc would be:

IF FIRST()==0 THEN
	WINDOW_SUM(SUM(Falls w/Injury]), 0, IIF(FIRST()==0,LAST(),0))
END

The ,0 IIF(FIRST()==0,LAST(),0 part is necessary in version 7 to optimize performance, you can get rid of that in version 8.

You can also do a table calculation in the primary that accesses fields in the secondary, however TOTAL() can’t be used across blended data sources, so you’d have to use the WINDOW_SUM version.

With a second table calculation for the Falls w/out Injury, now the view can be built, starting with the line chart from above:

  1. Add Measure Names (from the Primary) to Filters Shelf, filter it for a couple of random measures.
  2. Put Measure Values on the Rows Shelf.
  3. Click on the Measure Values pill on Rows to set the Mark Type to Bar.
  4. Drag Measure Names onto the Color Shelf (for the Measure Values marks).
  5. Drag Unit onto the Level of Detail Shelf (for the Measure Values marks).
  6. Switch to the Secondary to put the two Falls for v7 Blend calcs onto the Measure Values Shelf.
  7. Set their Compute Usings to Unit.
  8. Remove the 2 measures chosen in step 1.
  9. Clean up the view – turn on dual axes, move the secondary axis marks to the back, change the axis tick marks to integers, set axis titles, etc.

This is pretty cool, we’re using domain padding to fill in for non-existent data and then having a blend happening at one level of detail while aggregating to another, just for the second axis. Here’s the v7 workbook on Tableau Public:

Patient Falls Dashboard - Click on Image to go to Tableau Public

Patient Falls Dashboard – Click on image above to go to Tableau Public

Tableau Version 8 Blending – Faster, Easier, Better

For version 8, Tableau made it possible to blend data without requiring the linking fields in the view. Here’s how I build the above v7 view in v8:

  1. Add Measure Names (from the Primary) to Filters Shelf, filter it for a couple of random measures.
  2. Put Measure Values on the Rows Shelf.
  3. Click on the Measure Values pill on Rows to set the Mark Type to Bar.
  4. Drag Measure Names onto the Color Shelf (for the Measure Values marks).
  5. Switch to the Secondary and click the chain link icon next to Unit to turn on blending on Unit.
  6. Drag the Falls w/Injury and Falls w/out Injury calcs onto the Measure Values Shelf.
  7. Remove the 2 measures chosen in step 1.
  8. Clean up the view – turn on dual axes, move the secondary axis marks to the back, change the axis tick marks to integers, set axis titles, etc.

The results will be the same as v7.

Next: Tableau’s Data Blending Architecture

———————————————————————————-

References:

[1] Kristi Morton, Ross Bunker, Jock Mackinlay, Robert Morton, and Chris Stolte, Dynamic Workload Driven Data Integration in Tableau, University of Washington and Tableau Software, Seattle, Washington, March 2012, http://homes.cs.washington.edu/~kmorton/modi221-mortonA.pdf.

[2] Hans Rosling, Wealth & Health of Nations, Gapminder.org, http://www.gapminder.org/world/.

[3] Jonathan Drummey, Tableau Data Blending, Sparse Data, Multiple Levels of Granularity, and Improvements in Version 8, Drawing with Numbers, March 11, 2013, http://drawingwithnumbers.artisart.org/tableau-data-blending-sparse-data-multiple-levels-of-granularity-and-improvements-in-version-8/.

 

An Introduction to Data Blending – Part 3 (Benefits of Blending Data)

Readers:

In Part 2 of this series on data blending, we delved deeper into understanding what data blending is. We also examined how data blending is used in Hans Rosling’s well-known Gapminder application.

Today, in Part 3 of this series, we will dig even deeper by examining the benefits of blending data.

Again, much of Parts 1, 2 and 3 are based on a research paper written by Kristi Morton from The University of Washington (and others) [1].

You can learn more about Ms. Morton’s research as well as other resources used to create this blog post by referring to the References at the end of the blog post.

Best Regards,

Michael

Benefits of Blending Data

In this section, we will examine the advantages of using the data blending feature for integrating datasets. Additionally, we will review another illustrative example of data blending using Tableau.

Integrating Data Using Tableau

In Ms. Morton’s research, Tableau was equipped with two ways of integrating data. First, in the case where the data sets are collocated (or can be collocated), Tableau formulates a query that joins them to produce a visualization. However, in the case where the data sets are not collocated (or cannot be collocated), Tableau federates queries to each data source, and creates a dynamic, blended view that consists of the joined result sets of the queries. For the purpose of exploratory visual analytics, Ms. Morton (et al) found that data blending is a complementary technology to the standard collocated approach with the following benefits:

  • Resolves many data granularity problems
  • Resolves collocation problems
  • Adapts to needs of exploratory visual analytics

Figure 1 - Company Tables

Image: Kristi Morton, Ross Bunker, Jock Mackinlay, Robert Morton, and Chris Stolte, Dynamic Workload Driven Data Integration in Tableau. [1]

Resolving Data Granularity Problems

Often times a user wants to combine data that may not be at the same granularity (i.e. they have different primary keys). For example, let’s say that an employee at company A wants to compare the yearly growth of sales to a competitor company B. The dataset for company B (see Figure 1 above) contains a detailed quarterly growth of sales for B (quarter, year is the primary key), while company A’s dataset only includes the yearly sales (year is the primary key). If the employee simply joins these two datasets on yearly earnings, then each row from A will be duplicated for each quarter in B for a given year resulting in an inaccurate overestimate of A’s yearly earnings.

This duplication problem can be avoided if for example, company B’s sales dataset were first aggregated to the level of year, then joined with company A’s dataset. In this case, data blending detects that the data sets are at different granularities by examining their primary keys and notes that in order to join them, the common field is year. In order to join them on year, an aggregation query is issued to company B’s dataset, which returns the sales aggregated up to the yearly level as shown in Figure 1. This result is blended with company A’s dataset to produce the desired visualization of yearly sales for companies A and B.

The blending feature does all of this on-the-fly without user-intervention.

Resolves Collocation Problems

As mentioned in Part 1, managed repository is expensive and untenable. In other cases, the data repository may have rigid structure, as with cubes, to ensure performance, support security or protect data quality. Furthermore, it is often unclear if it is worth the effort of integrating an external data set that has uncertain value. The user may not know until she has started exploring the data if it has enough value to justify spending the time to integrate and load it into her repository.

Thus, one of the paramount benefits of data blending is that it allows the user to quickly start exploring their data, and as they explore the integration happens automatically as a natural part of the analysis cycle.

An interesting final benefit of the blending approach is that it enables users to seamlessly integrate across different types of data (which usually exist in separate repositories) such as relational, cubes, text files, spreadsheets, etc.

Adapts to Needs of Exploratory Visual Analytics

A key benefit of data blending is its flexibility; it gives the user the freedom to view their blended data at different granularities and control how data is integrated on-the-fly. The blended views are dynamically created as the user is visually exploring the datasets. For example, the user can drill-down, roll-up, pivot, or filter any blended view as needed during her exploratory analysis. This feature is useful for data exploration and what-if analysis.

Another Illustrative Example of Data Blending

Figure 2 (below) illustrates the possible outcomes of an election for District 2 Supervisor of San Francisco. With this type of visualization, the user can select different election styles and see how their choice affects the outcome of the election.

What’s interesting from a blending standpoint is that this is an example of a many-to-one relationship between the primary and secondary datasets. This means that the fields being left-joined in by the secondary data sources match multiple rows from the primary dataset and results in these values being duplicated. Thus any subsequent aggregation operations would reflect this duplicate data, resulting in overestimates. The blending feature, however, prevents this scenario from occurring by performing all aggregation prior to duplicating data during the left-join.

Figure 2 - San Francisco Election

 Image: Kristi Morton, Ross Bunker, Jock Mackinlay, Robert Morton, and Chris Stolte, Dynamic Workload Driven Data Integration in Tableau. [1]

Next: Data Blending Design Principles

——————————————————————————————————–

References:

[1] Kristi Morton, Ross Bunker, Jock Mackinlay, Robert Morton, and Chris Stolte, Dynamic Workload Driven Data Integration in Tableau, University of Washington and Tableau Software, Seattle, Washington, March 2012, http://homes.cs.washington.edu/~kmorton/modi221-mortonA.pdf.

[2] Hans Rosling, Wealth & Health of Nations, Gapminder.org, http://www.gapminder.org/world/.