At some point, I got labeled as a “Tableau cube expert.” The funny thing is, when I first acquired this label I just barely knew how to spell OLAP. But over the years, I’ve gained some experience working with Tableau and cubes. And I get questions almost weekly about how Tableau works with cubes and for some Tableau tips and tricks for working with cubes. Here are my thoughts:
The Cube Paradigm
Tableau works with cubes by translating VizQL (the Visual Query Language Tableau generates based on where you drag and drop fields in the view to visually encode the data) into MDX and issuing that as a query to the cube, getting back the results, and then rendering the visualization.
When cubes are well designed and when you’ve got a developer or team of developers who are pretty quick to turn-around requests for cube enhancements, it can be a fairly enjoyable experience to work with Tableau against a cube. The key to understand, is that cubes are designed to “pre-aggregate” the data, meaning that all the aggregations are defined already (so if it is an average in the cube — that’s what you get!) and the values are stored for each intersection of dimension values.
What this means is that the cube is where most of the analytical logic must take place. When you are working with relational data sources, Tableau offers a wealth of options for implementing business rules, logic, and analysis: calculations using dimensions, creation of new dimensions, groups, bins, LOD, Top N Filters/Sets using an aggregation of the dimension itself, custom aggregations of any row level field, complex calculated filters, data source filters, view underlying row-level data, etc…, etc…, etc…
But those are not options with a cube. Why? Because cubes are designed to handle all of that themselves. In fact, most of it is pre-calculated and stored in aggregate form — so you can’t simply create new slices of data or new custom aggregations. That takes away a lot of complexity from Tableau (so give a business executive a connection to a well-designed cube with business rules baked in and a license of Tableau and she’s pretty safe in the analysis she does), but it also takes away a lot of flexibility.
There’s still a lot you can do with Tableau — statistical functions are mostly there (sans forecasting), table calcs (sans referencing dimension fields – but I can’t think of a reason why the table calc shouldn’t be able to reference a dimension – hence: this idea for which you can vote) And, of course, you can visualize what’s in the cube.
But if you find yourself needing to do analysis using fields not in the cube or want to implement complex business rules or logic – then be prepared to implement that in the cube. Or use another data source (after all, the cube was built from something!)
Tips and Tricks for Using Tableau with Cubes
At the same time, I have discovered a few tips, tricks, and workarounds that may help you should you find yourself using Tableau with OLAP:
- You cannot create an extract from a cube, but you can create an offline cube file (.cub or .cube) and use that as a data source and even distribute it as a packaged workbook (.twbx).
- You can write calculated members using MDX. This can help you group dimensional values, among other things.
- Make sure that in addition to having a date hierarchy in the cube, you also have a date dimension with the actual value of the date. That will give you ultimate flexibility in choosing how you visualize dates.
- If you can’t change the cube and are stuck with just the date hierarchy, then create a secondary source with the date parts and the full date and then blend to get the exact date value.
- For that matter, use blending to get values not available in the cube. Frustrated that you can’t alias dimension values when using a cube? Just create a secondary data source with the actual values and desired aliases then blend and use the alias. Sad that you can’t create ad-hoc or calculated groups? Create a secondary source with the groupings and blend!
- Use table calculations to fake calculations that you can’t do. For example:
If you wanted to get the difference between Profit for Technology and Office Supplies, you can’t write the calculation:
IF [Category] = “Technology” THEN [Profit] END - IF [Category] = “Office Supplies” THEN [Profit] END
But using the view above, you could write:
LOOKUP([Profit], Last()) – LOOKUP([Profit], First())
7. Try to convince your boss to use a nice star schema data warehouse instead.