Excel spreadsheets and flat text files can be incredible data sources for Tableau. They are ubiquitous, easy to use, easy to create, portable. But, if you use Excel or text files as Tableau data sources, then you may not be aware of some landmines that can undermine your analysis. Step on one of these and your credibility as a data analyst or data visualization expert might be blown to smithereens:
What makes this landmine particularly dangerous is that Tableau 9.3 changed the way Tableau handles joins on blank (NULL) values in Excel and text files. Tableau now follows the ANSI standard which states that NULL values will not match any value (even other NULLs). I’m glad to see Tableau following convention! (And I’m sure it was necessary to standardize for cross-database joins in Tableau 10.0) But, earlier versions of Tableau would allow a join to find matches when the two values were blank. So if you were used to that behavior, you might not even know you are in the minefield. And many Excel users just think “blank” and not “null”, again unaware of the danger that lurks just beneath the surface. Bottom line: Make sure fields you use in a join don’t have blanks. If necessary, change blank values in the source.
Real Bottom Line: I was wrong! Read more here and stay tuned to see what Tableau 10.0 has in store…
This landmine is relatively new as unions were just introduced in Tableau 9.3. (By the way, I’m presenting a Think Data Thursday webinar on Unions in Tableau this week and another webinar on new features in Tableau 9.3 next week!)
What happens with Income when you union these three files?
It turns out that it all depends on which file you use to start the union.
Start with Originals.txt and the total income is $58,502.23:
Start with Prequels.txt and the total income is just $11,500!
So what happened? In the first case, Tableau understood the data type to be a decimal value (because the first values in the first file were decimal) and so all values came through. In the second case, Tableau understood the value to be whole (integer) and any subsequent decimal values came through as NULL!
Bottom Line #1: To avoid this landmine, double check your data. You can simply change the data type (even in the data connection screen, using the icon – the hash sign for number – for the data type) to tell Tableau specifically how to read values. Setting it to Number (Decimal) fixes the problem:
Bottom Line #2: This conclusively settles the debate. Start with the originals, then the prequels (if you must), then the sequels.
This landmine can remain hidden and blow up when you least expect it. It’s very similar to the above issue, in that it occurs when most values are whole numbers, but you have a few decimals. In this case, it can occur with a single file data source because Tableau will sample the top n rows of data to determine the data type. If all it sees are whole numbers, decimals later on will come in as NULL (similar to above).
Bottom Line: Double check your values. Change data type manually if needed.
I’m certain there are other landmines out there. Which ones have you encountered? Share your war stories in the comments below!