• Unions in Tableau 9.3: Tips and Tricks

    I’ve been participating the beta testing of Tableau 9.3 and I’m loving the innovation the Tableau developers have been bringing to recent releases of Tableau – especially in the area of integration!  Unions are an incredible new feature in Tableau 9.3 that will make life much easier.  So, as we eagerly wait the release of the new version, here are a few tips and tricks for using unions in Tableau 9.3:

    I love unions in Tableau 9.3!

    Tips and Tricks for the Unions in Tableau 9.3

    1. Start with one file, then drag all others into connection: You could drag each file one-by-one into the connection editor, but start with one and then select the others and drag them all at once to create unions quickly:

    Unions Created in the Data Editor

    1. If needed, use the Customize Union option to see your union and add/remove files. The option is available on the drop down menu of the union in the connection editor. You can remove individual files and you can drag files from the left sidebar into the list and they will appear at the bottom:
    Customize Union

    You can give your unions names in the Title text box

     

    1. You can union a file to itself. Just drag the file same file into the connection editor multiple times. Why would you do this?  Well, there are times that you might need to “reshape” the data a little and it’s helpful to have two rows instead of the original single row.  For example, if you have an origin city and destination city in the same row, but want to draw a path between the two, you’re stuck. But union the file to itself, work a little calculation magic, and you now have a row for source and a row for destination and you can draw the line between them.  (Of course, for certain situations there are alternatives!)

     

    1. Make note of the Path and Table Name fields that Tableau automatically adds to your unions. These fields give you the filenames and table names and associate each row of data with the source file/table. As we’ll see, this can be quite useful.

     Path and Table Name fields for Unions in Tableau 9.3

     

    1. Use the Path and Table Name fields to your advantage! Here’s a scenario: you get a new data dump file each month. Since the file is named based on the month (yyyy-mm.csv), the guy who provides the files (we’ll call him “Fred”) has never felt it necessary to include the date in the contents of the file. But Fred never thought about the fact that when you union these together in a single source, you actually want to know the dates.  So, you, being quite clever, use the [Path] or [Table Name] field to work-around Fred’s lapse in judgment:
    Calculation to get Date in Union

    Here, we’ve used the DATEPARSE function to parse the Table Name field value as a date.

     

    1. Check the Data Integrity! (part 1) That is, make sure the files in your union contain what you think they contain. Start simple – check to see if each file has the number of records you expect.  Look for anything that falls outside the norm:
    File Check

    With a reference line or two, I can see that the only file outside of 2 standard deviations is the one for April 2015 – I probably need to check that one

     

     

    1. Check the Data Integrity! (part 2) The first check was at a very high level, now we’ll go deep and check the fields.
    Field Check

    This view allows me to check every field for every file, allowing me to see if there are fields that don’t exist within all files in the union.

    With a view like this, I can immediately tell what issues exist in the union:

    • We didn’t start tracking Returns until January 2015
    • The field “Product ID” appears to have changed names to “ID” in January 2015
    • The field “Cost” appears to have been split into two separate fields, “Cost (internal)” and “Cost (external)” in July 2015.
    • The last time the field “Margin” appeared was in March 2015.

    But how do you create the view to check your union?  I’m glad you asked…

     

    Creating the Field Level View to check the State of the Union

     Creating the above view for checking unions is not difficult, but requires a bit of setup.  Here’s how you can do it:

    • Start by creating a view with Path on Rows, Measure Names on Columns, and Measure Values on Size:

    Field Check - Step 1

    • Switch the aggregations for all Measure Values from SUM to COUNT. Do this because you want to know whether a field exists in a given file in the union – not the total value.  By the way, you can do this quickly by selecting all fields on the Measure Values shelf (click one, hold Shift, and click another to get everything in between) and changing at once using the right-click context menu:
    Change SUM to COUNT

    Click on one of the measure values, then hold Shift (or Option) while clicking another to select all fields in between.

    • Now, edit the Size (double click the Size legend or use the drop down menu on the legend to edit) and set the End value of the Range to 1. This way, if there is at least one value for the field in a file (NULLs aren’t counted), then we’ll get a large square versus a little tiny square.

    Set Size Range (Tableau Size Legend)

     

    • Now, you can see all the fields that were classified by Tableau as Measures. What about the dimensions?  Simply add them to the Measure Values shelf as COUNT aggregations.  You can accomplish this relatively quickly by dragging and dropping them while holding down the right mouse button (or Option key on Mac) to get a selection menu:

    Right Click Drag and Drop

    And now you can see the issues in the data!

     

    But how do you fix them?  Well, some are going to require a fix of the source.  You’ll have to ask Jason to produce files with Returns for 2014.  But others are a little easier.

     

    Bonus Tip #1: Use the Merge feature in the Data Source editor!

    Since I can see that Product ID changed names to ID in 2015 files,

    Field Changed Name

    then I might want to merge the two fields into one.  It’s as simple as selecting the two fields in the Data Source editor and using the drop down to select Merge.

    Using Tableau 9.3 to Merge Fields in Union

    The resulting field (named Coalesce by default) contains the value of either Product ID or ID (whichever was not null).  The field can be renamed and then used in any view:

    Merged Field in Tableau 9.3

    Bonus Tip #2: Use calculations to fix other union issues

    What about the case where Cost got split into 2 fields?  That’s not as simple as using Merge because in some cases there are 2 non-null values.

    Tableau Calculation Needed

    But we can use a simple calculation to get the Total. Notice, the use of the ZN() function (zero null – that is, replace any nulls with zero) to make sure the addition works.  And now you have a field you can use consistently throughout the data source:

    Tableau Calculation Applied

     

     

    And there you have it – 9 tips on using the Union feature in Tableau 9.3.  So what cool things will you accomplish with unions?  How will unions make your life easier?  Let me know in the comments below!

9 Responsesso far.

  1. Siraj Samsudeen says:

    Hi Joshua, excellent coverage of the new feature. It is a really wonderful idea to be able to see what changes are happening to your column sources – the view you have set up will be so useful in a number of real-life scenarios. I I really liked the productivity-enhancing tip to change the aggregation of multiple measures at once.

  2. Matt Lutton says:

    This is great, Joshua! Thanks for sharing your experiences, and posting here — this will help a lot of folks get started with these new features. Excellent post!!

  3. Julia says:

    thanks for this great post. Can I union a file and a table from my Database with new feature ?

    • Julia says:

      looks like i cannot 🙁

      • Joshua Milligan says:

        Julia,

        (Sorry for the delay in response). No, you are correct that it is not currently possible to union a file with a database table. Hopefully subsequent versions will continue to build out the integration features that really started to be built out with 8.0 and this kind of thing will work!

        In the meantime, you might be able to try another approach:
        1.) use a process to export the DB table as a file and then union.
        2.) Import the data from the file to the DB and use a Custom SQL to union the data there.

  4. Eduardo says:

    Excellent post Joshua!Data integrity tips were amazing. Thank you sir

  5. PrakashA says:

    Hi,

    I am using Tableau Desktop 10.1 version. I have more than one data set (excel workbooks) which I need to append in Tableau Data Source tab. All the excel workbooks have same sets of columns. I can place them one below the other in excel and load in to tableau. But my objective is to bring them one below the other in Tableau data source tab itself. I have successfully tried appending the data sets in tableau worksheets, but I wish to know if the same can be done in the Data Source tab. I have also tried using “New Union” functionality with the various available join types, but the data does not get appended as required. Is there any other way through which I can append the data one below the other? Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *