• Rank and Ranking in Tableau Prep

    One thing I’ve submitted  an idea for Tableau Prep (and would love if you voted for it!) is the ability to use Table Calculations, especially Index() and PreviousValue().  This would solve a lot of use cases, including adding row numbers, rank, sort, handling cumulative values, and more.  But to compute a rank or sort specifically, there is a work-around that will work in certain cases. Here a possibility for sorting and ranking in Tableau Prep:

    Ranking in Tableau Prep Race - First, Second, Third

    Ranking in Tableau

    Let’s use the familiar Superstore Sample data set which has Sales for Orders made by Customers in various States.  Let’s say I want to rank the customers for each state based on the amount of sales.  Using Tableau, I have the option to use the Rank() table calculation, and it isn’t too difficult to come up with something like this:

    Ranking in TableauThe difficulty with table calculations is 1) you have to set the right addressing and partitioning (not too bad in this case) and 2) you have already gone down several layers of complexity, so deeper analysis is going to be even more complex.  What if you could rank the source?

    Ranking in Tableau Prep

    Let’s take the same data set and see how we might achieve ranking in Tableau Prep.  If I want to rank by customer within state, here are the steps I might take:

    1. Create an aggregate of Sales by State and Customer:
    2. Add a Step after the aggregation that contains the aggregate rows:
      Aggregate Rows Step in Tableau Prep
    3. Join the Aggregate Rows to the previous aggregate step.  For all intents and purposes, I’m joining the aggregation to itself:
      Join to get Rank in Tableau prepYou’ll notice that I kept it as an inner join, matched on State (to ensure the ranking is done within the state) and then also included a condition that the Sales of the State & Customer must be less than or equal to the Sales of the Aggregate Rows step.  In other words, I want to match records where the state is the same and where a customer had higher (or the same) sales.  That means if Customer A had the highest sales, they would only match on themselves (and if we count the number of records, we’d get “1” – Rank #1 – ranking in Tableau Prep – but now we’re getting ahead of ourselves).
    4. Add an Aggregation to the Join that allows us to count the Number of Records per State and Customer (okay, we weren’t too far ahead of ourselves):
      Count the number of matches to get the rank in Tableau PrepThe Number of Records is the number of matches from the join which ultimately is the Rank of the customer within the State.  So, we’ll…
    5. Add a clean step to rename the Number of Records to Rank Within State:
    6. Finally, we can Join that Clean step to the Original set of records on Customer and State.  Our final output contains all the original records with the addition of the sales Rank Within State!

    And now, we have something that makes even complex analysis in Tableau so much easier!  Do I want to compare summer sales of my top 10 customers per state?  Easy!  Do I want to look at customers who are top in one state and bottom in another? Easy!  There’s no end to the analysis that suddenly became easy because I moved the complexity to the data prep (and Tableau Prep made that easy too! In fact you can download the flow here!)

    Post Tagged with , , , ,

7 Responsesso far.

  1. yoshihisa tabuchi says:

    It looks nice at first.

    It seems that calculation is not going well if there is the same numerical value.

    ex) examination result
    Nan Norden,100
    Rodger Nardone,100
    Preston Baranowski,80
    Shakira Menz,50

    Nan Norden’s rank will be 2.

    • Joshua Milligan says:

      Yes, that’s true – it is a limitation of this approach – it produces only a modified competition rank.

    • You can change the logic a little bit to make it work to get the ranking you want. I have the completed flow package as well as a screenshot, but I can’t attach it in the comment. Hence, here is a quick explanation of the change in my logic:

      Rather than using Sales <= Sales, I removed the equal condition so that the customer does not get his own sales. This means that the records would be dropped if we do inner join. Hence, I used LEFT outer join. Then, I counted the number of non-null customers on the right – that would give me a rank starting with 0. I added a 1 to get the rank I want. So, for your scenario, it would produce Rank 1 for Nan and Rodger and Rank 3 for Preston. I also created another dataset with unique values for the 4 customers and tested to ensure that the logic works.

      Hi Joshua, is there a way for me to mail the file or the output screenshot to you so that you can make it available here? Thank you again for inspiring me with your logic which got me started 🙂

  2. Hi Joshua,

    This is absolutely brilliant. When I was reading your article, after I have understood the problem statement and the dataset, I tried my best to find a solution before I read yours so that I can flex my brain and test my Prep knowledge. However, after thinking long and hard, I realized that I could not figure it out – now, I was very curious as to how you managed it! When you I saw that you used a non-equal join, I was just so blown away by 2 things:

    1. Your ingenuity in coming up with this solution – thank you for taking the time to share this
    2. How my brain never even considered a non-equi join as a possibility due to many years of thinking about joins as equals. Even though I knew this feature exists in Prep, somehow the thought never came to my mind. One more reminder that our knowledge and experience blinds us – leads me to more humility!

    So, thank you so much. I am actually trying to solve another problem – tagging the top customers based on their sales in Prep and this article really invigorated my mind. Thank you so much Joshua!

Leave a Reply

Your email address will not be published. Required fields are marked *