Rank and Ranking in Tableau Prep
One thing I’ve submitted an idea for Tableau Prep (and would love if you voted for it!) is the ability to use Table Calculations, especially Index() and PreviousValue(). This would solve a lot of use cases, including adding row numbers, rank, sort, handling cumulative values, and more. But to compute a rank or sort specifically, there is a work-around that will work in certain cases. Here a possibility for sorting and ranking in Tableau Prep:
Ranking in Tableau
Let’s use the familiar Superstore Sample data set which has Sales for Orders made by Customers in various States. Let’s say I want to rank the customers for each state based on the amount of sales. Using Tableau, I have the option to use the Rank() table calculation, and it isn’t too difficult to come up with something like this:
The difficulty with table calculations is 1) you have to set the right addressing and partitioning (not too bad in this case) and 2) you have already gone down several layers of complexity, so deeper analysis is going to be even more complex. What if you could rank the source?
Ranking in Tableau Prep
Let’s take the same data set and see how we might achieve ranking in Tableau Prep. If I want to rank by customer within state, here are the steps I might take:
- Create an aggregate of Sales by State and Customer:
- Add a Step after the aggregation that contains the aggregate rows:
- Join the Aggregate Rows to the previous aggregate step. For all intents and purposes, I’m joining the aggregation to itself:
You’ll notice that I kept it as an inner join, matched on State (to ensure the ranking is done within the state) and then also included a condition that the Sales of the State & Customer must be less than or equal to the Sales of the Aggregate Rows step. In other words, I want to match records where the state is the same and where a customer had higher (or the same) sales. That means if Customer A had the highest sales, they would only match on themselves (and if we count the number of records, we’d get “1” – Rank #1 – ranking in Tableau Prep – but now we’re getting ahead of ourselves).
- Add an Aggregation to the Join that allows us to count the Number of Records per State and Customer (okay, we weren’t too far ahead of ourselves):
The Number of Records is the number of matches from the join which ultimately is the Rank of the customer within the State. So, we’ll…
- Add a clean step to rename the Number of Records to Rank Within State:
- Finally, we can Join that Clean step to the Original set of records on Customer and State. Our final output contains all the original records with the addition of the sales Rank Within State!
And now, we have something that makes even complex analysis in Tableau so much easier! Do I want to compare summer sales of my top 10 customers per state? Easy! Do I want to look at customers who are top in one state and bottom in another? Easy! There’s no end to the analysis that suddenly became easy because I moved the complexity to the data prep (and Tableau Prep made that easy too! In fact you can download the flow here!)
Slick!!
Thanks!
It looks nice at first.
It seems that calculation is not going well if there is the same numerical value.
ex) examination result
Nan Norden,100
Rodger Nardone,100
Preston Baranowski,80
Shakira Menz,50
Nan Norden’s rank will be 2.
Yes, that’s true – it is a limitation of this approach – it produces only a modified competition rank.
Hi Joshua,
I found a way to simulate Tableau Desktop’s RANK_UNIQUE function while still keeping your same Flow. If you add a step before the initial aggregation and create a calculated field = [Sales]+ASCII([Customer Name])/100000 and add it to the aggregation to use in the first join instead of Sales it will generate unique values in alphabetical order for Customers with the exact same Sales, but different first initials. You could also use MID() to add characters in case two customers have the exact same sales and first initials, but even without additional characters this significantly reduces the likelihood of duplicate ranks.
Katie,
That’s a very clever approach! I love it!
This has been a great find. I’m really surprised Prep doesn’t have a built in ranking function, but this was a great work around for me.
Instead of sales amount I had to sort events by date. The join was
PreviousID=CurrentID & PrevDate <= CurrentDate
Putting row count gave me the rank. Joining back to data gave me ranked events.
It's a work around but hey, it worked. Kudos as well to Katie on the additional detail.
Thanks Rob! I’m guessing the functionality will come at some point, but for now I’ve used this work-around quite a bit. Glad it helped!
You can change the logic a little bit to make it work to get the ranking you want. I have the completed flow package as well as a screenshot, but I can’t attach it in the comment. Hence, here is a quick explanation of the change in my logic:
Rather than using Sales <= Sales, I removed the equal condition so that the customer does not get his own sales. This means that the records would be dropped if we do inner join. Hence, I used LEFT outer join. Then, I counted the number of non-null customers on the right – that would give me a rank starting with 0. I added a 1 to get the rank I want. So, for your scenario, it would produce Rank 1 for Nan and Rodger and Rank 3 for Preston. I also created another dataset with unique values for the 4 customers and tested to ensure that the logic works.
Hi Joshua, is there a way for me to mail the file or the output screenshot to you so that you can make it available here? Thank you again for inspiring me with your logic which got me started 🙂
Siraj,
Thank you so much for the kind words and for how you built on the solution! I’d love to take a look at screenshots or files. Send me a direct message on either Twitter (https://twitter.com/VizPainter) or LinkedIn (https://www.linkedin.com/in/joshua-milligan-vizpainter/) and I’ll be in touch!
Hi Joshua,
This is absolutely brilliant. When I was reading your article, after I have understood the problem statement and the dataset, I tried my best to find a solution before I read yours so that I can flex my brain and test my Prep knowledge. However, after thinking long and hard, I realized that I could not figure it out – now, I was very curious as to how you managed it! When you I saw that you used a non-equal join, I was just so blown away by 2 things:
1. Your ingenuity in coming up with this solution – thank you for taking the time to share this
2. How my brain never even considered a non-equi join as a possibility due to many years of thinking about joins as equals. Even though I knew this feature exists in Prep, somehow the thought never came to my mind. One more reminder that our knowledge and experience blinds us – leads me to more humility!
So, thank you so much. I am actually trying to solve another problem – tagging the top customers based on their sales in Prep and this article really invigorated my mind. Thank you so much Joshua!
[…] is an extension to the ranking trick in Tableau Prep. Lets see how we can get unique row ids for below […]
Hi Joshua – I think your suggestion here may be the right path to my challenge, I just can’t seem to wrap my head around your solution. I don’t get the concept of “aggregation” step well enough. I posted a question to Tableau Public forums about my challenge to prep the NOAA Hurricane data, which has two types of lines of data and I can’t seem to figure out what the join clause could be, besides a year. I know I need to create some sort of unique IDs, but I am lost. Could you please take a look at it for me and give me some hints? https://community.tableau.com/thread/293530
hello Joshua, I translated your blog and published to my WordPress to share with tableau fans in China, and thanks for your contributions.
I want to publish a book of tableau and would you like to be a co-author so I can translate more blogs from your to share with other people.
Even I want to translate your book next year.
look forward for your answer ,and my email is yupengw@126.com
Michael wu, China
Great work but I don’t think you need the other join to bring back in the Sales number. Just add Avg(Sales) to your second aggregation (when you are determining the rank) and it will bring in that sales value.
Carl,
I looked at that, but I don’t think that works as you’d like it to. But maybe I’m not following your suggestion fully.