Skip to main content

Market Basket Analysis Tableau – How To Find Products to Cross Sell Together

Say for instance you’re in the business of selling bookcases– but in the basket, what should be sold together to promote cross selling?

Lets show you how it can be done with Superstore data!

Step 1: Identify which orders contain the product, category etc which fit your criteria.


Step 2: Use this as an identifier against the whole order.

We can do this with an easy level of detail calculation! We are essentially isolating to each order OrderID (since its fixed to this in the calc) then checking against which is the max value. Since any string value is inherently larger than nothing (the null value), it will return the string and populate against each order ID.

Step 3: Place this new calculation on filters

This will bring back all orders where it contains bookcases. But it will also keep its accessory products!

From here its just about rearranging the view to find what other product categories are being sold with bookcases. Place a count distinct of order ID to columns with subcategory to rows.

The view below tells us that for the 224 orders that contain a bookcase sold, we also had 56 orders that sold at least 1 quantity of a binder.

But when I place profit onto color, this tells me that for the past bookcase orders, not only have 42 orders had a bookcase and a phone in the same basket. But I’ve also made a nice profit of $2,687 from customers buying these items together! This is probably important as we are not making a profit from Bookcases!

Interested in the workbook? Check it out on Tableau Public!

Thanks for reading!


Tableau 2017 – Conference Tips!

If you’re headed to the conference, here’s a couple of pointers to prepare/look out for. I would highly recommend being organised so you can be open to random opportunities that arise.


  • If you have food allergies – pack accordingly. There’s food everywhere but all pickup food stations
  • Comfy walking shoes. The conference is MASSIVE and you’ll be on the go from breakfast of 3AM for a week.
  • Buy Conference Swag ASAP. The store was clear out by Thursday. You can always order online but its just not the same!

Learning at Conference

  • Plan your sessions ahead of time (and create back up plans). I did my schedule in Excel so I could keep track of it all. Take into account the travel time between venues if you can.
  • Make your life easy and just let go of the FOMO. There are so many amazing events and sessions. It’s going to happen – so just be happy with the plan you’ve committed to. I remember feeling so much stress on my first day and Emma from TIL also shared this sentiment. Her advice? You gotta let it go!
  • Feel free to leave a session, if a session is really terrible/too full for you to absorb information. Its about making the most of it for you.
  • Check out the Vendor Expo. I think of it as a professional development session in itself! Ask lots of questions– they’ll be happy you did.
  • Take notes during sessions. I carried my laptop around.
  • Pick good speakers. Its also tough for those preparing the sessions – its a lot of work! Some might not have a had enough time, not great public speakers etc.. so that’s going to reflect in the end product.
  • Get good seats. Last year, people in the back probably had a hard time because many of the rooms were long and skinny with no screens for the back end of the crowd.


  • Create a WhatsApp group with your travel mates
    • With 15,000 people, you’ll need it to help navigate where’s the best sessions, events etc.
  • Go out and socialize. Tableau folks are some of the friendliest, quirky data analysts around. What a world of difference it makes to meet them in person!

That’s all I can think about for now. Any others I missed? Let me know in the comments!

See you in Vegas!


Information is Beautiful – G Research Lecture Series Review

David McCandeless is a titan in data visualisation – no doubt about it. And with The Science Museum’s immersive IMAX screen, we saw the vastness of a billion dollars, examined the medical veracity of “superfoods” and proved the most controversial question of all – what is the best dog breed? (All backed with data!)

With beautiful colours, composition and storytelling, McCandeless’ approach to data visualisation invites all audiences to engage with the world around us. Like a bakery window with glossy cakes, he draws users in with data, design and a story.

But how does this message fare in a room full of quants? He demonsrates data visualization’s power to engage with his final audience game, asking a series of questions of what is more popular on Google: Beer vs Wine, Cornflakes vs Toast, Youtube vs Sex (the answer may surprise you). 400 people immediately took to the 101 of data analysis: to think, take a guess, and to revise one’s answer. By demonstrating how data visualization can bring conversation, McCandless taught an audience of all disciplines how we can be inquisitive about our data- to move from information into knowledge.

Unsurprisingly, the topic of fake news was raised during the Q&A period – how can we spot fake news? As a journalist, ad man, and designer and developer, all these talents still suggest this 1 answer: its tough to spot and trusting your source is key. Data Visualisation is powerful – use it wisely.

Here are some of my pictures from the event:

How to Find the Number of Active Projects at Any Given Point in Time

This analytics problem is a rite of passage – we all eventually get asked it. And it usually goes to a tune like this:

“I have a list of projects with start and end dates. I want to know at any given time, how many projects are active.”

(Btw the TL;DR for this is actually this great video by my TIL colleague Andre)

In this case, we looking for how many orders are being prepared for shipment at any given date?

Here’s my viz.

Using Superstore, the data you’re given looks something like this.

And visually represented, we know that these projects all overlap on March 16 2016.

But how can we express this as a line chart? i.e. not at a given point in time, but how many over time?

For such a simple question, it actually has a rather complex (but easy to implement!) solution. It requires a product join.

The difficulty with this easy to ask question is that …how can Tableau plot data for a datapoint that effectively doesn’t exist? That is, when we look at the data, March 16 only occurs 3 times.

So as smart as Tableau is, it does not know that all the orders in the list are in preparation mode, just like the other 3. (Well Tableau CAN know, but lets put densification aside for another day).

Step 1: Prepare your Data

Given this train of thought, that means we need to give each entry (order ID/employee/project/whatever your flavor of granularity) every date we want to compare against.

And since we want to be able to compare against any date, we are going to multiply each entry 365 times for 365 dates.  This will pad out the underlying dates and give the data structure we need to make this work!

That means you’ll need a dataset with dates you want to see in 1 column and a dummy variable in another. I’ve very creatively called mine “One”.

Then you’ll have to add the dummy variable to your original dataset. In this case, I added a “1” running a straight column down the Superstore data.


Set it as an inner join in Tableau

Almost there!

Step 2: Create Your Date Filter

.. then put this filter on columns.

Step 3: Start to create your view!

Put date from your dummy dataset on columns and add the count distinct of your chosen dimension on rows.

In mine, I get something like this.


Other resources I’ve used to get to this method include this great video by my colleague Andre and of course, the great Bethany Lyon’s talk on Cross Database Joins at TC16.


Happy Vizzing!~