Search Not Just Numbers

Friday, 18 August 2017

Excel Tip: An introduction to SUMPRODUCT and why you should learn it

Let me start with an apology for leaving it so long without a post. It's been a busy time work-wise and, in addition, I've taken up running to finally lose the excess weight and run the Great North Run half marathon next month. After spending the last 20 years or so doing very little exercise while sitting in front of a computer screen, I decided I had to do something about it - and raise some money for a good cause along the way!

OK! Excuses over! I've got a very powerful but often misunderstood function to tell you about.

On the face of it, SUMPRODUCT does a very simple thing, it multiplies arrays together and sums the results. Unless you're a serious mathematician, you're probably thinking that this is not something you've ever had a desire to do! But bear with me.

I rated it one of the most useful and under-used functions in a recent survey of Excel experts.

First of all, here is a simple explanation of how it works:

=SUMPRODUCT({5,6,4},{7,4,12}) returns 107 because:


You can enter up to 255 arrays like this (as long as they are all the same size) and SUMPRODUCT will multiply them then sum the totals as above. That's basically what SUMPRODUCT does. Now let's look at why this is very useful.

First of all, the SUMPRODUCT function allows you to work with arrays without entering the function using Ctrl+Alt+Enter, which means that you don't run the risk of accidentally clicking into the cell and clicking enter, then wondering why your array formula has stopped working. If you don't know what an array formula is ignore this point, as you don't need to worry about it with SUMPRODUCT. If you do, then you'll know what I'm talking about!

More importantly, the arrays entered into SUMPRODUCT can be formulae that result in arrays. The formula above could have been entered as:

=SUMPRODUCT(A2:A4,C2:C4)

if cells A2, A3 and A4 contained 5,6 and 4 respectively and cells C2, C3 and C4 contained 7, 4 and 12.

But we could also (say) add 1 to each value in the first array before multiplying by entering:

=SUMPRODUCT(A2:A4+1,C2:C4)

resulting in 130  - as ((5+1) x 7) + ((6+1) x 4) + ((4+1) x 12)) = 130.

You're probably still saying, "So what?", but here's where it gets useful.

These formulae can use other Excel functions and, even more usefully, conditions that return arrays of Trues and Falses (converted into 1s and 0s).

Let's say that in the following data, we need to know the total quantity of product A sold in the North:



We could use SUMIFS as follows:

=SUMIFS(C2:C8,A2:A8,"North",B2:B8,"A") which returns 38.

Or we could use:

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"),C2:C8) to return the same answer.

This works as follows...

The formula (A2:A8="North") results in the array:

{TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE} because these are the 7 results of the formula (A2="North" is TRUE, A3="North" is FALSE, etc.)

By placing the double minus (--) in front of the formula, it forces this to return 1s and 0s instead of TRUEs and FALSES, i.e.:

{1,0,0,1,0,0,1}

Our SUMPRODUCT formula therefore arrives at the same answer as the SUMIFS because the three arrays we are multiplying are as follows:



By multiplying the two arrays (A2:A8="North") and (B2:B8="A") after converting them to 1s and 0s, the result will only be a 1 if both conditions are true (as if either returns 0, we will be multiplying the other by zero).

Incidentally, if we don't multiply this by the C2:C8 range we can use this as an alternative to COUNTIFS

i.e. 

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"))

is the same as:

=COUNTIFS(A2:A8,"North",B2:B8,"A")

But why would you want to use this instead of SUMIFS, or COUNTIFS?

Well, SUMPRODUCT is a lot more flexible.

Let's say we wanted to know total sales value, rather than quantity.

We can't do this with SUMIFS without adding a new column (being column C x column D) and applying the formula to that, however with SUMPRODUCT we can just multiply by column D right in the formula:

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"),C2:C8,D2:D8) 

or let's say we wanted total sales of product A for North and South:

=SUMPRODUCT(--(A2:A8="North")--(A2:A8="South"),--(B2:B8="A"),C2:C8,D2:D8) 

Here we have added the two arrays together (-- is a +) - which will return a 1 if  column A is EITHER North or South, as one of the columns will be 1 if this is true whereas if it is not, both columns will be 0.

These are still pretty simple uses of the function, but hopefully this is enough to illustrate that they can go far beyond the capabilities of SUMIFS and COUNTIFS.




If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".

6 comments:

  1. Great article again, Glen; well worth the wait
    If you're performing further maths on a logical value, you don't need the "--"
    So SUMPRODUCT((A2:A8="North"),(B2:B8="A"),C2:C8) would do the same job and look slightly less cumbersome

    jim

    ReplyDelete
    Replies
    1. Thanks, good point Jim. I was erring on the side of caution, as it will always work with the --, rather than having to learn when you can get away without using it. Also, trying to keep an introduction to the function as simple as I can :)

      Delete
    2. Good Point, I only discovered the need for -- when I couldn't get this to work for a simple count (on reflection, COUNTIF probably would have done the trick)
      I used to use this method frequently but then Tables and Pivots came out, and SUMIFS, COUNTIFS and AVERAGEIFS too
      But, as you say, sometimes only the flexibility of SUMPRODUCT will do

      jim

      Delete
  2. Downside is that SUMPRODUCT is slower in calculation than SUMIF and COUNTIF, most notable in large data sets

    ReplyDelete
  3. If you use named ranges then SUMPRODUCT becomes self-documenting.

    ReplyDelete
  4. Nice article SUMPRODUCT rules!

    SUMIF + COUNTIF + SUMIFS + COUNTIFS do not work on closed files - SUMPRODUCT does.

    SUMPRODUCT also handles leading zeroes in codes correctly - the others don't.

    The others can't use functions within their brackets SUMPRODUCT can and that is its real super power.

    eg this sums every second row
    =SUMPRODUCT((MOD(ROW($A$2:$A$9),2)=1)*($A$2:$A$9))

    I use the * between the brackets as this is easier to type and explain since when you multiply by true it acts as one and false acts as zero. (It is slightly slower to use * calculation wise)

    Regards

    Neale

    ReplyDelete