Counting Unique Values - why CSE?

XOR LX

Well-known Member
Joined
Jul 2, 2012
Messages
4,517
Hi all.

I was wondering if anyone could explain why, when using the standard FREQUENCY/MATCH construction for counting unique values within a range, in almost every single case is this done using a CSE formula, even though the non-CSE equivalent with SUMPRODUCT is a perfectly valid one?

For example, to calculate the number of unique entries in the range A1:A10 (I will assume that there are no blanks in this range, and also that the entries are not such that they merit any prior coercion, e.g. with "~"), we almost always see:

=SUM(IF(FREQUENCY(MATCH(A1:A10,A1:A10,0),ROW(A1:A10)-ROW(A1)+1),1))

with CSE.

But why?

Why not:

=SUMPRODUCT(N(FREQUENCY(MATCH(A1:A10,A1:A10,0),ROW(A1:A10)-ROW(A1)+1)>0))

?

(Or some other suitable coercer - double-unary, addition of zero, etc. - in place of my use of the N function, if you prefer.)

After all, in almost all other cases, the general consensus seems to be that we only resort to a CSE formula when absolutely necessary, i.e. if there does not exist a non-CSE set-up. But here that is clearly not the case.

And yet, as I said, practically every single example that I've seen (and that amounts to several hundred, I would imagine) uses the CSE construction, not the non-CSE one.

Is this simply an example of habit? Someone posted the CSE version one day (perhaps not realising that it was achievable using simply SUMPRODUCT) and people simply caught on, perhaps also not realising that it might be translatable into a non-CSE equivalent?

Or is there a more logical explanation? Is it the case, for example, that, for whatever reasons, the CSE version is actually preferable? Does it, for example, offer better performance in terms of calculation speed and/or efficiency?

I have not tested this hypothesis, but can this really be the case? If so, does it apply to all cases of CSE vs SUMPRODUCT? Or is there something in particular about the combination of FREQUENCY and MATCH which lends itself better to a CSE construction than to a non-CSE one? Again, I cannot imagine how this could be possible.

And if it's not particular to this construction, but rather true in general - that this CSE construction with IF(...,1) performs better than SUMPRODUCT - then shouldn't we be using it in all cases? So instead of:

=SUMPRODUCT(N(A1:A10="A"),N(B1:B10="B"))

we should all be using the CSE:

=SUM(IF(A1:A10="A",IF(B1:B10="B",1)))

Apologies for the rather esoteric post, but this has been on my mind for a while. I know of no other situation in which a CSE formula is preferred to such an extent over a non-CSE one, and I cannot for the life of me think why this might be the case.

Regards
 

Excel Facts

Add Bullets to Range
Select range. Press Ctrl+1. On Number tab, choose Custom. Type Alt+7 then space then @ sign (using 7 on numeric keypad)
Hello XOR LX,

Very interesting question ...!!!

It could be that the usage of N() or the double unary operator are considered as too esoteric ...

Hope Aladin will step into this topic to share his views ...

Cheers
 
Last edited:
Upvote 0
It could be that the usage of N() or the double unary operator are considered as too esoteric ...

But if that's the case then what are you suggesting? That we stop using SUMPRODUCT altogether and all use CSE formulas?

Regards
 
Upvote 0
As an observation, your first formula doesn't need CSE..
But I do recall an associate saying ....always use CSE with SUMPRODUCT...I don't but he always did....A learning glitch maybe ??
 
Upvote 0
But if that's the case then what are you suggesting? That we stop using SUMPRODUCT altogether and all use CSE formulas?

Regards

There is no suggestion ... whatsoever from my part ...

It would appear to me that we are all just bending to a habit ...

Cheers
 
Upvote 0
my comment exactly.....via the associate ...:LOL:
 
Upvote 0
Hello XOR LX,

Very interesting question ...!!!

It could be that the usage of N() or the double unary operator are considered as too esoteric ...

Hope Aladin will step into this topic to share his views ...

Cheers

Such are worded/discussed many times (often on this board). The efficieny scores run like this:

-- followed by +0 followed by *,..., N(). If an operator can be used instead of a function call, that would be the right thing to do.

Wisdom: Never substitute SUMPRODUCT for control+shift+enter.
 
Upvote 0
As an observation, your first formula doesn't need CSE..

Ah, yes! Stupid me, trying to create a simpler example.

And perhaps you've hit upon the answer there, since, if we include one or more conditional statements, e.g.:

=SUM(IF(FREQUENCY(IF(B1:B10="A",MATCH(A1:A10,A1:A10,0)),ROW(A1:A10)-ROW(A1)+1),1))

then the formula does now require CSE.

What's more, because of that conditional statement, it is not so easy to form the equivalent non-CSE, SUMPRODUCT construction, since e.g.:

=SUMPRODUCT(N(B1:B10="A"),N(FREQUENCY(MATCH(A1:A10,A1:A10,0),ROW(A1:A10)-ROW(A1)+1)>0))

will always return #VALUE! due to the two arrays being of a different dimension (the first array will consist of 10 elements, the second - by virtue of FREQUENCY always returning one additional element - 11).

And attempts to handle this error obviously require extra work, making it far less efficient than the array set-up.

So it turns out that the CSE version is necessary if we have one or more conditional statements to apply, but not necessary if we do not. But in that case I can fully understand how the habit has developed to use the CSE version even if there are no conditional statements.

Regards
 
Upvote 0
The efficieny scores run like this:

-- followed by +0 followed by *,..., N(). If an operator can be used instead of a function call, that would be the right thing to do.

Ah, many thanks. Perhaps I will stop using N from now on if it is less efficient than the use of an operator.

Regards
 
Upvote 0
So it turns out that the CSE version is necessary if we have one or more conditional statements to apply, but not necessary if we do not. But in that case I can fully understand how the habit has developed to use the CSE version even if there are no conditional statements.

Ignore that, since of course then it doesn't even require CSE!

Think I need more coffee. :)

Thread solved. Thanks to all who contributed.

Regards
 
Upvote 0

Forum statistics

Threads
1,213,497
Messages
6,113,999
Members
448,541
Latest member
iparraguirre89

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top