Working WIth Excel Arrays

Excel Array Formulas

Working WIth Excel Arrays

At work, I've been working on sifting through data to see if there is anything we should be aware of or could use. I've encountered some new kinds of problems on arranging the data, so I've been Googling a lot lately. These Googles have led me to an interesting page at Exceljet that shows some unusual examples of arrays.

The kinds of arrays that I am familiar with are those that handle multiple criterias. I've been using them before SUMIFS or COUNTIFS came into existence. The structure of the arrays that I've been doing looks like:

[ap_divider color="#CCCCCC" style="solid" thickness="3px" width="100%" mar_top="20px" mar_bot="20px"]

{=SUM(IF((range1 = criteria1)*(range2 = criteria2)*...*(rangeN=criteriaN), range ))}

where the ranges don't have to reference different cells but they do have to have the same dimensions. And as an important note, the brackets are NOT manually keyed in. The brackets are entered by doing CTRL+SHIFT+ENTER.

[ap_divider color="#CCCCCC" style="solid" thickness="3px" width="100%" mar_top="20px" mar_bot="20px"]

....The "Embed Document" doesn't work anymore. Once I figure out what is going on, I will update but I sure hate it when something works and then someone changes something and breaks it...

For the time being, I'm going to add images but what a PAIN! You can't see the formulas!

Excel formulas: LEN and SUMPRODUCT

 

SUMPRODUCT

Let's first look at SUMPRODUCT, which is on the right hand side of the image: cells K3:X20. The SUMPRODUCT that I'm most familiar with and that the books talk about is structured as: =SUMPRODUCT(array1, array2). This formula is similar to multiplying two columns of numbers and then adding up the results. So in the image, that formula would be similar to multiplying the numbers in column L and M (L*M) which column N does, and then adding up column N. Cell N10 is basically the sum of that column and cell N12 uses the SUMPRODUCT formula, giving you the same result (cell N13 shows the formula).

What I didn't know was that SUMPRODUCT could be used with one array which is basically the sum of the numbers. Normally, you don't really need to use this kind of SUMPRODUCT but there may be instances when doing so can give you the answer. Column V is this scenario.

The middle section, Q and R, just shows that the arrays must have the same dimensions.

Just keep in mind what SUMPRODUCT can do.

LEN

On the left hand side, I'm doing different permutations on the use of LEN formula. Column E is the normal use of the formula: LEN(E14).

Column F shows the situation where I entered the LEN formula as an array of the column of words: {=LEN($C$7:$C$14)}. Remember the brackets are created via CTRL+SHIFT+ENTER. In column F, I placed my cursor in cell F7, entered the formula, and then did CTRL+SHIFT+ENTER to enter the formula. I then copied that formula down the column. Note that this did not do anything meaningful.

Things start to get interesting in column G. In column G, I highlighted cells G7:G14, typed in the formula LEN(C7:C14) and entered it with CTRL+SHIFT+ENTER. What happened next was CTRL+SHIFT+ENTER added in the brackets { } and the formula LEN(C7:C14) showed up in cells G7:G14. Note that I did not use $C$7:$C$14. The results are the same as column E!

I was not aware that one can use array formulas with LEN.

But, wait, it gets even weirder. Exceljet makes array formulas without the brackets! Column I shows what they are doing. Cells I7:I14 contains the formula =LEN($C$7:$C$14). I keyed in the formula in cell I7 and then copied it down. Note that the results changes accordingly, even though the formula looks the same.

But there's more! I could type =LEN(C7:C14) and Excel would give that same freaky answer. In column I, I added the $ signs so I could copy it down. If I enter the formula without $, then I can't copy it down because the cell references would change. I would have to enter the formula in each individual cell individually. So if I enter =LEN(C7:C14) in cell J7, I would get 9. If I entered the same formula =LEN(C7:C14) in cell J9, I would get 13. And if I entered the same formula =LEN(C7:C14) in cell J22, I would get an error message. So this kind of formula requires a number in the same row.

Just keep this weirdness in mind.

Now Exceljet brings together the SUMPRODUCT and the unique array feature in one nifty formula that counts the number of unique texts. Not count the number of texts but count unique ones.

Here's the snapshot of the Excel:

Excel: Counting Number of Unique Texts

 

Counting the number of unique text

The items highlighted in yellow are those written in Exceljet style formula - arrays without the brackets.

The point of this section is to derive a formula to count the number of unique text in column C34:C43. Note there are duplicate names and we don't want to count duplicate names. The normal formula COUNTA(C34:C43) in cell D27 does not work because it gives 10 rather than 4. Even the COUNTIF in cell D28 does not work.

Column D34:D43 shows the result using the normal COUNTIF formula: =COUNTIF($C$34:$C$43,C34), where D34 will change from D34 to D43 as you go down the rows. It doesn't give you want yet.

Column E34:E43 does the same thing using the array approach: {=COUNTIF($C$34:$C$43,C34)}. Normally, I wouldn't use this approach when the normal way works.

Column F34:F43 now uses the double array but I enter the formula in cell F34 first and then copy it down: {=COUNTIF($C$34:$C$43,$C$34:$C$43)}. Note how this differs from column E. This approach does not work.

Column G34:G43 uses the same formula as F but I first highlight G34:G43, then enter the formula. When I hit CTRL+SHIFT+ENTER, all of the cells contain this same formula: {=COUNTIF(C34:C43,C34:C43)}. Now we get the same numbers as column D.

Then there's the Exceljet way found in column I34:I43: =COUNTIF($C$34:$C$43,$C$34:$C$43). This formula looks like the one found in F but they are without the brackets. Keep in mind that the result is an array.

Now, none of these options get you where you want to go. The trick to recognize is that you now have an array where you have multiples of the same number count - 3 threes, 2 twos, and if you had any, 4 fours, etc. So if you have 3 threes and you invert the 3s, you get 1/3, 1/3, 1/3, which if you add, comes up to 1. The inversion of 2 twos would become 1/2 and 1/2, which adds up to 1. Again the result is an array. You are just inverting an array.

Column K34:K43 shows that inversion: =1/COUNTIF($C$34:$C$43,$C$34:$C$43). All of those cells have the same formula.

Finally, if you do SUMPRODUCT on that array =SUMPRODUCT(1/COUNTIF($C$34:$C$43,$C$34:$C$43)) in a single cell, you will get the count of unique texts. You could have done this formula without the $, since this formula will be in a single cell and you don't have to copy it. Cell D26 is that formula without the $.

[ap_divider color="#CCCCCC" style="solid" thickness="3px" width="100%" mar_top="20px" mar_bot="20px"]

Now I have to admit that knowing when to use an array formula with brackets instead of normal formula and when to use an array formula without brackets will not come naturally to me. I will have to use this A LOT before I start having an instinct on how to use this. I'm just barely scratching the surface of this. Apparently you can use an array structure with formulas such as SMALL, LARGE, MATCH, but in order to see how they work, you will have to test the formulas at a small scale first to see what happens.

Similar Posts