change language:

  1. Introduction
  2. Array formulas returning one value
  3. Array formulas returning multiple values
  4. Pros and cons
  5. AND + OR operators
  6. Use of constants
  7. Examples
    1. Determine league rankings
    2. Sort alphabetically
    3. Find the position of the 1st capital letter in a word
  8. Conclusion
  9. Questions / suggestions

1. Introduction

Array formulas are unknown to many Excel users. That's a shame, because array formulas are a very powerful tool within Excel. This unfamiliarity is partly because in Excel help little or no attention is paid to array formulas and it is therefore difficult to be aware of the existence of this functionality at all. This tutorial explains how array formulas work and what can be done with them. It starts with explaining the operation with simple examples, but at the end also more complex examples are discussed. In newer Excel versions (especially Excel for Office 365) alternative formulas are available for some examples, but this is not considered here. The examples used work in all common Excel versions.

Array formulas process a series of data instead of data from a single cell. Such a series is also called a matrix or array. Array formulas must be closed with Ctrl+Shift+Enter. After that, curly braces automatically appear around these formulas to indicate that it is an array formula. So these curly braces should not be typed. In this tutorial, the braces are displayed to make it clear that it is a matrix formula. Although array formulas must be closed differently, they can be copied normally to other cells. Due to the closing with Ctrl+Shift+Enter, array formulas are also known as CSE formulas. In Excel for Office 365, array formulas usually do not have to be closed with Ctrl+Shift+Enter, but it is allowed.

Array formulas come in 2 variants, namely formulas that return one value and formulas that generate an array of values.

2. Array formulas returning one value

The operation of an array formula can best be explained by a simple example. The overview below shows sales figures for a number of salesmen.

array formules 1

Now suppose that you would like to know the highest selling price/piece from salesman 'John'. In Excel 2019 or Excel for Office 365, the formula MAXIFS can be used for this, but in older versions it can easily be calculated with an array formula. This can be done with the following formula, for example:

{=MAX(IF(A2:A9="john",D2:D9))}

This formula must of course be entered with Ctrl+Shift+Enter. The operation of this formula is as follows. First we look in A2: A9 where the value 'john' is. The result of this comparison is: TRUE,false,false,TRUE,false,TRUE,false,false.

Then the maximum of D2: D9 is calculated, but only for those cases where in column A2:A9 the result is TRUE. So the maximum is determined from the numbers 3,4,7 so that the result of the formula is 7. The above formula is good and works, but could be a bit shorter:

{=MAX((A2:A9="john")*D2:D9)}

This last shorter notation only works well with maximums greater than or equal to zero. Especially with the last formula, it can be seen that the syntax of an array formula differs significantly from the syntax of a regular formula. If you enter the above formulas as a regular formula (without Ctrl+Shift+Enter), an error message will follow. It is also very important that the matrices in the formula have exactly the same size. In this case it concerns A2:A9 and D2:D9.

Now suppose you want to calculate the highest sold price/piece for salesman 'john' for article group A. Then the formula can be expanded as follows:

{=MAX(IF((A2:A9="john")*(B2:B9="A"),D2:D9))}

So an extra comparison criterion is added to the formula. The result of this formula then becomes 4.

If the maximum is always greater than or equal to zero, the formula can be shortened to:

{=MAX((A2:A9="john")*(B2:B9="A")*D2:D9)}

If then the turnover of the above table had to be calculated, you would first have to calculate the price/piece times the number per line, after which the total turnover can finally be determined by means of the SUM function. But with an array formula, this can be calculated very easily in one go.For example, put the following array formula in cell C10:

{=SUM(C2:C9*D2:D9)}

The result of this formula is 260. In fact, this formula performs the following calculation:

=C2*D2 + C3*D3 + C4*D4 + C5*D5 + C6*D6 + C7*D7 + C8*D8 + C9*D9

With large ranges, this is almost impossible to do with normal formulas and the chance of errors becomes very high. Dit is dus een voorbeeld van een erg korte matrixformule waarmee heel snel een erg uitgebreide berekening kan worden uitgevoerd.

3. Array formulas returning multiple values

The previous section discussed array formulas that return one value. But it is also possible to generate many values ​​at once with an array formula. Continuing on the previous example, you could, for example, enter the turnover per line in one go with one array formula. To do this, first select the range E2:E9, and then in the formula bar, type the following array formula:

{=C2:C9*D2:D9}

After closing with Ctrl + Shift + Enter, the entire selected range is filled in one go with the turnover per line (number x price/piece).

Another example is about a standard Excel function that must always be entered as an array formula for one area at a time. This concerns the TRANSPOSE function. In the example below, the data that is now displayed horizontally in cell D1:I1 should be displayed vertically.

array formulas 2

To do that, first select the range where the data should be placed (in this case D3:D8). Then you can enter in the function bar:

{=TRANSPOSE(D1:I1)}

This formula must also be closed again with Ctrl+Shift+Enter. The data from D1:I1 then appears in D3:D8.

If you now try to delete cell D5, for example, the following error message will appear:

array formulas 3

This is a sign that the formula forms one whole over D3:D8 and that it is therefore not possible to remove part of it. This is also an example of an array formula where a series of data is generated oin one go.

4. Pros and cons

As already indicated, array formulas are a powerful tool. The main advantages and disadvantages are:

advantages array formulas:

  • it can be used to calculate (complex) things that cannot be calculated with normal formulas
  • often make the use of helper columns, such as those sometimes required with regular formulas, unnecessary
  • the formulas can be much shorter than traditional formulas
  • it is possible to fill a large range with data at once by entering only one array formula

disadvantages array formulas

  • the formulas are often less understandable for less experienced Excel users
  • the formulas must always be closed with Ctrl+Shift+Enter. If not, incorrect values ​​or error messages are returned
  • using a large amount or complex array formulas can be at the expense of speed
  • there is little or no documentation in Excel about the use of array formulas

5. AND + OR operators

Array formulas can be used to filter data by multiple criteria. In a previous paragraph, the following formula was discussed, with which the highest sold price/piece could be calculated for salesman 'john' for article group A:

{=MAX(IF((A2:A9="john")*(B2:B9="A"),D2:D9))}

The * sign acts as the AND operator here. The salesman must be 'john' AND the article group must be 'A'. But if, for example, you want to calculate the maximum selling price for seller 'peter' OR 'mark', then the + -sign must be used for this. So the + sign acts as a kind of OR operator. The formula then becomes:

{=MAX(IF((A2:A9="peter")+(A2:A9="mark"),D2:D9))}

If the maximum is greater than or equal to zero, then the following formula can also be used:

{=MAX(((A2:A9="peter")+(A2:A9="mark"))*D2:D9)}

Make sure that an extra set of brackets is placed around the entire OR construction. The result of this function is 9. AND and OR operators can be combined / nested in many ways. With OR operators it is important that there is no overlap between the criteria, because otherwise double counting can occur. So for example the formula {=SOM(((C2:C9>4)+(C2:C9<9))*(C2:C9))} gives incorrect results. All values ​​of C2:C9 are summed (this is correct because any value is greater than 4 or less than 9), but the values ​​between 4 and 9 are counted twice. So the outcome is not 43 but 68 (the numbers 5, 7, 8 and 5 are counted twice).

6. Use of constants

Constants can also be used in array formulas. Suppose the sum of the numbers 1 to 5 to the power of 2 has to be calculated. This is of course possible with a regular formula such as:

=1^2+2^2+3^2+4^2+5^2

But this can also be done with an array formula, where the numbers 1 to 5 are entered as a constant:

{=SUM({1,2,3,4,5}^2)}

Please note that the curly braces around the constants must be typed. This is in contrast to the curly braces that surround the entire formula. If you want to find the sum of numbers 1 to 20 to the power of 2, it becomes very laborious to enter it in both the normal formula and the array formula above. An alternative compact array formula is then:

{=SUM(RIJ(1:20)^2)}

The numbers 1 to 20 are generated by the function ROW(1:20). This corresponds to the constants:

{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}

All very well, but what is the practical use of such constants in array formulas? To demonstrate this, an example is given below using constants. Suppose there is a campsite where a certain rate must be paid per day and an increased rate in the months of July, August and September:

array formulas 4

The following array formula can be used to calculate what must be paid based on the above data:

{=SUM((1+B2-B1)*B3,ISNUMBER(MATCH(MONTH(ROW(INDIRECT(B1&":"&B2))),{7,8,9},0))*(B4-B3))}

In this formula, all days are first calculated using the normal rate ((1+B2-B1)*B3) and then the days in July, August and September are increased by the difference between the peak rate and the off-peak rate. To determine whether a day falls within the peak rate, the function: ROW(INDIRECT(B1&":"&B2)) is used. The dates in cells B1 and B2 are converted to row numbers. After all, in Excel data is stored as a number. In the example above, these are row numbers 43956 to 44024. The month is determined for all these row numbers (which are in fact all dates from B1 to B2). If this month equals 7.8 or 9, an amount of € 2.50 will be added per day (that is the difference between peak rate and off-peak rate). The numbers 7,8, and 9 are entered as constants in the formula and must be provided with curly braces. The result of this formula then becomes 720. That is 69 days x € 10.00 + 12 days x € 2.50.

7. Examples

Below are three other examples where array formulas can offer a solution.

7.1. Determine league rankings

Excel is widely used to keep track of the results and standings of competitions. The picture below shows an example of a competition.

array formulas 5

Suppose an interim score of the above competition must be displayed. In this case this can be done with normal Excel formulas. The points can easily be displayed from highest to lowest in D2:D7 with the formula:

=LARGE(B$2:B$7,ROW(A1))

and the corresponding team can be found with:

=INDEX(A$2:A$7,MATCH(D2,B$2:B$7,0))

But what if 2 teams have the same number of points? Then the above formula will no longer work, because in the case of an equal number of points, the same team will be displayed twice. Array formulas can help here. There are several ways to do this. Two possibilities are discussed here. The starting point here is that in D2:D7 the ranking positions are indicated with: =LARGE(B$2:B$7,ROW(A1)).

A first formula to display the correct position is:

{=INDEX(A$2:A$7,SMALL(IF(ISNA(MATCH(A$2:A$7,E$1:E1,0)),IF(B$2:B$7=D2,ROW(A$2:A$7)-1)),1))}

How does this formula work? The formula determines the smallest row number for which the value from column B2:B7 corresponds to cell D2 and for which the corresponding team has not been previously displayed. The latter prevents a team from appearing twice in the event of an equal number of points. This is checked with the function: ISNA(MATCH(A$2:A$7,E$1:E1,0)). You may only choose from teams that have not previously appeared in the standings. his is done by comparing the values ​​of A2:A7 with the values ​​in E$1:E1 to be able to use only the values ​​that cannot be found and thus the compare function yields #N/A. Because the function is copied down, the last number of E$1:E1 is adjusted each time (E1 becomes E2 etc.) and it is assured that a team cannot appear twice. Finally, the row number found is decreased by 1 because the first value of the matrix starts on row 2, while in the index function, the first value is 1.

Another formula to achieve the same is:

{=INDEX(A$2:A$7,SMALL(IF(B$2:B$7=D2,ROW(B$2:B$7)-1),COUNTIF(D$2:D2,D2)))}

This formula uses a different approach to avoid showing duplicate values. COUNTIF(D$2:D2,D2) sets the position of the SMALLfunctiom. For duplicate values, the smallest row is displayed first that satisfies the condition that B2:B7=D2 and then the second smallest row. In the first case, the COUNTIF(D$2:D2,D2) function has the value 1, and in the second case, the value 2.

The above formulas also work if more than 2 teams have finished in the same place.

7.2. Sort alphabetically

Suppose the list below is to be sorted alphabetically.

array formulas 6

This sorting alphabetically can be done using the following array formula:

{=INDEX(A$1:A$10,MATCH(SMALL(COUNTIF(A$1:A$10,"<"&A$1:A$10),ROW(A1)),COUNTIF(A$1:A$10,"<"&A$1:A$10),0))}

This sorting is done using the COUNTIF function. Per line it is checked how many values ​​are smaller than the value on that line. In fact, this virtually creates the following table:

array formulas 7

In this table the order of the alphabetical sorting has already been determined. And the SMALL function actually reads this table from lowest to highest, by setting the position of the smallest with COUNTIF(A$1:A$10,”<“&A$1:A$10).

7.3 Find the position of the 1st capital letter in a word

In cell A1 there is a word of arbitrary length, whose position must be determined with the first capital letter. This array formula can be used for this:

{=MIN(IF((CODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1)));1))>=65)*(CODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1))<=90),ROW(INDIRECT("1:"&LEN(A1)))))}

In this formula ROW(INDIRECT(“1:”&LEN(A1))) creates a virtual matrix with numbers from 1 up to the length of the word. Then the word is checked letter by letter for capital letters. This check is done with the CODE function. The CODE function determines the numeric value of a character. Uppercase numeric values ​​range from 65 to 90. Then the first position is displayed because the MIN function is used and thus the lowest number is displayed. To display the last position of a capital letter of a word, the function MAX must be used instead of MIN.

8. Conclusion

In this tutorial, some fairly complex array formulas have been discussed using a number of examples. A lot more is possible with array formulas, but it is impossible to cover all of that. The tutorial is mainly intended to give an impression of what can be done with array formulas. Array formulas are a very powerful tool. Things can be done with it that are not possible with normal formulas. While array formulas may seem incomprehensible at first, there is certainly some logic to them. This logic is best discovered by practicing a lot with array formulas. This also provides insight into how matrix formulas work exactly and what can be done with them.

9. Questions / suggestions

Hopefully this article has helped you understand and apply array formulas. If you have any questions about this topic or suggestions for improvement, please post a comment below.

arrow_up