Arrays in Excel

Before we dwelve into array formulas in Excel, we need to talk about some of the more basic stuff, like what is an array? why would we need it in excel? is array a new features in excel? and how does it work?

Why Do I Need to Learn Array in Excel

The History of Array in Excel

What is an Array?

First, we need to talk about what most people in the world of programming addressed as variable, it's a bit off-track, but I'll try my best to keep it short. A variable is like a box with a label on it. You can store something inside of the box and label the box. And whenever you need to retrieve something, you can go find the box with the right label on it.
Now, a box may contain one thing like a wedding ring or it could contain many things like a box with "Kitchen Stuffs" label on it. When a box, contains more than one thing, we can consider it as an array. To give a glimpse of what a variable with single value looks like and what a variable array looks like, in javascript it looks like this:


JAVASCRIPT
1const weddingRingBox = "wedding ring";
2const kitchenStuffsBox = ["Spoon", "Fork", "Oven", "Knife"];

In excel, assuming we store the value of variable weddingRingBox in A1 and kitchenStuffsBox in A2, this can be represented as follows:


XLSX
1="wedding ring"
2={"Spoon","Fork","Oven","Knife"}

In Excel, it looks like this:


"Cell with single value in A1 and array in A2

Notice the curly braces? In programming, we're using square brackets to define an array, in Excel though, an array is define using curly braces instead.


How about those thin border that wraps the cells in range A2:D2? Those are what we called as array spill area in Excel. Since an array contains multiple values, while a cell in Excel can only store exactly only one value, if the output of your formula is an array, it will spill to adjacent cells.


By default an array such as ={"Spoon", "Fork", "Oven", "Knife"} will spill from left to right (ltr) or on a row based. You can change this default behavior to spill top to bottom or on a column based by using TRANSPOSE such as =TRANSPOSE({"Spoon", "Fork", "Oven", "Knife"}).


Also notice that you write the formula only in one cell, which is A2, but if you activate cell B2, C2 or D2, you can see that the formula is still visible in Formula Bar and the formula looks kinda greyed out, and you can only edit the formula when the active cell is in A2. Moreover, if you write or output something in B2, C2 or D2, say we write "Hello!" in C2, then you will get #SPILL! error in A2, this is because C2 has a value and prevent the array in A2 to spill its output.


You can refer to an array in excel from other cell by referring to the top left first cell of the array and end it with # like in the case above, you can write =A2# say in G1 and the array will show up and spill from G1 to J1.


Array Index and Element

One special characteristic of array is what we call index, while the value inside of an array usually called as an element. Let's go back to the example of array in typescript previously, with a bit of interactivity to try for yourself (expect an error in output when you delete the number between square brackets, this is okay, just type 0, 1, 2 or 3):


const App = () => {
  const kitchenStuffs: string[] = ["Spoon", "Fork", "Oven", "Knife"];

  return (
    /*
      Try changing the number inside the square brackets,
      say kitchenStuffs[0] to kitchenStuffs[2]
      Note: In programming world, index start from 0, so
      kitchenStuffs maximum index number is 3
    */
    <span>You choose: {kitchenStuffs[0]}</span> // kitchenStuffs[0] should return Spoon
  )
}

export default App;

As can be seen above, we can refer to the first element in kitchenStuffs by writing kitchenStuffs[0], and to refer to the third element by writing kitchenStuffs[2]. In programming, an index start from 0, the same cannot be said for Excel. In Excel, the index start at 1.


Multi Universe, err Dimensional Array!

In programming lingo, we define the size of an array as shape or dimension, so the previous example of array in Excel is to be said of having dimension of 1x4 or shape(1, 4) (in Python lingo!)


Let's not limit ourself with an array that contains only 1 row and 4 columns, like what is being shown on previous Excel example. If an array could contain multiple values inside of it, why not the values itself is also an array, so its an array inside of another array.


HOLD ON TIGHT! Because we're going to venture into the world of multi dimensional array!


In data science or mathematical terms especially in linear algebra, there are mathematical objets known as scalar, vector, matrix and tensor. Think of scalar (also known as zero rank tensor) as an object that contains single value like Excel's cell. Vector (also known as first rank tensor) is an object that contains multiple values, this is equals to our previous example of array in Excel with dimension 1x4. Matrix (also known as second rank tensor) is an object that is composed of multiple arrays that in overall shape a rectangular array with dimension such as 4x5. While tensor is an object that may contains an infinite number of arrays and may represent three dimensional space and above (discussion about this shape is out of the scope of this writing, and by limitation, Excel can't simulate this shape either, so we will stick to matrix on next section)


When you're working with Excel array formulas, you will need to be able to think in the scope of at least an array inside of another array, because most of our data or output will be in this kind of form. To extend on our last example of kitchenStuffs example, imagine, for whatever reason God knows, the things inside of the box was laid out like this


"two dimensional array in excel

This can be achieved by writing the following formula ={"Spoon", "Fork", "Oven", "Knife", "Sauce Pan"; "Tea", "Sugar", "Salt", "Wok", "Mixing Bowl"; "Glass", "Plate", "MSG", "Skillet", "Colander"; "Cutting Board", "Whisk", "Spatula", "Toaster", "Blender"}


Excel Regional Format Setting

Note that your regional format setting maybe different than the regional format being used in this writing. You may need to switch `,` for `;` and vice versa on some regional format.

For more information, refers to your operating system Regional and Language settings.


Notice how I separate each line of array using separator ;? Now we have four values inside of A1, separated by ;, but each value out of those four values in A1 is also an array itself. In data science terms, one array or vector, usually represent 1 event out of some samples. This also in line with the spirit of how data should be represented in a tabular fashion, whether in Excel or in any kind of data processing tools or storage.


Well, actually my last example didn't respect any principle of tabular data, basically because I just want to show an example of dimension of array as a collection of kitchen stuffs. But most of the time, your data may look more like this


"tabular data

Where the first column represent items, the second column may represent weight in grams, third columns may represent the material it was made of, fourth column is year it is being made, and the last column may define whether the item has rust on it or not.


Array Indexing and Slicing

Now that you know that an array in Excel could have one dimension or two dimension (theoretically, it could be infinite dimension, but Excel has limitation for that), how to call your array as a whole or in part to be calculated in another function or to be outputted? With regards to the special characteristic of an array that has been discussed in I.1, we can call an array as whole by writing =A1# (this is also already mentioned on the last part of part I) in any other cells, or if we want only some part of the array itself, we could do slicing by index. Below are some examples of slicing on index


"examples of indexing and slicing of array in Excel

The first example shows how to extract one element from an array by using index intersection such as =INDEX(A1#, 2, 3) to return the element at row 2 and column 3. INDEX() takes 3 arguments, in which 2 are required and the last one is optional. The first argument of INDEX() ask for the array that you want to look up for (in our example above is A1#), the second argument ask for what number of row do you want to get from the array in first argument (in our example this is 2 for second row), and the last optional argument is the number of column that you want to get from the array (in our example this is 3 for third column). Given these arguments in the example, INDEX() formula will return Salt as the element of the array that is located in second row and third column.


Though, INDEX() is not limited to returning one element only. In the second example of =INDEX(A1#,2,0), almost all the arguments are the same as previous example, except for the column argument which is on this example is 0, this means that we are only want to return the element on the array based on its row, which is 2 in the formula and that's it. You can also write =INDEX(A1#,2,), notice how I wrote it ,), this will give you the same output as ,0). Instead of a single value like the first example above, this formula will return you an array instead, an array containing elements in the second row of array A1# that is.


You can also do the same to return elements of array based on its column by using =INDEX(A1#,0,5). This is exactly the same as the second example above, but instead of ,2,0) we use ,0,5) in this third example to return the fifth column of an two dimensional array disregarding its row. And yes, instead of ,0,5) you can also write it as ,,5) just like before.


You can also gives an array in the argument for row and column (second and third argument) of a INDEX() formula as shown in the last example which look a bit complex. =INDEX(A1#,TRANSPOSE({1,2,3}), {2,4,5}). What this last example is trying to do is actually to return the first, second and third rows on the second, fourth and fifth columns of an array. Notice how it ignores the third column and jump from second to fourth column. You can pick the row and column that you want to return from an array, you can even use array to indicate the selection of rows and columns that you want to return. In the last example it is quite clear that {2, 4, 5} refer to second, fourth and fifth column as it is also returning Fork, Knife, Sauce Pan and then Sugar, Wok, Mixing Bowls, it is indeed returning the second, fourth and fifth column. But how about the rows that we choose? Why would we wrapped it in another formula of TRANSPOSE()? Remember! When you wrote {1, 2, 3} this in turn will output an array horizontally, from left to right, what else is being presented horizontally from left to right? Yes, columns. So if you wrote the argument for rows in INDEX() as {1, 2, 3}, you will get unexpected result that will actually return the elements in coordinates (1,2), (2,4), (3,5) which is Fork, Wok, Colander. If you really want to return the first, second and third rows without using TRANSPOSE() you should write it as, =INDEX(A1#,{1;2;3},{2,4,5}) instead. Notice the ; separator, remember what this separator present in array? Yes, rows.


If it still feels hard to grasp this concept, don't worry, it took some trial and error too for me for the first time. Feel free to come back to this section and read it slowly, and what is more important, is to try it yourself, experiment out.


How Array Really Works

I will wrote more later on..., stay tuned!

Table of Contents
Avatar of Johanes Indra Pradana Pao

Johanes Indra Pradana Pao is a Business Analyst Senior Supervisor with over a decade of experience turning data into insights. With a knack for creating financial models and machine learning magic, Pao has a history of solving complex problems—like bridging data gaps during a major ERP overhaul.

When he’s not mastering the art of data, Pao is focused on family, especially his son, who always takes precedence over work. Pao enjoys sharing his Programming and Excel knowledge and thrives in the office, where he gets to dive into data science and technology. Surviving catholic school is just one of his many skills, but balancing work and family is his real talent.

His tech stack includes Python, TypeScript, Rust, Excel, TensorFlow, and a few programming languages he’s still trying to pronounce correctly.

If you find his writings mind-blowing, sensational, and earth-shattering—especially if they save you from a spreadsheet meltdown—feel free to donate here. He’s in the middle of a ‘loan’ from his wife, and he promises this will help him pay it back!

Speak Your Mind (or Don't)
This post has no comment yet.