Learn Excel - "Array Formula to Extract Records" - Podcast #1746

If you like this content, please consider visiting the video on YouTube and subscribe to the MrExcel Channel to show your support!
This video has been published on Jul 8, 2013.
Nirml asks how to extract all records between certain dates for a customer, using a formula. This requires a fairly complex array formula. Using a page from Mike ExcelisFun Girvin's new book, I explain how to extract records using a formula.
maxresdefault.jpg


Transcript of the video:
MrExcel podcast is sponsored by "Easy-XL"!
Learn Excel from MrExcel podcast, episode 1746 - Extract Records with a Formula!
Hey, welcome back to the MrExcel netcast, I'm Bill Jelen.
Today's question is sent in by Nermal.
Nermal has a large data set, he has several criteria fields. He had 3, I have 4, and wanted to have a formula.
So we when the customer changed, the formula would automatically extract the matching records.
Alright, so let's just try this here, we're going to change from General Motors to Exxon.
OK, well now this is not a formula, this is an Advanced filter, and unfortunately, I said "Well hey, this is going to be easy, just come back here to Data filter and re-Apply", but you can't re-Apply an Advanced filter, you have to go through all the steps again, Filter, Advanced, Copy to another location, click OK, and then you get the new records.
And you know, you can't expect the person using your spreadsheet to go through all of those steps each time, so we need a formula.
And, I'm going to plug Mike's book here, starting today, at mrexcel.com, go to eBooks, "The Ctrl+Shift+Enter" ePub, which is either PDF for your computer, ePub for your iPad, or mobi for your Kindle, is out!
So I bought this book, and I read through the book, and there's a chapter there on extracting records.
And if you watch the Dueling podcast, you know that Mike is really good at these array formulas and I'm really bad at them.
I don't want to show you what I've learned by reading Mike's book, so we're going to go here to a formula that I built, based on Mike's book.
And the big thing is, we have our input sales, right, so we have some criteria.
And the brunt of the formulas down here are going to be INDEX, we're going to go back to the original data set, use the INDEX function to pull out a specific record number.
So, here we're pulling out record #16, which is down in row 17, and I'm getting the date, the invoice number, the revenue, and the open balance, to be pulled over here using INDEX functions.
Alright, so the index function is not what we're going to talk about today, what we're really going to talk about today is how to get the record numbers.
That's where the magic- that's the array formula that has all the magic, and you'll see the formula up here, and we'll come back to this formula, but I want to explain it in a simpler instance.
And in this instance we just have one criteria, we have a small data set here with ten rows.
I choose a customer here, and I want to extract the records for that customer.
Now I'll show you that it's working, if I type in Ford, all of this information changes to give me the Ford records.
OK, so we'll go back to Exxon, and we'll talk about how this formula is working, alright?
What we start with, deep inside the array formula, is an IF function, alright?
And think about an IF function it says "If logical test is TRUE", so we're checking all of the customers from C2:C11 to see if they're equal to "Exxon" which is stored in H2, and I have $ signs there.
OK, so we're checking to see if each of those are TRUE, and as you can imagine, we're going to get a series of FALSEs and TRUEs, if it's "Exxon" it's FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE!
Alright, that's going to be the results of the IF function.
And then what do we do if it's TRUE?
If it's TRUE, I want the row number of that record, minus the row number of the headers.
The headers are in row 1, so we take the row of each record minus 1, and if you look down here, now what happens is, because we've specified what to do if it's TRUE, but we haven't specified what to do if it's FALSE, the IF function is giving us a bunch of FALSEs.
So we're getting FALSE FALSE, and then a record number, FALSE FALSE, and then a record number, FALSE FALSE, and then a record number...
And actually that text right there is wrong, there should be one more FALSE going in there.
Alright, so let me edit that just so we are correct.
Alright now, so that piece of the array function, the IF and the ROW -1, is returning a lot of FALSEs when the record doesn't match, and it's returning the row number when the record does match.
Now the trick to this, is to pass that array of FALSEs and numbers to the SMALL function.
What is the SMALL function?
I'm sure you used MIN and MAX, MIN and MAX give us the smallest value and the largest value.
But SMALL lets us specify range, ,1 to get the smallest value, ,2 to get the second smallest value, ,3 to get the third smallest value, and so on.
And so, by passing that array to the SMALL, and asking for the first smallest item, it's going to give us the 3, and sure enough, right there we get a 3, that's the first one that matches.
Let's change this to Ford, and while my text down here won't change, we're going to get FALSE FALSE FALSE, 4, 5, FALSE FALSE FALSE FALSE, 10.
So the smallest from that array of values, I just said, is going to be the number 4.
And then by using row of A1 here, that's the world's geekiest way of writing the 1, so =ROW(A1) , when we copy that down, it automatically changes the A2 A3 A4 A5, which gives us a series of numbers.
So in F8, it's the same set of SMALL and numbers, but we're asking for the row B1, which it gives us the second value there.
Alright, so when you kind of break it down, it was an intimidating formula, but you can see how the guts in here, the IF function is really kind of common, right?
We're checking each of the customers to see if it's equal to "Ford" or to "Exxon", and if it is, give us the row number that THAT appears.
But then to me, the really clever part is using SMALL, knowing it's SMALL it's going to ignore the FALSEs, and just give us the numeric values.
Now, in real life you would have to extend this format down, so that way it's far enough to include the largest possible value, and so in my formula back here, that's why I wrapped it in the IFERROR to give me "" if in fact there was nothing there.
So I copied my formula down more rows than I thought I would need, and I'm just going to get some blanks here.
I also used IFERROR around the index to make sure that I get "" on the inside.
Now, so in the simple case, we had just one criteria "Is it the customer?" With 3 criteria, or 4 criteria, you would think that you would go out and use an AND function, alright, like this.
But the AND function doesn't work with array formulas, and that's not really documented anywhere, but it is documented in Mike's book, and I appreciate that Mike's book goes through and talks about out of the 468 functions, which ones will work with array formulas and, which ones won't, that's a great thing.
So when we come back here, instead of using the AND function like you would think you would want to use, we have to nest IF statements, do the first test, and then another IF statement with the second test, and then another IF statement with the third test, and then in my case another IF statement with the fourth test.
If all of that is TRUE then give me the row, and I don't bother to put what to do it's FALSE, that way the IF function gives me FALSE I just close 4 IF statements right there, and we get our answer.
So let's try it here, we'll put in "Ford", and all of the data down there and changes to reflect the new information.
Let's try a smaller one, I think "CitiGroup" only has a few records it has a lot.
Uh, who's small, "Verizon".
There you go, so you get fewer records.
The beautiful thing about this, well these formulas are expensive and cost a lot of calculation time because they're doing, in this case, I think, 563 tests for every single row here.
It is cool that you don't have to run the Advanced filter again.
So there's times, especially a Nermal's case, remember Nermal asked the question today, where this will solve the problem.
Again, check out Mike Girvin's book, it's at the printers right now, but the eBook, is out if you want to read it on your screen, the PDF or the ePub or the kindle edition, is available now at mrexcel.com , probably in a couple of weeks the print edition will be making its way out.
A great resource on array formulas, they are just poorly documented elsewhere, and this, for the first time, in one place, puts all kinds of great information about array formulas.
So check out that book from Mike Girvin!
[ Music ] I want to thank Nermal for sending the question in, I want to thank you for stopping by, we'll see you next time for another netcast from MrExcel!
 

Forum statistics

Threads
1,214,926
Messages
6,122,306
Members
449,079
Latest member
juggernaut24

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top