Extract Text from Middle of Alphanumeric String Up to a Number

croiss

New Member
Joined
Oct 7, 2014
Messages
8
Hi guys, I'm having trouble figuring out how to complete this data cleanup:

I have a list of items that all follow the same format:

123456789012 Abc company 5 10-04-2013 bcdefgh company y p

The items were copied and pasted from a PDF file, hence the messiness, and the misalignment of the columns... They all pasted over into one column, and I was able to start extracting text.
Some notes about the data:

1. They all start with a 12-character code that is followed by a space before the company title.
2. The company titles are all different lengths, which is why delimiting would not work.
3. Each company name is followed by a number, and then a date, and then another description...

My question is, how can I pull the "ABC company" part out the string without pulling the rest of the string? In other words, I want to pull the text UP TO a number. So in this case, "ABC COMPANY" up to 5.

So far, I have been played around with mid, len, find, search... But haven't been able to figure it out. I've tried also basing it off the =left(a2,min(find...) but couldn't get it to work within the MID function.

Thank you!
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
AB
1123456789012 Abc company 5 10-04-2013 bcdefgh company y pAbc company

<tbody>
</tbody>

Spreadsheet Formulas
CellFormula
B1=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",100)),100,200))

<tbody>
</tbody>

<tbody>
</tbody>

That formula will only work if the company name is exactly two words long. In thinking about it, I don't believe any formula solution can be robust enough to guarantee a proper selection since company names can contain numbers (think 3M) and dashes in any various combinations. I think a UDF (user defined function) might be the only way to go to reasonably "guarantee" a proper solution. This is what I came up with...
Code:
Function GetCompany(S As String) As String
  Dim X As Long, Text
  Text = Mid(S, 14)
  For X = 14 To Len(Text)
    If Mid(S, X) Like " # ##-##-####*" Then
      GetCompany = Mid(S, 14, X - 14)
    End If
  Next
End Function
If the OP is not familiar with VB coding and/or UDFs....

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use GetCompany just like it was a built-in Excel function. For example,

=GetCompany(A1)

If you are using XL2007 or above, make sure you save your file as an "Excel Macro-Enabled Workbook (*.xlsm) and answer the "do you want to enable macros" question as "yes" or "OK" (depending on the button label for your version of Excel) the next time you open your workbook.
 
Upvote 0
Hi Joe, Thanks for the formula, but it doesn't work for all of them... Is there a way to extract up to a number? Something that would work no matter what the length of the company name, because it would stop at the number?

Here's a few more examples:

122334456444 ABC DEFG COMP ANY 6 10-31-2013 SOMETHIN GMORE ABOUT THIS Y
446677999005 XYZ GENERAL STORE XYZ 5 11-25-2013 BLAH BLAH Y O
667889000123 MY FIRST INC CORP 13 02-09-2013*BLAH BLAH N O

<tbody>
</tbody>
 
Upvote 0
Rick, This method works well! But not for all of them, either... Although Thank you so much for the code and the introduction to UDFs.
 
Upvote 0
Rick, This method works well! But not for all of them, either... Although Thank you so much for the code and the introduction to UDFs.
I see the problem... I assumed from the data you posted that the number after the company name would be single digits only and the last example in your last posting shows a two-digit number there. In order to modify my UDF, tell me what is the maximum number of digits that can appear in the number that follows the company name.
 
Upvote 0
I see the problem... I assumed from the data you posted that the number after the company name would be single digits only and the last example in your last posting shows a two-digit number there. In order to modify my UDF, tell me what is the maximum number of digits that can appear in the number that follows the company name.


Rick, the number of digits is not what seems to be causing the problem, as it seems like the max is 2 digits ... here are two examples of companies that the function did not work for:

111222333444 XYZOK COR COMP CED ABC 5 10-31-2013* BLAH BLAH BLAH BLAH * P
555666777888* WOW WONDER AMAZE XYZ G 2 11-20-2013* BLAH BLAH P

<tbody>
</tbody>



OF course, the formatting is the only thing I retained from the actual data because of confidentiality issues...
 
Last edited:
Upvote 0
this should work if max is 2 digits

Code:
=TRIM(MID(A1,14,SEARCH("?? ??-??-????",A1)-14))











Excel 2013
AB
1123456789012 Abc company 5 10-04-2013 bcdefgh company y pAbc company
2122334456444 ABC DEFG COMP ANY 6 10-31-2013 SOMETHIN GMORE ABOUT THIS YABC DEFG COMP ANY
3446677999005 XYZ GENERAL STORE XYZ 5 11-25-2013 BLAH BLAH Y OXYZ GENERAL STORE XYZ
4667889000123 MY FIRST INC CORP 13 02-09-2013*BLAH BLAH N OMY FIRST INC CORP
5111222333444 XYZOK COR COMP CED ABC 5 10-31-2013* BLAH BLAH BLAH BLAH * PXYZOK COR COMP CED ABC
6555666777888* WOW WONDER AMAZE XYZ G 2 11-20-2013* BLAH BLAH PWOW WONDER AMAZE XYZ G
Sheet4
 
Upvote 0

Forum statistics

Threads
1,214,782
Messages
6,121,532
Members
449,037
Latest member
tmmotairi

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top