Excel VBA to extract a word from a text
I am trying to figure out a way to generate stimuli for the self-paced reading study, which has to be run in DMDX. DMDX uses the #$%*&@ RTF format, plus a bunch of confusing control commands. Many users recommend doing stimuli generation in Excel, which turns out to be a really bad choice for text manipulation. Anyway, my task is, given a stence, to generate a bunch of sentences, where every letter is replaced with an underline except for a single word.
And, surprise! Excel does not have a function where you can extract a word from a string. The only functions it provides are (a) searching for the first occurence of a letter in a string, (b) extracting part of a string, (c) replacing a substring with something else, and (d) repeating a letter several times. Not logically enough to do what needs to be done.
VBA is the only hope. I found this:
![]()
B45: =MyExtract(A45, 2, "B", " ")
Function MyExtract(MyText As String, ItemNo As Integer, FrontOrBack As String, Optional MySeparator As String) As String
‘ This function will scan a string containing at least TWO words, each separated
‘ by a specified character and extract one of the words. It can extract the word
‘ (or SubString) counting from either the Back or Front of the text.
‘ [www.meadinkent.co.uk]
Dim LenText As Integer, n As Integer, CountSpaces As Integer
Dim MySt As Integer, MyFin As Integer, MyStep As Integer, Mk1 As Integer, Mk2 As Integer‘ MySeparator was an optional parameter
If Len(MySeparator) = 0 Then MySeparator = " "LenText = Len(MyText)
‘ You cannot extract a word if length is LT 3 chars
If LenText < 3 Then
MyExtract = "*"
GoTo MyEndBit
End If‘ set the direction in which the text is examined
If UCase(FrontOrBack) = "F" Then
MySt = 2
MyFin = LenText - 1
MyStep = 1
Else
MyFin = 2
MySt = LenText - 1
MyStep = -1
End If‘ identify the position of characters matching the separator
For n = MySt To MyFin Step MyStep
If Mid(MyText, n, 1) = MySeparator Then
CountSpaces = CountSpaces + 1
If CountSpaces = ItemNo - 1 Then Mk1 = n
If CountSpaces = ItemNo Then Mk2 = n
End If
Next nIf CountSpaces = 0 Then
MyExtract = "*"
GoTo MyEndBit
End IfIf UCase(FrontOrBack) = "B" Then
n = Mk1
Mk1 = Mk2
Mk2 = n
End IfIf Mk2 = 0 Then Mk2 = LenText + 1
Mk1 = Mk1 + 1MyExtract = Mid(MyText, Mk1, Mk2 - Mk1)
MyEndBit:
End Function
April 19th, 2005 at 9:16 pm e
[…] monstrations of MDL
Self-paced Reading: Excel VBA script
Yesterday I found a useful […]
June 11th, 2008 at 9:28 am e
Hello,
I am new to VBA. And need some help. Your code is the closest to what I might need. How would I change the code to extra multiple words?
for the column in questions each cell has a block of text of varying lengths.
e.g of a cell:
Cell K2 contains the following
ENTITY NAME : “Text”
SOURCE : Text (ID:####)
TYPE :
FULL NAME : Text
PROGRAM : Text
TYPE : Text
CATEGORY : Text
ENTITY NAME : “Text”
SOURCE : Text (ID:####)
TYPE :
FULL NAME : Text
PROGRAM : Text
TYPE : Text
CATEGORY : Text
ENTITY NAME : “Text”
SOURCE : Text (ID:####)
TYPE :
FULL NAME : Text
PROGRAM : Text
TYPE : Text
CATEGORY : Text
Cell K3 contains
ENTITY NAME : “Text”
SOURCE : Text (ID:####)
TYPE :
FULL NAME : Text
PROGRAM : Text
TYPE : Text
CATEGORY : Text
ENTITY NAME : “Text”
SOURCE : Text (ID:####)
TYPE :
FULL NAME : Text
PROGRAM : Text
TYPE : Text
CATEGORY : Text
and so on…
I need to extract the entity Name and the ID.
June 11th, 2008 at 12:04 pm e
mf — In your case you might have a chance to do this without the VBA. Take a look at the original page, http://www.meadinkent.co.uk/xlextracttext.htm, and specifically the Example 2, where they had
B39: =MID(A39, FIND(”;”, A39, 7)+1,
FIND(”;”, A39, FIND(”;”, A39,
7)+1)-FIND(”;”, A39, 7)-1)
It may be similar to what you need — search for “FULL NAME” and get the first position, and search for “PROGRAM” to get the last position of FULL NAME (minus 2). Then use MID() to get it out.
Best of luck,
– gary
June 12th, 2008 at 3:13 pm e
Thanks Gary, will check it out
June 13th, 2008 at 8:47 am e
Hello Gary,
Hope you are well. My apologies, I think I might have not explained properly or may be have misunderstood you.
I have a screen shot of the skeleton workbook in question below:
http://picasaweb.google.ca/wanderer.mf/M/photo#5211358379641655026
The link you provided if edited would only be able to extract one instance of the word. Where as there are multiple entries in the same cell. I am not sure if it can be done because my VBA and Advanced excel is quite weak.
I hope you can help.
Thanks
June 13th, 2008 at 9:11 am e
I forgot to mention. The main reason I am doing this is so that I can count the number of times a specific entity name appears along with its ID
June 23rd, 2008 at 1:55 pm e
mf — looks like the file was originally a text file, in which case it may be much easier to filter the information using GNU GAWK or GREP programs (you can wikipedia them). These programs are designed for extracting text patterns and will be much easier to work with. You may be able to hack up some Excel/VBA code to do this, but I personally will use GAWK/GREP.
June 24th, 2008 at 8:07 am e
Thanks for your reply Gary.
I am not sure if the file was a text file as I normally run a batch file to extract the data from an application. The data is automatically stored in an excel file thats a CSV. I modify the file and save it as an xls so that I can create pivot tables to sort the data. When its coming down to the body field I need to somehow extract fields from the cells i.e Entity, Source, full name and category. This will be used to display the data and count the number of times it appears.
So will I be able to use GREP? How do you suggest I go about it?
August 8th, 2008 at 1:26 pm e
Wow that is a lot of code! I believe I saw a whole little proggie for this at http://www.vbasic.net I would search over there. Good luck!