Friday, March 9, 2012

Script Component Task: function that returns last word in a string

I have a field called CustomerName Varchar 100 and I wish to write a function that can do the following in a script component task

create a function called CleanString (ByVal CustomerName as String) As String

CleanString Returns the last word of a Customer name if the CustomerName field contains more than one word or if the CustomerName field does not contain Corp or Ltd

ie parse 'Mr John Tools' and the function returns 'Tools'

ie parse 'TechnicalBooks' and the function returns 'TechnicalBooks'

ie parse 'Microsoft Corp' return 'Microsoft Corp'

ie parse 'Digidesign Ltd' return 'Digidesign Ltd'

Any idea of a regular expression or existing piece of existing code I can have

thanks in advance

dave

I believe you'd want something along the lines of:

\s*(\w+(\s+(Corp|Ltd))?)$

Public Function CleanString(ByVal CustomerName As String) As String

Dim re As New Regex("\s*(\w+(\s+(Corp|Ltd))?)$")

Dim m As Match

m = re.Match(CustomerName)

' insert sanity to make sure we had a match

' group 0 would be the full string

' group 1 is the one we want

Dim g As Group = m.Groups(1)

Dim cc As CaptureCollection = g.Captures

CleanString = cc(0).Value

End Function

|||

Matt,

Thanks the code almost worked--I had to rewrite it like this. However I need help to tweak the regular expression Regex("\s*(\w+(\s+(Corp|Ltd))?)$") Im getting the following results In/Out

Input: Digidesign Australasia Pty Ltd Output: Pty Ltd --wrong should be Ltd

Input: Ms S Aiten Output: Aiten --correct

Function CleanString(ByVal CustomerName As String) As String

m = re.Match(CustomerName)

' group 0 would be the full string

' group 1 is the one we want

CleanString = m.Groups(1).Value

End Function

'Called by

Public Class ScriptMain

Inherits UserComponent

Dim re As Regex = New Regex("\s*(\w+(\s+(Corp|Ltd))?)$")

Dim m As Match

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

If Len(Row.CUSTOMERNAME) > 0 Then

Output0Buffer.oCUSTOMERNAME = Row.CUSTOMERNAME.ToString

Output0Buffer.oSearchName = CleanString(Row.CUSTOMERNAME.ToString)

End If

|||I don't think you can easily do this with a single regex. You might want to pre-process the string to remove the Pty first, or have a special case that captures the words you want to the left and right of "Pty" whenever the string appears in your customer name.

No comments:

Post a Comment