Re:Textstring processing/grouping
jan schrieb:
Quote
On 14 Mar, 17:29, "ekkehard.horner" <ekkehard.hor...@arcor.de>wrote:
>israelsson....@googlemail.com schrieb:
>
>>Hi,
>>I would like to read a textfile (list of transactions, one transaction
>>per line) and group similar transactions together (several
>>transactions per line). Something like this,
[...] sample below
>Please give some more details. How do you know that a new group starts?
>Does PABIXF really start a new group with member PABIXL?
No it does not, they should have been on the same line as the other
PABIX*.
A new group generally starts with the first 4 chrs (like PABI and its
members PABIX*), however to make things a bit more complicated it can
also be the first 3 or 5 chrs (like PABIT and its members PABIT* or
INF with its members INF02, INF08 and INF09).
I realise this may not be possible to do, but a piece of code that
does the grouping based on the first 4 chrs (which is the usual case)
would help out a great deal.
This code:
Dim oFS : Set oFS = CreateObject( "Scripting.FileSystemObject" )
Dim oTS : Set oTS = oFS.OpenTextFile( ".\groupta.txt" )
If Not oTS.AtEndOfStream Then
Dim sHead : sHead = Trim( oTS.ReadLine )
Dim sGroup : sGroup = sHead
Dim sLine
Do While Not oTS.AtEndOfStream
sLine = Trim( oTS.ReadLine )
If 1 = InStr( sLine, sHead ) Then
sGroup = sGroup + " " + sLine
Else
WScript.Echo sGroup
sHead = sLine
sGroup = sHead
End If
Loop
WScript.Echo sGroup
End If
oTS.Close
is based on the assumption that each group starts with its prefix/head.
Given the test data:
PABIT
PABIT4
PABIT5
PABIT6
PABIT7
PABIT8
PABITB
PABITF
PABITL
PABI
PABIX1
PABIX2
PABIX3
PABIX7
PABIX8
PABIXB
PABIXC
PABIXD
PABIXE
PABIXF
PABIXL
INF
INF02
INF08
OTHER
GOTCHA
GOTCHA01
GOTCHA02
NO
the output is:
=== groupTA: group transactions ===============================================
PABIT PABIT4 PABIT5 PABIT6 PABIT7 PABIT8 PABITB PABITF PABITL
PABI PABIX1 PABIX2 PABIX3 PABIX7 PABIX8 PABIXB PABIXC PABIXD PABIXE PABIXF PABIXL
INF INF02 INF08
OTHER
GOTCHA GOTCHA01 GOTCHA02
NO
=== groupTA: 0 done (00:00:00) ================================================
A sequence of headers like
HEAD
HEAD01
HEADER
HEADER01
will cause roblems however.
-