Real Software Forums

The forum for Real Studio and other Real Software products.
[ REAL Software Website | Board Index ]
It is currently Sat Jun 24, 2017 5:32 pm
xojo

All times are UTC - 5 hours




Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Thu May 10, 2012 3:30 pm 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
and HONOR Double quotes at the same time!

it will determine if the file is comma or tab delmited and convert it to TAB if comma otherwise leaves it alone

you need to supply INP_F and OUT_F as folderitems

Dim t As TextInputStream
Dim i As Integer
Dim j As Integer
Dim tb As Integer
Dim cm As Integer
Dim s As String
Dim temp(-1) As String
tb=0
cm=0
t=TextInputStream.Open(inp_f)
s=t.readall
t.close
s=ReplaceLineEndings(s,EndOfLine.UNIX)
list=Split(s,EndOfLine.UNIX)
// remove blank lines and determine delimiter
If list.ubound>0 Then
For i=list.Ubound DownTo 0
s=Trim(list(i))
If s="" Then
list.remove i
Else
tb=tb+CountFields(list(i),ChrB(9))-1
cm=cm+CountFields(list(i),",")-1
End If
Next i
End If
//
// If commas out number TABS then it must be a comma delimited file
//
If cm>tb Then ' file is COMMA delimited! change it to TAB (watch out for ")
For i=0 To list.ubound
s=list(i)
If InStr(s,ChrB(34))=0 Then ' no " so do it fast
s=ReplaceAll(s,",",ChrB(9))
Else
temp=Split(s,",")
For j=temp.ubound DownTo 1
If Left(temp(j-1),1)=ChrB(34) And Right(temp(j),1)=ChrB(34) Then
temp(j-1)=Mid(temp(j-1),2)+","+Left(temp(j),Len(temp(j))-1)
temp.remove j
End If
Next j
s=Join(temp,ChrB(9))
s=ReplaceAll(s,ChrB(9)+ChrB(34),ChrB(9))
s=ReplaceAll(s,ChrB(34)+ChrB(9),ChrB(9))
s=ReplaceAll(s,ChrB(34)+ChrB(34),"'")
End If
list(i)=s
Next i
End If
//
// Write the File back out
//
Dim xxx As TextOutputStream
xxx=TextOutputStream.Create(out_f)
s=Join(list,EndOfLine.UNIX)
xxx.write s
xxx.close

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Fri May 11, 2012 10:07 am 
Offline

Joined: Thu Dec 01, 2011 2:13 pm
Posts: 288
Why not use replaceall?

_________________
Mac OS X 10.3-10.8
Windows 2000 (I know it sucks)
Windows Server 2007

You want a bunch of new classes and web styles? realstudiodevspot.com (search there for Web Styles Plugin)
Folderitem is too hard? File Bin Class
I hate cows.


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Fri May 11, 2012 10:12 am 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
because THIS is a valid comma delimited string


1234 , "Jones, Jim", Fred, " 1,2,3,4,5 "

a replace all would be wrong in this case as it would result in

1234 -> "Jones -> Jim" -> Fred -> " 1 -> 2 -> 3 -> 4 -> 5 "

where the correct output would be

1234 -> Jones,Jim -> Fred -> 1,2,3,4,5


and if you look close.. it DOES use a simple replaceall if there are no DOUBLE QUOTES in the string

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Fri May 11, 2012 10:35 am 
Offline

Joined: Thu Dec 01, 2011 2:13 pm
Posts: 288
Oops , sry.

_________________
Mac OS X 10.3-10.8
Windows 2000 (I know it sucks)
Windows Server 2007

You want a bunch of new classes and web styles? realstudiodevspot.com (search there for Web Styles Plugin)
Folderitem is too hard? File Bin Class
I hate cows.


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Fri May 11, 2012 10:49 am 
Offline

Joined: Fri Nov 16, 2007 10:18 pm
Posts: 195
Location: Portland, OR
Don't feel bad. The whole point of these forums is to share knowledge and learn. :)

_________________
Windows: Win7 64bit sp1, Vista 32bit sp2, WinXP 32bit SP3
Linux: RH EL6
Mac: Died in 2011 and took 2 months to notice.

RealStudio: 2012r2


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Wed May 30, 2012 7:49 am 
Offline

Joined: Sun Jun 24, 2007 12:24 pm
Posts: 441
Location: Madrid
DaveS wrote:
because THIS is a valid comma delimited string


1234 , "Jones, Jim", Fred, " 1,2,3,4,5 "

a replace all would be wrong in this case as it would result in

1234 -> "Jones -> Jim" -> Fred -> " 1 -> 2 -> 3 -> 4 -> 5 "

where the correct output would be

1234 -> Jones,Jim -> Fred -> 1,2,3,4,5


and if you look close.. it DOES use a simple replaceall if there are no DOUBLE QUOTES in the string



One comment.

This would be a valid record as well:

1234,"1234",\"1234,"12,34",1234\"

Contents translate to:
1234
1234
"1234
12,34
1234"

When you find a quote, double quote or a comma you have to backpedal one position to see if it's escaped. If it is then it's a plain character and neither a delimiter nor an enclosure. Likewise a backslash is an escape character and should be translated as a backslash only if doubled.

Obviously, this only applies if you want to escape, escape with backslash and escape only certain characters otherwise take backslash literally.

_________________
----
http://eduo.info/
http://gallery.eduo.info/
http://twitter.com/eduo/


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Wed May 30, 2012 8:14 am 
Offline
User avatar

Joined: Mon Feb 05, 2007 5:21 pm
Posts: 600
Location: New York, NY
eduo wrote:
When you find a quote, double quote or a comma you have to backpedal one position to see if it's escaped. If it is then it's a plain character and neither a delimiter nor an enclosure. Likewise a backslash is an escape character and should be translated as a backslash only if doubled.

Obviously, this only applies if you want to escape, escape with backslash and escape only certain characters otherwise take backslash literally.

Backpedalling would be insufficient. Suppose you had this string:

something,else\\,entirely

This is three values, the middle of which is "else\", but if you backpedal, your code would think it was two values, the second being "else\,entirely".

A better solution is to split all the characters into an array, then evaluate them each in order, skipping the ones that are appropriate to skip. You could even account for EndOfLine chars between quotes that way.

_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Wed May 30, 2012 8:31 am 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
Common convention is to use double double quotes to indicate a literal double quote... -OR- to use \"

However... it is also common that \" is always the sequence to escape a double quote, and that \\ escapes a literal \
with the \\ taking precedence over \"

So "test\",test" becomes test,test
and "test\\",test" becomes test\ test


neither situation is covered by the code I posted.


NOTE the use of the word "common". There ARE NO PUBLISHED "STANDARDS" for CSV... just guidelines, and it is up to each implementation to decide how or if it will handle certain situations.

To avoid these situations..... start with a TAB DELIMITED FILE

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Wed May 30, 2012 8:43 am 
Offline
User avatar

Joined: Mon Feb 05, 2007 5:21 pm
Posts: 600
Location: New York, NY
DaveS wrote:
To avoid these situations..... start with a TAB DELIMITED FILE

Best advice of the day. :-)

_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Wed May 30, 2012 10:51 am 
Offline
User avatar

Joined: Mon Feb 05, 2007 5:21 pm
Posts: 600
Location: New York, NY
Here is another way to approach this. It doesn't have the cool feature of figuring out whether the string should be converted at all, but this preserves the encoding of the original string and should be pretty fast. Note that I use StrComp because it is faster than "=".

Function CSVToTab(s As String) As String
// Converts a comma-delimited string to tab-delimited.
// Assumes that "\" is an escape character and quotes should
// be ignored unless escaped.
// Values between quotes are taken in their entirety.

dim enc as TextEncoding = s.Encoding

dim tab as string = enc.Chr( 9 )
dim quote as string = """"
quote = quote.ConvertEncoding( enc )
dim comma as string = ","
comma = comma.ConvertEncoding( enc )
dim backslash as string = "\"
backslash = backslash.ConvertEncoding( enc )

dim chars() as string = s.Split( "" )
dim newChars() as string

dim inQuote as boolean
dim lastCharIndex as integer = chars.Ubound
dim i as integer
while i <= lastCharIndex
dim thisChar as string = chars( i )
dim nextChar as string
if i < lastCharIndex then nextChar = chars( i + 1 )

select case true
case StrComp( thisChar, quote, 0 ) = 0
inQuote = not inQuote
i = i + 1

case not inQuote and StrComp( thisChar, comma, 0 ) = 0
newChars.Append tab
i = i + 1

case StrComp( thisChar, backslash, 0 ) = 0
newChars.Append nextChar
i = i + 2

else
newChars.Append thisChar
i = i + 1

end select

wend

dim r as string = join( newChars, "" ).ConvertEncoding( enc )
return r

End Function

_________________
Kem Tekinay
MacTechnologies Consulting
http://www.mactechnologies.com/

Need to develop, test, and refine regular expressions? Try RegExRX.


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Tue Mar 12, 2013 7:24 am 
Offline

Joined: Mon Jul 30, 2007 7:35 am
Posts: 6
I like your code a lot as it is fast and I need to process a lot of big files from all across Europe, because not all Windows localisations use the same delimiter for csv files comming from Excel I adapted the program a little bit so I can pass the appropriate delimiter from outside.

Thanks for sharing this function it works like a champ.

Function CSVToTab(s As String,delimiter As String) As String
// Converts a separator-delimited string to tab-delimited.
// Assumes that "\" is an escape character and quotes should
// be ignored unless escaped.
// Values between quotes are taken in their entirety.
dim enc as TextEncoding = s.Encoding
dim tab as string = enc.Chr( 9 )
dim quote as string = """"
quote = quote.ConvertEncoding( enc )
dim Separator as string = delimiter
Separator = Separator.ConvertEncoding( enc )
dim backslash as string = "\"
backslash = backslash.ConvertEncoding( enc )
dim chars() as string = s.Split( "" )
dim newChars() as string
dim inQuote as boolean
dim lastCharIndex as integer = chars.Ubound
dim i as integer
while i <= lastCharIndex
dim thisChar as string = chars( i )
dim nextChar as string
if i < lastCharIndex then nextChar = chars( i + 1 )
select case true
case StrComp( thisChar, quote, 0 ) = 0
inQuote = not inQuote
i = i + 1
case not inQuote and StrComp( thisChar, Separator, 0 ) = 0
newChars.Append tab
i = i + 1
case StrComp( thisChar, backslash, 0 ) = 0
newChars.Append nextChar
i = i + 2
else
newChars.Append thisChar
i = i + 1
end select
wend
dim r as string = join( newChars, "" ).ConvertEncoding( enc )
return r
End Function


Top
 Profile  
Reply with quote  
 Post subject: Re: Convert a COMMA delimited file to a TAB delimited file
PostPosted: Tue Mar 12, 2013 10:11 am 
Offline
Real Software Engineer

Joined: Sat Dec 24, 2005 8:18 pm
Posts: 7858
Location: Canada, Alberta, Near Red Deer
http://great-white-software.com/CSVParser.zip

_________________
Norman Palardy (Real Software)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 5 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group