Real Software Forums

The forum for Real Studio and other Real Software products.
[ REAL Software Website | Board Index ]
It is currently Wed Dec 11, 2019 4:59 pm
xojo

All times are UTC - 5 hours




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: RTF to HTML: Two Questions
PostPosted: Thu Jul 07, 2011 3:38 am 
Offline

Joined: Fri Sep 30, 2005 1:53 pm
Posts: 914
Location: Philadelphia, PA
I wrote the following Windows program to convert simple RTF to HTML:

Dim Result As Boolean
Dim HTMLType As New FileType
Dim RTFType As New FileType
Dim File1 As FolderItem
Dim Count, I As Integer
Dim AllPiecesOfPage, PieceOfPage As String
Dim Piece, Temp, WholeWebPage As String
Dim SR as StyleRun

RTFType.Name = "RTF Files"
RTFType.Extensions = "rtf"
File1 = GetOpenFolderItem ( RTFType )
If File1 <> Nil Then
Result = TextArea1.Open ( File1 )
End If

Count = TextArea1.StyledText.StyleRunCount
AllPiecesOfPage = ""

For I = 0 to Count - 1

SR = TextArea1.StyledText.StyleRun( I )
PieceOfPage = SR.Text

If SR.Bold = True Then
PieceOfPage = "<b>" + PieceOfPage + "</b>"
End If

If SR.Italic = True Then
PieceOfPage = "<i>" + PieceOfPage + "</i>"
End If

If SR.Underline = True Then
PieceOfPage = "<u>" + PieceOfPage + "</u>"
End If

PieceOfPage = ReplaceAll (PieceOfPage, Chr ( 13 ) + Chr ( 13 ), "<p>" )

PieceOfPage = ReplaceAll (PieceOfPage, Chr ( 13 ), "<br />" )

Piece = "<span style=" + Chr ( 34 ) + "font-family: " + SR.Font _
+ "; font-size: " + Str ( Floor ( .5 * SR.Size ) ) + "pt; color: " + Mid ( Str ( SR.textcolor ), 3 ) _
+ ";" + Chr ( 34 ) + ">" + PieceOfPage + "</span>"

AllPiecesOfPage = AllPiecesOfPage + Piece

Next I

WholeWebPage = "<html>" + EndOfLine + "<body style=""font-size: 30pt;>"
WholeWebPage = WholeWebPage + AllPiecesOfPage + EndOfLine
WholeWebPage = WholeWebPage + "</body>" + EndOfLine + "</html>" + EndOfLine

TextArea1.Text = WholeWebPage
TextArea1.SelStart = 0
TextArea1.SelLength = Len ( TextArea1.Text )
TextArea1.SelTextSize = 30
TextArea1.SelLength = 0

HTMLType.Name = "HTML files"
HTMLType.Extensions = ".html; .htm"
File1 = GetSaveFolderItem(HTMLType, "")
If File1 <> Nil Then
Result = TextArea1.Save ( File1, False )
End If

It's not pretty, but it seems to work.

Question #1: Should I be using "span" or "div" or something more complicated than either? (Both "span" and "div" appear to work, but that doesn't necessarily mean that either works all the time.)

Question #2: Do you notice any (other?) non-obvious errors in the code (or at least non-obvious to me)?

Thanks in advance for any advice.

Barry Traver

P.S. I'm posting this in the Windows area because I have an idea that the lines containing "13" may not work on the Mac and/or with Linux.


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Thu Jul 07, 2011 3:54 am 
Offline

Joined: Fri Jan 06, 2006 3:21 pm
Posts: 12388
Location: Portland, OR USA
You should use <span> instead of <div>. <div> is a block oriented tag, and is a superset of <p>. <span> is in-line, within a <p> and a <div>.

TextArea uses chr(13) on all platforms, regardless of each platform's EndOfLine.


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Thu Jul 07, 2011 4:24 am 
Offline

Joined: Fri Sep 30, 2005 1:53 pm
Posts: 914
Location: Philadelphia, PA
timhare wrote:
You should use <span> instead of <div>. <div> is a block oriented tag, and is a superset of <p>. <span> is in-line, within a <p> and a <div>.

Thanks for the comment.

timhare wrote:
TextArea uses chr(13) on all platforms, regardless of each platform's EndOfLine.

Thanks again.

If I understand what you're saying, then Windows seems not to be the only platform where the platfom's normal EndOfLine is different from the platform's EndOfline for a TextArea. Interesting!

Another question....

Assume the following code:

Dim S1 As String
S1 = TextArea1.Text

Do I remember correctly that for Windows end-of-line is one-character in TextArea1 -- i.e., Chr ( 13 ) -- but is now automatically converted to a normal Windows end-of-line of two characters -- i.e., Chr ( 13 ) + Chr ( 10 ) -- in S1 in the preceding code snippet?

Barry Traver


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Thu Jul 07, 2011 11:31 am 
Offline

Joined: Fri Jan 06, 2006 3:21 pm
Posts: 12388
Location: Portland, OR USA
No, the end of line character remains a single chr(13). There is no automatic conversion for that simple assignment. You would have to code it as

s1 = ReplaceLineEndings(Textarea1.Text, EndOfLine)

Tim


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Fri Jul 08, 2011 2:31 pm 
Offline

Joined: Fri Sep 30, 2005 1:53 pm
Posts: 914
Location: Philadelphia, PA
Tim,

You're absolutely correct. I had it backwards. Let's see if I can get it right this time.

What does take place is not that S1 automatically adds Chr (10) but that TextArea1.Text automatically loses Chr (10). (I'm talking about Windows, of course.)

Example #1:
S1 = Chr ( 65 ) + Chr ( 13 ) + Chr ( 10 )
MsgBox Str ( Len ( S1 ) )
TextArea1.Text = S1
MsgBox Str ( Len ( TextArea1.Text ) )

Example #2:
TextArea1.Text = Chr ( 65 ) + Chr ( 13 ) + Chr ( 10 )
S1 = TextArea1.Text
MsgBox Str ( Len ( S1 ) )
TextArea1.Text = S1
MsgBox Str ( Len ( TextArea1.Text ) )
S1 = TextArea1.Text
MsgBox Str ( Len ( TextArea1.Text ) )

Also, ReplaceLineEndings can be a bit tricky at times.

No surprises with this:

S1 = Chr ( 65 ) + Chr ( 13 )
S1 = ReplaceLineEndings(S1, EndOfLine.Windows)
MsgBox Str ( Len ( S1 ) )

But try this:

TextArea1 = Chr ( 65 ) + Chr ( 13 )
TextArea1.Text = ReplaceLineEndings(TextArea1.Text, EndOfLine.Windows)
MsgBox Str ( Len ( TextArea1.Text ) )

Barry Traver

P.D. I think I've got it right this time, but please let me know if I didn't.


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Fri Jul 08, 2011 2:50 pm 
Offline

Joined: Fri Jan 06, 2006 3:21 pm
Posts: 12388
Location: Portland, OR USA
What's surprising about that example? And the "tricky" part is not ReplaceLineEndings, but assigning a string to TextArea1.Text. ReplaceLineEndings returns a string with a trailing chr(10). TextArea1 dutifully removes it.


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Fri Jul 08, 2011 8:51 pm 
Offline

Joined: Fri Sep 30, 2005 1:53 pm
Posts: 914
Location: Philadelphia, PA
timhare wrote:
What's surprising about that example? And the "tricky" part is not ReplaceLineEndings, but assigning a string to TextArea1.Text. ReplaceLineEndings returns a string with a trailing chr(10). TextArea1 dutifully removes it.

What's surprising to me as a non-professional is that TextArea1.Text is _exactly_ the same both before and after the following line:

TextArea1.Text = ReplaceLineEndings(TextArea1.Text, EndOfLine.Windows)

That is, one might expect from the code that a one-character line ending is being replaced by a two-character line ending throughout the text. Instead, TextArea1.Text ends up unchanged. (EndOfLine.Windows is two characters long, not the one-character line ending it started out with.)

Question: does that line of code even make sense? Is it even valid? After all, Chr ( 13 ) is supposedly being replaced by Chr ( 13 ) + Chr ( 10 ), and that doesn't seem to happen, right?

Barry Traver


Top
 Profile  
Reply with quote  
 Post subject: Re: RTF to HTML: Two Questions
PostPosted: Sat Jul 09, 2011 2:24 am 
Offline

Joined: Fri Jan 06, 2006 3:21 pm
Posts: 12388
Location: Portland, OR USA
Yes, it does happen. ReplaceLineEndings puts in a 2-character end of line. TextArea1 immediately takes it right back out. As you noted previously, when you assign a string to TextArea.Text, it replaces the normal line endings with chr(13), but when you take the string back out, the line endings remain a single chr(13). Try to imagine the intermediate results:

dim s as string
TextArea1.Text = chr(65) + chr(13)
s = ReplaceLineEndings(TextArea1.Text, EndOfLine.Windows)
msgbox str(len(s))
TextArea1.Text = s
msgbox str(len(TextArea1.Text))

The first msgbox will be "3", the second will be "2".


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 5 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group