Real Software Forums
http://forums.realsoftware.com/

Different ways of loading an URL into a variable/ a string
http://forums.realsoftware.com/viewtopic.php?f=2&t=47138
Page 1 of 1

Author:  Eclipse [ Sun Mar 03, 2013 7:48 am ]
Post subject:  Different ways of loading an URL into a variable/ a string

It has come to my attention that there are different ways of loading an URL into a string and processing the content using either regExp or InStr.

There is an efficient command in Unix/Linux called wget that seems far more successful than the RB get -command.

This code doesn't work on any URL. Are there other alternatives to load URL's?? I've looked in the Reference Guide and there are plenty to choose from... (??)

DIM http as New HTTPSocket
http.Yield = True

textField1.text = http.Get("http://www.workingURL.com/" , 30)

Author:  ktekinay [ Sun Mar 03, 2013 9:50 am ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

This exact code works perfectly here unless the url returns a redirect ("Location" header). There is no native command to follow the redirect, but you can put this into a module somewhere:
Function GetRedirectAddress(Extends h As HTTPSocket, url As String, timeout As Integer, maximumIterations As Integer = kDefaultMaximumIterations) As String
// Gets the redirect address for a url
// Will give up after maximumIterations interations.
// Put a 0 (or less) in there for infinite

if url = "" then return url

dim isFinite as boolean = true
if maximumIterations < 1 then
isFinite = false
end if

do
dim headers as InternetHeaders = h.GetHeaders( url, timeout )
if headers is nil then
url = ""
exit
elseif headers.Value( "Location" ) <> "" then
url = headers.Value( "Location" )
else
exit
end if
if isFinite then
maximumIterations = maximumIterations - 1
end if
loop until isFinite and maximumIterations = 0 // Will never end if maxiumIterations < 0 to start

return url

End Function

Then it's:
url = h.GetRedirectAddress( url, 30 )
s = h.Get( url, 30 )

Author:  Eclipse [ Mon Mar 04, 2013 8:14 am ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

http://www.syscare.se/files/realbasic-error.png

I keep getting this Real Studio error when running this code!
The code as you write it wont compile... so, I tried to fix it!

I have RS 2012 v1, and after next pay-day I will renew!
Maybe you have a later version?

Author:  ktekinay [ Mon Mar 04, 2013 8:33 am ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

I don't know why that would be a fatal error, but you can replace "kDefaultMaximumIterations" with simply "5" (or some other number of your choosing). That controls how many redirects the code will follow before it gives up.

Author:  Eclipse [ Mon Mar 04, 2013 7:12 pm ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

Thank you for taking the time to write!

However, I get all kinds of errors when trying to compile... I need to examine your code really carefully and maybe I can rewrite it.
It's really educating when copy/paste doesn't work!

I like challenges! I don't complain! I'll look into this tomorrow!!
Thank you very much!

Author:  timhare [ Mon Mar 04, 2013 7:33 pm ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

in the Parameters section you have

maximumIterations as Integer = kDefaultMaximumIterations as Integer

Too many "as integer" in there. That might be the problem. Also, any method using Extends must be defined in a Module.

Author:  msssltd [ Thu Mar 07, 2013 10:21 am ]
Post subject:  Re: Different ways of loading an URL into a variable/ a stri

Eclipse wrote:
There is an efficient command in Unix/Linux called wget that seems far more successful than the RB get -command.

wget creates a http get request, downloads html page source, parses it then downloads resources linked within the page, using further http get requests. wget is a web browser in it's own right.

httpsocket.get creates a single http get request, which downloads the page source. If you want to download the resources linked within the page, you will need to parse the html page source and issue further httpsocket.get requests yourself.

Page 1 of 1 All times are UTC - 5 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/