Real Software Forums

The forum for Real Studio and other Real Software products.
[ REAL Software Website | Board Index ]
It is currently Wed Oct 18, 2017 8:12 am
xojo

All times are UTC - 5 hours




Post new topic Reply to topic  [ 10 posts ] 
Author Message
 Post subject: How Query Rows Work
PostPosted: Mon Apr 15, 2013 8:32 am 
Offline
User avatar

Joined: Fri Nov 10, 2006 4:10 pm
Posts: 1855
Location: Michigan, USA
Please note that I'm personally working with PostgreSQL 9.1.

I'm very curious how, exactly, queries are handled by Real Studio. When I perform a query, the database returns multiple records, and Real Studio presents this to me as a RecordSet object. From here I'm able to view one row at a time and "move" between rows. This is what confuses me. What happens when I MoveNext? Are the rows in memory somewhere, or is RS retrieving the query rows from the database one at a time?

What's confusing to me is that I don't see this concept of "moving" anywhere else when performing queries. Why aren't I simply returned an array of RecordSets?

Appreciate the insight! :)

_________________
Windows 8 x64
Windows XP Pro SP3
Ubuntu 11.04 via Virtual Box
RS Enterprise 2012r1.1

Programming Tutorials & Free Projects: http://www.JasonTheAdams.com
"Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Mon Apr 15, 2013 9:00 am 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
I'm not positive..... but based on my experience with other databases in other environments,

A RecordSet is not TRULY an "array" of records.... but it is what would be called a "database cursor" in any other context.

This "cursor" points to the currently valid record and allows the user to perform certain actions with it.. But that is the only "record" the "recordset" is dealing with (discounting any internal caching mechanisms).

When you issue a MOVENEXT, the cursor simply increments to the next qualifying record (or EOF), the same with MOVEPREV (assuming your database supports bi-directional recordsets [SQLITE for example does not])

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Mon Apr 15, 2013 9:12 am 
Offline
User avatar

Joined: Fri Nov 10, 2006 4:10 pm
Posts: 1855
Location: Michigan, USA
Thank you, Dave, that is very interesting. I just read up on Cursors in PostgreSQL. I can definitely appreciate this as the way to go as, if queries went straight to memory, returns of millions (or more) rows could bring systems to their knees.

What's causing me to think about this is that I have a situation where I'm looping through active records of a View, but I also need a single column from a function. The problem is that if I perform the function on each loop iteration the total time of the loop will be the initial query + (sub-query * iterations). So as the number of rows grow, the loop will consume take more time linearly. Not good. I'm trying to think of a creative way of working the two queries together.

Open to suggestions. :)

_________________
Windows 8 x64
Windows XP Pro SP3
Ubuntu 11.04 via Virtual Box
RS Enterprise 2012r1.1

Programming Tutorials & Free Projects: http://www.JasonTheAdams.com
"Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Mon Apr 15, 2013 9:21 am 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
Depending on your situation of course.

Create an IN-MEMORY database
Copy the table (only desired field(s)) into that "CREATE table db.temptable AS Select x,y,z FROM db2.realtable"
Then use the in-memory on. Still same mechanism, but in memory will be much faster

Just an idea... only way to find out is try it.

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Mon Apr 15, 2013 11:36 am 
Offline
User avatar

Joined: Fri Nov 10, 2006 4:10 pm
Posts: 1855
Location: Michigan, USA
Thanks for the idea, Dave. I ended up realizing that I could very effectively re-write the function into multiple, much faster views, which allows me to more effectively query across multiple rows without a linear consequence. It's funny how such different (often better) ideas occur after walking away from something for months. ;)

_________________
Windows 8 x64
Windows XP Pro SP3
Ubuntu 11.04 via Virtual Box
RS Enterprise 2012r1.1

Programming Tutorials & Free Projects: http://www.JasonTheAdams.com
"Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Wed Apr 17, 2013 4:30 am 
Offline
User avatar

Joined: Fri Oct 28, 2005 7:05 am
Posts: 565
Location: Emsworth, UK
Jason_Adams wrote:
I'm very curious how, exactly, queries are handled by Real Studio.
The trite answer is, they aren't!

Real Studio interacts with the native database driver (API) or, if you are using ODBC, it interacts with the ODBC API, which interacts with the native driver. So there is one or more layers of abstraction between RS and the database; whatever or wherever the actual data is.

Quote:
When I perform a query, the database returns multiple records, and Real Studio presents this to me as a RecordSet object.
What actually happened was, RS sent the query string to the driver, the driver sent it to the database and the database returned 'a page' of data; which is no more than the driver is able to cache in memory. Exactly how many rows and fields 'a page' might be, depends on the query, the database and the driver - Which would seem a good reason for the abstraction.

Quote:
From here I'm able to view one row at a time and "move" between rows. This is what confuses me. What happens when I MoveNext? Are the rows in memory somewhere, or is RS retrieving the query rows from the database one at a time?
It doesn't matter! What matters is that when you call MoveNext, the driver returns the next row to your application, quickly and efficiently. A driver for a desktop database will behave very differently to a driver for a Client/Server database.

Take SQLite's in memory database, retrieving a row at a time would be quick and efficient, as the database is ultimately residing in the same RAM as the driver and your application. On the other hand, transferring a row at a time from a Client/Server database across a network, would be very slow and extremely inefficient - Each row needing to be wrapped in about 60 bytes of network header and even more bytes of TCP ACK packets.

Let's go a little further with the Client/Sever example. Let's say your query returns 200MB of data to your client, sat on the end of a 100Mbps fast ethernet connection. Do you want to wait _a_minimum_ of 15 seconds before receiving the first row? Because 15 seconds is how long it would take to transfer 200MB at 100Mbps, in absolutely ideal conditions - in real life you do well to see 20Mbps sustained transfer over fast ethernet. So the server sends the first page, which might only be 64KB, as soon as the first page is available. Your application then gets the first row as soon as the first page is received by the driver. With very large queries, the first row is returned to your application, before the server has even finished assembling all the rows to be returned. Further pages are assembled and held in the Server's cache, with the client driver requesting the next page whenever a MoveNext call brings the record pointer near to the bottom of the current page.

Quote:
What's confusing to me is that I don't see this concept of "moving" anywhere else when performing queries.
It is really very common. If you were programming ODBC or native drivers using C, you would most likely still be calling a __MoveNext() function, as declared in their APIs.

Quote:
Why aren't I simply returned an array of RecordSets?


A little thinking exercise. Why is it, do you think, that few client/server drivers return a RecordCount? The answer is in my preceding paragraph.

_________________
Yes it's me in the avatar


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Wed Apr 17, 2013 9:33 am 
Offline
User avatar

Joined: Fri Nov 10, 2006 4:10 pm
Posts: 1855
Location: Michigan, USA
Thanks msssltd! As always, you stretch my brain. :)

This all makes a lot of sense now, and I'm especially impressed by the idea of sending the data before it's even finished accumulating all rows. Someone was thinking!

I do have one question. As I said, I'm using PostgreSQL, and on a Linux server. When I'm on the server and EXPLAINing queries I plan on incorporating, I'm curious how accurate the times I receive actually are. The queries I run take up to (but not often) 2 seconds over 100 rows, but this isn't due to the size, but the calculations each query performs (a lot of grouping, aggregates, and math). If I'm understanding you correctly, it may actually start sending rows before the query is finished. Does this mean it may say it takes 2 seconds to perform the query, but I may actually have the initial rows in 100-500ms?

Trying to work through the implications in my head. Thanks! :)

_________________
Windows 8 x64
Windows XP Pro SP3
Ubuntu 11.04 via Virtual Box
RS Enterprise 2012r1.1

Programming Tutorials & Free Projects: http://www.JasonTheAdams.com
"Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Wed Apr 17, 2013 9:33 am 
Offline
User avatar

Joined: Fri Nov 10, 2006 4:10 pm
Posts: 1855
Location: Michigan, USA
Thanks msssltd! As always, you stretch my brain. :)

This all makes a lot of sense now, and I'm especially impressed by the idea of sending the data before it's even finished accumulating all rows. Someone was thinking!

I do have one question. As I said, I'm using PostgreSQL, and on a Linux server. When I'm on the server and EXPLAINing queries I plan on incorporating, I'm curious how accurate the times I receive actually are. The queries I run take up to (but not often) 2 seconds over 100 rows, but this isn't due to the size, but the calculations each query performs (a lot of grouping, aggregates, and math). If I'm understanding you correctly, it may actually start sending rows before the query is finished. Does this mean it may say it takes 2 seconds to perform the query, but I may actually have the initial rows in 100-500ms?

Trying to work through the implications in my head. Thanks! :)

_________________
Windows 8 x64
Windows XP Pro SP3
Ubuntu 11.04 via Virtual Box
RS Enterprise 2012r1.1

Programming Tutorials & Free Projects: http://www.JasonTheAdams.com
"Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Wed Apr 17, 2013 9:40 am 
Offline
User avatar

Joined: Sun Aug 05, 2007 10:46 am
Posts: 4931
Location: San Diego, CA
Quote:
but I may actually have the initial rows in 100-500ms?


No..... Inserts/Updates etc are cached by the database until a COMMIT (explict or implied) occurs. So for all intents and purposes your data is not "available" until the query is complete and commited

This allows for rollbacks to occur.

some database engines (such as Oracle) do allow the "next" query to access data updated or inserted by a previous query where there was no COMMIT in-between. These database engines are smart enough to manage both the original table, and cache as one data unit, Allowing the results of multiple queries to be rolled back should there be a failure anywhere along the line.

_________________
Dave Sisemore
iMac I7[2012], OSX Mountain Lion 10.8.3 RB2012r2.1
Note : I am not interested in any solutions that involve custom Plug-ins of any kind


Top
 Profile  
Reply with quote  
 Post subject: Re: How Query Rows Work
PostPosted: Thu Apr 18, 2013 3:18 am 
Offline
User avatar

Joined: Fri Oct 28, 2005 7:05 am
Posts: 565
Location: Emsworth, UK
Jason_Adams wrote:
If I'm understanding you correctly, it may actually start sending rows before the query is finished. Does this mean it may say it takes 2 seconds to perform the query, but I may actually have the initial rows in 100-500ms?

I am going to disagree with Dave, because as far as I can see he's talking about transaction isolation, which is all about when changes to data become permanent, rather than when data can be returned.

The answer I like is, "It depends."

As a general rule, the first row should be returned as soon as it can be returned, although optimisation and efficiency considerations may cause (some) deferral.

Take a simple, unordered query
SELECT * FROM someTable;
Move the disk head to the start of someTable, read the first block of data and the server has done all it needs to do, to be able to return the first row. If someTable is occupying hundreds of blocks, and the client is on the other end of a network, the server might as well write the first row(s) into the network transmit buffer, while it waits for the disk head to continue moving and reading more of the blocks which contain someTable.

Adjust the query to be an ordered list
SELECT * FROM someTable ORDER BY someField DESC;
Move the disk head to the start of someTable, read the first block. Can the server return the first row? No it can't. All rows of someTable will have to be read (scanned), before any row can be placed in order and the first row returned

Introduce a calculated field
Select A, B, (A + B) as Total FROM someTable ORDER BY Total DESC;
Now we have a choice. Is it quicker to scan the entire table, have the CPU do the maths for every row, scan the results, place the first row in order and return it. Or, read the first block of some table, do the maths for each row in the first block, read the second block etc.

The answer is, "It depends." It depends how much and what data is in the table, how fast the disk head can move to it, how long the CPU takes to do the calculations, how much RAM is available - If there isn't enough RAM to hold the table, the disk heads going to have to move (even) more, as the interim results need to be written to a temporary file.

Adding grouping and joining clauses, using indexes, the number of possibly 'best' outcomes rises. The more complex the query becomes, the more scope there is to optimise how the results are arrived at and the more scope there is, for different database engines to behave differently.

To try to predict the most efficient way to scan the tables and assemble the results of a particular query, we need some sort of 'query plan' . Which is roughly what EXPLAIN does in Postgres. EXPLAIN performs a dry run on the query and tells you how the engine is likely to process it and what each step will cost in time and resource. Using EXPLAIN ANALYZE with Postgres, changes the dry run to a live run, which may change the results dramatically.

I should point out I have a lot more practical experience as a dba with MS SQL than with Postgres. From what I recall, the timing information presented by a query plan, is an estimation of completion of the query at the server. It does not include transmission (to the client) or any deferral made at the client.

To reiterate the original answer. In most cases (I can think of) the first row should be returned as soon as it can be returned; which may be some time before the last row can be returned. RS has a quirky threading model, which presents scope for RS to defer availability of the first row and I have never tested that.

So the other answer I like is, "Try it and see."

_________________
Yes it's me in the avatar


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC - 5 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group