Ask a Question related to Coldfusion - Advanced Techniques, Design and Development.
-
DCS Admin #1
Large Query Options
Ok,
So we've been racking our brains for a viable (and scalable) solution to this
problem. We've played with a couple ideas, but I'd like to pose the problem to
the community and see if anyone has a solution that they know works:
Scenario:
We've got customers who log into our web portal to view orders and order
history. In many cases, the data set returned could contain thousands or tens
of thousands of rows. For display purposes, they can choose how many records
are displayed per page and then using the 1 of N paging functionality, traverse
between pages.
Unfortunately, we're forced to reload the query with every page. This lends
itself ot some serious page-load performance issues.
We've toyed with the following ideas and would love your feedback on
alternative approaches:
1.) Load the recordset into Session ... With the number of users we have
using the system concurrently, this is a major memory overload and isn't
optimal.
2.) Load the recordset into Cookie ... Too risky. Keeping the data set on
the client file system opens it up for all kinds of security and manipulation
problems.
3.) Create a "Query" master table that a stored procedure would build with
the id's of the records to be displayed... so 1 of N would return rows 1
through N from the table only... Problem with this is sorting is invalidated
because it will now only apply to the data in 1 through N... otherwise every
time a sort is performed, it would have to requery.
4.) Create a "Query" master table that is a replica of the source table but
only contains the records for the user to view. This would overcome the
problem in #3, but would now be a maintenance nightmare to keep the query and
source tables in synch.
Bottom line, I need to be able to return a recordset of a large quantity of
data from a table that has exponentially greater amounts of data and then allow
the user to traverse through N pages without a major performance issue through
each subsequent page view.
Any ideas?
DCS Admin Guest
-
query of quries and large data types
I am running a query of queries on a query that runs agains an oracle database and returns a clob column. When I try to access the clob column in... -
How To Query Large Data/Records ?
Hi All, Here is my Problem - In my search form I have to query 50 million records in a single transaction and to display the results . - Without... -
View Menu selection options (Large icons, Small icons, List, Details or Thumbnails)
In the View Menu option of the File Save or File Open menus (4th icon to the right of the Save In or Look In drop-down boxes) Acrobat 5.0 gives you... -
Tuning format options for large EXT3 partition
On Mon, 22 Sep 2003 16:27:57 -0500, Mauriat staggered into the Black Sun and said: mke2fs -j -m 0 -N $NUMBER /dev/whatever (or) mke2fs -j -m 0... -
large number of records in query (orcale 8i)
Hi all, At the moment in our server ( written in java) we cache all records of a query from database (Oracle8i) and we manipulate them as per... -
kenji776 #2
Re: Large Query Options
Although I am not as advanced as you, if it was me, I would go for the session
loading way. It seems as though it would be the most secure, and probobly the
fastest way. Don't go for it on my advice alone, seeing as I'm just a 17 year
old kid with only about a year of CF experiance. But from what I know, I would
say it's the clear way to go. Good luck!
kenji776 Guest
-
DCS Admin #3
Re: Large Query Options
Kenji,
Thank you for the response. My initial inclination was to go with Session as
well and only reload the data set when they access page 1.
However, the downside is that Session variables are loaded into Server memory.
If you have 200+ concurrent users logged in each with a Session variable
containing 10,000+ records in a RecordSet object, you could potentially be
looking at a LARGE amount of shared memory usage.
Granted, it's a problem that can be solved by just throwing money at hardware
(versus people and time thrown at re-coding) but I want to make sure we've
explore all options first.
Thanks again!
DCS Admin Guest
-
Stressed_Simon #4
Re: Large Query Options
Is there a reason why you are not caching this query?
Stressed_Simon Guest
-
DCS Admin #5
Re: Large Query Options
I never fully understood exactly how the query caching worked... How do I cache
it? How do I force a reload of that cache?
If the function call automatically queries the data, when the query is cached,
how does it know not to re-query the database?
This sounds like what I want, but not sure how to practically implement it.
DCS Admin Guest
-
Stressed_Simon #6
Re: Large Query Options
OK when you do a select query you name it right?
ie <cfquery name="qMyQuery"........
Well when you add the following lines:-
<cfquery name="qMyQuery" cachedwithin="#CreateTimeSpan(0, 0, 30, 0)#">
That will now store the results of that query for 30 minutes. The four
sections in the CreateTimeSpan() function are as follows:-
CreateTimeSpan(Days, Hours, Minutes, Seconds)
So now I assume you are calling the same query on each page, or hopefully have
it stored in a CFC. This means that the first journey to the database will be
the big one and then for the next 30 minutes it will use the same results.
To clear cache you can set the CreateTimeSpan() function to a variable, and
then use cfif to pass CreateTimeSpan(0, 0, 0, 0) which will clear the query.
I hope that helps.
Stressed_Simon Guest
-
kenji776 #7
Re: Large Query Options
Yeah, go with Simon, he knows whats up. Good though, don't know why I didn't think about caching the query.. duh!
kenji776 Guest
-
JMGibson3 #8
Re: Large Query Options
Query Caching would impose a memory hit (not to mention flushing any other
cached queries) just as bad as Session variables. It also sounds like these
queries will be different for each user, so each user's flavor would burn up
another chunk of system cache (memory).
Are the queries always sorted, even if dynamic, for example SortKey = "col1" &
"col2" & "col3"?
If so you could run something like a TOP N type query with a WHERE #SortKey# >
'#LastKeyOnPage# ORDER BY #SortKey# DESC. If your DB doesn't support TOP N,
you could at least use the key as a starting point, and loop through a
MIN(#SortKey#)
WHERE #SortKey# > '#LoopKey# ORDER BY #SortKey# ASC, N times, starting with
LoopKey = LastKeyOnPage, changing it to last retrieved key. You'd then run N
single record queries. DB load will be worse in the second case which may not
matter too much if you have good indexes. Either way, at least you'll only get
N records back. Also to implement this you'll need to combine CF Formatting on
the values along with DB Formatting of the columns to ensure string values of
constant length (numbers=000001, dates = 20050505, etc.) for the WHERE and
ORDER BY's.
JMGibson3 Guest
-
The ScareCrow #9
Re: Large Query Options
JMGibson3,
You don't indicate what db your using. But I created a stored procedure for
sql server 2000.
The cf page passes an integer that indicates the starting (,000) and an
integer that indicates how many records to return.
So, lets assume you want to return, 1000 records at a time
Thus the result set would be
1 to 1000
1001 to 2000
ect....
You can then manage the amount of records being returned to the cf page.
Ken
The ScareCrow Guest
-
baz #10
Re: Large Query Options
I've had experience dealing with this issue myself, and judging by the
options you've outlined, you have quite a good understanding of whats
available and the pitfalls for each technique.
Forget the session idea completely - this is the least scalable, least
efficient and most buggy solution - hitting your database time after
time is much better and faster. Plus cflocking sucks. And when u
overload the session other important session variables might be kicked
out.
Basically it comes down to 2 things - functionality and speed.
If you want to search, sort and filter the entire recordset on the
server side (there are javascripts and flash techniques for doing the
same on the client side without hitting the server) your best choices
are either to hit the DB each time or use VERITY SEARCH (made for large
recordsets) One thing to consider if you decide to hit the DB each time
is that most people spend most of their time at the beginning part of a
recordset (like the first results page on google - most people don't
get to page 10+) So what you could do is return a query up to that
point. So for example, if there are 1000 records and the user is
viewing 1-25, you could just query the DB for the first 25. If the user
is looking at 26-50, now you would have to return the first 50.
Basically you would only return the entire recordset set if the user is
viewing records 976-1000 - this way you can maintain the sorting and
order.
If you don't mind limiting functionality you can be a lot faster and
your options are limitless. For example in one case, our recordsets
grew so large that we changed the page so that a user HAD to give a
search term and only the first 50 results got displayed.
Finally, depending on the type of data you are using, you can employ
grouping and aggregation. So for example instead of just dumping a list
of all orders, you could add up all orders per country and show a list
of 10 countries with the total orders for each. Then when the user
clicks on a country, it shows a bunch of cities with the total orders
for each. Then finally when you click a city will it will show the
individual orders. This way you will never need to return thousands of
orders.
Baz
baz Guest
-
Lupus 23 #11
Re: Large Query Options
Why not just use a query to limit the query to only the records that are
displayed on that particular page.
Depending on what database you use, the code will be a bit different.
The ScareCrow is on the right track, but you don't need to use a store proc to
do this.
In MS-SQL clever usage of the TOP command can get you what you need.
My SQL has a LIMIT command that will be useful.
Oracle has yet another command, but essentiall they all work the same.
Here is a rough example of how this might work in MS-SQL server or access.
SELECT TOP 5 field FROM (
SELECT TOP 10 field
FROM table_name
WHERE field = whatever
ORDER field desc
) order by field asc
This example would select records 5 to 10.
The 5 is how many records per page. The 10 is the starting point, and will be
changed based on what 'page' you are on.
Looking at the query you will see the principle. We first select records we
need in the subquery to the end of the record set ordered descending. Then we
reorder that list ascending, and select off the top X number we want on the
page.
Follow?
No stored procs, no pain, and you relieve your query size problems without
subjecting your server to caching the entire recordset.
Lupus 23 Guest
-
DCS Admin #12
Re: Large Query Options
Everyone,
Thank you for your feedback so far.
Lupus I definately think your suggestion is heading in the direction I want to
go. One problem I forsee with that is the potential for lost records... If a
new order comes in while I'm "viewing" records 10 - 15, based on the sort
criteria, there is a chance that order will push the row numbers forwards or
backwards by 1, thus allowing for a single record to get lost between pages.
How would you handle that scenario?
DCS Admin Guest
-
DCS Admin #13
Re: Large Query Options
Also, we're using Oracle 9i as the backend database
DCS Admin Guest
-
Lupus 23 #14
Re: Large Query Options
DCS,
That is a risk that you may have to take, but even if you were pulling back
the entire recordset you would still have this problem.
If you code it such that it works based on number of records, rather than by
dealing with it in pages, then the new record should just slip into the results.
Let see.. the Oracle command that is simular to TOP is
select field
from table
where rownum <= 10
ROWNUM is the key here. You can see how you could reconstruct the idea of my
TOP based query above using rownum.
Something like this:
SELECT field
FROM (
SELECT field
FROM table_name
WHERE field = whatever
AND rownum <= 10
ORDER BY field desc
)
WHERE rownum <= 5
ORDER BY field asc
Lupus 23 Guest



Reply With Quote

