Ask a Question related to Informix, Design and Development.
-
Thomas J. Girsch #1
UPDATE STATISTICS
Imagine a table tab1 with a bazillion rows and five columns. There is one
index on the table, on column A.
If I do:
UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
.... it runs very quickly. But if I just do:
UPDATE STATISTICS LOW FOR TABLE tab1;
.... it takes hours, maybe days. So what exactly does the latter command do,
that the former command does NOT do?
Thomas J. Girsch Guest
-
update statistics slowing system
Should doing an update statistics on a table in one database affect the preformance of users in a totally separate database. Does it lock some... -
Update Statistics High Error
When I try to execute update statistics high, I get the following error : Database selected. 564: Cannot sort rows. 179: ISAM error: no... -
Update statistics suggestions --Urgent
Hi I have a database with around 80 to 90 tables , and cron job is running for UPDATE STATS "update statistics high " runs every night on... -
Tetra and Update Statistics
Paul Watson wrote: YER JOKING! Not even low? If there are indexes, then it *surely* needs to know at least how many frikken rows there are in... -
Error message on update statistics high
On Wed, 30 Jul 2003 15:01:59 -0400, Francisco Roldan wrote: Or set PDQPRIORITY > 2 to sort in memory and set PSORT_NPROCS to invoke the parallel... -
Mark Denham #2
Re: UPDATE STATISTICS
Updates statistics for each column in the table. If your version is older
than 7.31UD2 or 9.30 [not certain for 9] I think you will find that the
table is read multiple times, once for each col in the table.
This would explain the astronomical increase in the runtime.
Mark
----- Original Message -----
From: "Thomas J. Girsch" <tgirsch@worldnet.att.net>
To: <informix-list@iiug.org>
Sent: Thursday, August 21, 2003 21:12
Subject: UPDATE STATISTICS
do,> Imagine a table tab1 with a bazillion rows and five columns. There is one
> index on the table, on column A.
>
> If I do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
>
> ... it runs very quickly. But if I just do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1;
>
> ... it takes hours, maybe days. So what exactly does the latter commandsending to informix-list> that the former command does NOT do?
>
>
Mark Denham Guest
-
Olivier CHRISTOL #3
Re: UPDATE STATISTICS
"Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message news:<O5e1b.97166$X43.68994@clmboh1-nws5.columbus.rr.com>...
Dear All,> Imagine a table tab1 with a bazillion rows and five columns. There is one
> index on the table, on column A.
>
> If I do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
>
> ... it runs very quickly. But if I just do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1;
>
> ... it takes hours, maybe days. So what exactly does the latter command do,
> that the former command does NOT do?
I experience the same problem. The update statistics low takes about 1
sec when I do the update on all columns separetly and five hours when
I do the update statistics low on the table .
The update statistics low for a column does not update indexes whereas
the low for the table do.
Is anyone here has an idea on how you can improve time of an update
statistics low for my_table in IDS 7.31UC5 and how you can limit the
impact on users.
Thanks in advance for any helpfull remarks
Olivier
Olivier CHRISTOL Guest
-
Rob Wilson #4
Re: UPDATE STATISTICS
Olivier CHRISTOL wrote:
It is my understanding (but I have never tested) that if you have a> The update statistics low for a column does not update indexes whereas
> the low for the table do.
>
single-column index and you update stats low on that column, the stats
for that index are updated. It is my further understanding that for
multi-column indexes if you update stats low and list all the columns in
that index, the stats will be updated.
So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
one on C2, C3:
update statistics low for table T(C1);
update statistics low for table T(C2, C3);
would be the proper method to update stats low.
I have further found (through experience and testing) that following the
recommendations in the Guide to Performance for your particular flavor
of Informix tends to both perform well when running and also cause good
performance for other queries. Yes, there are exceptions. Always test
first.
Rob Wilson Guest
-
Francisco Roldan #5
RE: UPDATE STATISTICS
Following your example :
So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
one on C2, C3:
update statistics low for table T(C1);
update statistics low for table T(C2, C3);
would be the proper method to update stats low.
Update statistics low for table T(C1,C2,C3)
Would not update stats on any index?
regards
-----Mensaje original-----
De: Rob Wilson [mailto:rob_wilson@ameritech.net]
Enviado el: Viernes, 22 de Agosto de 2003 10:11 a.m.
Para: [email]informix-list@iiug.org[/email]
Asunto: Re: UPDATE STATISTICS
Olivier CHRISTOL wrote:
It is my understanding (but I have never tested) that if you have a> The update statistics low for a column does not update indexes whereas
> the low for the table do.
>
single-column index and you update stats low on that column, the stats
for that index are updated. It is my further understanding that for
multi-column indexes if you update stats low and list all the columns in
that index, the stats will be updated.
So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
one on C2, C3:
update statistics low for table T(C1);
update statistics low for table T(C2, C3);
would be the proper method to update stats low.
I have further found (through experience and testing) that following the
recommendations in the Guide to Performance for your particular flavor
of Informix tends to both perform well when running and also cause good
performance for other queries. Yes, there are exceptions. Always test
first.
sending to informix-list
Francisco Roldan Guest
-
Rajib Sarkar #6
Re: UPDATE STATISTICS
When you do a UPDATE STATS is broken into 2 phases: 1) update the
information based on indices 2) update the distribution information.
For update stats low only the first phase is run...
In this phase the following is done in sequence:
1) The partition page is read to update the nrows and npused columns of the
table (for a fragmented table, each fragment's nrows and npused is updated
using the partition page)
2) Update column statistics of the table if there are columns with indexes
and the user has not specifically requested "with distributions only".
Inside 2) the engine loops through each index and checks if the keys in the
index are part of the column list specified by the user if yes, it'll scan
the index to gather the index statistics if not, it'll skip this scan.
That's why when you specify the update stats of a column with a index it
takes longer than when a non-indexed column is specified, since step 2) is
skipped for this.
So, if you don't specify any columns in the update stats it'll essentially
scan through all the indexes in order to get the statistics, and if the
table is pretty volatile i.e. lots of updates/delete operations then while
scanning it'll put the 'marked for deleted' keys to the B-Tree cleaner
thread which can slow up the things a bit too.
The update stats has been improved a lot with parallelization and also
better usage of the memory / heap allocated for the process ... if you want
the white paper on the improved update stats, let me know I'll send it over
to you in a seperate email.
HTH
Thanx much,
Rajib Sarkar
Advisory Software Engineer (RAS)
IBM Data Management Group
Ph : (602)-217-2100
Fax: (602)-217-2100
T/L : 667-2100
As long as you derive inner help and comfort from anything, keep it --
Mahatma Gandhi
olivier.christol@rf
o.atmel.com To: [email]informix-list@iiug.org[/email]
(Olivier CHRISTOL) cc:
Sent by: Subject: Re: UPDATE STATISTICS
owner-informix-list
@iiug.org
08/22/2003 05:41 AM
Please respond to
olivier.christol
"Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message
news:<O5e1b.97166$X43.68994@clmboh1-nws5.columbus.rr.com>...one> Imagine a table tab1 with a bazillion rows and five columns. There isdo,> index on the table, on column A.
>
> If I do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
>
> ... it runs very quickly. But if I just do:
>
> UPDATE STATISTICS LOW FOR TABLE tab1;
>
> ... it takes hours, maybe days. So what exactly does the latter commandDear All,> that the former command does NOT do?
I experience the same problem. The update statistics low takes about 1
sec when I do the update on all columns separetly and five hours when
I do the update statistics low on the table .
The update statistics low for a column does not update indexes whereas
the low for the table do.
Is anyone here has an idea on how you can improve time of an update
statistics low for my_table in IDS 7.31UC5 and how you can limit the
impact on users.
Thanks in advance for any helpfull remarks
Olivier
sending to informix-list
Rajib Sarkar Guest
-
Thomas J. Girsch #7
Re: UPDATE STATISTICS
"Olivier CHRISTOL" <olivier.christol@rfo.atmel.com> wrote in message > Dear
All,What I've found is that if you do an UPDATE STATISTICS LOW for each _index_,>
> I experience the same problem. The update statistics low takes about 1
> sec when I do the update on all columns separetly and five hours when
> I do the update statistics low on the table .
> The update statistics low for a column does not update indexes whereas
> the low for the table do.
>
> Is anyone here has an idea on how you can improve time of an update
> statistics low for my_table in IDS 7.31UC5 and how you can limit the
> impact on users.
>
> Thanks in advance for any helpfull remarks
>
> Olivier
that's all you need to do. Unless someone can explain otherwise. Example,
you've got a table with columns a, b, c, d, and e, and you've got a
composite index on a,b,c; a composite index on b, c, a; and a singleton
index on column b; you could do this:
UPDATE STATISTICS LOW FOR TABLE tab1(a,b,c);
UPDATE STATISTICS LOW FOR TABLE tab1(b,c,a);
UPDATE STATISTICS LOW FOR TABLE tab1(b);
This would update all the sysindexes rows appropriately. Any one of the
above would update nrows in systables appropriately. And running all three
of these combined would take considerably less time than just running UPDATE
STATISTICS LOW FOR TABLE tab1;
So my ultimate question becomes, why would you ever run UPDATE STATISTICS
LOW FOR TABLE tab1?
Thomas J. Girsch Guest
-
Thomas J. Girsch #8
Re: UPDATE STATISTICS
Thanks, Art. Already have dostats, as well as a home grown updstats
program. We've found that in our case, our application runs much better
WITHOUT distributions. LOW only stats are best for us.
But my ultimate question has not been answered to my satisfaction. Imagine
the following table:
CREATE TABLE tab1 (
a INTEGER,
b CHAR(5),
c DATE,
d CHAR(30)
);
CREATE UNIQUE INDEX ix_tab1_00 ON tab1(a,b);
CREATE INDEX ix_tab1_01 ON tab1(b);
CREATE INDEX ix_tab1_01 ON tab1(c);
If I run:
UPDATE STATISTICS LOW FOR TABLE tab1(a,b);
UPDATE STATISTICS LOW FOR TABLE tab1(b);
UPDATE STATISTICS LOW FOR TABLE tab1(c);
.... it updates the sysindexes records, as well as the nrows in systables.
If I run:
UPDATE STATISTICS LOW FOR TABLE tab1;
.... it also does these things. But it takes worlds longer, even in IDS 9.3.
So my assumption, possibly incorrect, is that the latter command must be
doing something _else_, something that the previous three UPDATE STATISTICS
commands didn't do. Updating some other part of the system catalog,
perhaps? If yes, then what exactly is it doing in addition to the above,
and what benefit can I expect to see from that? (Or what penalty for NOT
doing that?) If no, then why does the latter command take SO much longer?
Wouldn't that then be *gasp* a bug?
"Art S. Kagel" <kagel@bloomberg.net> wrote in message
news:pan.2003.08.22.14.14.03.774652.10594@bloomber g.net...> On Thu, 21 Aug 2003 21:12:14 -0400, Thomas J. Girsch wrote:
>
> If you operate on the one column that is indexed then it only collects
> info on the one column and updates just that one matching index's
> sysindex(sysindices) record. If you update stats on the entire table it
> has to collect data for all of the columns. That requires multiple
> sorts.
>
> Get my dostats utility. It performs the minimal update stats statements
> as recommended in the Performance Guide for you using optimized
> statements.
>
> Dostats is part of the package utils2_ak available from the IIUG Software
> Repository.
>
> Art S. Kagel
>> > Imagine a table tab1 with a bazillion rows and five columns. There is
> > one index on the table, on column A.
> >
> > If I do:
> >
> > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
> >
> > ... it runs very quickly. But if I just do:
> >
> > UPDATE STATISTICS LOW FOR TABLE tab1;
> >
> > ... it takes hours, maybe days. So what exactly does the latter command
> > do, that the former command does NOT do?
Thomas J. Girsch Guest
-
Mark Denham #9
Re: UPDATE STATISTICS
I have found the same thing on numerous occasions with Peoplesoft. With the
odd exception, distributions have just served to soak up CPU time.
Mark
----- Original Message -----
From: "Thomas J. Girsch" <tgirsch@worldnet.att.net>
To: <informix-list@iiug.org>
Sent: Friday, August 22, 2003 19:10
Subject: Re: UPDATE STATISTICS
Imagine> Thanks, Art. Already have dostats, as well as a home grown updstats
> program. We've found that in our case, our application runs much better
> WITHOUT distributions. LOW only stats are best for us.
>
> But my ultimate question has not been answered to my satisfaction.9.3.> the following table:
>
> CREATE TABLE tab1 (
> a INTEGER,
> b CHAR(5),
> c DATE,
> d CHAR(30)
> );
>
> CREATE UNIQUE INDEX ix_tab1_00 ON tab1(a,b);
> CREATE INDEX ix_tab1_01 ON tab1(b);
> CREATE INDEX ix_tab1_01 ON tab1(c);
>
> If I run:
>
> UPDATE STATISTICS LOW FOR TABLE tab1(a,b);
> UPDATE STATISTICS LOW FOR TABLE tab1(b);
> UPDATE STATISTICS LOW FOR TABLE tab1(c);
>
> ... it updates the sysindexes records, as well as the nrows in systables.
>
> If I run:
>
> UPDATE STATISTICS LOW FOR TABLE tab1;
>
> ... it also does these things. But it takes worlds longer, even in IDSSTATISTICS> So my assumption, possibly incorrect, is that the latter command must be
> doing something _else_, something that the previous three UPDATESoftware> commands didn't do. Updating some other part of the system catalog,
> perhaps? If yes, then what exactly is it doing in addition to the above,
> and what benefit can I expect to see from that? (Or what penalty for NOT
> doing that?) If no, then why does the latter command take SO much longer?
> Wouldn't that then be *gasp* a bug?
>
> "Art S. Kagel" <kagel@bloomberg.net> wrote in message
> news:pan.2003.08.22.14.14.03.774652.10594@bloomber g.net...> > On Thu, 21 Aug 2003 21:12:14 -0400, Thomas J. Girsch wrote:
> >
> > If you operate on the one column that is indexed then it only collects
> > info on the one column and updates just that one matching index's
> > sysindex(sysindices) record. If you update stats on the entire table it
> > has to collect data for all of the columns. That requires multiple
> > sorts.
> >
> > Get my dostats utility. It performs the minimal update stats statements
> > as recommended in the Performance Guide for you using optimized
> > statements.
> >
> > Dostats is part of the package utils2_ak available from the IIUGcommand> > Repository.
> >
> > Art S. Kagel
> >> > > Imagine a table tab1 with a bazillion rows and five columns. There is
> > > one index on the table, on column A.
> > >
> > > If I do:
> > >
> > > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
> > >
> > > ... it runs very quickly. But if I just do:
> > >
> > > UPDATE STATISTICS LOW FOR TABLE tab1;
> > >
> > > ... it takes hours, maybe days. So what exactly does the lattersending to informix-list>> > > do, that the former command does NOT do?
>
Mark Denham Guest
-
David Williams #10
Re: UPDATE STATISTICS
"Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message
news:7qx1b.37$Ce2.30@clmboh1-nws5.columbus.rr.com...Imagine> Thanks, Art. Already have dostats, as well as a home grown updstats
> program. We've found that in our case, our application runs much better
> WITHOUT distributions. LOW only stats are best for us.
>
> But my ultimate question has not been answered to my satisfaction.
Log a call with Informix tech support...
David Williams Guest
-
Jean Sagi #11
Re: UPDATE STATISTICS
Rajib Sarkar wrote:--> The update stats has been improved a lot with parallelization and also
> better usage of the memory / heap allocated for the process ... if you want
> the white paper on the improved update stats, let me know I'll send it over
> to you in a seperate email.
>
> HTH
It certainly would be a good read.
If you can I want it.
Atte,
Jesús Antonio Santos Giraldo
[email]jeansagi@myrealbox.com[/email]
[email]jeansagi@netscape.net[/email]
sending to informix-list
Jean Sagi Guest



Reply With Quote

