Ask a Question related to Informix, Design and Development.

  1. #1

    Default UPDATE STATISTICS

    Imagine a table tab1 with a bazillion rows and five columns. There is one
    index on the table, on column A.

    If I do:

    UPDATE STATISTICS LOW FOR TABLE tab1(col_a);

    .... it runs very quickly. But if I just do:

    UPDATE STATISTICS LOW FOR TABLE tab1;

    .... it takes hours, maybe days. So what exactly does the latter command do,
    that the former command does NOT do?


    Thomas J. Girsch Guest

  2. Similar Questions and Discussions

    1. update statistics slowing system
      Should doing an update statistics on a table in one database affect the preformance of users in a totally separate database. Does it lock some...
    2. Update Statistics High Error
      When I try to execute update statistics high, I get the following error : Database selected. 564: Cannot sort rows. 179: ISAM error: no...
    3. Update statistics suggestions --Urgent
      Hi I have a database with around 80 to 90 tables , and cron job is running for UPDATE STATS "update statistics high " runs every night on...
    4. Tetra and Update Statistics
      Paul Watson wrote: YER JOKING! Not even low? If there are indexes, then it *surely* needs to know at least how many frikken rows there are in...
    5. Error message on update statistics high
      On Wed, 30 Jul 2003 15:01:59 -0400, Francisco Roldan wrote: Or set PDQPRIORITY > 2 to sort in memory and set PSORT_NPROCS to invoke the parallel...
  3. #2

    Default Re: UPDATE STATISTICS


    Updates statistics for each column in the table. If your version is older
    than 7.31UD2 or 9.30 [not certain for 9] I think you will find that the
    table is read multiple times, once for each col in the table.

    This would explain the astronomical increase in the runtime.

    Mark

    ----- Original Message -----
    From: "Thomas J. Girsch" <tgirsch@worldnet.att.net>
    To: <informix-list@iiug.org>
    Sent: Thursday, August 21, 2003 21:12
    Subject: UPDATE STATISTICS

    > Imagine a table tab1 with a bazillion rows and five columns. There is one
    > index on the table, on column A.
    >
    > If I do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
    >
    > ... it runs very quickly. But if I just do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1;
    >
    > ... it takes hours, maybe days. So what exactly does the latter command
    do,
    > that the former command does NOT do?
    >
    >
    sending to informix-list
    Mark Denham Guest

  4. #3

    Default Re: UPDATE STATISTICS

    "Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message news:<O5e1b.97166$X43.68994@clmboh1-nws5.columbus.rr.com>...
    > Imagine a table tab1 with a bazillion rows and five columns. There is one
    > index on the table, on column A.
    >
    > If I do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
    >
    > ... it runs very quickly. But if I just do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1;
    >
    > ... it takes hours, maybe days. So what exactly does the latter command do,
    > that the former command does NOT do?
    Dear All,

    I experience the same problem. The update statistics low takes about 1
    sec when I do the update on all columns separetly and five hours when
    I do the update statistics low on the table .
    The update statistics low for a column does not update indexes whereas
    the low for the table do.

    Is anyone here has an idea on how you can improve time of an update
    statistics low for my_table in IDS 7.31UC5 and how you can limit the
    impact on users.

    Thanks in advance for any helpfull remarks

    Olivier
    Olivier CHRISTOL Guest

  5. #4

    Default Re: UPDATE STATISTICS

    Olivier CHRISTOL wrote:

    > The update statistics low for a column does not update indexes whereas
    > the low for the table do.
    >
    It is my understanding (but I have never tested) that if you have a
    single-column index and you update stats low on that column, the stats
    for that index are updated. It is my further understanding that for
    multi-column indexes if you update stats low and list all the columns in
    that index, the stats will be updated.

    So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
    one on C2, C3:

    update statistics low for table T(C1);
    update statistics low for table T(C2, C3);

    would be the proper method to update stats low.

    I have further found (through experience and testing) that following the
    recommendations in the Guide to Performance for your particular flavor
    of Informix tends to both perform well when running and also cause good
    performance for other queries. Yes, there are exceptions. Always test
    first.

    Rob Wilson Guest

  6. #5

    Default RE: UPDATE STATISTICS


    Following your example :
    So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
    one on C2, C3:

    update statistics low for table T(C1);
    update statistics low for table T(C2, C3);

    would be the proper method to update stats low.

    Update statistics low for table T(C1,C2,C3)
    Would not update stats on any index?


    regards



    -----Mensaje original-----
    De: Rob Wilson [mailto:rob_wilson@ameritech.net]
    Enviado el: Viernes, 22 de Agosto de 2003 10:11 a.m.
    Para: [email]informix-list@iiug.org[/email]
    Asunto: Re: UPDATE STATISTICS


    Olivier CHRISTOL wrote:

    > The update statistics low for a column does not update indexes whereas
    > the low for the table do.
    >
    It is my understanding (but I have never tested) that if you have a
    single-column index and you update stats low on that column, the stats
    for that index are updated. It is my further understanding that for
    multi-column indexes if you update stats low and list all the columns in
    that index, the stats will be updated.

    So, table T has columns C1, C2, C3, C4 and two indexes: one on C1 and
    one on C2, C3:

    update statistics low for table T(C1);
    update statistics low for table T(C2, C3);

    would be the proper method to update stats low.

    I have further found (through experience and testing) that following the
    recommendations in the Guide to Performance for your particular flavor
    of Informix tends to both perform well when running and also cause good
    performance for other queries. Yes, there are exceptions. Always test
    first.
    sending to informix-list
    Francisco Roldan Guest

  7. #6

    Default Re: UPDATE STATISTICS






    When you do a UPDATE STATS is broken into 2 phases: 1) update the
    information based on indices 2) update the distribution information.

    For update stats low only the first phase is run...

    In this phase the following is done in sequence:

    1) The partition page is read to update the nrows and npused columns of the
    table (for a fragmented table, each fragment's nrows and npused is updated
    using the partition page)
    2) Update column statistics of the table if there are columns with indexes
    and the user has not specifically requested "with distributions only".

    Inside 2) the engine loops through each index and checks if the keys in the
    index are part of the column list specified by the user if yes, it'll scan
    the index to gather the index statistics if not, it'll skip this scan.

    That's why when you specify the update stats of a column with a index it
    takes longer than when a non-indexed column is specified, since step 2) is
    skipped for this.

    So, if you don't specify any columns in the update stats it'll essentially
    scan through all the indexes in order to get the statistics, and if the
    table is pretty volatile i.e. lots of updates/delete operations then while
    scanning it'll put the 'marked for deleted' keys to the B-Tree cleaner
    thread which can slow up the things a bit too.

    The update stats has been improved a lot with parallelization and also
    better usage of the memory / heap allocated for the process ... if you want
    the white paper on the improved update stats, let me know I'll send it over
    to you in a seperate email.

    HTH




    Thanx much,

    Rajib Sarkar
    Advisory Software Engineer (RAS)
    IBM Data Management Group
    Ph : (602)-217-2100
    Fax: (602)-217-2100
    T/L : 667-2100

    As long as you derive inner help and comfort from anything, keep it --
    Mahatma Gandhi



    olivier.christol@rf
    o.atmel.com To: [email]informix-list@iiug.org[/email]
    (Olivier CHRISTOL) cc:
    Sent by: Subject: Re: UPDATE STATISTICS
    owner-informix-list
    @iiug.org


    08/22/2003 05:41 AM
    Please respond to
    olivier.christol





    "Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message
    news:<O5e1b.97166$X43.68994@clmboh1-nws5.columbus.rr.com>...
    > Imagine a table tab1 with a bazillion rows and five columns. There is
    one
    > index on the table, on column A.
    >
    > If I do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
    >
    > ... it runs very quickly. But if I just do:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1;
    >
    > ... it takes hours, maybe days. So what exactly does the latter command
    do,
    > that the former command does NOT do?
    Dear All,

    I experience the same problem. The update statistics low takes about 1
    sec when I do the update on all columns separetly and five hours when
    I do the update statistics low on the table .
    The update statistics low for a column does not update indexes whereas
    the low for the table do.

    Is anyone here has an idea on how you can improve time of an update
    statistics low for my_table in IDS 7.31UC5 and how you can limit the
    impact on users.

    Thanks in advance for any helpfull remarks

    Olivier


    sending to informix-list
    Rajib Sarkar Guest

  8. #7

    Default Re: UPDATE STATISTICS


    "Olivier CHRISTOL" <olivier.christol@rfo.atmel.com> wrote in message > Dear
    All,
    >
    > I experience the same problem. The update statistics low takes about 1
    > sec when I do the update on all columns separetly and five hours when
    > I do the update statistics low on the table .
    > The update statistics low for a column does not update indexes whereas
    > the low for the table do.
    >
    > Is anyone here has an idea on how you can improve time of an update
    > statistics low for my_table in IDS 7.31UC5 and how you can limit the
    > impact on users.
    >
    > Thanks in advance for any helpfull remarks
    >
    > Olivier
    What I've found is that if you do an UPDATE STATISTICS LOW for each _index_,
    that's all you need to do. Unless someone can explain otherwise. Example,
    you've got a table with columns a, b, c, d, and e, and you've got a
    composite index on a,b,c; a composite index on b, c, a; and a singleton
    index on column b; you could do this:

    UPDATE STATISTICS LOW FOR TABLE tab1(a,b,c);
    UPDATE STATISTICS LOW FOR TABLE tab1(b,c,a);
    UPDATE STATISTICS LOW FOR TABLE tab1(b);

    This would update all the sysindexes rows appropriately. Any one of the
    above would update nrows in systables appropriately. And running all three
    of these combined would take considerably less time than just running UPDATE
    STATISTICS LOW FOR TABLE tab1;

    So my ultimate question becomes, why would you ever run UPDATE STATISTICS
    LOW FOR TABLE tab1?


    Thomas J. Girsch Guest

  9. #8

    Default Re: UPDATE STATISTICS

    Thanks, Art. Already have dostats, as well as a home grown updstats
    program. We've found that in our case, our application runs much better
    WITHOUT distributions. LOW only stats are best for us.

    But my ultimate question has not been answered to my satisfaction. Imagine
    the following table:

    CREATE TABLE tab1 (
    a INTEGER,
    b CHAR(5),
    c DATE,
    d CHAR(30)
    );

    CREATE UNIQUE INDEX ix_tab1_00 ON tab1(a,b);
    CREATE INDEX ix_tab1_01 ON tab1(b);
    CREATE INDEX ix_tab1_01 ON tab1(c);

    If I run:

    UPDATE STATISTICS LOW FOR TABLE tab1(a,b);
    UPDATE STATISTICS LOW FOR TABLE tab1(b);
    UPDATE STATISTICS LOW FOR TABLE tab1(c);

    .... it updates the sysindexes records, as well as the nrows in systables.

    If I run:

    UPDATE STATISTICS LOW FOR TABLE tab1;

    .... it also does these things. But it takes worlds longer, even in IDS 9.3.
    So my assumption, possibly incorrect, is that the latter command must be
    doing something _else_, something that the previous three UPDATE STATISTICS
    commands didn't do. Updating some other part of the system catalog,
    perhaps? If yes, then what exactly is it doing in addition to the above,
    and what benefit can I expect to see from that? (Or what penalty for NOT
    doing that?) If no, then why does the latter command take SO much longer?
    Wouldn't that then be *gasp* a bug?

    "Art S. Kagel" <kagel@bloomberg.net> wrote in message
    news:pan.2003.08.22.14.14.03.774652.10594@bloomber g.net...
    > On Thu, 21 Aug 2003 21:12:14 -0400, Thomas J. Girsch wrote:
    >
    > If you operate on the one column that is indexed then it only collects
    > info on the one column and updates just that one matching index's
    > sysindex(sysindices) record. If you update stats on the entire table it
    > has to collect data for all of the columns. That requires multiple
    > sorts.
    >
    > Get my dostats utility. It performs the minimal update stats statements
    > as recommended in the Performance Guide for you using optimized
    > statements.
    >
    > Dostats is part of the package utils2_ak available from the IIUG Software
    > Repository.
    >
    > Art S. Kagel
    >
    > > Imagine a table tab1 with a bazillion rows and five columns. There is
    > > one index on the table, on column A.
    > >
    > > If I do:
    > >
    > > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
    > >
    > > ... it runs very quickly. But if I just do:
    > >
    > > UPDATE STATISTICS LOW FOR TABLE tab1;
    > >
    > > ... it takes hours, maybe days. So what exactly does the latter command
    > > do, that the former command does NOT do?

    Thomas J. Girsch Guest

  10. #9

    Default Re: UPDATE STATISTICS


    I have found the same thing on numerous occasions with Peoplesoft. With the
    odd exception, distributions have just served to soak up CPU time.

    Mark

    ----- Original Message -----
    From: "Thomas J. Girsch" <tgirsch@worldnet.att.net>
    To: <informix-list@iiug.org>
    Sent: Friday, August 22, 2003 19:10
    Subject: Re: UPDATE STATISTICS

    > Thanks, Art. Already have dostats, as well as a home grown updstats
    > program. We've found that in our case, our application runs much better
    > WITHOUT distributions. LOW only stats are best for us.
    >
    > But my ultimate question has not been answered to my satisfaction.
    Imagine
    > the following table:
    >
    > CREATE TABLE tab1 (
    > a INTEGER,
    > b CHAR(5),
    > c DATE,
    > d CHAR(30)
    > );
    >
    > CREATE UNIQUE INDEX ix_tab1_00 ON tab1(a,b);
    > CREATE INDEX ix_tab1_01 ON tab1(b);
    > CREATE INDEX ix_tab1_01 ON tab1(c);
    >
    > If I run:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1(a,b);
    > UPDATE STATISTICS LOW FOR TABLE tab1(b);
    > UPDATE STATISTICS LOW FOR TABLE tab1(c);
    >
    > ... it updates the sysindexes records, as well as the nrows in systables.
    >
    > If I run:
    >
    > UPDATE STATISTICS LOW FOR TABLE tab1;
    >
    > ... it also does these things. But it takes worlds longer, even in IDS
    9.3.
    > So my assumption, possibly incorrect, is that the latter command must be
    > doing something _else_, something that the previous three UPDATE
    STATISTICS
    > commands didn't do. Updating some other part of the system catalog,
    > perhaps? If yes, then what exactly is it doing in addition to the above,
    > and what benefit can I expect to see from that? (Or what penalty for NOT
    > doing that?) If no, then why does the latter command take SO much longer?
    > Wouldn't that then be *gasp* a bug?
    >
    > "Art S. Kagel" <kagel@bloomberg.net> wrote in message
    > news:pan.2003.08.22.14.14.03.774652.10594@bloomber g.net...
    > > On Thu, 21 Aug 2003 21:12:14 -0400, Thomas J. Girsch wrote:
    > >
    > > If you operate on the one column that is indexed then it only collects
    > > info on the one column and updates just that one matching index's
    > > sysindex(sysindices) record. If you update stats on the entire table it
    > > has to collect data for all of the columns. That requires multiple
    > > sorts.
    > >
    > > Get my dostats utility. It performs the minimal update stats statements
    > > as recommended in the Performance Guide for you using optimized
    > > statements.
    > >
    > > Dostats is part of the package utils2_ak available from the IIUG
    Software
    > > Repository.
    > >
    > > Art S. Kagel
    > >
    > > > Imagine a table tab1 with a bazillion rows and five columns. There is
    > > > one index on the table, on column A.
    > > >
    > > > If I do:
    > > >
    > > > UPDATE STATISTICS LOW FOR TABLE tab1(col_a);
    > > >
    > > > ... it runs very quickly. But if I just do:
    > > >
    > > > UPDATE STATISTICS LOW FOR TABLE tab1;
    > > >
    > > > ... it takes hours, maybe days. So what exactly does the latter
    command
    > > > do, that the former command does NOT do?
    >
    >
    sending to informix-list
    Mark Denham Guest

  11. #10

    Default Re: UPDATE STATISTICS


    "Thomas J. Girsch" <tgirsch@worldnet.att.net> wrote in message
    news:7qx1b.37$Ce2.30@clmboh1-nws5.columbus.rr.com...
    > Thanks, Art. Already have dostats, as well as a home grown updstats
    > program. We've found that in our case, our application runs much better
    > WITHOUT distributions. LOW only stats are best for us.
    >
    > But my ultimate question has not been answered to my satisfaction.
    Imagine

    Log a call with Informix tech support...



    David Williams Guest

  12. #11

    Default Re: UPDATE STATISTICS


    Rajib Sarkar wrote:
    > The update stats has been improved a lot with parallelization and also
    > better usage of the memory / heap allocated for the process ... if you want
    > the white paper on the improved update stats, let me know I'll send it over
    > to you in a seperate email.
    >
    > HTH
    --
    It certainly would be a good read.

    If you can I want it.


    Atte,


    Jesús Antonio Santos Giraldo
    [email]jeansagi@myrealbox.com[/email]
    [email]jeansagi@netscape.net[/email]


    sending to informix-list
    Jean Sagi Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139