Professional Web Applications Themes

How to optimise many updates on 1Mio rows? - Microsoft SQL / MS SQL Server

I need to perform many update queries on the same table sporting 1 Mio rows. It is not possible to do all updates in one query, I definitely need several queries. On the other hand, in each update query, I'd rather update only about 20,000 rows at a time to avoid ing the log file. The "top 20000" or "set rowcount 20000" methods don't seem to help the optimiser, probably because the update queries are joined, like in: UPDATE Orders SET OrderType = "unknown" FROM Orders LEFT JOIN OrderTypes ON Orders.OrderType = OrderTypes.OrderType WHERE OrderTypes.OrderType IS NULL I have an ...

  1. #1

    Default How to optimise many updates on 1Mio rows?


    I need to perform many update queries on the same table
    sporting 1 Mio rows.
    It is not possible to do all updates in one query, I
    definitely need several queries.

    On the other hand, in each update query, I'd rather update
    only about 20,000 rows at a time to avoid ing the log
    file.

    The "top 20000" or "set rowcount 20000" methods don't seem
    to help the optimiser, probably because the update queries
    are joined, like in:

    UPDATE Orders
    SET OrderType = "unknown"
    FROM Orders
    LEFT JOIN OrderTypes
    ON Orders.OrderType = OrderTypes.OrderType
    WHERE OrderTypes.OrderType IS NULL

    I have an int column on the table (an order number), but
    it is not a key, and there is no guarantee that the order
    numbers are in sequence.
    There could be gaps of more than 20,000 in the order
    numbers.

    How would you go about splitting the work in "slices" of
    20,000 rows?
    It would be nice if the "slicing" could be done once at
    the beginning, and reused in all the following update
    queries.

    Thanks in advance.

    George Guest

  2. #2

    Default Re: How to optimise many updates on 1Mio rows?


    Perhaps a more realistic example might help as the join is not required
    the way you expressed it ?
    Could you run the job more frequently so that less would be updated or
    do you always need to update many rows each time regardless - i.e. like
    updating a value by 10% under certain conditions ?

    *** Sent via Developersdex [url]http://www.developersdex.com[/url] ***
    Don't just participate in USENET...get rewarded for it!
    malcolm k Guest

  3. #3

    Default Re: How to optimise many updates on 1Mio rows?


    Why are you saying that the join isn't necessary?

    Yes, I need to update all the rows each time. It's part of
    a data import.
    >-----Original Message-----
    >
    >Perhaps a more realistic example might help as the join
    is not required
    >the way you expressed it ?
    >Could you run the job more frequently so that less would
    be updated or
    >do you always need to update many rows each time
    regardless - i.e. like
    >updating a value by 10% under certain conditions ?
    >
    >*** Sent via Developersdex [url]http://www.developersdex.com[/url]
    ***
    >Don't just participate in USENET...get rewarded for it!
    >.
    >
    George Guest

  4. #4

    Default Re: How to optimise many updates on 1Mio rows?

    George wrote:
    > Why are you saying that the join isn't necessary?
    >
    Let's look at it again:
    UPDATE Orders
    SET OrderType = "unknown"
    FROM Orders
    LEFT JOIN OrderTypes
    ON Orders.OrderType = OrderTypes.OrderType
    WHERE OrderTypes.OrderType IS NULL

    You are joining by OrderType. So, if OrderTypes.OrderType is null, then
    Orders.OrderType must also be Null, correct? So why bring OrderTypes into
    the picture? This can only slow things down, especially since you are using
    a Left Join, which is also misguided.*

    Just update all the rows in Orders where Order.OrderType is Null:

    UPDATE Orders
    SET OrderType = "unknown"
    WHERE OrderType IS NULL

    These will be the same records that would be updated by your original query.

    Why are you leaving all those Nulls in the OrderType column in the
    OrderTypes table?

    HTH,
    Bob Barrows
    *You're goal is to update records in Orders where the OrderType is equal to
    the OrderType column in OrderTypes, isn't it? Why make your query go to the
    extra work of assembling ALL the records in Orders? If you need a join
    (which you don't), use an Inner Join.


    Bob Barrows Guest

  5. #5

    Default Re: How to optimise many updates on 1Mio rows?


    My goal is to replace the order type by "unknown" when the
    order type is not one of the types listed in the
    OrderTypes table.
    I find those invalid types by doing a left join. When
    OrderTypes.OrderType it means that there is no match for
    the type in Orders with any type in OrderTypes

    >-----Original Message-----
    >George wrote:
    >> Why are you saying that the join isn't necessary?
    >>
    >Let's look at it again:
    >UPDATE Orders
    >SET OrderType = "unknown"
    >FROM Orders
    >LEFT JOIN OrderTypes
    >ON Orders.OrderType = OrderTypes.OrderType
    >WHERE OrderTypes.OrderType IS NULL
    >
    >You are joining by OrderType. So, if OrderTypes.OrderType
    is null, then
    >Orders.OrderType must also be Null, correct? So why bring
    OrderTypes into
    >the picture? This can only slow things down, especially
    since you are using
    >a Left Join, which is also misguided.*
    >
    >Just update all the rows in Orders where Order.OrderType
    is Null:
    >
    >UPDATE Orders
    >SET OrderType = "unknown"
    >WHERE OrderType IS NULL
    >
    >These will be the same records that would be updated by
    your original query.
    >
    >Why are you leaving all those Nulls in the OrderType
    column in the
    >OrderTypes table?
    >
    >HTH,
    >Bob Barrows
    >*You're goal is to update records in Orders where the
    OrderType is equal to
    >the OrderType column in OrderTypes, isn't it? Why make
    your query go to the
    >extra work of assembling ALL the records in Orders? If
    you need a join
    >(which you don't), use an Inner Join.
    >
    >
    >.
    >
    George Guest

  6. #6

    Default Re: How to optimise many updates on 1Mio rows?

    Ohh - my mistake! You are trying to set bogus OrderTypes in Orders to
    'unknown' when they don't exist in OrderTypes! <blush>

    OK, instead of a Left Join, you may get better performance from a NOT EXISTS
    query:

    SET ROWCOUNT 20000
    WHILE ROWCOUNT > 0
    BEGIN
    SET
    UPDATE Orders
    SET OrderType='unknown'
    FROM Orders
    WHERE NOT EXISTS (
    Select ot.OrderType
    FROM OrderTypes ot
    WHERE ot.OrderType = Orders.OrderType)
    --truncate the transaction log
    END

    Bob Barrows

    George wrote:
    > Why are you saying that the join isn't necessary?
    >
    > Yes, I need to update all the rows each time. It's part of
    > a data import.
    >
    >> -----Original Message-----
    >>
    >> Perhaps a more realistic example might help as the join is not
    >> required the way you expressed it ?
    >> Could you run the job more frequently so that less would be updated
    >> or do you always need to update many rows each time regardless -
    >> i.e. like updating a value by 10% under certain conditions ?
    >>
    >> *** Sent via Developersdex [url]http://www.developersdex.com[/url] ***
    >> Don't just participate in USENET...get rewarded for it!
    >> .

    Bob Barrows Guest

  7. #7

    Default Re: How to optimise many updates on 1Mio rows?

    > Ohh - my mistake! You are trying to set bogus OrderTypes in Orders to
    > 'unknown' when they don't exist in OrderTypes! <blush>
    Maybe we should have a CHECK CONSTRAINT (CHECK OrderType IN (SELECT
    OrderType FROM OrderTypes) so that bogus data doesn't get in there in the
    first place...



    Aaron Bertrand - MVP Guest

  8. #8

    Default Re: How to optimise many updates on 1Mio rows?

    Aaron Bertrand - MVP wrote:
    >> Ohh - my mistake! You are trying to set bogus OrderTypes in Orders to
    >> 'unknown' when they don't exist in OrderTypes! <blush>
    >
    > Maybe we should have a CHECK CONSTRAINT (CHECK OrderType IN (SELECT
    > OrderType FROM OrderTypes) so that bogus data doesn't get in there in
    > the first place...
    I think he said it's a data import. I'm assuming this is a staging table
    where he validates the data before putting it into his actual tables ...

    Bob


    Bob Barrows Guest

  9. #9

    Default Re: How to optimise many updates on 1Mio rows?


    spot on!
    >-----Original Message-----
    >Aaron Bertrand - MVP wrote:
    >>> Ohh - my mistake! You are trying to set bogus
    OrderTypes in Orders to
    >>> 'unknown' when they don't exist in OrderTypes! <blush>
    >>
    >> Maybe we should have a CHECK CONSTRAINT (CHECK
    OrderType IN (SELECT
    >> OrderType FROM OrderTypes) so that bogus data doesn't
    get in there in
    >> the first place...
    >
    >I think he said it's a data import. I'm assuming this is
    a staging table
    >where he validates the data before putting it into his
    actual tables ...
    >
    >Bob
    >
    >
    >.
    >
    George Guest

Similar Threads

  1. How to optimise query?
    By YogeshM in forum Coldfusion Database Access
    Replies: 1
    Last Post: October 27th, 10:42 AM
  2. How to best optimise gradients with text?
    By Francesca NY webforumsuser@macromedia.com in forum Macromedia Fireworks
    Replies: 1
    Last Post: October 6th, 02:33 PM
  3. Optimise HD?
    By Peter KERR in forum Mac Applications & Software
    Replies: 5
    Last Post: July 7th, 03:40 PM
  4. Best practice to optimise big updates (repost, rewritten with example)
    By Bob Barrows in forum Microsoft SQL / MS SQL Server
    Replies: 4
    Last Post: July 2nd, 08:49 PM
  5. Select rows where other related rows don't exist
    By Dmand in forum Microsoft SQL / MS SQL Server
    Replies: 2
    Last Post: July 2nd, 06:23 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139