Professional Web Applications Themes

Weird performance bottle-neck with large tables. - MySQL

I am running into something really weird with MySQL. Bare in mind that the following is not a real-world application but a test environment that I have set up which is meant to only imitate another real-world application. So, please don't respond with recommendations of how to do this same thing differently. :) I have created 10 tables each with 1 million to 10 million rows. Each table has 1 million more than the previous. Each table contains two 64-bit integers which together form the primary key and a data field which is of type TEXT. I then have a ...

  1. #1

    Default Weird performance bottle-neck with large tables.

    I am running into something really weird with MySQL. Bare in mind that
    the following is not a real-world application but a test environment
    that I have set up which is meant to only imitate another real-world
    application. So, please don't respond with recommendations of how to
    do this same thing differently. :)

    I have created 10 tables each with 1 million to 10 million rows. Each
    table has 1 million more than the previous. Each table contains two
    64-bit integers which together form the primary key and a data field
    which is of type TEXT. I then have a program that spawns 50
    connections to the first table. Each thread then sends an equal
    distribution of INSERTs, SELECTs and UPDATEs. I then monitor the
    average query time of each type of query. After finishing with the
    first table, I then move to the next table and so on until all tables
    have been tested.

    On my machine, I am seeing query times in the range of .6 ms/query.
    There is little change in the values until I reach the larger tables.
    Suddenly, at a point which changes each time, the query time jumps up
    in the range of 30 ms/query. Sometimes this occurs at the 7 million
    table, sometimes higher or lower. There are times when it never
    happens at all and the .6 ms/query timing continues even in the 10
    million row table. It never happens on the smaller tables though.

    I then setup a test case where once it popped, it re-ran the tests on
    the smaller tables. Once the query time has jumped to the 20-30 ms
    point, queries on the small tables also take 20-30 ms. So, it has to
    be something that is happening in mysql, not just the given table.
    Once I let it run for a while, eventually the query times drop down low
    again.

    When I run "SHOW PROCESSLIST" while the test is running, I see only a
    couple of queries running/waiting. Once the timing has popped, the
    process list shows lots of queries, most of which are waiting.
    Generally the waiting reason is either "Locked" or "System Lock".

    I am running 3.23.51 on Linux 2.4.32. The tables are MyISAM. There is
    plenty of memory. In fact, during the test my swap is completely
    unused. So it can't be a swapping issue. At least, it can't be a
    system swapping issue. It could very well be a swapping issue inside
    of mysql.

    If anyone could provide any light on this situation, I would greatly
    appreciate it. Thank you.

    jinxidoru@byu.net Guest

  2. #2

    Default Re: Weird performance bottle-neck with large tables.

    net wrote:
     


    That's a very old version of MySQL. It might not happen in 4.0, 4.1 or 5.0.

     


    That's old too, I haven't used a 2.4 kernel for a while.

     


    But do all the indexes of all the tables fit in the key buffer?

    I suggest you first check this, raise it if necessary, then if the problem
    persists, first upgrade MySQL, then your kernel.



    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Guest

  3. #3

    Default Re: Weird performance bottle-neck with large tables.

     

    Yeah, I'm in the process of trying the same test on 5.0
     

    I agree, but because of certain project requirements, this is not
    changeable.
     

    I've tried increasing the key buffer, and it had no noticeable effect.
    Also, if that were the case then I probably wouldn't see the timing
    increase on the smaller tables when I returned to them.

    jinxidoru Guest

  4. #4

    Default Re: Weird performance bottle-neck with large tables.

    jinxidoru wrote:
     
    >
    > Yeah, I'm in the process of trying the same test on 5.0

    >
    > I agree, but because of certain project requirements, this is not
    > changeable.

    >
    > I've tried increasing the key buffer, and it had no noticeable effect.
    > Also, if that were the case then I probably wouldn't see the timing
    > increase on the smaller tables when I returned to them.[/ref]


    Yes you would as the key buffer would contain the indexes of the larger
    tables as the smaller tables' indexes would have been pushed out of the
    buffer.


    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Guest

  5. #5

    Default Re: Weird performance bottle-neck with large tables.

    > Yes you would as the key buffer would contain the indexes of the larger 

    But, once a couple queries have been run on the smaller tables,
    wouldn't the index for the smaller tables be put back into the cache.
    I could see this being a problem when we're talking about just a few
    individual queries, but my test case involves thousands of queries.

    jinxidoru Guest

  6. #6

    Default Re: Weird performance bottle-neck with large tables.

    jinxidoru wrote:
     
    >
    > But, once a couple queries have been run on the smaller tables,
    > wouldn't the index for the smaller tables be put back into the cache.
    > I could see this being a problem when we're talking about just a few
    > individual queries, but my test case involves thousands of queries.[/ref]


    As I understand it, only the index blocks that are required for the queries
    you execute would get cached in the key buffer, so if you keep querying
    different parts of the table you might not get any buffer hits.

    With MySQL 5.0 you can use 'LOAD INDEX INTO
    CACHE' (http://dev.mysql.com/doc/refman/5.0/en/load-index.html) to make
    sure all the index is cached beforehand, but of course you still need a key
    buffer big enough to fit them all in.


    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Guest

  7. #7

    Default Re: Weird performance bottle-neck with large tables.

    net wrote:
     

    So this is kind of "Performance vs. Data Set Size" test. How do you
    access the rows? Random access, uniform distribution? This will not
    simulate real workload then. Real data has hotspots and generally a
    nonuniform access pattern.
     

    Looks like some cache gets too small. Since you are using MyISAM, there
    is only the key_buffer inside MySQL. If you access the data files as
    well (that is: fields not in indexes), those accesses will be cached by
    the operating system. The fluctuation could be explained by the random
    access pattern of your test. But linear progression is not very useful.
    You should better increase your data set sizes in geometric
    progression. Common are powers of two.
     

    No. It's just how caches work. The key_buffer is not instantly flushed
    and refilled, once you go back to a smaller table. Same goes for caches
    outside MySQL. Some caches may also be dirty and need to get flushed
    first.
     

    Right. As soon as the caches are "hot" again, the performance
    increases.
     

    Locking comes from your operation mix. MyISAM tables can be read or
    written mutually exclusive. So locking happens all the time. Though
    you cannot see it unless there is a backlog of queries in the server.
    As long as the server can answer your questions faster than your
    testing environment asks new ones, everything will be fine.

    If you want to understand the phenomenon, you have to closely monitor
    both MySQL and your operating system. I.e. the efficiency of the
    key_buffer can be estimated by looking at the key_read_requests and
    key_reads counters from SHOW STATUS.

    With BLOBs in MyISAM tables you will experience table fragmentation.
    Some UPDATEs will create new records, leaving a hole where the old
    record was. Eventually some holes will be filled later, but overall
    the table size will increase and make your caches less usefull and
    performance drop.


    XL
    --
    Axel Schwenke, Senior Software Developer, MySQL AB

    Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
    MySQL User Forums: http://forums.mysql.com/
    Axel Guest

Similar Threads

  1. Poor performance with large datasets
    By Luther Miller in forum ASP.NET Web Services
    Replies: 4
    Last Post: June 20th, 06:00 AM
  2. Split large tables
    By Daniel Onnen in forum MySQL
    Replies: 1
    Last Post: March 8th, 08:10 PM
  3. Performance with linked tables
    By Eddie Simons in forum ASP Database
    Replies: 1
    Last Post: September 25th, 10:46 AM
  4. Joining Tables from different DBs => Do I have to care about performance?
    By Matti in forum Microsoft SQL / MS SQL Server
    Replies: 2
    Last Post: July 9th, 01:53 PM
  5. need a cheap JOIN on 3 large tables under Win32
    By JT in forum PERL Miscellaneous
    Replies: 11
    Last Post: July 1st, 11:24 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139