Professional Web Applications Themes

joins in mysql - MySQL

Hello, I've read numerous articles on the web advocating avoiding joins in MySQL whenever possible for performance reasons. I was curious as to just how slow a simple equi-join would be, so I ran some tests using the benchmark code from this site: [url]http://www.developerfusion.co.uk/show/2058/[/url] I made 2 tables and filled each with 3 million records and then performed some joins and got these results with MySQL 4.1.12: CREATE TABLE table1 ( id int(10) unsigned NOT NULL auto_increment, field1 char(50) default NULL, field2 char(50) default NULL, PRIMARY KEY (id) ) CREATE TABLE table2 ( id int(10) unsigned NOT NULL auto_increment, field1 ...

  1. #1

    Default joins in mysql

    Hello,

    I've read numerous articles on the web advocating avoiding joins in
    MySQL whenever possible for performance reasons. I was curious as to
    just how slow a simple equi-join would be, so I ran some tests using the
    benchmark code from this site:

    [url]http://www.developerfusion.co.uk/show/2058/[/url]

    I made 2 tables and filled each with 3 million records and then
    performed some joins and got these results with MySQL 4.1.12:

    CREATE TABLE table1 (
    id int(10) unsigned NOT NULL auto_increment,
    field1 char(50) default NULL,
    field2 char(50) default NULL,
    PRIMARY KEY (id)
    )

    CREATE TABLE table2 (
    id int(10) unsigned NOT NULL auto_increment,
    field1 char(50) default NULL,
    field2 char(50) default NULL,
    PRIMARY KEY (id)
    )

    ******************************

    SELECT *
    FROM table1 a, table2 b
    WHERE a.id = b.id AND a.id = 1

    OUTPUT:
    num rows: 1
    This page was created in 0.00327396392822 seconds

    ******************************

    SELECT *
    FROM table1 a, table2 b
    WHERE a.id = b.id AND a.id >= 1 AND a.id <= 100

    OUTPUT:
    num rows: 100
    This page was created in 0.00424695014954 seconds

    ******************************

    SELECT *
    FROM table1 a, table2 b
    WHERE a.id = b.id AND a.id >= 1 AND a.id <= 10000

    OUTPUT:
    num rows: 10000
    This page was created in 0.116240024567 seconds

    ******************************

    SELECT *
    FROM table1 a, table2 b
    WHERE a.id = b.id AND a.id >= 1 AND a.id <= 100000

    OUTPUT:
    num rows: 100000
    This page was created in 1.99258494377 seconds

    ******************************

    SELECT *
    FROM table1 a, table2 b
    WHERE a.id = b.id AND a.id >= 1 AND a.id <= 1000000

    OUTPUT:
    num rows: 1000000
    This page was created in 19.5196549892 seconds

    ******************************

    That being said, is there any real merit to the idea of avoiding joins
    whenever possible? 3 million rows per table seems like a fairly decent
    size to me, and returning 10,000 rows from that join takes just a bit
    over 1/10 of a second. Could someone please let me know if I am
    oversimplifying all of this and am missing something important, because
    otherwise these numbers seem very reasonable to me. Thanks in advance.
    Marcus Guest

  2. #2

    Default Re: joins in mysql

    Marcus wrote:
    > I've read numerous articles on the web advocating avoiding joins in
    > MySQL whenever possible for performance reasons.
    That's nonsense.

    By the same reasoning, we should avoid "while" loops in conventional
    programming languages -- again, for performance reasons. I mean, they
    _must_ be bad for performance! They run the same lines of code over and
    over again! ;-)

    It's true that we should avoid _unnecessary_ joins in SQL, just as we
    should avoid unnecessary code in other languages. There are smart ways
    to write code, and there are naive ways to write code. You know, like
    moving loop-invariant code outside the loop, etc.

    Likewise, it's smart to use join expressions that can be optimized and
    that can use indexes, and it makes joins even between huge tables return
    results very quickly.

    The articles you read must be making a more subtle point. No one
    writing about SQL should make such baseless, general statements.

    Regards,
    Bill K.
    Bill Karwin Guest

  3. #3

    Default Re: joins in mysql

    Marcus wrote:
    > Hello,
    >
    > I've read numerous articles on the web advocating avoiding joins in
    > MySQL whenever possible for performance reasons. I was curious as to
    > just how slow a simple equi-join would be, so I ran some tests using the
    > benchmark code from this site:
    >
    > [url]http://www.developerfusion.co.uk/show/2058/[/url]
    >
    First of all, don't believe everything you read on the internet! There
    are a lot of people who have no real knowledge of what's going on
    espousing all kinds of things.

    To answer your question, this goes back to basics of DB theory and
    normalization. There are 5 levels of normalization (in theory).

    As you go to higher normalization levels, you get higher diversity and
    less duplication of data. But at the same time it means you will
    typically have more tables, requiring more joins. This means that
    performance will often times be slower, requiring more CPU cycles (of
    course, there are exceptions to this - but those are typically rare).

    For instance, in first normal form you may have everything in one table,
    requiring no joins. However, you would also have a massive duplication
    of data. Take a bank, for instance. If a customer has a savings
    account and a checking account, the customer's name, address, etc. would
    have to be saved in each account's record.

    Going to second normal form, you would have two tables - a customer
    table, an account table. The account table would contain the customer's
    id, requiring a single join.

    But what if you have a joint account (two customers)? This doesn't work
    as well. Yes, you could stay in second normal form and have multiple
    entries in the account table for customer ids, but this has it's limits.
    (There are other ways also, but they each have their advantages and
    disadvantages)

    Going to third normal form adds a third table containing a customer id
    and an account id, linking the other two tables. This would require two
    joins to get customer information for an account (or vice versa), making
    it slower but more flexible.

    The bottom line is - proper database design is not always easy and has
    to consider the design of the entire database as well as the access
    required by the program(s) using it. You can't say joins by themselves
    are good or bad. They need to be taken in the context of the entire system.

    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    [email]jstucklexattglobal.net[/email]
    ==================
    Jerry Stuckle Guest

Similar Threads

  1. Replies: 4
    Last Post: July 30th, 02:02 AM
  2. JOINs instead of AND
    By Jim Michaels in forum MySQL
    Replies: 1
    Last Post: March 11th, 02:26 AM
  3. SQL help on joins
    By zCrow in forum Dreamweaver AppDev
    Replies: 3
    Last Post: May 10th, 09:17 PM
  4. multiple inner joins
    By Anderson11983 in forum Coldfusion Database Access
    Replies: 3
    Last Post: May 4th, 08:06 AM
  5. PHP and MySQL Table Joins
    By Ralph Freshour in forum PHP Development
    Replies: 3
    Last Post: September 1st, 09:54 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139