Professional Web Applications Themes

database design and query problem - MySQL

Hi, I'm looking at creating a database which will contain over 1.2 million telephone numbers, however after putting them all in one table the time it takes to retrieve particular numbers takes quite a while (approx 5.2 secs per request), so i tried splitting up the data into 8 tables (named by prefix of the number), this has worked out much quicker as only the tables where the user has a number is being searched. if anyone has any suggestions on a better design please let me know. This leads me onto the main problem, how can I select from ...

  1. #1

    Default database design and query problem

    Hi,

    I'm looking at creating a database which will contain over 1.2 million
    telephone numbers, however after putting them all in one table the time
    it takes to retrieve particular numbers takes quite a while (approx 5.2
    secs per request), so i tried splitting up the data into 8 tables (named
    by prefix of the number), this has worked out much quicker as only the
    tables where the user has a number is being searched. if anyone has any
    suggestions on a better design please let me know.

    This leads me onto the main problem, how can I select from multiple
    tables but get all the results in 1 query. I am looking to do this so I
    can let MySQL process the LIMIT and the OFFSET rather than let PHP do so.

    I've tried this but it doesnt seem to work

    SELECT
    *
    FROM
    `0870`,
    `0800`,
    `0702`
    WHERE
    `0870`.`owner` = '0'
    LIMIT
    30


    Many thanks
    Adnan
    Adnan Barakat Guest

  2. #2

    Default Re: database design and query problem

    Adnan Barakat wrote:
    > Hi,
    >
    > I'm looking at creating a database which will contain over 1.2 million
    > telephone numbers, however after putting them all in one table the time
    > it takes to retrieve particular numbers takes quite a while (approx 5.2
    > secs per request), so i tried splitting up the data into 8 tables (named
    > by prefix of the number), this has worked out much quicker as only the
    > tables where the user has a number is being searched. if anyone has any
    > suggestions on a better design please let me know.
    >
    > This leads me onto the main problem, how can I select from multiple
    > tables but get all the results in 1 query. I am looking to do this so I
    > can let MySQL process the LIMIT and the OFFSET rather than let PHP do so.
    >
    > I've tried this but it doesnt seem to work
    >
    > SELECT
    > *
    > FROM
    > `0870`,
    > `0800`,
    > `0702`
    > WHERE
    > `0870`.`owner` = '0'
    > LIMIT
    > 30
    >
    >
    > Many thanks
    > Adnan
    I have a similar table with over 5 million records and the answer
    is always immediate.

    Is your table indexed?

    Can you post the structure of your table?

    SHOW CREATE TABLE your_table_name\G

    ciao
    gmax

    --
    _ _ _ _
    (_|| | |(_|>< The Data Charmer
    _|
    [url]http://datacharmer.blogspot.com/[/url]
    Giuseppe Maxia Guest

  3. #3

    Default Re: database design and query problem

    thanks for your quick response gmax,

    as you can see the table isn't indexed, which I take it isn't a good
    idea, how would I go about indexing it (as in which columns would i need
    to other than `uid`)

    +----------+------------------------------------------+
    | Table | Create Table |
    +----------+------------------------------------------+
    | numbers | CREATE TABLE `numbers` (
    `uid` varchar(11) NOT NULL default '',
    `band` smallint(2) NOT NULL default '0',
    `owner` int(11) NOT NULL default '0',
    `status` smallint(2) NOT NULL default '0',
    `dest` varchar(250) NOT NULL default '',
    PRIMARY KEY (`uid`)
    ) TYPE=MyISAM |
    +----------+------------------------------------------+


    many thanks

    Adnan




    Giuseppe Maxia wrote:
    > Adnan Barakat wrote:
    >
    >> Hi,
    >>
    >> I'm looking at creating a database which will contain over 1.2 million
    >> telephone numbers, however after putting them all in one table the
    >> time it takes to retrieve particular numbers takes quite a while
    >> (approx 5.2 secs per request), so i tried splitting up the data into 8
    >> tables (named by prefix of the number), this has worked out much
    >> quicker as only the tables where the user has a number is being
    >> searched. if anyone has any suggestions on a better design please let
    >> me know.
    >>
    >> This leads me onto the main problem, how can I select from multiple
    >> tables but get all the results in 1 query. I am looking to do this so
    >> I can let MySQL process the LIMIT and the OFFSET rather than let PHP
    >> do so.
    >>
    >> I've tried this but it doesnt seem to work
    >>
    >> SELECT
    >> *
    >> FROM
    >> `0870`,
    >> `0800`,
    >> `0702`
    >> WHERE
    >> `0870`.`owner` = '0'
    >> LIMIT
    >> 30
    >>
    >>
    >> Many thanks
    >> Adnan
    >
    >
    > I have a similar table with over 5 million records and the answer
    > is always immediate.
    >
    > Is your table indexed?
    >
    > Can you post the structure of your table?
    >
    > SHOW CREATE TABLE your_table_name\G
    >
    > ciao
    > gmax
    >
    Adnan Barakat Guest

  4. #4

    Default Re: database design and query problem

    Adnan Barakat wrote:
    > thanks for your quick response gmax,
    >
    > as you can see the table isn't indexed, which I take it isn't a good
    > idea, how would I go about indexing it (as in which columns would i need
    > to other than `uid`)
    >
    > +----------+------------------------------------------+
    > | Table | Create Table |
    > +----------+------------------------------------------+
    > | numbers | CREATE TABLE `numbers` (
    > `uid` varchar(11) NOT NULL default '',
    > `band` smallint(2) NOT NULL default '0',
    > `owner` int(11) NOT NULL default '0',
    > `status` smallint(2) NOT NULL default '0',
    > `dest` varchar(250) NOT NULL default '',
    > PRIMARY KEY (`uid`)
    > ) TYPE=MyISAM |
    > +----------+------------------------------------------+
    >
    >
    > many thanks
    >
    > Adnan
    >
    I can't see form the names the meaning of each field.
    Assuming that your phone number is in `owner`, then you should do

    ALTER TABLE phones ADD UNIQUE KEY (owner);

    After that, if you run a query with
    WHERE owner = somenumber

    it will return a result immediately.

    Depending on your most frequent queries, you could also add
    a compound index. For example, if you search for status, band, and dest,
    then you could add

    ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);

    However, it depends on what you are asking for, and how often you are issuing such queries.

    Try adding a KEY on the phone number field (if numbers are supposed to be without
    duplicates, add a UNIQUE KEY) and see if the situation improves.

    ciao
    gmax

    P.S. You should answer AFTER the quoted message, not before.

    --
    _ _ _ _
    (_|| | |(_|>< The Data Charmer
    _|
    [url]http://datacharmer.blogspot.com/[/url]
    Giuseppe Maxia Guest

  5. #5

    Default Re: database design and query problem

    Please keep responding before the quote, So I won't have to scroll all
    the way down to see the answer.

    In fact, the splitting of the tables is a bit what an index does, but in
    a far better fashion. By default, a textual index is 4 characters long,
    meaning that the 4 first characters can be used in a binary search. If
    all numbers start with the same 4 characters, this does not make much
    sense. You can therefore supply a key length for the index as well. So
    if, say, 80% of your phone numbers start with '0870', the index would
    not be very effective and you could gain a lot more speed by setting the
    index length to a larger value. Supposing that 'uid' contains the phone
    number (it really pays off to choose descriptive names), you could try
    to enlarge the key to 6 characters with:

    ALTER TABLE phones DROP PRIMARY KEY;
    ALTER TABLE phones ADD PRIMARY KEY(uid(6));

    (I could not find a modification command for an index)

    Best regards

    Giuseppe Maxia wrote:
    > Adnan Barakat wrote:
    >
    >> thanks for your quick response gmax,
    >>
    >> as you can see the table isn't indexed, which I take it isn't a good
    >> idea, how would I go about indexing it (as in which columns would i
    >> need to other than `uid`)
    >>
    >> +----------+------------------------------------------+
    >> | Table | Create Table |
    >> +----------+------------------------------------------+
    >> | numbers | CREATE TABLE `numbers` (
    >> `uid` varchar(11) NOT NULL default '',
    >> `band` smallint(2) NOT NULL default '0',
    >> `owner` int(11) NOT NULL default '0',
    >> `status` smallint(2) NOT NULL default '0',
    >> `dest` varchar(250) NOT NULL default '',
    >> PRIMARY KEY (`uid`)
    >> ) TYPE=MyISAM |
    >> +----------+------------------------------------------+
    >>
    >>
    >> many thanks
    >>
    >> Adnan
    >>
    >
    > I can't see form the names the meaning of each field.
    > Assuming that your phone number is in `owner`, then you should do
    >
    > ALTER TABLE phones ADD UNIQUE KEY (owner);
    >
    > After that, if you run a query with
    > WHERE owner = somenumber
    >
    > it will return a result immediately.
    >
    > Depending on your most frequent queries, you could also add
    > a compound index. For example, if you search for status, band, and dest,
    > then you could add
    >
    > ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);
    >
    > However, it depends on what you are asking for, and how often you are
    > issuing such queries.
    >
    > Try adding a KEY on the phone number field (if numbers are supposed to
    > be without
    > duplicates, add a UNIQUE KEY) and see if the situation improves.
    >
    > ciao
    > gmax
    >
    > P.S. You should answer AFTER the quoted message, not before.
    >
    Dikkie Dik Guest

  6. #6

    Default Re: database design and query problem

    Dikkie Dik wrote:
    > Please keep responding before the quote, So I won't have to scroll all
    > the way down to see the answer.
    >
    Please don't top post. The standards for this group are to respond
    inline (like this) or after the quoted message.
    > In fact, the splitting of the tables is a bit what an index does, but in
    > a far better fashion. By default, a textual index is 4 characters long,
    > meaning that the 4 first characters can be used in a binary search. If
    > all numbers start with the same 4 characters, this does not make much
    > sense. You can therefore supply a key length for the index as well. So
    > if, say, 80% of your phone numbers start with '0870', the index would
    > not be very effective and you could gain a lot more speed by setting the
    > index length to a larger value. Supposing that 'uid' contains the phone
    > number (it really pays off to choose descriptive names), you could try
    > to enlarge the key to 6 characters with:
    >
    Actually, an index does not split the table at all - any more than the
    index to a book splits the book. An index only contains the key and a
    pointer to the record.

    As for the first 4 characters of the text search - where did you find
    this information? I haven't found anything like this in the MySQL doc,
    and my experience has been that the entire key is in the table.

    Additionally, the key would not be able to enforce uniqueness if it
    didn't contain the entire key.
    > ALTER TABLE phones DROP PRIMARY KEY;
    > ALTER TABLE phones ADD PRIMARY KEY(uid(6));
    >
    > (I could not find a modification command for an index)
    >
    Yes, but the doc indicates this can be used to SHORTEN the amount of
    space used by the index. There is no place I can find where it states
    only the first 4 characters of a text field are used for searches.
    > Best regards
    >

    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    [email]jstucklexattglobal.net[/email]
    ==================
    Jerry Stuckle Guest

  7. #7

    Default Re: database design and query problem

    Dikkie Dik wrote:
    > Please keep responding before the quote, So I won't have to scroll all
    > the way down to see the answer.
    >
    In newsgroups it is customary to post AFTER the quote.

    To quote an old joke:

    A: Top posting!
    Q: What is the most irritating thing on Usenet?

    [url]http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.4[/url]
    > In fact, the splitting of the tables is a bit what an index does, but in
    > a far better fashion.
    ???
    Splitting a table is against normalization, and it should be done only if
    there are really compelling reasons for it. (Like creating a data warehouse)
    It might be more efficient, but it goes against relational principles,
    thus making queries much more difficult.
    > By default, a textual index is 4 characters long,
    I can't find any proof of this. Care to show where did you get that info?
    A far as I know, a textual index is as long as the column to which it refers,
    unless a limit is explicitly given.
    > meaning that the 4 first characters can be used in a binary search. If
    > all numbers start with the same 4 characters, this does not make much
    > sense. You can therefore supply a key length for the index as well. So
    > if, say, 80% of your phone numbers start with '0870', the index would
    > not be very effective and you could gain a lot more speed by setting the
    > index length to a larger value.
    If you need to find an exact telephone number, 4 characters are not enough. However,
    if your phone numbers have an area code, it may be a good idea to split the
    column into two, and to create a compound index.
    > Supposing that 'uid' contains the phone
    > number (it really pays off to choose descriptive names), you could try
    > to enlarge the key to 6 characters with:
    >
    > ALTER TABLE phones DROP PRIMARY KEY;
    > ALTER TABLE phones ADD PRIMARY KEY(uid(6));
    >
    > (I could not find a modification command for an index)
    >
    All in one command:
    ALTER TABLE phones DROP KEY old_index, ADD KEY new_index.

    [SNIP]

    ciao
    gmax

    --
    _ _ _ _
    (_|| | |(_|>< The Data Charmer
    _|
    [url]http://datacharmer.blogspot.com/[/url]
    Giuseppe Maxia Guest

  8. #8

    Default Re: database design and query problem

    Giuseppe Maxia wrote:
    > Adnan Barakat wrote:
    >
    >> thanks for your quick response gmax,
    >>
    >> as you can see the table isn't indexed, which I take it isn't a good
    >> idea, how would I go about indexing it (as in which columns would i
    >> need to other than `uid`)
    >>
    >> +----------+------------------------------------------+
    >> | Table | Create Table |
    >> +----------+------------------------------------------+
    >> | numbers | CREATE TABLE `numbers` (
    >> `uid` varchar(11) NOT NULL default '',
    >> `band` smallint(2) NOT NULL default '0',
    >> `owner` int(11) NOT NULL default '0',
    >> `status` smallint(2) NOT NULL default '0',
    >> `dest` varchar(250) NOT NULL default '',
    >> PRIMARY KEY (`uid`)
    >> ) TYPE=MyISAM |
    >> +----------+------------------------------------------+
    >>
    >>
    >> many thanks
    >>
    >> Adnan
    >>
    >
    > I can't see form the names the meaning of each field.
    > Assuming that your phone number is in `owner`, then you should do
    >
    > ALTER TABLE phones ADD UNIQUE KEY (owner);
    >
    > After that, if you run a query with
    > WHERE owner = somenumber
    >
    > it will return a result immediately.
    >
    > Depending on your most frequent queries, you could also add
    > a compound index. For example, if you search for status, band, and dest,
    > then you could add
    >
    > ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);
    >
    > However, it depends on what you are asking for, and how often you are
    > issuing such queries.
    >
    > Try adding a KEY on the phone number field (if numbers are supposed to
    > be without
    > duplicates, add a UNIQUE KEY) and see if the situation improves.
    >
    > ciao
    > gmax
    >
    > P.S. You should answer AFTER the quoted message, not before.
    >
    hi gmax,

    `uid` is the actual telephone number as there cant be any duplicates (it
    is varchar because the number starts with a zero so int didn't work). I
    will change the name of the column to `number` to make it more obvious!
    `owner` is the uid of the user who owns the number.
    `band` is the type of number (a number which relates to the value of the
    number).
    `status` is the type of service provided to that number.
    `dest` is the destination of the number.

    I have altered the table and the time has been reduced a quite a bit.

    I ran ALTER TABLE numbers ADD UNIQUE KEY (uid);

    many thanks
    Adnan
    Adnan Barakat Guest

  9. #9

    Default Re: database design and query problem

    While there are a number of good reasons to split data into separate
    tables, the reason you mention above isn't one of them. Splitting a
    table to speed indexes might be useful if you plan to query a member
    table directly (so query a subset - such as the numbers that were
    entered in a particular quarter of the year - which are all stored in a
    member table), but not if you are always accessing the parent.

    Indexes (by default) are built on the entire field - if the field is 4
    characters long, then the index will index all 4 characters. If the
    field is 32 characters, the index will index all 32 characters.

    MySQL uses a B-TREE method for creating the index. I don't think
    you'll see much of a performance difference if you split the area code
    into a separate field and use a compound index. The only time that
    would be useful is if you sometimes search for a phone number *without*
    an area code - in which case you could order the compound index with
    the 'number' and then the 'area code'. Of course, that means that the
    'area code' indexed fields wouldn't be accessible directly - but the
    cardinality of an area code is probably low anyways...

    It is sometimes useful to perform partial indexing (in MySQL this is
    done by specifying the length to index), if the cardinality of the
    partially indexed field is close enough to the cardinality of the
    entire field, or if you have other business requirements (such as
    always perform matches of a subpart of the field) that require this.
    You can count the cardinality of a partial index using a select unique
    combined with a substring statement.

    I think the previous answer is the optimal one, use a UNIQUE index on
    the whole field, or create a PRIMARY KEY index (if nulls are allowed
    you'll have to use UNIQUE).

    P.S. - Posting through Google Groups here , which I don't think even
    includes the original message in the response ...

    --
    Chander Ganesan
    Open Technology Group, Inc.
    Phone: 877-258-8987/919-463-0999
    [url]http://www.otg-nc.com[/url]
    Expert MySQL training

    Chander Ganesan Guest

  10. #10

    Default Re: database design and query problem

    Chander Ganesan wrote:
    > While there are a number of good reasons to split data into separate
    > tables, the reason you mention above isn't one of them. Splitting a
    > table to speed indexes might be useful if you plan to query a member
    > table directly (so query a subset - such as the numbers that were
    > entered in a particular quarter of the year - which are all stored in a
    > member table), but not if you are always accessing the parent.
    >
    > Indexes (by default) are built on the entire field - if the field is 4
    > characters long, then the index will index all 4 characters. If the
    > field is 32 characters, the index will index all 32 characters.
    >
    > MySQL uses a B-TREE method for creating the index. I don't think
    > you'll see much of a performance difference if you split the area code
    > into a separate field and use a compound index. The only time that
    > would be useful is if you sometimes search for a phone number *without*
    > an area code - in which case you could order the compound index with
    > the 'number' and then the 'area code'. Of course, that means that the
    > 'area code' indexed fields wouldn't be accessible directly - but the
    > cardinality of an area code is probably low anyways...
    >
    > It is sometimes useful to perform partial indexing (in MySQL this is
    > done by specifying the length to index), if the cardinality of the
    > partially indexed field is close enough to the cardinality of the
    > entire field, or if you have other business requirements (such as
    > always perform matches of a subpart of the field) that require this.
    > You can count the cardinality of a partial index using a select unique
    > combined with a substring statement.
    >
    > I think the previous answer is the optimal one, use a UNIQUE index on
    > the whole field, or create a PRIMARY KEY index (if nulls are allowed
    > you'll have to use UNIQUE).
    >
    > P.S. - Posting through Google Groups here , which I don't think even
    > includes the original message in the response ...
    >
    > --
    > Chander Ganesan
    > Open Technology Group, Inc.
    > Phone: 877-258-8987/919-463-0999
    > [url]http://www.otg-nc.com[/url]
    > Expert MySQL training
    >
    Hi Chander,

    Thank you very much for your detailed response, it has cleared things up
    for me :D

    Adnan
    Adnan Barakat Guest

Similar Threads

  1. Design query: Space Before or After?
    By Greg_Gaspard@adobeforums.com in forum Adobe Indesign Macintosh
    Replies: 13
    Last Post: October 2nd, 03:39 AM
  2. Newbie 'structure' design query
    By Kenny in forum MySQL
    Replies: 12
    Last Post: August 18th, 04:25 PM
  3. db query/design problem
    By kabbi~thkek in forum Coldfusion Database Access
    Replies: 0
    Last Post: September 30th, 11:50 AM
  4. Database Design
    By Mohan in forum Oracle Server
    Replies: 10
    Last Post: July 15th, 10:05 PM
  5. Database Design Problem
    By Aaron in forum Microsoft SQL / MS SQL Server
    Replies: 2
    Last Post: July 8th, 12:36 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139