Professional Web Applications Themes

Multiple tables for same data - how to sort - MySQL

Hi We are developing an application that stores logging data in a mysql database. Due to some facts, we decided to store the loggingdata in a seperate table for each day. ie we have table Log_20060101, log_20060102 etc. They all have ofcourse the same structure.\ This is the structure: Id, integer Time, char(26) Source, char(10) Destination, char(10) Data, char(20) The ID field is unique for all dates, as it is calculated from the current time. The front-end application, that displays the data, intially shows the data, sorted by Time, descending. Thus it starts with the latest entry and wandes through ...

  1. #1

    Default Multiple tables for same data - how to sort

    Hi

    We are developing an application that stores logging data in a mysql
    database.
    Due to some facts, we decided to store the loggingdata in a seperate table
    for each day. ie we have table Log_20060101, log_20060102 etc. They all have
    ofcourse the same structure.\
    This is the structure:
    Id, integer
    Time, char(26)
    Source, char(10)
    Destination, char(10)
    Data, char(20)

    The ID field is unique for all dates, as it is calculated from the current
    time.

    The front-end application, that displays the data, intially shows the data,
    sorted by Time, descending. Thus it starts with the latest entry and wandes
    through today's table until the screen is full, and if the screen is not
    full yet, it goes to yesterday's table etc.

    The Source and Destination fields are spread randomly through all the
    tables, so if the user chooses to sort the display based on Source or
    Destination, the application basically has to browse throug *all* the tables
    and then start with the lowest (or highest, whatever direction the sort is
    in) Source field. So it could well be that the first Source is in table
    Log_20060101 and the next in Log_20060606 and then again in Log_20060101
    etc.
    (As in
    SELECT * FROM <All tables>
    ORDER BY Source
    LIMIT 40;
    )

    Even more troublesome is it when the user slides the scrollbar-slider down,
    so the app has to find n fields, starting from the m-th.
    (As in
    SELECT * FROM <All tables>
    ORDER BY Source
    LIMIT 40
    OFFSET 500000;
    )

    The datasize is dependent on the user's specific situation, but typically
    30.000 records per day.
    That would be > 10.000.000 per year

    Can anyone point me in the right direction as to how this search can be done
    the fastest way.

    Kind regards, Hans


    Hans Guest

  2. #2

    Default Re: Multiple tables for same data - how to sort

    >We are developing an application that stores logging data in a mysql
    >database.
    >Due to some facts, we decided to store the loggingdata in a seperate table
    >for each day. ie we have table Log_20060101, log_20060102 etc. They all have
    >ofcourse the same structure.\
    This is almost always a mistake. Your questions illustrate why.
    You're usually better putting it all in one table, with the variable
    table name (or the variable portions of it) put in an added column
    if that info isn't there already.

    In your example, do you even have ANY queries that will be restricted
    to a single table (known in advance BEFORE you start counting
    how many entries you retrieved)?
    >This is the structure:
    >Id, integer
    >Time, char(26)
    >Source, char(10)
    >Destination, char(10)
    >Data, char(20)
    >
    >The ID field is unique for all dates, as it is calculated from the current
    >time.
    >
    >The front-end application, that displays the data, intially shows the data,
    >sorted by Time, descending. Thus it starts with the latest entry and wandes
    >through today's table until the screen is full, and if the screen is not
    >full yet, it goes to yesterday's table etc.
    >
    >The Source and Destination fields are spread randomly through all the
    >tables, so if the user chooses to sort the display based on Source or
    >Destination, the application basically has to browse throug *all* the tables
    >and then start with the lowest (or highest, whatever direction the sort is
    >in) Source field. So it could well be that the first Source is in table
    >Log_20060101 and the next in Log_20060606 and then again in Log_20060101
    >etc.
    >(As in
    >SELECT * FROM <All tables>
    >ORDER BY Source
    >LIMIT 40;
    >)
    >
    >Even more troublesome is it when the user slides the scrollbar-slider down,
    >so the app has to find n fields, starting from the m-th.
    >(As in
    >SELECT * FROM <All tables>
    >ORDER BY Source
    >LIMIT 40
    >OFFSET 500000;
    >)
    >
    >The datasize is dependent on the user's specific situation, but typically
    >30.000 records per day.
    >That would be > 10.000.000 per year
    >
    >Can anyone point me in the right direction as to how this search can be done
    >the fastest way.
    Use a single table. Have a date-time field and index it.

    Gordon L. Burditt
    Gordon Burditt Guest

  3. #3

    Default Re: Multiple tables for same data - how to sort


    "Gordon Burditt" <gordonb.lihe9burditt.org> wrote in message
    news:128b4j9a5tmjp38corp.supernews.com...
    > >We are developing an application that stores logging data in a mysql
    >>database.
    >>Due to some facts, we decided to store the loggingdata in a seperate table
    >>for each day. ie we have table Log_20060101, log_20060102 etc. They all
    >>have
    >>ofcourse the same structure.\
    >
    > This is almost always a mistake. Your questions illustrate why.
    > You're usually better putting it all in one table, with the variable
    > table name (or the variable portions of it) put in an added column
    > if that info isn't there already.
    We did that, we indexed the Date and Time fields, and also the Source and
    Destination fields. It appeared that, when the table had like 10.000.000
    rows and the user
    wanted something like
    SELECT *
    FROM Table
    LIMIT 40
    OFFSET 5000000

    (as a result of repositioning the scrollbar slider), this takes too long to
    have an
    acceptable response time (over 30 seconds), this is simply unacceptable.
    Besides that, when the user applies a filter to show only records where
    Source = 'SomeSource', then to find the number of matched records (to set
    the scrollbar's parameters) takes too long and also the query to find the
    records just takes too long to be convenient for the user.
    (Obviously we have the proper indexes applied and tried various combinations
    of indexes; there's no gain possible here imho)

    That's why we chose this approach. While the application starts, all tables
    are queried with

    SELECT Source, Destination, COUNT(*) FROM <Table>
    GROUP BY Source, Destination

    to find all possible combinations and their respective counts, and store
    this in an std::vector, so that at least the counts (for the Scrollbar
    parameters) are found with an almost unnoticable delay.

    Most of the functionalty is alread implemented, and it is working like a
    charm, must faster than with all data in one table.

    An additional advantage is the support for our backup/archiving system, we
    choose ( a long time ago, and that's not gonna change) to create one
    backup-file every day at midnight, with all changes of the last 24 hours.
    With the one-table-per-day solution, we just tarball the table-files instead
    of using mysqldump.

    The only problem is to find the records when the user chooses to sort on the
    'Source' or 'Destination' field.
    It is not for sure yet that this feature will be implemented, it will only
    be implemented if any of you guys help me find a proper solution, with an
    acceptable response time.
    >
    > In your example, do you even have ANY queries that will be restricted
    > to a single table (known in advance BEFORE you start counting
    > how many entries you retrieved)?
    Yes, there are. Normally, the application 'follows' today's table, (as in
    tail -f ) with a selectable refresh timeout (30 secs up to 1 hour, or
    disabled)
    >
    >>This is the structure:
    >>Id, integer
    >>Time, char(26)
    >>Source, char(10)
    >>Destination, char(10)
    >>Data, char(20)
    >>
    >>The ID field is unique for all dates, as it is calculated from the current
    >>time.
    >>
    >>The front-end application, that displays the data, intially shows the
    >>data,
    >>sorted by Time, descending. Thus it starts with the latest entry and
    >>wandes
    >>through today's table until the screen is full, and if the screen is not
    >>full yet, it goes to yesterday's table etc.
    >>
    >>The Source and Destination fields are spread randomly through all the
    >>tables, so if the user chooses to sort the display based on Source or
    >>Destination, the application basically has to browse throug *all* the
    >>tables
    >>and then start with the lowest (or highest, whatever direction the sort is
    >>in) Source field. So it could well be that the first Source is in table
    >>Log_20060101 and the next in Log_20060606 and then again in Log_20060101
    >>etc.
    >>(As in
    >>SELECT * FROM <All tables>
    >>ORDER BY Source
    >>LIMIT 40;
    >>)
    >>
    >>Even more troublesome is it when the user slides the scrollbar-slider
    >>down,
    >>so the app has to find n fields, starting from the m-th.
    >>(As in
    >>SELECT * FROM <All tables>
    >>ORDER BY Source
    >>LIMIT 40
    >>OFFSET 500000;
    >>)
    >>
    >>The datasize is dependent on the user's specific situation, but typically
    >>30.000 records per day.
    >>That would be > 10.000.000 per year
    >>
    >>Can anyone point me in the right direction as to how this search can be
    >>done
    >>the fastest way.
    >
    > Use a single table. Have a date-time field and index it.
    been there, did that, got the t-shirt, but this is not the proper solution.
    Thanks for the advise though.
    >
    > Gordon L. Burditt


    Hans Guest

Similar Threads

  1. update multiple records in multiple tables from one form
    By Anj01 in forum Coldfusion Database Access
    Replies: 1
    Last Post: May 17th, 05:54 AM
  2. DB design...multiple tables for ancillary data or just one?
    By darrel in forum Dreamweaver AppDev
    Replies: 0
    Last Post: March 21st, 05:24 PM
  3. Can one <form> submit data to multiple tables?
    By jip in forum Dreamweaver AppDev
    Replies: 3
    Last Post: March 3rd, 09:11 PM
  4. Replies: 1
    Last Post: September 9th, 07:16 AM
  5. Creating tables and merging them for multiple data
    By Agin in forum Microsoft SQL / MS SQL Server
    Replies: 1
    Last Post: July 2nd, 03:39 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139