Professional Web Applications Themes

96 Billion records on MySQL/Linux - MySQL

Hello all, I have rather a large database I have to create, each record would be around 230 bytes but I need to hold 4 months of data online for queries. Each day I may receive around 80 million records. Can Linux running on P4 with 1GB+ RAM & a good RAID system handel such task? I will have few indexes (2 or 3) and will be doing queries based on them (date, phone number) what kind of result can I expect? can this work? where are the pitfalls? Thanks in advance....Q please reply to the thread and/or to my ...

  1. #1

    Default 96 Billion records on MySQL/Linux

    Hello all,

    I have rather a large database I have to create, each record would be
    around 230 bytes but I need to hold 4 months of data online for
    queries. Each day I may receive around 80 million records.

    Can Linux running on P4 with 1GB+ RAM & a good RAID system handel such
    task?

    I will have few indexes (2 or 3) and will be doing queries based on
    them (date, phone number)

    what kind of result can I expect? can this work? where are the
    pitfalls?

    Thanks in advance....Q

    please reply to the thread and/or to my email at Q<at>STARLINX.COM

    Q Guest

  2. #2

    Default Re: 96 Billion records on MySQL/Linux

    correction 9.6 Billion not 96 Billion

    Q wrote: 

    Q Guest

  3. #3

    Default Re: 96 Billion records on MySQL/Linux

    Q schrieb: 

    reading of approx 1 mio records from a mysqldump took me _some_ minutes
    today. So you need maybe 100 to 1600 min for your 80mio records. But:
    the -e option of mysqldump creates rather fast sqlstatements with very
    less overhead. Your input may be rather complicated, you may loose time
    at other tasks too.
    Why not create test-data, eg impot the us telefone book?
     

    I don' believe a P4 could handle queryies too while importing this large
    amount of data.

    Get a (much better) better equipment.

    Werner
    werner Guest

  4. #4

    Default Re: 96 Billion records on MySQL/Linux

    "Q" <com> wrote:
    ^^^
    I'd like to know whom I'm talking to. Please add a name here.
     

    Then lets do the basic math:

    230 bytes * 4 * 30 * 80.000.000 = 2.208.000.000.000 byte =~ 2TB
     

    2TB disk space will be manageable. I suggest two external disk boxes
    with fast SCSI or FC links to the host. Then RAID-10 with submirrors
    on different boxes.
     

    DATETIME is 8 bytes, a phone number will be a CHAR(16). The record
    pointer is 6 Bytes. That's 30 Bytes per row. A combined index
    (date, phone) will be 30 bytes * 4 * 30 * 80.000.000 =~ 240GB big.

    This is much more than your available RAM. Don't expect this to be
    fast. If we assume that index compression works well, you may do
    with half the size. Still too big.
     

    It will work. It will be quite slow, especially for tasks that
    require to access the whole index (i.e. data import).

    Pitfalls: make sure you use constant row length. Use as few
    indexes as possible. Buy RAM!

    Options: your data seems to be partitionable along the time axis.
    You could try MERGE tables across one-month tables. Or you wait
    for 5.1 and use partitions.


    HTH, XL
    --
    Axel Schwenke, Senior Software Developer, MySQL AB

    Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
    MySQL User Forums: http://forums.mysql.com/
    Axel Guest

  5. #5

    Default Re: 96 Billion records on MySQL/Linux

    Hello,

    My Name is Q Fard.

    Thanks for the reply.

    Actually the data is larger than you have, it is 80MIL per day and
    customer wants to have it for 4 months. These are phone records.

    I am running on SuSe Linux and I believe I can upgrade up to 8GB RAM...
    I am sure it will be slow but can I get it to search date + phone
    combination within a reasonable time?

    I can also put the 10MIL records from each phone company switch in to
    its own database.

    would any of this help??

    Thanks again...Q




    Axel Schwenke wrote: 
    >
    > Then lets do the basic math:
    >
    > 230 bytes * 4 * 30 * 80.000.000 = 2.208.000.000.000 byte =~ 2TB

    >
    > 2TB disk space will be manageable. I suggest two external disk boxes
    > with fast SCSI or FC links to the host. Then RAID-10 with submirrors
    > on different boxes.

    >
    > DATETIME is 8 bytes, a phone number will be a CHAR(16). The record
    > pointer is 6 Bytes. That's 30 Bytes per row. A combined index
    > (date, phone) will be 30 bytes * 4 * 30 * 80.000.000 =~ 240GB big.
    >
    > This is much more than your available RAM. Don't expect this to be
    > fast. If we assume that index compression works well, you may do
    > with half the size. Still too big.

    >
    > It will work. It will be quite slow, especially for tasks that
    > require to access the whole index (i.e. data import).
    >
    > Pitfalls: make sure you use constant row length. Use as few
    > indexes as possible. Buy RAM!
    >
    > Options: your data seems to be partitionable along the time axis.
    > You could try MERGE tables across one-month tables. Or you wait
    > for 5.1 and use partitions.
    >
    >
    > HTH, XL
    > --
    > Axel Schwenke, Senior Software Developer, MySQL AB
    >
    > Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
    > MySQL User Forums: http://forums.mysql.com/[/ref]

    Q Guest

  6. #6

    Default Re: 96 Billion records on MySQL/Linux

    "Q" <com> wrote:
     

    80 million rows per day. That's what you said. That's what sums up to
    2TB total row data and 240GB for an combined (timestamp, number) index.
     

    It depends on what you call reasonable time. Finding one row via index
    would take only few seeks. A MyISAM index page is 4K. With 75% fill
    factor and 30 bytes per entry you get a fanout of 100. To find one out
    of 9.600.000.000 rows you have to read 9.6E9 log 100 =~ 5 index pages
    and the data record itself. That will be 6 disk seeks =~ 60ms. If some
    of the pages are already in memory, it will be even faster. That's the
    reason why memory is so important - with enough memory you can hold
    index pages in memory and will get fast results.

    Sufficient memory is even more important for importing data. Refreshing
    (or creating) an index needs to access many (or all) index pages.
     

    Sure. The typical strategy to handle big datasets is partitioning.
    If you don't do cross company queries you can safely put each companies
    data into its own database. Preferrably onto its own hardware.

    Again: MySQL 5.1 will come with "real" partitioning [1]. At the moment
    you can use the MERGE [2] feature to access multiple MyISAM tables as
    a whole. Both will clearly help bundling ressources.

    PS: please do not top post [3].


    [1] http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
    [2] http://dev.mysql.com/doc/refman/5.0/en/merge-storage-engine.html
    [3] http://www.cs.tut.fi/~jkorpela/usenet/brox.html


    XL
    --
    Axel Schwenke, Senior Software Developer, MySQL AB

    Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
    MySQL User Forums: http://forums.mysql.com/
    Axel Guest

Similar Threads

  1. Random records from MySQL
    By rddave51 in forum Dreamweaver AppDev
    Replies: 1
    Last Post: May 27th, 02:23 PM
  2. No records found for MySQL Table
    By wittsdd in forum Coldfusion Database Access
    Replies: 4
    Last Post: April 9th, 12:54 AM
  3. Sensible Method of Inserting Records In To MySQL
    By Frank Pryor in forum PHP Development
    Replies: 2
    Last Post: August 12th, 11:02 AM
  4. IBM bets 2.9 Billion on Linux for semiconductor manufacture
    By Anthony Mandic in forum Sun Solaris
    Replies: 6
    Last Post: July 2nd, 04:59 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139