DBI problem : How can I load quickly one huge table with DBI ??.

Ask a Question related to PERL Miscellaneous, Design and Development.

  1. #1

    Default Re: DBI problem : How can I load quickly one huge table with DBI??.

    Vincent Le-Texier <texier@ebi.ac.uk> writes:
    > If the query is :
    >
    > my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
    > my $s_trans = $dbh->prepare($str_trans);
    >
    > Foreach entries I want to load into the transcript table , I do :
    > $s_trans->bind_param(1,$id);
    > $s_trans->bind_param(2,$embl_acc);
    >
    > and $s_trans->execute;
    OK. At least you're not preparing it as well as executing it every time :)
    > I would like to known, if there are objects and methods with perl DBI to
    > load by block of statements (execute one statement every 3000 entries for
    > example) instead of each entry.
    Two thoughts:

    a) do you have any indexes or primary keys on the table while you're
    inserting all this stuff? You don't want the hold-up of maintaining the
    index every time, so only create such things after all the data's in place;

    b) sure you can batch things up, with transactions. Turn off auto_commit
    behaviour e.g. like:

    | $dbh=DBI->connect("dbi:[stuff]", "", "", {AutoCommit => 0})

    and then in your main loop, maintain a counter of number of rows and every
    few hundred, do a commit. Fill in the blanks in the following:

    | $rowcount=0;
    | $sth=$dbh->prepare (some_insert_statement);
    |
    | while (looping_over_input_values) {
    | $rc=$sth->bind_param(1, something);
    | $rc=$sth->bind_param(2, somethingelse);
    | $rc=$sth->execute;
    |
    | if(!($rowcount%100)) {
    | $handle->commit;
    | }
    | $rowcount++;
    | }
    | $handle->commit; # don't forget this after they're all in

    HTH,

    ~Tim
    --
    Product Development Consultant
    OpenLink Software
    Tel: +44 (0) 20 8681 7701
    Web: <http://www.openlinksw.com>
    Universal Data Access & Data Integration Technology Providers
    Tim Haynes Guest

  2. Similar Questions and Discussions

    1. Huge problem.
      My problem is that every time I go to a website with a video or try to watch a movie trailer my browser closes. I've been trying to locate a problem...
    2. loading only few rows from huge table
      Hi, I have a table with 100K rows. One of columns is a timestamp and indicates when this row inserted. What will the the best way of getting 10...
    3. Could not load type VTFixup Table from assembly Invalid token in v-table fix-up table.
      We are getting this error after clearing the web.config of database infomation - even after using the wizard to re-enter the information. I could...
    4. How put an image in a table quickly ?
      Hi, I need to put an image (bmp or jpg) into a SQL Server database, from a stored procedure called from a C++ program. In my c++ program I use...
    5. How to deal with huge table?
      I have a user table which is to store user's demographics. Every time a user signs up, one row will be created. When this table grows (say, 50M...
  3. #2

    Default DBI problem : How can I load quickly one huge table with DBI ??.

    Hi all,

    I'm using DBI from perl to load Oracle databse (dbi:Oracle).

    my program works but very slow.

    The fact is :

    If the query is :

    my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
    my $s_trans = $dbh->prepare($str_trans);

    Foreach entries I want to load into the transcript table , I do :
    $s_trans->bind_param(1,$id);
    $s_trans->bind_param(2,$embl_acc);

    and $s_trans->execute;


    that means, for each entry I've executed the statement $str_trans.

    THIS IS VERY SLOW (if you have for example 1 million entries to load).


    I would like to known, if there are objects and methods with perl DBI
    to load by block of statements (execute one statement every 3000 entries
    for example) instead of each entry.

    Others ideas are also welcome.

    Thanks for your advices,

    Vincent.


    Vincent Le-Texier Guest

  4. #3

    Default Re: DBI problem : How can I load quickly one huge table with DBI??.

    Vincent Le-Texier wrote:

    Hello Vincent,
    > Hi all,
    >
    > I'm using DBI from perl to load Oracle databse (dbi:Oracle).
    >
    > my program works but very slow.
    >
    > The fact is :
    >
    > If the query is :
    >
    > my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
    > my $s_trans = $dbh->prepare($str_trans);
    >
    > Foreach entries I want to load into the transcript table , I do :
    > $s_trans->bind_param(1,$id);
    > $s_trans->bind_param(2,$embl_acc);
    >
    you don't have do to that.

    $s_trans->execute ($id,$embl_acc)

    works as well.
    > and $s_trans->execute;
    >
    >
    > that means, for each entry I've executed the statement $str_trans.
    >
    > THIS IS VERY SLOW (if you have for example 1 million entries to load).
    >
    >
    > I would like to known, if there are objects and methods with perl DBI
    > to load by block of statements (execute one statement every 3000 entries
    > for example) instead of each entry.
    You should set
    $dbi->{AutoCommit} = 0;

    so that statements are only committed if you explicitly do this. Of
    course, you'll have to add a
    $dbi->commit()
    then, either at the end of your code (which is probably not a good idea
    for 1 million entries) or you commit every 10,000 rows or so.

    However, I once wrote the whole NCBI Unigene data (including accession
    numbers) into two tables and, surprisingly, it was actually quite fast.

    >
    > Others ideas are also welcome.
    I think, SQL Loader might be faster than Perl. But I never used it. :-)
    >
    > Thanks for your advices,
    >
    > Vincent.
    >
    Cheers,
    Dominik

    Dominik Seelow Guest

  5. #4

    Default Re: DBI problem : How can I load quickly one huge table with DBI??.

    Since this is Oracle, you have a couple of options:

    1. Use SQL*Loader. You will **never, ever** be able to write anything
    that will match the speed. Because of the volume of data, this willbe
    your best option. make sure you are using the direct method (1).
    2. Use direct inserts with the "/*+ append */" hint (1)
    3. Rewrite this using the Oracle::OCI module. I am not sure, but you
    may be able to use array processing with this module. If you can use
    array processing, it will be orders of magnitude faster than single inserts.

    (1) Note: The direct method will leave your indexes in an invalid
    state. When the load is completed, they will need to be rebuilt. Also,
    any triggers you have on these tables will not fire with this method.

    In addition to Tim's suggestions, you should be connecting using the
    BEQUEATH SQL*Net connection protocol. Network traffic could be an issue
    here.

    Last, you should look into using the Oracle events system (talk with
    your DBA, or look at the Oracle docs). This will tell you what is
    slowing you code down. No matter what you are doing, if you are doing
    Oracle development, you need to know how to use this tool.

    Tim Haynes wrote:
    > Vincent Le-Texier <texier@ebi.ac.uk> writes:
    >
    >
    >>If the query is :
    >>
    >>my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
    >>my $s_trans = $dbh->prepare($str_trans);
    >>
    >>Foreach entries I want to load into the transcript table , I do :
    >>$s_trans->bind_param(1,$id);
    >>$s_trans->bind_param(2,$embl_acc);
    >>
    >>and $s_trans->execute;
    >
    >
    > OK. At least you're not preparing it as well as executing it every time :)
    >
    >
    >>I would like to known, if there are objects and methods with perl DBI to
    >>load by block of statements (execute one statement every 3000 entries for
    >>example) instead of each entry.
    >
    >
    > Two thoughts:
    >
    > a) do you have any indexes or primary keys on the table while you're
    > inserting all this stuff? You don't want the hold-up of maintaining the
    > index every time, so only create such things after all the data's in place;
    >
    > b) sure you can batch things up, with transactions. Turn off auto_commit
    > behaviour e.g. like:
    >
    > | $dbh=DBI->connect("dbi:[stuff]", "", "", {AutoCommit => 0})
    >
    > and then in your main loop, maintain a counter of number of rows and every
    > few hundred, do a commit. Fill in the blanks in the following:
    >
    > | $rowcount=0;
    > | $sth=$dbh->prepare (some_insert_statement);
    > |
    > | while (looping_over_input_values) {
    > | $rc=$sth->bind_param(1, something);
    > | $rc=$sth->bind_param(2, somethingelse);
    > | $rc=$sth->execute;
    > |
    > | if(!($rowcount%100)) {
    > | $handle->commit;
    > | }
    > | $rowcount++;
    > | }
    > | $handle->commit; # don't forget this after they're all in
    >
    > HTH,
    >
    > ~Tim

    --
    Ron Reidy
    Oracle DBA

    Ron Reidy Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139