Professional Web Applications Themes

blobs or files using NDB cluster - MySQL

Hi, In develloping a *large* high-availability website that will be storing lots of both images/video and text, what would be the best way to go about storing the images and video? I can see at least two options: a) store the images as BLOBs in the database and let the server app retrieve them and serve them to the client. b) store the images as individual files, and put in only http links to the files into the database, so the client browser will fetch them, this would leave millions of files in the filesystem though (performance?). Both will be ...

  1. #1

    Default blobs or files using NDB cluster

    Hi,

    In develloping a *large* high-availability website that will be
    storing lots of both images/video and text, what would be the best way
    to go about storing the images and video?
    I can see at least two options:
    a) store the images as BLOBs in the database and let the server app
    retrieve them and serve them to the client.
    b) store the images as individual files, and put in only http links to
    the files into the database, so the client browser will fetch them,
    this would leave millions of files in the filesystem though
    (performance?).

    Both will be used in a NDB environment, and is meant to be scalable to
    be storing extreme amounts of images (TBs) but will start out smaller.
    I am looking mainly for high speed in serving the images and
    scalability of the whole system.

    How does yahoo, google, youtube, myspace etc etc do it??

    A secondary question: how do you store more data than can fit in a
    single node of an NDB cluster? i.e. if each node has a 1 TB drive, but
    you want to store 3 TBs?

    Thanks!
    Regards,
    Tobias

    nsa.usa@gmail.com Guest

  2. #2

    Default Re: blobs or files using NDB cluster

    com wrote: 

    Tobias,

    Some people will tell you to never store this information in a database.
    But before you believe them, I'd suggest you find out if they've even
    tried.

    I've been doing it for over 20 years (starting with doents using DB2
    on IBM mainframes back in the 80's). It works quite well; databases are
    well suited for quick retrieval of such information. Databases are
    quite efficient at selecting from hundreds of thousands (or even
    millions) of rows. Can you imagine having that many files in one directory?

    Other advantages include ensuring data consistency (if you're using
    INNODB database) - you can ensure you can't delete the blob while it's
    still being referenced. You can't do that if they're stored in files.
    Backup is much easier. You can move the database to a different server
    without changing anything except the host name and not have to worry
    about NFS, etc.

    There are a lot of advantages to doing it this way. And people who
    claim retrieval is too slow either haven't tried it or don't know how to
    tune their system.


    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    net
    ==================
    Jerry Guest

  3. #3

    Default Re: blobs or files using NDB cluster

    On Apr 22, 10:44 pm, Jerry Stuckle <net> wrote: 





    >
    > Tobias,
    >
    > Some people will tell you to never store this information in a database.
    > But before you believe them, I'd suggest you find out if they've even
    > tried.
    >
    > I've been doing it for over 20 years (starting with doents using DB2
    > on IBM mainframes back in the 80's). It works quite well; databases are
    > well suited for quick retrieval of such information. Databases are
    > quite efficient at selecting from hundreds of thousands (or even
    > millions) of rows. Can you imagine having that many files in one directory?
    >
    > Other advantages include ensuring data consistency (if you're using
    > INNODB database) - you can ensure you can't delete the blob while it's
    > still being referenced. You can't do that if they're stored in files.
    > Backup is much easier. You can move the database to a different server
    > without changing anything except the host name and not have to worry
    > about NFS, etc.
    >
    > There are a lot of advantages to doing it this way. And people who
    > claim retrieval is too slow either haven't tried it or don't know how to
    > tune their system.
    >
    > --
    > ==================
    > Remove the "x" from my email address
    > Jerry Stuckle
    > JDS Computer Training Corp.
    > net
    > ==================[/ref]


    MySQL has (had?) a problem that caused the blob data to be loaded into
    memory even if the field was not being accessed.

    My solution is to have a simple table dedicated to the blobs (primary
    key and blob) and have the meta data in a seperate table with a
    foreign key to its blob data.

    You might also want to consider that you might need to resize/
    thumbnail your images.

    Tigger

    Tigger Guest

  4. #4

    Default Re: blobs or files using NDB cluster

    Tigger wrote: 
    >> Tobias,
    >>
    >> Some people will tell you to never store this information in a database.
    >> But before you believe them, I'd suggest you find out if they've even
    >> tried.
    >>
    >> I've been doing it for over 20 years (starting with doents using DB2
    >> on IBM mainframes back in the 80's). It works quite well; databases are
    >> well suited for quick retrieval of such information. Databases are
    >> quite efficient at selecting from hundreds of thousands (or even
    >> millions) of rows. Can you imagine having that many files in one directory?
    >>
    >> Other advantages include ensuring data consistency (if you're using
    >> INNODB database) - you can ensure you can't delete the blob while it's
    >> still being referenced. You can't do that if they're stored in files.
    >> Backup is much easier. You can move the database to a different server
    >> without changing anything except the host name and not have to worry
    >> about NFS, etc.
    >>
    >> There are a lot of advantages to doing it this way. And people who
    >> claim retrieval is too slow either haven't tried it or don't know how to
    >> tune their system.
    >>
    >> --
    >> ==================
    >> Remove the "x" from my email address
    >> Jerry Stuckle
    >> JDS Computer Training Corp.
    >> net
    >> ==================[/ref]
    >
    >
    > MySQL has (had?) a problem that caused the blob data to be loaded into
    > memory even if the field was not being accessed.
    >
    > My solution is to have a simple table dedicated to the blobs (primary
    > key and blob) and have the meta data in a seperate table with a
    > foreign key to its blob data.
    >
    > You might also want to consider that you might need to resize/
    > thumbnail your images.
    >
    > Tigger
    >[/ref]

    Tigger,

    Good point, and one I should have mentioned.

    Yes, you are correct. But it's not just MySQL. Every RDBM I've seen
    reads the entire row into memory before selecting the columns to be
    returned to the application. And even if they didn't, they would have
    to read data from the disk in the block size being used by the file system.

    Putting the blob in a separate table significantly increases speed when
    not accessing the BLOB, while slowing the system almost unnoticeably
    when accessing the BLOB.

    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    net
    ==================
    Jerry Guest

  5. #5

    Default Re: blobs or files using NDB cluster

    On Apr 22, 3:44 pm, Jerry Stuckle <net> wrote:
     

    Hi Jerry,

    Thanks for the info much appreciated. I like the idea of storing in
    the DB as it just seems neater, and millions of files in a FS needs
    some special considerations probably :-)
    I think I will try to do some testing with storing in the db and I'll
    post the results when I'm done.
    Regarding putting the images in its own table I would definently do
    this, I stumbled on 'vertical partitioning' and this seems to be it
    and it seems to be very necessary.
    Regarding my secondary question, if anybody's interrested I allready
    found the answer and its: horizontal partitioning, so a table can be
    split with some of the rows on a different FS.

    Regards,
    Tobias

    nsa.usa@gmail.com Guest

  6. #6

    Default Re: blobs or files using NDB cluster

    On Apr 23, 10:15 am, Tigger <tv> wrote:
     

    Hi Tigger,

    Thanks. I think it still has this 'problem' but I don't think its
    considered a problem as such as the way to do it properly apparently
    is how you describe. See also my answer to Jerry.

    What do you mean with the resizing? This makes me wonder if there is a
    performance penalty if the images have different filesizes? Should I
    put f.ex. all 1024x768 images in one table and all 640x480 images in a
    different table and so on? would this make any difference or no
    difference? hmm...

    Cheers,
    Tobias

    nsa.usa@gmail.com Guest

  7. #7

    Default Re: blobs or files using NDB cluster

    On Apr 24, 2:42 am, com wrote: 



    >
    > Hi Tigger,
    >
    > Thanks. I think it still has this 'problem' but I don't think its
    > considered a problem as such as the way to do it properly apparently
    > is how you describe. See also my answer to Jerry.
    >
    > What do you mean with the resizing? This makes me wonder if there is a
    > performance penalty if the images have different filesizes? Should I
    > put f.ex. all 1024x768 images in one table and all 640x480 images in a
    > different table and so on? would this make any difference or no
    > difference? hmm...
    >
    > Cheers,
    > Tobias[/ref]

    I've just implemented an image provider that uses images stored in a
    database. One issue we've always had which I tried to address with
    this implementation was that the client had to process and upload
    multiple versions of an images to cover each use. e.g. thumbnails,
    icons and other sizes. And then the designer would change the sizes on
    us!!!

    In this implementation I let them upload original unprocessed image
    and the system creates the variations for them. I did this by creating
    a data structure that allowed me to define "image styles" and to store
    versions of the image related to each style. A style would define
    dimensions and sizing rules.

    Not sure on the use of different tables for different sizes. If you
    have fixed, pre-defined sizes it might be worth doing that anyhow to
    reduce the number of rows on each table.

    How are you planning on feeding the images to the website?

    I'm using custom 404 redirect handling (IIS & asp) which streams data
    from a back end HttpHandler (.Net). This cannot be fast but I'm
    restricted to the user<->asp<->.Net<->database architecture!

    I could skip the asp part but its realy important that these images
    have urls from the correct domain and are friendly to search engines
    and administrators.

    Tigger Guest

Similar Threads

  1. Cluster nodes discover peers outside cluster domain
    By TRS-80 in forum Coldfusion Server Administration
    Replies: 2
    Last Post: August 13th, 09:33 PM
  2. Shared access to files in multiserver cluster?
    By Seajay in forum Coldfusion Server Administration
    Replies: 6
    Last Post: May 1st, 10:01 PM
  3. BLOBs in Oracle
    By pdrich in forum Coldfusion Database Access
    Replies: 1
    Last Post: February 22nd, 08:30 PM
  4. OLE DB and BLOBs
    By Robert Embleton in forum Informix
    Replies: 0
    Last Post: September 4th, 07:21 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139