Professional Web Applications Themes

interfilesystem copies: large du diffs - Linux / Unix Administration

I recently rsync'd around 2.8TB between a RHE server (jfs fs) and a Netapps system. Did a 'du -sk' against each to verify the transfers: 2894932960 sources total, KB 2751664496 destination total, KB That's a 140GB discrepancy. Subsequent verbose rsyncs have turned up nothing that was not originally transferred. I often note similar behaviour with smaller transfers between servers with similar OS/fs combos and have always seen it to come extent with transfers between systems of any type. It's just that the usual discrepancies in this case are magnified greatly by the sheer volume of data. Needless to say, 140GB ...

  1. #1

    Default interfilesystem copies: large du diffs

    I recently rsync'd around 2.8TB between a RHE server (jfs fs) and a
    Netapps system. Did a 'du -sk' against each to verify the transfers:

    2894932960 sources total, KB
    2751664496 destination total, KB

    That's a 140GB discrepancy. Subsequent verbose rsyncs have turned up
    nothing that was not originally transferred.

    I often note similar behaviour with smaller transfers between servers
    with similar OS/fs combos and have always seen it to come extent with
    transfers between systems of any type. It's just that the usual
    discrepancies in this case are magnified greatly by the sheer volume of
    data. Needless to say, 140GB going missing would be a bit of a problem
    and it's not much fun picking through 2.8TB for MIA data.

    Can anyone shed some light on why this happens?

    tia

    orgone Guest

  2. #2

    Default Re: interfilesystem copies: large du diffs

    orgone wrote: 

    "df" uses actual blocks allocated. "du" takes the
    file size and concludes that all blocks are allocated.
     

    My best guess is the NetApp somehow handles sply allocated
    files differently so that "du" sees the block actually
    allocated not just the file size using the address of the last
    byte.

    Alternate theory that is far less likely: On your source tree
    you have a history of making hundreds of thousands of files
    and then deleting nearly all of them, leaving a lot of very
    large directories. On your target tree the directories are
    much smaller.

    Yet another alternate theory: Smaller blcok/fragment/extent
    size on the target. So on the source any file has a fairly
    large minimum block count but on the target smaller files
    take fewer blocks. You would need very many small files to
    account for a 3% difference, but a few 100K files under 512
    bytes should cause this.

    Doug Guest

  3. #3

    Default Re: interfilesystem copies: large du diffs

    In article <googlegroups.com>,
    orgone <com> wrote: 

    First, this is only a 5% difference. I could easily imagine the
    difference being much larger:

    The du command (and the underlying st_blocks field in the result of
    the stat() system call) reports the amount of space used. But:
    - A filesystem uses space not only for the data component (the bytes
    that are stored in the files), but also for overhead: directories,
    per-file overhead like inodes and indirect blocks, and more, often
    referred to as metadata. How efficiently this overhead is stored
    varies considerably by filesystem. And whether this overhead is
    reported as part of the answer from du also varies. In some extreme
    cases (filesystems that separate their data and metadata physically)
    this overhead is not reported at all. The ratio of metadata to data
    varies considerably by file system type and by file/directory size,
    but for many small files 5% is not out of line.
    - The amount of space allocated to a file typically has some
    granularity, which is often 4KB or 16KB (historically, it has ranged
    from 128 bytes for the cp/m filesystem to 256KB for some filesystems
    used in high-performance computing). This means the size of the
    file is rounded up to this granularity, which if your files are
    typically small can make a huge difference. Say your files are all
    2KB, and you store them on a file system with a 512B allocation
    granularity and on another one with 16KB allocation granularity,
    you'll get a result from du that is different by a factor of 32!
    - Are any of your files sp? I think every commercial filesystem
    that's in mass-production today supports sp files. But exactly
    how can vary widely. What is the granularity of holes in the file?
    What is the metadata overhead for holes (in extent-based filesystems
    this can make a significant difference if implemented carelessly)?
    Also, it is quite possible (maybe even likely) that your rsync
    copying turned sp files into contiguous files. Given that your
    total space usage shrank instead of increased, this doesn't seem
    likely to be the main effect here.
    - On the netapp, did you have snapshot turned on? If yes, does the
    result from du include the snapshots?
    - It isn't even completely clear what the result from du is supposed
    to be. The real disk usage? The size of the file rounded to
    kilobytes? Here is a suggestion to stir the pot: Assume you have a
    1MB file stored on a RAID-1 (mirrored) disk array. I think du
    should report the space usage as 2MB, because you are actually
    storing two copies of the file (you are using 2MB worth of disks).
    If you now migrate the file to a compressing filesystem that is not
    mirrored, du should report the space usage as 415KB, if that's how
    much disk space it really uses. No filesystem today would report
    those values, they would all report something pretty close to 1MB.

    For you, this is my suggestion: Instead of looking only at the total,
    make a complete list of the disk usage for each file. An easy way to
    do this from the command line is this. Make two listings of space
    usage, one each for source and destination, merge the lists, and look
    at the differences. Here is a quick attempt at a script which does
    this (just typed this in, you may have to debug it a little bit, and
    it assumes you don't have spaces in file names, if you do you'll have
    to do a lot of quoting and null-terminating):
    cd $SOURCE
    find . -type f | xargs du -k | sort +1 > /tmp/source.du
    cd $TARGET
    find . -type f | xargs du -k | sort +1 > /tmp/target.du
    cd /tmp
    join -j 2 source.du target.du > both.du
    awk '{print $1, $3 - $2}' < both.du | sort -n +1 > diff.du
    In the end, you'll have a listing of the difference in space usage in
    diff.du, sorted (I hope, I can never remember whether the -n switch to
    sort works correctly for negative numbers). Then pick a few examples
    of files that have large differences, or see whether you can make out
    a trend (maybe most files have a small difference). Then spot-check a
    few files, to make sure they were copied correctly.

    You can also use "join -j 2 -v 1 source.du target.du" to find files
    that were not copied, and the same with "-v 2" to find files that
    showed up in the copy uninvited.

    Now changing gears: Speaking as file system implementor (and somewhat
    of an expert), I would wish that the du command and the underlying
    information returned by the stat() system call would go away. On one
    hand, they are just to crude and don't begin to describe the
    complexity of space usage in a modern (complex) filesystem. On the
    other hand, they don't give the information answers that a system
    administrator (or an automated administration tool) really needs. As
    we saw above, for a 1GB file, the correct answer for space usage might
    be any of (all the numbers are made up)
    - 1GB worth of bytes
    - 1GB is the file size, but it is sp, so it only uses 876MB.
    - 1GB worth of bytes on the data disk, plus 7.4MB of metadata on the
    metadata disk.
    - 2GB worth of bytes, because of RAID 1.
    - 437MB worth of bytes, because of compression.
    - 0.456GB on datadisk_123, 1.234GB on datadisk_456, and 2.345GB on
    datadisk_789, plus 7.4MB on metadisk_abc and 3.7MB on metadisk_def.
    - 5.678GB on disk, because of RAID 1, asynchronous remote copy (still
    0.3GB worth of copying to be done, currently held in NVRAM), and
    fourteen snapshot copies, all slightly different, not to mention
    that the remote copy is compressed, and this figure includes
    metadata overhead on the metadata disks.
    - 4.567GB on expensive SCSI disks (at $3/GB plus $0.50/year/GB), and
    1.234GB on cheap SATA disks (at $1/GB plus $0.25/year/GB).
    As you see, returning one number is woefully inadequate.

    We need to ask ourselves: What is the purpose of the space usage
    information? It is not to verify that the file system has correctly
    stored the data (for that it is too crude), it is to enable
    administrating the file system, so it needs to give the information a
    system administrator might care about.

    If I had my way (fortunately, nobody ever listens to me), I would
    remove the du command and completely remove all notions of space usage
    from the user-mode application API, and put all space usage
    information into a file system management interface. There, questions
    like the following need to be answered:

    - How much space is user fred using (or files used by the wombat
    project, or files stored on storage device foobar)?
    - Has fred's usage increased recently?
    - How expensive is the storage used by fred? Original purchase, lease
    payments, yearly provisioning and administration cost?
    - Are the wombat projects requirements for data availability being
    met, or could I improve them by allocating more space to it and
    storing more redundant copies of their data?
    - If I move the wombat project to the netapp, and then use the free
    space on the cluster filesystem to put fred's files on, would that
    save me money or increase speed or availability?
    - Is the netapp still a cost-effective device, given that we just
    started using the fancy new foobar device from Irish Baloney
    Machines with the new cluster filesystem from Hockey-Puckered?

    (If it isn't clear, all mentions of the word "netapp" and oblique
    references to large computer companies are meant as humor, and are
    intended to neither praise nor denigrate my current, former or future
    employers).

    --
    The address in the header is invalid for obvious reasons. Please
    reconstruct the address from the information below (look for _).
    Ralph Becker-Szendy us
    _firstname_@lr_dot_los-gatos_dot_ca.us Guest

  4. #4

    Default Re: interfilesystem copies: large du diffs

    On 24 Aug 2005 02:08:46 -0700, orgone said something similar to:
    : I recently rsync'd around 2.8TB between a RHE server (jfs fs) and a
    : Netapps system. Did a 'du -sk' against each to verify the transfers:
    :
    : 2894932960 sources total, KB
    : 2751664496 destination total, KB
    :
    : That's a 140GB discrepancy. Subsequent verbose rsyncs have turned up
    : nothing that was not originally transferred.

    What are the native block sizes of the two filesystems? If you've got
    a large enough number of files and directories there, a smaller block size
    on the destination could account for the discrepancy in terms of less unused
    space at the end of the last block of each file.

    Another thing that I've seen cause discrepancies like this on occasion is
    when the source directories once had many more files in them then they
    currently do. Once more blocks have been allocated to a directory, they
    don't get deallocated when the number of files drops.

    Mike Guest

  5. #5

    Default Re: interfilesystem copies: large du diffs

    orgone wrote: 

    Rsync has a "-c" option for producing checksums, I imagine that would
    give me some reassurance that the transfer ocurred correctly. There is
    also the "-v" verbose option as you noted.

    To be certain I'd consider checksumming all the files on each system
    (e.g. something like find mydirectory -exec sum {} \; > sysname.sums)
    and use diff to compare the results. If really paranoid I'd use md5sum
    instead of sum. I imagine this will take considerable time on 2.8TB so
    I'd try it on small subsets first :-)
    Ian Guest

Similar Threads

  1. Replies: 5
    Last Post: February 16th, 09:04 AM
  2. Replies: 2
    Last Post: June 1st, 03:43 AM
  3. Can I run two copies of Cs in OSX ?
    By Joe_leMonnier@adobeforums.com in forum Adobe Illustrator Macintosh
    Replies: 7
    Last Post: April 19th, 02:29 AM
  4. Diffs between prepareFrame, enterFrame handlers?
    By h in forum Macromedia Director Lingo
    Replies: 5
    Last Post: January 19th, 05:06 PM
  5. regex diffs between perl 5.6.1 and 5.8.0?
    By Patrick Flaherty in forum PERL Miscellaneous
    Replies: 4
    Last Post: August 18th, 09:03 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139