Professional Web Applications Themes

altivec, no apparant speedup - Mac Programming

I'm unsure why I'm getting no decent speed up. I'm just toying around with it right now, not working on an application. I'm iterating over vec_add millions of times. It isn't any faster than just doing the math myself. Here's the code, followed by the output: CODE: clock_t startTime, elapsedTime; //Two iteration variables, since neither can go over 4 billion int numIters = 1000000000, numIters2 = 10; //Make two vectors vector unsigned long v1 = (vector unsigned long) (1, 2, 3, 4); vector unsigned long v2 = (vector unsigned long) (1, 1, 1, 1); printf("%vld\n", v1); printf("%vld\n", v2); startTime = ...

  1. #1

    Default altivec, no apparant speedup

    I'm unsure why I'm getting no decent speed up. I'm just toying around
    with it right now, not working on an application. I'm iterating over
    vec_add millions of times. It isn't any faster than just doing the math
    myself. Here's the code, followed by the output:

    CODE:

    clock_t startTime, elapsedTime;

    //Two iteration variables, since neither can go over 4 billion
    int numIters = 1000000000, numIters2 = 10;

    //Make two vectors
    vector unsigned long v1 = (vector unsigned long) (1, 2, 3, 4);
    vector unsigned long v2 = (vector unsigned long) (1, 1, 1, 1);

    printf("%vld\n", v1);
    printf("%vld\n", v2);

    startTime = clock();
    //Add the vectors a couple of times :-)
    for (int j = 0; j < numIters2; j++)
    for (int i = 0; i < numIters; i++)
    v1 = vec_add(v1, v2);

    elapsedTime = clock() - startTime;
    elapsedTime /= CLOCKS_PER_SEC;

    printf("%vld\n", v1);
    cout << "Elapsed time: " << elapsedTime << endl << endl;

    //Make four variables, identical to the original vector
    unsigned int q = 1, w = 2, e = 3, r = 4;
    cout << q << "," << w << "," << e << "," << r << endl;

    startTime = clock();

    //Add 'em up a little bit
    for (int j = 0; j < numIters2; j++)
    for (int i = 0; i < numIters; i++)
    {
    q += 1;
    w += 1;
    e += 1;
    r += 1;
    }

    elapsedTime = clock() - startTime;
    elapsedTime /= CLOCKS_PER_SEC;

    cout << q << "," << w << "," << e << "," << r << endl;
    cout << "Elapsed time: " << elapsedTime << endl;

    OUTPUT:

    Altivec found
    Hello World 1234
    1 2 3 4
    1 1 1 1
    1410065409 1410065410 1410065411 1410065412
    Elapsed time: 29.5735

    1,2,3,4
    1410065409,1410065410,1410065411,1410065412
    Elapsed time: 33.9299

    Why is there virtually no speed up? This can't have anything to do with
    16-byte aligning the vector because I declared it directly from constants.

    Thanks.

    __________________________________________________ ______________________
    Keith Wiley unm.edu
    http://www.unm.edu/~keithw http://www.mp3.com/KeithWiley

    "Yet mark his perfect self-contentment, and hence learn his lesson,
    that to be self-contented is to be vile and ignorant, and that to
    aspire is better than to be blindly and impotently happy."
    -- Edwin A. Abbott, Flatland
    __________________________________________________ ______________________
    Keith Guest

  2. #2

    Default Re: altivec, no apparant speedup

    In article <cs.unm.edu>,
    Keith Wiley <unm.edu> wrote:
     

    Merely calling AltiVec APIs doesn't magically make your code faster.
    AltiVec allows you to process large chunks of data at once. To take
    full advantage of it, you need to keep feeding large chunks of data to
    the CPU.
     

    This code has to wait for the completion of every vec_add before the
    next one can even start. Each iteration of the loop both depends on v1
    and sets v1. Remove that bottleneck and things should be faster.

    Oh, and use Apple's CHUD tools to be told exactly what's wrong with your
    code. They'll pinpoint pipeline stalls and other such things.

    -Eric

    --
    Eric Albert edu
    http://rescomp.stanford.edu/~ejalbert/
    Eric Guest

  3. #3

    Default Re: altivec, no apparant speedup

    Thanks. I'll have to look into that.

    __________________________________________________ ______________________
    Keith Wiley unm.edu
    http://www.unm.edu/~keithw http://www.mp3.com/KeithWiley

    "Yet mark his perfect self-contentment, and hence learn his lesson,
    that to be self-contented is to be vile and ignorant, and that to
    aspire is better than to be blindly and impotently happy."
    -- Edwin A. Abbott, Flatland
    __________________________________________________ ______________________
    Keith Guest

  4. #4

    Default Re: altivec, no apparant speedup

    In article <cs.unm.edu>,
    Keith Wiley <unm.edu> wrote:
     

    Shark is the one you want to look at most... Its great!
    Sean Guest

Similar Threads

  1. Split complex vectors and AltiVec FFT's
    By Eric Raas in forum Mac Programming
    Replies: 4
    Last Post: September 21st, 05:18 PM
  2. Info on Altivec convolution function (conv) in vDSP.h
    By Eric Raas in forum Mac Programming
    Replies: 2
    Last Post: September 12th, 09:18 PM
  3. how to speedup tweening?
    By Charliedhq webforumsuser@macromedia.com in forum Macromedia Flash Sitedesign
    Replies: 1
    Last Post: August 21st, 02:07 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139