Ask a Question related to PHP Development, Design and Development.

  1. #1

    Default Regex headache

    I am having a regex nightmare and can't see the wood for the trees.
    I want to extract data from an HTML file. I have been using the file()
    command which gets the html alright, I am just falling down with the regular
    expression.

    eg: <span class="something">Some Text</span><span class="something">Some
    More text</span>
    I want to extract the information and write it to an array. The above should
    produce:
    $data[0]="Some Text" , $data[1]="Some More Text"

    I am such a noddy when it comes to regex, can anyone help with a code
    snippet
    Thanks

    Regards
    Richard Grove
    [url]http://www.shopmaker.co.uk[/url] - Ecommerce Shop Systems



    Richard Grove - ®ed Eye Media Guest

  2. Similar Questions and Discussions

    1. Bit of a headache
      Hi all, I have found this bit of javascript to validate a users form input, but the problem is that it does it in steps - ie it will check the...
    2. Impersonation headache
      I have been fighting with impersonation for quite sometime now and now matter what I have tried it just won't work. I am trying to get...
    3. gradient headache
      Does any body see a problem with this? I get nothing. I want a triangle with blue in the lower right corner fading to transparent at the...
    4. Web Service headache
      Hi there, I am working on a web service, which was going fine until this morning. Both it and my test client app (a simple web app) are running...
    5. regex causing headache ;-(
      Hi there, I am having trouble with a reg ex. What it should do is, terminate if the url is /*/contact.html but it should not terminate if it...
  3. #2

    Default Re: Regex headache

    Richard Grove - ®ed Eye Media wrote:
    > I am having a regex nightmare and can't see the wood for the trees.
    > I want to extract data from an HTML file. I have been using the file()
    > command which gets the html alright, I am just falling down with the regular
    > expression.
    >
    > eg: <span class="something">Some Text</span><span class="something">Some
    > More text</span>
    > I want to extract the information and write it to an array. The above should
    > produce:
    > $data[0]="Some Text" , $data[1]="Some More Text"
    >
    > I am such a noddy when it comes to regex, can anyone help with a code
    > snippet
    > Thanks
    >
    > Regards
    > Richard Grove
    > [url]http://www.shopmaker.co.uk[/url] - Ecommerce Shop Systems
    >
    >
    >
    I did something like this b4. Here is some of it modified...

    <?php
    $data = array();
    $quote = '<span class="something">Some Text</span><span
    class="something">Some More text</span>';

    // get <span ...>...</span> within $quote
    preg_match_all('(<span.*?>*</span>)',$quote,$all_span, PREG_PATTERN_ORDER);
    foreach($all_span[0] as $span_match)
    {
    echo $span_match."\n";
    // $span_match = <span ...>...</span>

    // get data between the span tags
    preg_match_all('(>.*<)',$span_match,$all_data, PREG_PATTERN_ORDER);
    foreach($all_data[0] as $data_match)
    {
    echo ' '.substr($data_match,1,strlen($data_match)-2)."\n";
    array_push($data,substr($data_match,1,strlen($data _match)-2));
    }
    }

    print_r($data);
    ?>

    now $data should have all the stuff you need.
    -JI
    Jamie Isaacs Guest

  4. #3

    Default Re: Regex headache

    "Jamie Isaacs" <jamie@shsu.edu> wrote in message
    news:cjeeai$dgm@library1.airnews.net...
    > Richard Grove - ®ed Eye Media wrote:
    > > I am having a regex nightmare and can't see the wood for the trees.
    > > I want to extract data from an HTML file. I have been using the file()
    > > command which gets the html alright, I am just falling down with the
    regular
    > > expression.
    > >
    > > eg: <span class="something">Some Text</span><span class="something">Some
    > > More text</span>
    > > I want to extract the information and write it to an array. The above
    should
    > > produce:
    > > $data[0]="Some Text" , $data[1]="Some More Text"
    > >
    > > I am such a noddy when it comes to regex, can anyone help with a code
    > > snippet
    > > Thanks
    > >
    > > Regards
    > > Richard Grove
    > > [url]http://www.shopmaker.co.uk[/url] - Ecommerce Shop Systems
    > >
    > >
    > >
    >
    > I did something like this b4. Here is some of it modified...
    >
    > <?php
    > $data = array();
    > $quote = '<span class="something">Some Text</span><span
    > class="something">Some More text</span>';
    >
    > // get <span ...>...</span> within $quote
    > preg_match_all('(<span.*?>*</span>)',$quote,$all_span,
    PREG_PATTERN_ORDER);
    > foreach($all_span[0] as $span_match)
    > {
    > echo $span_match."\n";
    > // $span_match = <span ...>...</span>
    >
    > // get data between the span tags
    > preg_match_all('(>.*<)',$span_match,$all_data, PREG_PATTERN_ORDER);
    > foreach($all_data[0] as $data_match)
    > {
    > echo ' '.substr($data_match,1,strlen($data_match)-2)."\n";
    > array_push($data,substr($data_match,1,strlen($data _match)-2));
    > }
    > }
    >
    > print_r($data);
    > ?>
    >
    > now $data should have all the stuff you need.
    > -JI


    Many thanks, we are on the right road now.
    I changed it to this but it doesn't work.
    preg_match_all('(<span class="bodybold">*</span>)',$lines[$a],$all_span,
    PREG_PATTERN_ORDER);

    I would like to get data from between <span class="bodybold">data</span>

    Any ideas?



    Richard Grove - ®ed Eye Media Guest

  5. #4

    Default Re: Regex headache

    to match just the ones with bodybold in the span tag try this:

    <?php
    $data = array();
    $quote = '<span class="bodybold">Some Text</span><span
    class="something">Some More text</span>';

    // get <span ...>...</span> within $quote
    preg_match_all('(<span.*?(bodybold).*?>.*?</span>)',$quote,$all_span,
    PREG_PATTERN_ORDER);
    foreach($all_span[0] as $span_match)
    {
    echo $span_match."\n";
    // $span_match = <span ...>...</span>
    // get data between the span tags
    preg_match_all('(>.*<)',$span_match,$all_data, PREG_PATTERN_ORDER);
    foreach($all_data[0] as $data_match)
    {
    echo ' '.substr($data_match,1,strlen($data_match)-2)."\n";
    array_push($data,substr($data_match,1,strlen($data _match)-2));
    }
    }

    print_r($data);
    ?>
    Jamie Isaacs Guest

  6. #5

    Default Re: Regex headache

    "Jamie Isaacs" <jamie@shsu.edu> wrote in message
    news:cjer6c$nbc@library1.airnews.net...
    > to match just the ones with bodybold in the span tag try this:
    >
    > <?php
    > $data = array();
    > $quote = '<span class="bodybold">Some Text</span><span
    > class="something">Some More text</span>';
    >
    > // get <span ...>...</span> within $quote
    > preg_match_all('(<span.*?(bodybold).*?>.*?</span>)',$quote,$all_span,
    > PREG_PATTERN_ORDER);
    > foreach($all_span[0] as $span_match)
    > {
    > echo $span_match."\n";
    > // $span_match = <span ...>...</span>
    > // get data between the span tags
    > preg_match_all('(>.*<)',$span_match,$all_data, PREG_PATTERN_ORDER);
    > foreach($all_data[0] as $data_match)
    > {
    > echo ' '.substr($data_match,1,strlen($data_match)-2)."\n";
    > array_push($data,substr($data_match,1,strlen($data _match)-2));
    > }
    > }
    >
    > print_r($data);
    > ?>



    Many thanks Jamie,
    I'll give it a spin

    Regards
    Richard Grove
    [url]http://www.shopmaker.co.uk[/url] - Ecommerce Shop Systems


    Richard Grove - ®ed Eye Media Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139