" > > one can use Regexp::Common (well, if it's not allowed, it's still > possible with Regexp::Common): (untested): > > use Regexp::Common; > > s/($RE{delimited}{-delim => q {'"}})|$RE{comment}{Portia}/$1||""/ge; > > Abigail[/quote] Here's what I've come up with using a trial and error process: $TheCode = preg_replace('#^([^"\'\/]*)//[^"\']*[\n\r]$#mU', '$1', $TheCode); Applying this RegEx to the following text, =========================================================================== // 1a. Some comments that should be trashed // 2a. Some comments that should be trashed // 1b. Some comments that should be trashed // 2b. Some comments that should be trashed // 1c. Some comments that should be trashed // 2c. Some comments that should be trashed 3. Code goes here; 5. 6. $SomeVar = '// These comments should be left alone!'; 7. $SomeVar = ' // These comments should be left alone!'; 8. /* 9. Some more comments that will remain */ 10. "// 11. These comments should also be left alone " " // 12. These comments should also be left alone " " // 13. These comments should also be left alone " # 14. Hash/Pound sign comments should remain for now 15. 16. More Code goes here 17. { 18. This code should stay // 19. This comment should go 20. } 21. "This is a \"string\" that ends here 22. --->" " 23. This is a \"string\" that ends here --->" // 24. Some comments that should be trashed // 25. Some comments that should be trashed // 26. Some comments that should be trashed // 27. Some comments that should be trashed // 28. Some comments that should be trashed // 29. Some comments that should be trashed =========================================================================== Produces: =========================================================================== 3. Code goes here; 5. 6. $SomeVar = '// These comments should be left alone!'; 7. $SomeVar = ' // These comments should be left alone!'; 8. /* 9. Some more comments that will remain */ 10. "// 11. These comments should also be left alone " " // 12. These comments should also be left alone " " // 13. These comments should also be left alone " # 14. Hash/Pound sign comments should remain for now 15. 16. More Code goes here 17. { 18. This code should stay 20. } 21. "This is a \"string\" that ends here 22. --->" " 23. This is a \"string\" that ends here --->" =========================================================================== ======================================================================== // Here is the breakdown of how this works: // ' // Opening quote to contain the RegEx String // # // Opening RegEx delimiter. Using # because // // is used in the match // ^ // Start at the beginning of the string // ( // Begin capture section 1 // [^"\'\/] // Any character except these 3: " ' / // * // Zero or more of the previous // ) // Close capture section 1 // // // Must have the double slashes // on the line // [^"\'] // Any character except these 2: " ' // * // Zero or more of the previous // [\n\r] // End of line character // $ // End of string marker // # // Closing RegEx delimiter // m // Multi-line string modifier // U // Ungreedy modifier // ' // Closing quote for RegEx string ============================================================================= This appears to strip out all // comments that are NOT contained in quote marks, whether single or double. However, it doesn't remove the line completely if there are only blank spaces remaining after the RegEx operation. Oh, well. I guess these can be removed with a subsequent RegEx. Any comments or suggestions? -- Start Here to Find It Fast!™ -> [url]http://www.US-Webmasters.com/best-start-page/[/url] $8.77 Domain Names -> [url]http://domains.us-webmasters.com/[/url] [allowsmilie] => 1 [showsignature] => 0 [ipaddress] => [iconid] => 0 [visible] => 1 [attach] => 0 [infraction] => 0 [reportthreadid] => 0 [isusenetpost] => 1 [msgid] => <40A5B018.6932@US-Webmasters.com> [ref] => <40A4579F.7D4@US-Webmasters.com> [htmlstate] => on_nl2br [postusername] => W. D. [ip] => NewsGroups@US-W [isdeleted] => 0 [usergroupid] => [membergroupids] => [displaygroupid] => [password] => [passworddate] => [email] => [styleid] => [parentemail] => [homepage] => [icq] => [aim] => [yahoo] => [msn] => [skype] => [showvbcode] => [showbirthday] => [usertitle] => [customtitle] => [joindate] => [daysprune] => [lastvisit] => [lastactivity] => [lastpost] => [lastpostid] => [posts] => [reputation] => [reputationlevelid] => [timezoneoffset] => [pmpopup] => [avatarid] => [avatarrevision] => [profilepicrevision] => [sigpicrevision] => [options] => [akvbghsfs_optionsfield] => [birthday] => [birthday_search] => [maxposts] => [startofweek] => [referrerid] => [languageid] => [emailstamp] => [threadedmode] => [autosubscribe] => [pmtotal] => [pmunread] => [salt] => [ipoints] => [infractions] => [warnings] => [infractiongroupids] => [infractiongroupid] => [adminoptions] => [profilevisits] => [friendcount] => [friendreqcount] => [vmunreadcount] => [vmmoderatedcount] => [socgroupinvitecount] => [socgroupreqcount] => [pcunreadcount] => [pcmoderatedcount] => [gmmoderatedcount] => [assetposthash] => [fbuserid] => [fbjoindate] => [fbname] => [logintype] => [fbaccesstoken] => [newrepcount] => [vbseo_likes_in] => [vbseo_likes_out] => [vbseo_likes_unread] => [temp] => [field1] => [field2] => [field3] => [field4] => [field5] => [subfolders] => [pmfolders] => [buddylist] => [ignorelist] => [signature] => [searchprefs] => [rank] => [icontitle] => [iconpath] => [avatarpath] => [hascustomavatar] => 0 [avatardateline] => [avwidth] => [avheight] => [edit_userid] => [edit_username] => [edit_dateline] => [edit_reason] => [hashistory] => [pagetext_html] => [hasimages] => [signatureparsed] => [sighasimages] => [sigpic] => [sigpicdateline] => [sigpicwidth] => [sigpicheight] => [postcount] => 3 [islastshown] => [isfirstshown] => [attachments] => [allattachments] => ) --> RegEx to delete // comments NOT in quotes: ( ' ) OR (")??? - PHP Development

RegEx to delete // comments NOT in quotes: ( ' ) OR (")??? - PHP Development

Thanks, Bill! I am going to have to chew on this a while so I understand it better--it seems fairly complicated. Bill wrote: > > W. D. wrote: >>Hi Folks, >> >>I am about to ship myself to a mental hospital! Can't figure >>out a regular expression to strip out comments that begin >>with double slashes "//" but are not contained in quotation >>marks, either single (') or double ("). >> >>Here is a bunch of sample text that I've been working on: >>===================== Code Start ========================== >>// Some comments that should be trashed >> >>Code goes here; >> >>$SomeVar ...

  1. #1

    Default Re: RegEx to delete // comments NOT in quotes: ( ' ) OR (")???

    Thanks, Bill!

    I am going to have to chew on this a while so I understand it
    better--it seems fairly complicated.

    Bill wrote:
    >
    > W. D. wrote:
    >>Hi Folks,
    >>
    >>I am about to ship myself to a mental hospital! Can't figure
    >>out a regular expression to strip out comments that begin
    >>with double slashes "//" but are not contained in quotation
    >>marks, either single (') or double (").
    >>
    >>Here is a bunch of sample text that I've been working on:
    >>===================== Code Start ==========================
    >>// Some comments that should be trashed
    >>
    >>Code goes here;
    >>
    >>$SomeVar = '// These comments should be left alone!';
    >>
    >>/* Some more comments that will remain */
    >>
    >>" // These comments should also be left alone "
    >>
    >># Hash/Pound sign comments should remain for now
    >>
    >>More Code goes here
    >>
    >> { This code should stay // This comment should go
    >> }
    >>=========================== Code End ===========================
    >>
    >>This is one of my PHP regexs that DOESN'T work:
    >>
    >> $TheCode = preg_replace('#([^\n\r\'\"]|.*)//.*[\n\r]#', '',
    >>$TheCode);
    >>
    >>Does anyone know how to create a RegEx to do this????!
    >>
    >>Thank you for saving my sanity if you can help!!!!!
    >>
    >
    > Derived from perlfaq on regex:
    >
    > #!/usr/bin/perl
    >
    > my $lines = <<ENDTEXT;
    > // Some comments that should be trashed
    > Code goes here;
    > \$SomeVar = '// These comments should be left alone!';
    > /* Some more comments that will remain */
    > \" // These comments should also be left alone \"
    > # Hash/Pound sign comments should remain for now
    > More Code goes here
    > { This code should stay // This comment should go
    > }
    > ENDTEXT
    >
    > print $lines, "\n\n\n";
    >
    > # this needs whitespace reformatting, but you're using PHP :{
    > $lines =~
    > s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;
    > print $lines, "\n";
    --
    Start Here to Find It Fast!™ ->
    [url]http://www.US-Webmasters.com/best-start-page/[/url]
    $8.77 Domain Names -> [url]http://domains.us-webmasters.com/[/url]
    W. D. Guest

  2. #2

    Default Re: RegEx to delete // comments NOT in quotes: ( ' ) OR (")???

    Hi Bill, et. al.,

    Am I interpreting this correctly?

    s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;

    RegExPattern = '
    # # Opening delimiter
    / # A forward slash
    \* # An escaped *
    [^*] # Any character (except: *)
    * # Zero or more of the previous character
    \* # An escaped *
    + # One or more of the previous
    ( # Begin a logical grouping
    [^/*] # Any character (except: / * )
    [^*] # Any character (except: * )
    * # Zero or more of the previous
    \* # An escaped *
    + # One or more of the previous
    ) # End logical grouping
    * # Zero or more of the previous
    / # A slash
    | # OR
    // # 2 forward slashes
    [^\n] # Can't be a newline
    * # Zero or more of the previous
    | # OR
    ( # Begin logical grouping
    " # A double quote mark
    ( # Begin a nested logical grouping
    \\ # An escaped backslash
    . # Any character (except newline \n)
    | # OR
    [^"\\] # Any character (except " or \ )
    ) # End logical grouping
    * # Zero or more of the previous
    " # A quote mark
    | # OR
    ' # A single quote mark
    ( # Begin logical grouping
    \\ # An escaped backslash
    . # Any character (except newline \n)
    | # OR
    [^'\\] # Any character except ' or \
    ) # End logical grouping
    * # Zero or more of the previous
    ' # A single quote mark
    | # OR
    . # Any character (except newline \n)
    [^/"'\\] # Any character (except these: / " ' \ )
    * # Zero or more of the previous
    ) # End logical grouping
    # # Closing delimiter
    $2 # ?
    #gsx # Modifiers? Global? String?
    ';



    Bill wrote:
    >
    > W. D. wrote:
    > > ===================== Code Start ==========================
    > > // Some comments that should be trashed
    > >
    > > Code goes here;
    > >
    > > $SomeVar = '// These comments should be left alone!';
    > >
    > > /* Some more comments that will remain */
    > >
    > > " // These comments should also be left alone "
    > >
    > > # Hash/Pound sign comments should remain for now
    > >
    > > More Code goes here
    > >
    > > { This code should stay // This comment should go
    > > }
    > > =========================== Code End ===========================
    > >
    > > Does anyone know how to create a RegEx to do this????!
    > >
    >
    > Derived from perlfaq on regex:
    >
    > #!/usr/bin/perl
    >
    > my $lines = <<ENDTEXT;
    > // Some comments that should be trashed
    > Code goes here;
    > \$SomeVar = '// These comments should be left alone!';
    > /* Some more comments that will remain */
    > \" // These comments should also be left alone \"
    > # Hash/Pound sign comments should remain for now
    > More Code goes here
    > { This code should stay // This comment should go
    > }
    > ENDTEXT
    >
    > print $lines, "\n\n\n";
    >
    > # this needs whitespace reformatting, but you're using PHP :{
    > $lines =~
    > s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;
    > print $lines, "\n";
    --
    Start Here to Find It Fast!™ ->
    [url]http://www.US-Webmasters.com/best-start-page/[/url]
    $8.77 Domain Names -> [url]http://domains.us-webmasters.com/[/url]
    W. D. Guest

  3. #3

    Default Re: RegEx to delete // comments NOT in quotes: ( ' ) OR (")???

    Thanks, Abigail for your reply.

    New RegEx below...

    Abigail wrote:
    >
    > W. D. (NewsGroupsUS-Webmasters.com) wrote on MMMCMIX September MCMXCIII
    > in <URL:news:40A4579F.7D4US-Webmasters.com>:
    > }} Hi Folks,
    > }}
    > }} I am about to ship myself to a mental hospital! Can't figure
    > }} out a regular expression to strip out comments that begin
    > }} with double slashes "//" but are not contained in quotation
    > }} marks, either single (') or double (").
    >
    > Assuming that quoted strings can contain backslashed quotes, as in:
    >
    > "This is a \"string\" that ends here --->"
    >
    > one can use Regexp::Common (well, if it's not allowed, it's still
    > possible with Regexp::Common): (untested):
    >
    > use Regexp::Common;
    >
    > s/($RE{delimited}{-delim => q {'"}})|$RE{comment}{Portia}/$1||""/ge;
    >
    > Abigail
    Here's what I've come up with using a trial and error process:

    $TheCode = preg_replace('#^([^"\'\/]*)//[^"\']*[\n\r]$#mU', '$1',
    $TheCode);

    Applying this RegEx to the following text,
    ================================================== =========================


    // 1a. Some comments that should be trashed
    // 2a. Some comments that should be trashed
    // 1b. Some comments that should be trashed
    // 2b. Some comments that should be trashed
    // 1c. Some comments that should be trashed
    // 2c. Some comments that should be trashed





    3.
    Code goes here;
    5.
    6. $SomeVar = '// These comments should be left alone!';
    7. $SomeVar = ' // These comments should be left alone!';
    8.
    /* 9. Some more comments that will remain */
    10.
    "// 11. These comments should also be left alone "
    " // 12. These comments should also be left alone "
    " // 13. These comments should also be left alone "
    # 14. Hash/Pound sign comments should remain for now
    15.
    16. More Code goes here
    17.
    { 18. This code should stay // 19. This comment should go
    20.
    } 21.
    "This is a \"string\" that ends here 22. --->"
    " 23. This is a \"string\" that ends here --->"
    // 24. Some comments that should be trashed
    // 25. Some comments that should be trashed
    // 26. Some comments that should be trashed
    // 27. Some comments that should be trashed
    // 28. Some comments that should be trashed
    // 29. Some comments that should be trashed

    ================================================== =========================

    Produces:
    ================================================== =========================












    3.
    Code goes here;
    5.
    6. $SomeVar = '// These comments should be left alone!';
    7. $SomeVar = ' // These comments should be left alone!';
    8.
    /* 9. Some more comments that will remain */
    10.
    "// 11. These comments should also be left alone "
    " // 12. These comments should also be left alone "
    " // 13. These comments should also be left alone "
    # 14. Hash/Pound sign comments should remain for now
    15.
    16. More Code goes here
    17.
    { 18. This code should stay
    20.
    } 21.
    "This is a \"string\" that ends here 22. --->"
    " 23. This is a \"string\" that ends here --->"








    ================================================== =========================

    ================================================== ======================
    // Here is the breakdown of how this works:
    // ' // Opening quote to contain the RegEx String
    // # // Opening RegEx delimiter. Using # because
    //
    // is used in the match
    // ^ // Start at the beginning of the string
    // ( // Begin capture section 1
    // [^"\'\/] // Any character except these 3: " ' /
    // * // Zero or more of the previous
    // ) // Close capture section 1
    // // // Must have the double slashes // on the line
    // [^"\'] // Any character except these 2: " '
    // * // Zero or more of the previous
    // [\n\r] // End of line character
    // $ // End of string marker
    // # // Closing RegEx delimiter
    // m // Multi-line string modifier
    // U // Ungreedy modifier
    // ' // Closing quote for RegEx string
    ================================================== ===========================

    This appears to strip out all // comments that are NOT contained in
    quote marks, whether single or double.

    However, it doesn't remove the line completely if there are only
    blank spaces remaining after the RegEx operation. Oh, well. I guess
    these can be removed with a subsequent RegEx.

    Any comments or suggestions?

    --
    Start Here to Find It Fast!™ ->
    [url]http://www.US-Webmasters.com/best-start-page/[/url]
    $8.77 Domain Names -> [url]http://domains.us-webmasters.com/[/url]
    W. D. Guest

  4. #4

    Default Re: RegEx to delete // comments NOT in quotes: ( ' ) OR (")???

    W. D. (NewsGroupsUS-Webmasters.com) wrote on MMMCMX September MCMXCIII
    in <URL:news:40A5B018.6932US-Webmasters.com>:
    --
    -- ================================================== ======================
    -- // Here is the breakdown of how this works:
    -- // ' // Opening quote to contain the RegEx String
    -- // # // Opening RegEx delimiter. Using # because
    -- //
    -- // is used in the match
    -- // ^ // Start at the beginning of the string
    -- // ( // Begin capture section 1
    -- // [^"\'\/] // Any character except these 3: " ' /
    -- // * // Zero or more of the previous
    -- // ) // Close capture section 1
    -- // // // Must have the double slashes // on the line
    -- // [^"\'] // Any character except these 2: " '
    -- // * // Zero or more of the previous
    -- // [\n\r] // End of line character
    -- // $ // End of string marker
    -- // # // Closing RegEx delimiter
    -- // m // Multi-line string modifier
    -- // U // Ungreedy modifier
    -- // ' // Closing quote for RegEx string
    -- ================================================== ===========================

    I've no idea which version of Perl you are using, but I've never heard
    of an 'ungreedy' modifier. Anyway, your description suggests that you
    don't strip out comments containing quotes. That is, you leave

    // This is a "comment"

    as is.


    Abigail
    --
    perl -e '* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
    / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
    % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % %;
    BEGIN {% % = ($ _ = " " => print "Just Another Perl Hacker\n")}'
    Abigail Guest

Similar Threads

  1. Can't launch Acrobat7 pro "There was an error while loading the plug-in 'Comments'.Bad parameter."
    By Kenneth_L._Frakes@adobeforums.com in forum Adobe Acrobat Macintosh
    Replies: 1
    Last Post: July 21st, 09:12 PM
  2. SQL Stored Procedure Problem "Single Quotes"
    By tranzformerz in forum Coldfusion Database Access
    Replies: 4
    Last Post: June 7th, 05:08 PM
  3. Replies: 0
    Last Post: February 10th, 05:36 PM
  4. Replies: 2
    Last Post: August 20th, 09:11 PM
  5. Trying to get smart quotes or "curly" quotes
    By damico webforumsuser@macromedia.com in forum Macromedia Dreamweaver
    Replies: 2
    Last Post: July 22nd, 04:47 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139