BUG in encoding package requires spaces around and

Ask a Question related to PERL Modules, Design and Development.

  1. #1

    Default BUG in encoding package requires spaces around and

    A bug in the 'encoding' module seems to require spaces around the right
    and left double-angle-brackets. This only is needed when a variable is
    being interpolated within a double-quoted string. Here's a demonstration
    program:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use encoding 'iso-8859-1';

    print "Hi there\n";
    print "Hello there again\n";

    our $string = 'Something fun';

    print "$string"; # BUG prevents compilation
    print " $string "; # spaces are needed around string to compile

    __END__

    This prints (i18n-file.pl is the name of my script):

    Global symbol "%_END__" requires explicit package name at ./i18n-file.pl
    line 11.
    Execution of ./i18n-file.pl aborted due to compilation errors.

    shell returned 255

    ------------------------------
    Comment out the first print $string, and everything works.


    Mumia W. Guest

  2. Similar Questions and Discussions

    1. This application requires JavaScript Support ?
      Hello all? I downloaded FMS developer version last week and have been trying to test some examples given. But it seems like javascript is never...
    2. Contribute CS3 Requires Password
      I administer 30 or more websites. Have been working with Contribute since version 2. Never required that I type in a password to open up the...
    3. Contribute requires password after installing IE 7
      After installing IE 7, Contribute (3.11) now asks for a password when starting up. After a bit of experimenting, this turned out to match my regular...
    4. Anyone knows why Net::NTP requires 5.8.x ?
      I think there is no reason why Net::NTP currently requires (in its Makefile and NTP.pm) 5.008. I tried contacting its author, but it seems...
    5. 3.1 Update requires Administrator Passoword
      I wonder what features, options, programs, tools are installed that require the MAC OS administrator password? I wonder what do these programs do?...
  3. #2

    Default Re: BUG in encoding package requires spaces around and

    Mumia W. wrote:
    > A bug in the 'encoding' module seems to require spaces around the right
    > and left double-angle-brackets. This only is needed when a variable is
    > being interpolated within a double-quoted string. [...]
    A workaround is to use curly braces around the variable name:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use encoding 'iso-8859-1';

    print "Hi there\n";
    print "Hello there again\n";

    our $string = 'Something fun';

    # print "$string\n"; # BUG prevents compilation
    print "${string}\n"; # Put 'string' in braces to avoid bug.

    Mumia W. Guest

  4. #3

    Default Re: BUG in encoding package requires spaces around and

    Mumia W. wrote:
    > Mumia W. wrote:
    > > A bug in the 'encoding' module seems to require spaces around the right
    > > and left double-angle-brackets. This only is needed when a variable is
    > > being interpolated within a double-quoted string. [...]
    >
    > A workaround is to use curly braces around the variable name:
    > [...]
    > print "${string}\n"; # Put 'string' in braces to avoid bug.
    Another workaround: print "$string\";

    My guess it that Perl considers to be part of the scalar's name
    somehow (though and are part of ISO-8859-1). I think you're right
    that this is a bug in the encoding module.

    But the problem seems to occur only in the character at the right side
    of $string (), not in the one at the left side (). (though print
    "$string"; doesn't work either)

    --
    Bart

    Bart Van der Donck Guest

  5. #4

    Default Re: BUG in encoding package requires spaces around and

    Mumia W. wrote:
    > needed when a variable is being interpolated within a double-quoted string.
    And, FWIW, this bug also affects strings quoted in qq{} style (which is
    what I would expect, of course, but I did test it).

    --
    David Filmer ([url]http://DavidFilmer.com[/url])

    usenet@DavidFilmer.com Guest

  6. #5

    Default Re: BUG in encoding package requires spaces around and

    Bart Van der Donck wrote:
    > Mumia W. wrote:
    >
    >> Mumia W. wrote:
    >>> A bug in the 'encoding' module seems to require spaces around the right
    >>> and left double-angle-brackets. This only is needed when a variable is
    >>> being interpolated within a double-quoted string. [...]
    >> A workaround is to use curly braces around the variable name:
    >> [...]
    >> print "${string}\n"; # Put 'string' in braces to avoid bug.
    >
    > Another workaround: print "$string\";
    >
    > My guess it that Perl considers to be part of the scalar's name
    > somehow (though and are part of ISO-8859-1). I think you're right
    > that this is a bug in the encoding module.
    >
    > But the problem seems to occur only in the character at the right side
    > of $string (), not in the one at the left side (). (though print
    > "$string"; doesn't work either)
    >
    Thanks for the backslash idea. The 'encoding' parser seems to be partial
    towards us-ascii. I don't know what the semantic difference is supposed
    to be between the vertical bar (|) and the broken bar (), but the
    encoding module treats them very differently:

    1 #!/usr/bin/perl
    2 use strict;
    3 use warnings;
    4 use encoding 'iso-8859-1';
    5
    6 local $\ = "\n";
    7 our $string = 'Something fun';
    8 print "My string is $string|"; # | == \x{7C} (us-ascii, vert. bar)
    9 print "Broken: $string"; # == \x{A6} (8859-1, broken bar)
    10
    11 __END__
    12
    13 The encoding module doesn't seem to like characters
    14 above 127. Either put a backslash before the on line
    15 nine, or comment out line 4, and the program runs.

    Mumia W. Guest

  7. #6

    Default Re: BUG in encoding package requires spaces around and

    Mumia W. wrote:
    > Thanks for the backslash idea. The 'encoding' parser seems to be partial
    > towards us-ascii. I don't know what the semantic difference is supposed
    > to be between the vertical bar (|) and the broken bar (), but the
    > encoding module treats them very differently:
    >
    > 1 #!/usr/bin/perl
    > 2 use strict;
    > 3 use warnings;
    > 4 use encoding 'iso-8859-1';
    > 5
    > 6 local $\ = "\n";
    > 7 our $string = 'Something fun';
    > 8 print "My string is $string|"; # | == \x{7C} (us-ascii, vert.. bar)
    > 9 print "Broken: $string"; # == \x{A6} (8859-1, broken bar)
    > 10
    > 11 __END__
    > 12
    > 13 The encoding module doesn't seem to like characters
    > 14 above 127. Either put a backslash before the on line
    > 15 nine, or comment out line 4, and the program runs.
    You're right, it appears that anything above 127 triggers the error
    message.

    print "Broken: $string";
    print "Broken: $string";
    print "Broken: $string";
    print "Broken: $string";
    print "Broken: $string";

    Or, as in your example:

    | (124) is okay (below 127)
    (166) is not okay (above 127)

    128 is just half of 256 (=the available characters in ISO-8859-1). The
    range 0-127 can be covered by setting the bits in a 7-bit binary digit,
    hence that set is sometimes referred to as 7-bit ASCII. ISO-8859-1 is a
    8-bit character set though, so I'ld say this shouldn't normally happen.
    I tested on different OS's and as CGI because I was not sure it could
    maybe be a shell issue. But that should not be the case here.

    Note that the following encoding gives exactly the same results:

    use encoding 'ascii';

    (Which would be explainable, because ASCII covers 0 to 127 only)

    But. Other tests turned out that the following charsets seem to have
    the same issue:

    use encoding 'iso-8859-16';
    use encoding 'utf-8';
    use encoding 'utf8';
    use encoding 'windows-1251';

    So the problem is not only at ISO-8859-1.

    I'm not sure where to go from here. I would conclude at this point that
    the 'encoding'-module only works for characters up to 127 that are put
    next to a variable's name.

    I hope this can be of some help.

    --
    Bart

    Bart Van der Donck Guest

  8. #7

    Default Re: BUG in encoding package requires spaces around and

    Mumia W. schreef:
    > A bug in the 'encoding' module seems to require spaces around the
    > right and left double-angle-brackets. This only is needed when a
    > variable is being interpolated within a double-quoted string. Here's
    > a demonstration program:
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    no utf8 ;
    > use encoding 'iso-8859-1';
    >
    > print "Hi there\n";
    > print "Hello there again\n";
    >
    > our $string = 'Something fun';
    >
    > print "$string"; # BUG prevents compilation
    > print " $string "; # spaces are needed around string to compile
    >
    > __END__
    >
    > This prints (i18n-file.pl is the name of my script):
    >
    > Global symbol "%_END__" requires explicit package name at
    > ./i18n-file.pl line 11.
    > Execution of ./i18n-file.pl aborted due to compilation errors.
    >
    > shell returned 255
    >
    > ------------------------------
    > Comment out the first print $string, and everything works.
    Insert "no utf8;" before the "use encoding ..." line.

    --
    Affijn, Ruud

    "Gewoon is een tijger."


    Dr.Ruud Guest

  9. #8

    Default Re: BUG in encoding package requires spaces around « and »

    Mumia W. wrote:
    > A bug in the 'encoding' module seems to require spaces around the right
    > and left double-angle-brackets. This only is needed when a variable is
    > being interpolated within a double-quoted string. Here's a demonstration
    > program:
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    > use encoding 'iso-8859-1';
    >
    > print "«Hi there»\n";
    > print "«Hello there again»\n";
    >
    > our $string = 'Something fun';
    >
    > print "«$string»"; # BUG prevents compilation
    > print "« $string »"; # spaces are needed around string to compile
    >
    > __END__
    >
    > This prints (i18n-file.pl is the name of my script):
    >
    > Global symbol "%_END__" requires explicit package name at ./i18n-file.pl
    > line 11.
    > Execution of ./i18n-file.pl aborted due to compilation errors.
    >
    > shell returned 255
    >
    > ------------------------------
    > Comment out the first print «$string», and everything works.
    >
    >
    If you look in encoding.pm, it appears to me that unless you're using
    the filter option, all it does is do some sanity checks on the encoding
    name and then set ${^ENCODING} to the given encoding name. This, and the
    findings in the adjacent threads (that "«${string}»" works) make it
    sound like the Perl parser is mis-handling the end of the interpolated
    variable name.

    So:

    Are you using the latest Perl? I believe this is 5.8.8.

    Are you using the latest Encode? I believe this is 2.17, or at least
    that is the latest on the CPAN mirror I use, as of the time I write this.

    If the answer to both is true, you might want to consider reporting
    this. I'm not sure how I would go about this, but the Encode
    documentation suggests maybe joining and posting to the Perl Unicode
    Mailing List.

    Tom Wyant
    harryfmudd [AT] comcast [DOT] net Guest

  10. #9

    Default Re: BUG in encoding package requires spaces around « and »

    Dr.Ruud wrote:
    > Mumia W. schreef:
    >
    >> A bug in the 'encoding' module seems to require spaces around the
    >> right and left double-angle-brackets. This only is needed when a
    >> variable is being interpolated within a double-quoted string. Here's
    >> a demonstration program:
    >>
    >> #!/usr/bin/perl
    >> use strict;
    >> use warnings;
    > no utf8 ;
    >> use encoding 'iso-8859-1';
    >>
    >> print "«Hi there»\n";
    >> print "«Hello there again»\n";
    >>
    >> our $string = 'Something fun';
    >>
    >> print "«$string»"; # BUG prevents compilation
    >> print "« $string »"; # spaces are needed around string to compile
    >>
    >> __END__
    >>
    >> This prints (i18n-file.pl is the name of my script):
    >>
    >> Global symbol "%_END__" requires explicit package name at
    >> ./i18n-file.pl line 11.
    >> Execution of ./i18n-file.pl aborted due to compilation errors.
    >>
    >> shell returned 255
    >>
    >> ------------------------------
    >> Comment out the first print «$string», and everything works.
    >
    > Insert "no utf8;" before the "use encoding ..." line.
    >
    It works!

    And I think I see why (from man utf8):
    > Note that if you have bytes with the eighth bit on in your script (for
    > example embedded Latin-1 in your string literals), "use utf8" will be
    > unhappy since the bytes are most probably not well-formed UTF-8. If
    > you want to have such bytes and use utf8, you can disable utf8 until
    > the end the block (or file, if at top level) by "no utf8;".
    Thanks for the utf8 idea. So it seems that we have a lot of ways to
    solve this problem: (1) put a space between the variable and the special
    character, (2) put the variable name in curly braces, (3) put a
    backslash before the special character, (4) specify 'no utf8', and (5)
    go ahead and convert the file to utf8 and 'use utf8':

    #!/usr/bin/perl
    use strict;
    use warnings;
    use utf8;
    use encoding 'utf-8';

    local $\ = "\n";
    our $string = 'Something fun';
    print "Reg: $string®";
    print "B-Bar: $string¦";
    print "Quoted: «$string»";
    print "Yen: $string¥";
    print "Euro: $string€";

    our $exoネtic = 'ネニ * њ シßぬ ヌ に *';
    print "exoネtic = $exoネtic";

    __END__


    It seems that utf8 extends the core perl parser in some interesting ways.


    Mumia W. Guest

  11. #10

    Default Re: BUG in encoding package requires spaces around « and »

    Mumia W. wrote:
    >
    >
    > Thanks for the utf8 idea. So it seems that we have a lot of ways to
    > solve this problem: (1) put a space between the variable and the special
    > character, (2) put the variable name in curly braces, (3) put a
    > backslash before the special character, (4) specify 'no utf8', and (5)
    > go ahead and convert the file to utf8 and 'use utf8':
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    > use utf8;
    > use encoding 'utf-8';
    >
    > local $\ = "\n";
    > our $string = 'Something fun';
    > print "Reg: $string®";
    > print "B-Bar: $string¦";
    > print "Quoted: «$string»";
    > print "Yen: $string¥";
    > print "Euro: $string€";
    >
    > our $exoネtic = 'ネニ * њ シßぬ ヌ に *';
    > print "exoネtic = $exoネtic";
    >
    > __END__
    >
    >
    > It seems that utf8 extends the core perl parser in some interesting ways.
    >
    >
    And non-obvious. It looks now like the behaviour is a feature (i.e. is
    documented). But it sure didn't pop out on my first pass through the
    documentation. Thanks.

    Tom Wyant
    harryfmudd [AT] comcast [DOT] net Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139