Cocoa: simple HTML text to Unicode

Ask a Question related to Mac Programming, Design and Development.

  1. #1

    Default Cocoa: simple HTML text to Unicode

    I have an NSString. 99% of the time it contains just text.
    Sometimes one or more characters of the string will be encoded
    in HTML, like the following:

    Time for Questions & Answers.

    What's the simplest method I can use to turn this into normal
    text (I assume Unicode) ? There will never be any image or
    link information in the string, just encoded text.


    Simon Slavin Guest

  2. Similar Questions and Discussions

    1. Cannot read Unicode from html blog
      With Acrobat 7.0.8 I tried to create a PDF from <http://ardentagnostic.blogspot.com/2006/09/bad-guys-are-also-good-guys-but.html> Problem 1: The...
    2. Importing greek unicode characters from html page
      When importing greek unicode characters from a html file, Adobe does not map these characters to unicode but to a greek font that does not inlude...
    3. increase the text box size-simple html help!
      This isn't an ASP question. Size should increase the size, unless it is limited by other factors, e.g. a width parameter in a style sheet for...
    4. Converting simple text to HTML
      Is there a way to convert ASCII text entered into a textbox, for example, into HTML so that newlines will be turned into <BR> tags or similar among...
    5. [cocoa richt text] named styles
      Hi, the RTF does support named styles for paragraphs and characters. I did not find any reference to that in the Cocoa documentation. How can I...
  3. #2

    Default Re: Cocoa: simple HTML text to Unicode

    In article <BB8179049668744EF6@10.0.1.2>,
    [email]slavins@hearsay.demon.co.uk[/email]@localhost (Simon Slavin) wrote:
    > I have an NSString. 99% of the time it contains just text.
    > Sometimes one or more characters of the string will be encoded
    > in HTML, like the following:
    >
    > Time for Questions &amp; Answers.
    >
    > What's the simplest method I can use to turn this into normal
    > text (I assume Unicode) ? There will never be any image or
    > link information in the string, just encoded text.
    One likely approach is as follows, though I have not tried it:

    1. Load the NSString into an NSAttributedString, using
    NSAttributedString's -initWithHTML:documentAttributes initializer.

    2. Follow that by converting the attributed string back to an
    un-attributed one.

    --
    Tom "Tom" Harrington
    Macaroni, Automated System Maintenance for Mac OS X.
    Version 1.4: Best cleanup yet, gets files other tools miss.
    See [url]http://www.atomicbird.com/[/url]
    Tom Harrington Guest

  4. #3

    Default Re: Cocoa: simple HTML text to Unicode

    <slavins@hearsay.demon.co.uk> wrote:
    > I have an NSString. 99% of the time it contains just text. Sometimes one
    > or more characters of the string will be encoded in HTML, like the
    > following:
    >
    > Time for Questions &amp; Answers.
    >
    > What's the simplest method I can use to turn this into normal text (I
    > assume Unicode) ? There will never be any image or link information in
    > the string, just encoded text.
    Search the string for HTML-encoded character entities and then convert
    them. :-) Numeric entities are assumed to be Unicode. Details:
    <http://makeashorterlink.com/?M10E233D5>

    Look at NSScanner, for some clues about how you might accomplish this.
    Paul Mitchum Guest

  5. #4

    Default Re: Cocoa: simple HTML text to Unicode

    In article <BB8179049668744EF6@10.0.1.2>,
    [email]slavins@hearsay.demon.co.uk[/email]@localhost (Simon Slavin) wrote:
    > What's the simplest method I can use to turn this into normal
    > text (I assume Unicode) ? There will never be any image or
    > link information in the string, just encoded text.
    My dirty little trick for this (used in STXML) was to note that Apple
    provides a perfectly fine entity decoder for XML to handle property
    lists, although they don't provide a detached interface to it. So I do
    something like:

    decodedString = [[NSString stringWithFormat: @"<string>%@</string>",
    encodedString] propertyList];
    Doc O'Leary Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139