Cyrillic characters / CF?

Ask a Question related to Macromedia ColdFusion, Design and Development.

  1. #1

    Default Cyrillic characters / CF?

    I'm trying to develop a web page in russian.

    The files are UTF-8 encoded text files. With an .htm or .html extension, the
    cyrillic characters show up fine rendered in a browser.

    When I change the extension to .cfm, all the characters in the page turn to
    empty boxes. There is no database support on these pages, just text and html
    and <cfinclude> tags...

    I can change a file back to .htm, and the cyrillic shows, back to .cfm and the
    characters box. I can do this all day. I'm totally stumped. It does not appear
    to be a reported CF issue in the Knowledge Base, but I wanted to see if anybody
    else can replicate this problem. I'm using MX 6.0.

    r-con Guest

  2. Similar Questions and Discussions

    1. Cyrillic not coming through from MS Word to InDesign
      Are you using exactly the same font in both applicaitons? It might be that the font you are using in InDesign does not support the Ccyrillic...
    2. Converting Cyrillic Text in Illustrator
      Does anyone have any experience with changing laguage in Adobe products, specifically Illustrator? I'm working with converting English text to...
    3. cyrillic fonts
      i'm using illustrator 10, and i'm trying to do a job that requires russian language. i can't seem to get my cyrillic fonts to appear as cyrillic. i...
    4. InDesign CS Cyrillic Support?
      Hi, I have a big problem trying to use Cyrillic fonts in In Design CS. I niticed that the fonts that are not Unicode have problems... Reading...
    5. Ruby & Cyrillic (Russian encodings)
      Hi, I would be thankful if s.b. could give me a link dealing with Cyrillic encoding in Russian. I want todo trivial things like checking...
  3. #2

    Default Re: Cyrillic characters / CF?

    first off, i think you should be using mx 6.1 plus all the hotfixes. etc. next,
    can i see the cf code? can i see the content?

    unicode chars rendered as boxes usually mean the font is unable to render that
    char. and question marks mean garbaged data.


    PaulH Guest

  4. #3

    Default Re: Cyrillic characters / CF?

    Interesting. I have some Russian HTML pages and I will eventually create some
    CFM ones, so I'd like to know too.

    I did a little test and I saved my existing HTML page as a CFM page. The
    cyrillic characters look to be showing up fine in the CFM page.

    You can see the HTML page at:
    [url]http://ga.water.usgs.gov/edu/watercyclerussianhi.html[/url]
    The CFM page is at: [url]http://ga2.er.usgs.gov/edu/watercyclerussianhi.cfm[/url]

    Now, I converted the cyrillic character to the numeric codes. I got the Word
    file from the Russian translator and "save as Web page". That converts
    character to the codes. So, my Russian text looks like this in the HTML/CFM
    page:
    <h1 title="Диаграмма
    водного
    цикла - The Water Cycle, in Russian">The water
    cycle, Russian</h1></td><td style="text-align: right; vertical-align: top;">

    R-Con, do you use codes like this or just cut and pasted characters.
    Please look at the above CFM page and let me know if you don't see Cyrillic
    characters. It also may depend on settings in the browser, but I thought UTF-8
    was supposed to make the characters appear properly in any browser...

    Howard Perlman
    US Geological Survey

    Howard Perlman Guest

  5. #4

    Default Re: Cyrillic characters / CF?

    Oops, I am using CF 7
    Howard Perlman Guest

  6. #5

    Default Re: Cyrillic characters / CF?

    Thanks for your input.

    First off, agreed - I should be using MX 6.1. Unfortunately that's not always
    how it works in government! With that said, we are gearing up for 7.

    I'm starting to think this may be an issue with the conversion from MS Word to
    text characters or the unix box, it's just strange the CFM extension would make
    such a difference. Using HomeSite code view, the cyrillic characters do not
    look like cyrillic characters, they look like

    ?????????-????????? ???? ???

    which, till today, I assumed were somehow the UTF-8 representations of
    cyrillic. If I paste cyrllic directly into HomeSite (in a UTF-8 encoded file),
    they do not display in the browser correctly, and they change upon saving and
    reopening.

    My process is to copy the cyrillic from MS Word to a Dreamweaver file that has
    a 1251 character set specified (doesn't seem to matter though), save, and post
    this to the remote server. Then I open the file and copy the formatted
    HTML/cyrillic into the template using HomeSite. Due to the templates and the
    way we connect to the remote server (RDS), Dreamweaver is not a viable tool for
    maintenance of this type (another long story).

    I'm going to try the save as HTML in MS Word; however, I don't know if I can
    deal with all the MS Word styles/XML garbage. With that said, I do see the
    cyrillic characters correctly in your page, Howard.

    I appreciate the assistance and I'll be back when I learn more.

    r-con Guest

  7. #6

    Default Re: Cyrillic characters / CF?

    Hey,

    I had many problems with foreign fonts and I'd try to stay with the Unicode
    numerics when Word put those out in the "Save as HTML". Of course, it is heck
    trying to cut and paste into the HTML file and knowing where you are.

    I would often get garbled characters if something didn't work. But, what I am
    doing worked with MX6.1, and now works with 7. I don't think it is CF at all.
    And I just noticed in my HTML I used <meta http-equiv="Content-Type"
    content="text/html; charset=iso-8859-1" />, when I thought I switched to UTF-8.
    For Macedonian it really made a difference which one I used, but I don't think
    so for Russian.

    Now, yes, Word 2000 makes a huge mess of an HTML file. Almost unusable. We
    have Word 2003 and if you do, notice when you say "Save as HTML" you can now
    choose "Filtered HTML", rather than just HTML. This creates a much smaller HTML
    file with hardly any of the mess that the older Word creates. That helped a
    whole lot.

    But, if you don't have the newer Word, Dreamweaver can solve this for you. I
    don't really use Dreamweaver, but I have used it just to clean up HTML Word
    files!. There is a pull down command that says something like "Clean up Word
    files" and it does clean out all the mess from Word. So, try that method, too.

    Howard P.

    Howard Perlman Guest

  8. #7

    Default Re: Cyrillic characters / CF?

    i'm not sure homesite can handle utf-8 encoded files, cfstudio never could.
    next if you're using dw w/a codepage instead of unicode, that could be the
    problem. if you can't use unicode w/ dw then try notepad and save as utf-8. and
    no you don't want to use html entities or NCR like howard's examples.

    have a look at the [url]http://www.sustainablegis.com/unicode/[/url].

    PaulH Guest

  9. #8

    Default Re: Cyrillic characters / CF?

    Gosh, this stuff has been driving me nuts. I have long pages that people sent
    me (Microsoft Word) in things like Farsi, Arabic, Greek, Macedonian, Georgian,
    Vietnamese.. And Chinese is coming in.

    I thought that the safest way was to use the numeric Unicode values and that
    they would display in any modern browser. I did notice it makes a difference
    which CHARSET I use in the META tag. Sometimes UTF-8 would actually mess up the
    text (Czech), in fact it would mess up the UNICODE characters (the ones
    numeric). Using 8859-1 (I think) corrected that.

    Of course, I don't know what other people are seeing. No one complains to me
    so I am assuming they see it properly (Czech did bring it up, luckily).

    But I keep reading it is the best to use UTF-8 as the CHARSET (for all
    pages?). I assumed that would be safe for everyone for every language.
    I did look at the [url]http://www.sustainablegis.com/unicode/[/url] page and I see it
    correctly in MSIE, Firefox, and Netscape 7, on PC. Then again, I think there
    are settings in the browser and I don't know about that.

    So, I'm not sure what to do....

    Howard Perlman Guest

  10. #9

    Default Re: Cyrillic characters / CF?

    howard

    html entities/NCR aren't a good choice. they don't search well, cause problems
    in editing, in db, not human readable, etc. and you never want to use
    iso-8859-1 encoding w/unicode data. no, the meta tag charset has no effect as
    cf ignores it, though including it is a good idea for web-crawlers and as an
    artifact for what the original encoding intent. if using utf-8 encoding
    "messed" something up then that text wasn't utf-8 in the first place.

    yes, utf-8 is the best choice for cf based pages. for one thing, mx defaults
    to it. use utf-8 for all your encoding. make sure your database, etc. handles
    it. see if [url]http://coldfusion.sys-con.com/read/44480.htm[/url] makes any sense.

    guys, this isn't voodoo. just keep your encodings straight and you shouldn't
    have any problems.



    PaulH Guest

  11. #10

    Default Re: Cyrillic characters / CF?

    Well... gosh.

    Now, all the pages I worked on are all HTML, not CFM. I didn't know it made a
    diffferenct.

    So, do you think there is one set CHARSET set META tag that I should use for
    all my pages, no matter what language?

    And when I cut and pasted out of Word a lot, I'd get a lot of garbled
    characters (boxes, etc) when I pasted the text into HTML and viewed it.
    Macedonian was the first one I tried that caused problems. I just happened to
    paste in the Unicode numerics and it worked (yes, VERY hard to work with, and
    VERY hard to edit). But, I didn't know there was or is a different way???

    I thought the advantage of the Unicode numerics was that the user would not
    need the other language font installed..

    Gives me a headache..

    Thanks

    Howard Perlman Guest

  12. #11

    Default Re: Cyrillic characters / CF?

    [url]http://www.stephencollins.org/archives/2005/03/a_gift_to_the_c.html[/url]

    HTH
    cf_menace Guest

  13. #12

    Default Re: Cyrillic characters / CF?

    So, do you think there is one set CHARSET set META tag that I should use for
    all my pages, no matter what language? if you know me, then you know how i'll
    answer that ;-) just use unicode is pretty much all you have to remember.

    pasting from word into what? on what OS? what encoding in word? what you're
    doing contains a lot of variability as far as encoding goes. boxes means the
    browser can't render that char (ie not the right font). what's the "etc." bit?
    if it's question marks then the data is garbaged. if its mojibake then the
    encoding is wrong. while it moght sound silly, notepad is a decent editor for
    these sorts of things. it lets you control the encoding and 100% doesn't add
    anything. you stop off there first, save off the text as utf-8 and then
    proceed.

    no, no matter what's used the user will need a font that has those glyphs. we
    do a lot of i18n work and all our w/s have arial unicode ms on them (yes it's
    huge but it has pretty much all the world languages in it--not all but pretty
    close to all). but as long as the font is unicode based and is the correct font
    for that language you shouldn't have any problems.

    btw if you're producing the same page/site in different langauges using html,
    i think its time to investigate using cf for that.

    PaulH Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139