Ask a Question related to Macromedia ColdFusion, Design and Development.
-
r-con #1
Cyrillic characters / CF?
I'm trying to develop a web page in russian.
The files are UTF-8 encoded text files. With an .htm or .html extension, the
cyrillic characters show up fine rendered in a browser.
When I change the extension to .cfm, all the characters in the page turn to
empty boxes. There is no database support on these pages, just text and html
and <cfinclude> tags...
I can change a file back to .htm, and the cyrillic shows, back to .cfm and the
characters box. I can do this all day. I'm totally stumped. It does not appear
to be a reported CF issue in the Knowledge Base, but I wanted to see if anybody
else can replicate this problem. I'm using MX 6.0.
r-con Guest
-
Cyrillic not coming through from MS Word to InDesign
Are you using exactly the same font in both applicaitons? It might be that the font you are using in InDesign does not support the Ccyrillic... -
Converting Cyrillic Text in Illustrator
Does anyone have any experience with changing laguage in Adobe products, specifically Illustrator? I'm working with converting English text to... -
cyrillic fonts
i'm using illustrator 10, and i'm trying to do a job that requires russian language. i can't seem to get my cyrillic fonts to appear as cyrillic. i... -
InDesign CS Cyrillic Support?
Hi, I have a big problem trying to use Cyrillic fonts in In Design CS. I niticed that the fonts that are not Unicode have problems... Reading... -
Ruby & Cyrillic (Russian encodings)
Hi, I would be thankful if s.b. could give me a link dealing with Cyrillic encoding in Russian. I want todo trivial things like checking... -
PaulH #2
Re: Cyrillic characters / CF?
first off, i think you should be using mx 6.1 plus all the hotfixes. etc. next,
can i see the cf code? can i see the content?
unicode chars rendered as boxes usually mean the font is unable to render that
char. and question marks mean garbaged data.
PaulH Guest
-
Howard Perlman #3
Re: Cyrillic characters / CF?
Interesting. I have some Russian HTML pages and I will eventually create some
CFM ones, so I'd like to know too.
I did a little test and I saved my existing HTML page as a CFM page. The
cyrillic characters look to be showing up fine in the CFM page.
You can see the HTML page at:
[url]http://ga.water.usgs.gov/edu/watercyclerussianhi.html[/url]
The CFM page is at: [url]http://ga2.er.usgs.gov/edu/watercyclerussianhi.cfm[/url]
Now, I converted the cyrillic character to the numeric codes. I got the Word
file from the Russian translator and "save as Web page". That converts
character to the codes. So, my Russian text looks like this in the HTML/CFM
page:
<h1 title="Диаграмма
водного
цикла - The Water Cycle, in Russian">The water
cycle, Russian</h1></td><td style="text-align: right; vertical-align: top;">
R-Con, do you use codes like this or just cut and pasted characters.
Please look at the above CFM page and let me know if you don't see Cyrillic
characters. It also may depend on settings in the browser, but I thought UTF-8
was supposed to make the characters appear properly in any browser...
Howard Perlman
US Geological Survey
Howard Perlman Guest
-
-
r-con #5
Re: Cyrillic characters / CF?
Thanks for your input.
First off, agreed - I should be using MX 6.1. Unfortunately that's not always
how it works in government! With that said, we are gearing up for 7.
I'm starting to think this may be an issue with the conversion from MS Word to
text characters or the unix box, it's just strange the CFM extension would make
such a difference. Using HomeSite code view, the cyrillic characters do not
look like cyrillic characters, they look like
?????????-????????? ???? ???
which, till today, I assumed were somehow the UTF-8 representations of
cyrillic. If I paste cyrllic directly into HomeSite (in a UTF-8 encoded file),
they do not display in the browser correctly, and they change upon saving and
reopening.
My process is to copy the cyrillic from MS Word to a Dreamweaver file that has
a 1251 character set specified (doesn't seem to matter though), save, and post
this to the remote server. Then I open the file and copy the formatted
HTML/cyrillic into the template using HomeSite. Due to the templates and the
way we connect to the remote server (RDS), Dreamweaver is not a viable tool for
maintenance of this type (another long story).
I'm going to try the save as HTML in MS Word; however, I don't know if I can
deal with all the MS Word styles/XML garbage. With that said, I do see the
cyrillic characters correctly in your page, Howard.
I appreciate the assistance and I'll be back when I learn more.
r-con Guest
-
Howard Perlman #6
Re: Cyrillic characters / CF?
Hey,
I had many problems with foreign fonts and I'd try to stay with the Unicode
numerics when Word put those out in the "Save as HTML". Of course, it is heck
trying to cut and paste into the HTML file and knowing where you are.
I would often get garbled characters if something didn't work. But, what I am
doing worked with MX6.1, and now works with 7. I don't think it is CF at all.
And I just noticed in my HTML I used <meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1" />, when I thought I switched to UTF-8.
For Macedonian it really made a difference which one I used, but I don't think
so for Russian.
Now, yes, Word 2000 makes a huge mess of an HTML file. Almost unusable. We
have Word 2003 and if you do, notice when you say "Save as HTML" you can now
choose "Filtered HTML", rather than just HTML. This creates a much smaller HTML
file with hardly any of the mess that the older Word creates. That helped a
whole lot.
But, if you don't have the newer Word, Dreamweaver can solve this for you. I
don't really use Dreamweaver, but I have used it just to clean up HTML Word
files!. There is a pull down command that says something like "Clean up Word
files" and it does clean out all the mess from Word. So, try that method, too.
Howard P.
Howard Perlman Guest
-
PaulH #7
Re: Cyrillic characters / CF?
i'm not sure homesite can handle utf-8 encoded files, cfstudio never could.
next if you're using dw w/a codepage instead of unicode, that could be the
problem. if you can't use unicode w/ dw then try notepad and save as utf-8. and
no you don't want to use html entities or NCR like howard's examples.
have a look at the [url]http://www.sustainablegis.com/unicode/[/url].
PaulH Guest
-
Howard Perlman #8
Re: Cyrillic characters / CF?
Gosh, this stuff has been driving me nuts. I have long pages that people sent
me (Microsoft Word) in things like Farsi, Arabic, Greek, Macedonian, Georgian,
Vietnamese.. And Chinese is coming in.
I thought that the safest way was to use the numeric Unicode values and that
they would display in any modern browser. I did notice it makes a difference
which CHARSET I use in the META tag. Sometimes UTF-8 would actually mess up the
text (Czech), in fact it would mess up the UNICODE characters (the ones
numeric). Using 8859-1 (I think) corrected that.
Of course, I don't know what other people are seeing. No one complains to me
so I am assuming they see it properly (Czech did bring it up, luckily).
But I keep reading it is the best to use UTF-8 as the CHARSET (for all
pages?). I assumed that would be safe for everyone for every language.
I did look at the [url]http://www.sustainablegis.com/unicode/[/url] page and I see it
correctly in MSIE, Firefox, and Netscape 7, on PC. Then again, I think there
are settings in the browser and I don't know about that.
So, I'm not sure what to do....
Howard Perlman Guest
-
PaulH #9
Re: Cyrillic characters / CF?
howard
html entities/NCR aren't a good choice. they don't search well, cause problems
in editing, in db, not human readable, etc. and you never want to use
iso-8859-1 encoding w/unicode data. no, the meta tag charset has no effect as
cf ignores it, though including it is a good idea for web-crawlers and as an
artifact for what the original encoding intent. if using utf-8 encoding
"messed" something up then that text wasn't utf-8 in the first place.
yes, utf-8 is the best choice for cf based pages. for one thing, mx defaults
to it. use utf-8 for all your encoding. make sure your database, etc. handles
it. see if [url]http://coldfusion.sys-con.com/read/44480.htm[/url] makes any sense.
guys, this isn't voodoo. just keep your encodings straight and you shouldn't
have any problems.
PaulH Guest
-
Howard Perlman #10
Re: Cyrillic characters / CF?
Well... gosh.
Now, all the pages I worked on are all HTML, not CFM. I didn't know it made a
diffferenct.
So, do you think there is one set CHARSET set META tag that I should use for
all my pages, no matter what language?
And when I cut and pasted out of Word a lot, I'd get a lot of garbled
characters (boxes, etc) when I pasted the text into HTML and viewed it.
Macedonian was the first one I tried that caused problems. I just happened to
paste in the Unicode numerics and it worked (yes, VERY hard to work with, and
VERY hard to edit). But, I didn't know there was or is a different way???
I thought the advantage of the Unicode numerics was that the user would not
need the other language font installed..
Gives me a headache..
Thanks
Howard Perlman Guest
-
cf_menace #11
Re: Cyrillic characters / CF?
[url]http://www.stephencollins.org/archives/2005/03/a_gift_to_the_c.html[/url]
HTH
cf_menace Guest
-
PaulH #12
Re: Cyrillic characters / CF?
So, do you think there is one set CHARSET set META tag that I should use for
all my pages, no matter what language? if you know me, then you know how i'll
answer that ;-) just use unicode is pretty much all you have to remember.
pasting from word into what? on what OS? what encoding in word? what you're
doing contains a lot of variability as far as encoding goes. boxes means the
browser can't render that char (ie not the right font). what's the "etc." bit?
if it's question marks then the data is garbaged. if its mojibake then the
encoding is wrong. while it moght sound silly, notepad is a decent editor for
these sorts of things. it lets you control the encoding and 100% doesn't add
anything. you stop off there first, save off the text as utf-8 and then
proceed.
no, no matter what's used the user will need a font that has those glyphs. we
do a lot of i18n work and all our w/s have arial unicode ms on them (yes it's
huge but it has pretty much all the world languages in it--not all but pretty
close to all). but as long as the font is unicode based and is the correct font
for that language you shouldn't have any problems.
btw if you're producing the same page/site in different langauges using html,
i think its time to investigate using cf for that.
PaulH Guest



Reply With Quote

