Ask a Question related to PostgreSQL / PGSQL, Design and Development.

  1. #1

    Default Re: Lower case

    Hi,

    On Wed, 2005-01-26 at 12:01 +0000, Vladimir S. Petukhov wrote:
    > Hi!
    >
    > Sorry for my English..
    >
    > I want to do case-insensitivity search, like this
    > ... WHERE lower (column_name) LIKE lower (%value%);
    > This work fine for English..
    > But i need search for Russian words, lower() operator does not work with
    > Russian (non-English) chars, but ORDER works fine...
    > ???????????????
    > What's wrong?
    lower(), upper() and case insensitive search highly depend on
    the correct locale. The locale is currently set on initdb
    time, that is, when your whole cluster gets created.

    If you dont want to recreate your cluster and your db,
    you probably need to write a function to lower()
    according to your datas locale.

    Regards
    Tino
    --
    Tino Wildenhain <tino@wildenhain.de>


    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

    Tino Wildenhain Guest

  2. Similar Questions and Discussions

    1. SOME of my lower case letters won't type!help
      I'm having this problem. Some of my lower case letters won't type in Dreamweaver. I even erased the site. But my new one I can't type lower case...
    2. OT--Copperplate & Lower Case
      Years ago, I stumbled across a Copperplate font that actually had a lower case character set with fine serifs and all. Can't remember where I saw it....
    3. #26354 [Opn->Bgs]: <Address> forced to lower case
      ID: 26354 Updated by: elmicha@php.net Reported By: spam at tkz dot net -Status: Open +Status: Bogus...
    4. #26354 [NEW]: <Address> forced to lower case
      From: spam at tkz dot net Operating system: linux & windows xp PHP version: 4.3.3 PHP Bug Type: Output Control Bug...
    5. lost lower case type
      nancy, no but don't be embarrased to ask. What I get I a Big capital A and then instead of lower case I get small uppercase. I have been all over...
  3. #2

    Default Re: Lower case

    On Wed, 26 Jan 2005 12:01:49 +0000, Vladimir S. Petukhov
    <vladimir@sycore.org> wrote:
    > I want to do case-insensitivity search, like this
    > ... WHERE lower (column_name) LIKE lower (%value%);
    > This work fine for English..
    > But i need search for Russian words, lower() operator does not work with
    > Russian (non-English) chars, but ORDER works fine...
    > ???????????????
    > What's wrong?
    Welllllll... Have you made an initdb with apropriate locale setting?
    Try:
    pg_controldata /var/lib/postgresql/data
    (or wherever your db is)
    You should see lines like:
    LC_COLLATE: C
    LC_CTYPE: C

    If you are using Unicode these should be ru_RU.UTF-8, if not then
    ru_RU.KOI8-R or something. If you see 'C', or 'en_EN' or similar
    then you won't have Russian lower/upper support (and ORDER BY
    was just a "luck" :)).

    What you can do:
    pg_dump the database, initdb --locale=ru_RU; pg_restore the database.

    Ohhh, and since you're at it, there is one thing which makes me
    wonder and if you don't mind, please try it. :) This will require
    plperlu language, so 'createlang plperlu' and that you use
    unicode encoding in your database.

    Try if this function:
    CREATE OR REPLACE FUNCTION lc(text) RETURNS text AS $$
    utf8::decode($_[0]);
    return lc($_[0]);
    $$ LANGUAGE plperlu STABLE;

    Used as select lc(your_text_column) from your_table;
    works for Russian alphabet. :) I'm just cuuurious! :)

    Regards,
    Dawid

    ---------------------------(end of broadcast)---------------------------
    TIP 8: explain analyze is your friend

    Dawid Kuroczko Guest

  4. #3

    Default Re: Lower case

    Tino Wildenhain wrote:
    > Hi,
    >
    > On Wed, 2005-01-26 at 12:01 +0000, Vladimir S. Petukhov wrote:
    >
    >>Hi!
    >>
    >>Sorry for my English..
    >>
    >>I want to do case-insensitivity search, like this
    >>... WHERE lower (column_name) LIKE lower (%value%);
    >>This work fine for English..
    >>But i need search for Russian words, lower() operator does not work with
    >>Russian (non-English) chars, but ORDER works fine...
    >>???????????????
    >>What's wrong?
    what about ILIKE operator?

    miso

    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to [email]majordomo@postgresql.org[/email] so that your
    message can get through to the mailing list cleanly

    Michal Hlavac Guest

  5. #4

    Default Lower case

    Hi!

    Sorry for my English..

    I want to do case-insensitivity search, like this
    .... WHERE lower (column_name) LIKE lower (%value%);
    This work fine for English..
    But i need search for Russian words, lower() operator does not work with
    Russian (non-English) chars, but ORDER works fine...
    ???????????????
    What's wrong?

    --
    It is better to travel hopefully than to fly Continental.

    Now playing: 16 - ?????????.mp3
    AutoGenerated by fortune & xmms...

    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to [email]majordomo@postgresql.org[/email] so that your
    message can get through to the mailing list cleanly

    Vladimir S. Petukhov Guest

  6. #5

    Default Re: Lower case

    Vladimir,

    there is pgsql-ru-general mailing list for russian speaking people

    Oleg
    On Wed, 26 Jan 2005, Vladimir S. Petukhov wrote:
    > Hi!
    >
    > Sorry for my English..
    >
    > I want to do case-insensitivity search, like this
    > ... WHERE lower (column_name) LIKE lower (%value%);
    > This work fine for English..
    > But i need search for Russian words, lower() operator does not work with
    > Russian (non-English) chars, but ORDER works fine...
    > ???????????????
    > What's wrong?
    >
    >
    Regards,
    Oleg
    __________________________________________________ ___________
    Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
    Sternberg Astronomical Institute, Moscow University (Russia)
    Internet: [email]oleg@sai.msu.su[/email], [url]http://www.sai.msu.su/~megera/[/url]
    phone: +007(095)939-16-83, +007(095)939-23-83

    ---------------------------(end of broadcast)---------------------------
    TIP 7: don't forget to increase your free space map settings

    Oleg Bartunov Guest

  7. #6

    Default Re: Lower case

    "Vladimir S. Petukhov" <vladimir@sycore.org> writes:
    > pg_controldata /var/pgsql/data
    > ...
    > LC_COLLATE: ru_RU
    > LC_CTYPE: ru_RU
    > bash-2.05b# psql -l
    > List of databases
    > Name | Owner | Encoding
    > -----------+----------+----------
    > testdb | postgres | UNICODE
    > And LIKE, ILIKE, ~ do not recognize upper/lower case..
    What character encoding is implied by those LC_ settings on your machine?
    If it's different from the database encoding (here utf8) these things
    won't actually work right.

    Also, before PG 8.0 upper/lower simply don't work on multibyte characters,
    which means you'd have to use a single-byte encoding to make it work
    (I think koi8 is single-byte but not sure).

    regards, tom lane

    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to [email]majordomo@postgresql.org[/email] so that your
    message can get through to the mailing list cleanly

    Tom Lane Guest

  8. #7

    Default Re: Lower case

    pg_controldata /var/pgsql/data
    ....
    LC_COLLATE: ru_RU
    LC_CTYPE: ru_RU

    bash-2.05b# psql -l
    List of databases
    Name | Owner | Encoding
    -----------+----------+----------
    testdb | postgres | UNICODE

    And LIKE, ILIKE, ~ do not recognize upper/lower case..
    SELECT ... ORDER BY do something like that (in English Alphabet):
    a
    AAAA
    aaaaa
    Tast12
    tes
    test
    Test12
    test12

    ?:(

    On Wednesday 26 January 2005 10:15, Dawid Kuroczko wrote:
    > On Wed, 26 Jan 2005 12:01:49 +0000, Vladimir S. Petukhov
    >
    > <vladimir@sycore.org> wrote:
    > > I want to do case-insensitivity search, like this
    > > ... WHERE lower (column_name) LIKE lower (%value%);
    > > This work fine for English..
    > > But i need search for Russian words, lower() operator does not work with
    > > Russian (non-English) chars, but ORDER works fine...
    > > ???????????????
    > > What's wrong?
    >
    > Welllllll... Have you made an initdb with apropriate locale setting?
    > Try:
    > pg_controldata /var/lib/postgresql/data
    > (or wherever your db is)
    > You should see lines like:
    > LC_COLLATE: C
    > LC_CTYPE: C
    >
    > If you are using Unicode these should be ru_RU.UTF-8, if not then
    > ru_RU.KOI8-R or something. If you see 'C', or 'en_EN' or similar
    > then you won't have Russian lower/upper support (and ORDER BY
    > was just a "luck" :)).
    >
    > What you can do:
    > pg_dump the database, initdb --locale=ru_RU; pg_restore the database.
    >
    > Ohhh, and since you're at it, there is one thing which makes me
    > wonder and if you don't mind, please try it. :) This will require
    > plperlu language, so 'createlang plperlu' and that you use
    > unicode encoding in your database.
    >
    > Try if this function:
    > CREATE OR REPLACE FUNCTION lc(text) RETURNS text AS $$
    > utf8::decode($_[0]);
    > return lc($_[0]);
    > $$ LANGUAGE plperlu STABLE;
    >
    > Used as select lc(your_text_column) from your_table;
    > works for Russian alphabet. :) I'm just cuuurious! :)
    >
    > Regards,
    > Dawid
    --
    Скунсу и не надо быть красивым. Его и так все уважают.

    Now playing:
    AutoGenerated by fortune & xmms...

    ---------------------------(end of broadcast)---------------------------
    TIP 1: subscribe and unsubscribe commands go to [email]majordomo@postgresql.org[/email]

    Vladimir S. Petukhov Guest

  9. #8

    Default Re: Lower case

    On Wednesday 26 January 2005 20:01, you wrote:
    > "Vladimir S. Petukhov" <vladimir@sycore.org> writes:
    > > pg_controldata /var/pgsql/data
    > > ...
    > > LC_COLLATE: ru_RU
    > > LC_CTYPE: ru_RU
    > >
    > > bash-2.05b# psql -l
    > > List of databases
    > > Name | Owner | Encoding
    > > -----------+----------+----------
    > > testdb | postgres | UNICODE
    > >
    > > And LIKE, ILIKE, ~ do not recognize upper/lower case..
    >
    > What character encoding is implied by those LC_ settings on your machine?
    > If it's different from the database encoding (here utf8) these things
    > won't actually work right.
    LANG=ru_RU.koi8r
    LC_ALL=ru_RU.koi8r
    But how it act on lower/upper cases? Client use utf-8 encoding...
    >
    > Also, before PG 8.0 upper/lower simply don't work on multibyte characters,
    > which means you'd have to use a single-byte encoding to make it work
    > (I think koi8 is single-byte but not sure).
    >
    > regards, tom lane
    >
    > ---------------------------(end of broadcast)---------------------------
    > TIP 3: if posting/reading through Usenet, please send an appropriate
    > subscribe-nomail command to [email]majordomo@postgresql.org[/email] so that your
    > message can get through to the mailing list cleanly
    --
    You feel a whole lot more like you do now than you did when you used to.

    Now playing:
    AutoGenerated by fortune & xmms...

    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

    Vladimir S. Petukhov Guest

  10. #9

    Default Re: Lower case

    On Thu, 27 Jan 2005 00:16:14 +0000, Vladimir S. Petukhov
    <vladimir@sycore.org> wrote:
    > > > LC_COLLATE: ru_RU
    > > > LC_CTYPE: ru_RU
    > > > Name | Owner | Encoding
    > > > -----------+----------+----------
    > > > testdb | postgres | UNICODE
    > > > And LIKE, ILIKE, ~ do not recognize upper/lower case..
    > >
    > > What character encoding is implied by those LC_ settings on your machine?
    > > If it's different from the database encoding (here utf8) these things
    > > won't actually work right.
    > LANG=ru_RU.koi8r
    > LC_ALL=ru_RU.koi8r
    > But how it act on lower/upper cases? Client use utf-8 encoding...
    The client uses utf-8 encoding, so does server. Texts are stored
    using UTF-8. However when you call a lower() function from
    PostgreSQL it does more or less following:
    -- it retrieves text row from database. This text is in UTF-8 encoding.
    -- it calls strxfrm function upon this text.
    -- strxfrm function sees that current locale is ru_RU.koi8r
    -- strxfrm then takes utf-8 encoded text and treats it as koi8r
    -- strxfrm "skips over" characters it does not recognize (utf-8 chars)
    -- strxfrm returns transformed text
    -- PostgreSQL takes the resulting text, believing it is still in utf-8.
    In other words, probably only latin characters were subject to lower()
    functions, any "unknown" Russian UTF-8 characters were at best
    skipped.

    Please note that PostgreSQL does not do implicit utf8->koi8r->utf8
    conversion while calling function lower(). AFAIK it does not even
    know (or care) if current locale setting ("ru_RU") is for different
    encoding than current database's. It is DB Admin's duty to make
    sure cluster locale (done in initdb) is compatible with database
    encoding (done in createdb).

    Regards,
    Dawid

    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

    Dawid Kuroczko Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139