Professional Web Applications Themes

Interesting behaviour of : LIKE v + '%' - Microsoft SQL / MS SQL Server

Removed by Administrator...

  1. Moderated Post

    Default Re: Interesting behaviour of : LIKE v + '%'

    Removed by Administrator
    Andrew J. Kelly Guest
    Moderated Post

  2. Moderated Post

    Default Re: Interesting behaviour of : LIKE v + '%'

    Removed by Administrator
    Gavin John Fowler Guest
    Moderated Post

  3. #3

    Default Re: Interesting behaviour of : LIKE v + '%'

    Andrew,
    I'm afraid I must take back my prev post.
    I can te this behaviour on other servers - all of them SQL 2000
    boxes - the worst offender being ver .194 but also happens on .534 -
    again with the same value involving '9' - other values put in the list
    of inserts in my script do not cause the problem.

    Also I notice that SQL Server 7 editions (which do not show the
    problem) have a slightly different query plan.

    If I am right here, then this is quite an issue Andrew ....
    GJF
    Gavin John Fowler Guest

  4. #4

    Default Re: Interesting behaviour of : LIKE v + '%'

    OK - just for the record - its a Collation issue.
    SQL_Latin1_General_Cp1_CI_AS seems a much faster environment tha
    Latin1_General_CI_AS for the query ...
    Cannot account for difference in SPs level-performance on various SQL
    2K boxes - but not nearly as significant as the differences due to
    collation.
    SHould have thought of this first !
    Gavin John Fowler Guest

  5. #5

    Default Re: Interesting behaviour of : LIKE v + '%' _ Collations

    What are you using to measure the response time? I don't see any
    differences. On my machine they all execute immediately, almost untraceable
    regardless of the sort order or value. Can you provide a very streamlined
    example of what yousee as a fast query and one as a slow query with
    everything needed to ensure there is no mistake on how it is set up. If
    that is repeatable on yours then I will pass it by the guys at MS even if I
    can't get it to do the same thing. By the way, you can call MS PSS directly
    if you like and report it. If it turns out to be a bug they will refund
    your money. But the nice thing about calling is you get to work directly
    with someone to try and reproduce the issue.


    [url]http://support.microsoft.com/default.aspx?scid=fh;EN-US;sql[/url] SQL Support
    [url]http://www.mssqlserver.com/faq/general-pss.asp[/url] MS PSS
    --

    Andrew J. Kelly
    SQL Server MVP


    "Gavin John Fowler" <Gavin_Fowlerhotmail.com> wrote in message
    news:5aa05620.0307070353.15535927posting.google.c om...
    > Yep - that's it.
    > If you have time Andrew - try running my script and setting the
    > various Collations - SQL_Latin1_General_Cp1_CI_AS and
    > Latin1_General_CI_AS
    > on the columns (Name, C2 of my table T) .
    > You'll see that the first one (Sort Order = 52) is very fast for all
    > values of data but Sort Order = 72 is appalling for certain values .
    > Don't understand quite why this should be so (I mean the variability
    > according to data values within a collation) But it is.
    >
    > Anyway. Sorted Mate.
    >
    > Thanks for the interest.

    Andrew J. Kelly Guest

  6. #6

    Default Re: Interesting behaviour of : LIKE v + '%'

    OK, there are two cases, server default collation to this
    (1) of Latin1_General_CI_AS and (2) of SQL_Latin1_General_CP1_CI_AS


    My claim is
    (A) That a server with default collation(2) (sort order = 52) in my
    tests ALWAYS has better and more predictable performance that a
    default of (1).

    and

    (B) On a server with default collation of (1) then the performance for
    some parameter values is appalling. Unless the column collation seq
    for a queried table is changed to that of (2) -
    SQL_Latin1_General_CP1_CI_AS
    - in which case a query against those columns returns quiickly.

    Here is a script: (Note for VERY poor performance use a server with
    default collation of Latin1_General_CI_AS )

    Create either the first or second version of the table and then fill
    with the data using the WHILE loop.
    Test using the parameterized queries below.

    if exists (select * from dbo.sysobjects where id =
    object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    drop table [dbo].[A]
    GO

    CREATE TABLE [dbo].[A] (
    [NAME] [varchar] (30) COLLATE Latin1_General_CI_AS NULL ,
    [C2] [varchar] (8) COLLATE Latin1_General_CI_AS NULL ,
    [C3] [int] NULL
    ) ON [PRIMARY]
    GO



    if exists (select * from dbo.sysobjects where id =
    object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    drop table [dbo].[A]
    GO

    CREATE TABLE [dbo].[A] (
    [NAME] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [C2] [varchar] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [C3] [int] NULL
    ) ON [PRIMARY]
    GO

    -- fill it with data

    declare
    i int,
    j int,
    k int,
    code varchar(8),
    asc varchar(8)

    set k = 1
    set j = 1
    set i = 1

    while i < 250
    begin
    set asc = ''
    set code = '00'

    set k = convert(int, ceiling(6*rand()))
    select code = code + te(k, convert(int, ceiling(9*rand())))

    select j = convert(int, floor(15*rand()))
    set j = 15 + j
    select asc = 65 + convert(int, ceiling(26*rand()))
    select asc = te(char(asc), j)

    insert A (NAME, C2, C3)
    values(asc, code, i)

    set i = i + 1
    end

    insert A (NAME, C2, C3)
    values('abc', '009', 1)

    insert A (NAME, C2, C3)
    values('abc', '0099', 1)

    insert A (NAME, C2, C3)
    values('abc', '0090', 1)

    insert A (NAME, C2, C3)
    values('abc', '009990', 1)

    insert A (NAME, C2, C3)
    values('abc', '009999', 1)

    insert A (NAME, C2, C3)
    values('abc', '009099', 1)

    insert A (NAME, C2, C3)
    values('abc', '009', 1)

    insert A (NAME, C2, C3)
    values('abc', '0099999', 1)

    -- run this - you could use set statistics io on if you wish
    -- The 'slow query is one with param = '009'
    - The fast one uses '005' in this case

    set statistics time on

    declare
    v1 varchar(8)
    set v1 = '005'

    select top 1 C2, C3 from A
    where C2 like v1 + '%'
    order by len(C2) desc

    set v1 = '009'
    select top 1 C2, C3 from A
    where C2 like v1 + '%'
    order by len(C2) desc

    On a server with default collation = SQL_Latin1_General_CP1_CI_AS
    '005' = approx 1 ms - for default col collation
    '009' = approx 1 ms - for default col collation
    '005' = approx 1 ms - for for col collation = Latin1_General_CI_AS
    '009' = approx 25 ms - for for col collation = Latin1_General_CI_AS



    On a server with default collation = Latin1_General_CI_AS
    '005' = approx 1 ms - for default col collation
    '009' = approx 120 ms - for default col collation
    '005' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    '009' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS


    I see a large difference between the slow (ie '009' or in fact
    ANYTHING with 9 as first non-zero digit ) and '005' (or ANY other
    string) IFF the column collation sequence = Latin1_General_CI_AS

    I am further saying that this is particularly marked when the server
    collation default = Latin1_General_CI_AS

    What is going on here ?


    Andrew, I really appreciate your taking the time to follow me on this
    - its not something that is immediately apparent, I grant you .. but
    is important to me. You cannot imagine how ....


    Gavin
    Gavin John Fowler Guest

  7. #7

    Default Re: Interesting behaviour of : LIKE v + '%'

    OK, I will see if someone on the private ng can make some sense of this.
    But you still didn't answer as to how you were determining the response
    time. What was the tool you used to get these measurements?

    --

    Andrew J. Kelly
    SQL Server MVP


    "Gavin John Fowler" <Gavin_Fowlerhotmail.com> wrote in message
    news:5aa05620.0307070703.6b962691posting.google.c om...
    > OK, there are two cases, server default collation to this
    > (1) of Latin1_General_CI_AS and (2) of SQL_Latin1_General_CP1_CI_AS
    >
    >
    > My claim is
    > (A) That a server with default collation(2) (sort order = 52) in my
    > tests ALWAYS has better and more predictable performance that a
    > default of (1).
    >
    > and
    >
    > (B) On a server with default collation of (1) then the performance for
    > some parameter values is appalling. Unless the column collation seq
    > for a queried table is changed to that of (2) -
    > SQL_Latin1_General_CP1_CI_AS
    > - in which case a query against those columns returns quiickly.
    >
    > Here is a script: (Note for VERY poor performance use a server with
    > default collation of Latin1_General_CI_AS )
    >
    > Create either the first or second version of the table and then fill
    > with the data using the WHILE loop.
    > Test using the parameterized queries below.
    >
    > if exists (select * from dbo.sysobjects where id =
    > object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    > drop table [dbo].[A]
    > GO
    >
    > CREATE TABLE [dbo].[A] (
    > [NAME] [varchar] (30) COLLATE Latin1_General_CI_AS NULL ,
    > [C2] [varchar] (8) COLLATE Latin1_General_CI_AS NULL ,
    > [C3] [int] NULL
    > ) ON [PRIMARY]
    > GO
    >
    >
    >
    > if exists (select * from dbo.sysobjects where id =
    > object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    > drop table [dbo].[A]
    > GO
    >
    > CREATE TABLE [dbo].[A] (
    > [NAME] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    > [C2] [varchar] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    > [C3] [int] NULL
    > ) ON [PRIMARY]
    > GO
    >
    > -- fill it with data
    >
    > declare
    > i int,
    > j int,
    > k int,
    > code varchar(8),
    > asc varchar(8)
    >
    > set k = 1
    > set j = 1
    > set i = 1
    >
    > while i < 250
    > begin
    > set asc = ''
    > set code = '00'
    >
    > set k = convert(int, ceiling(6*rand()))
    > select code = code + te(k, convert(int, ceiling(9*rand())))
    >
    > select j = convert(int, floor(15*rand()))
    > set j = 15 + j
    > select asc = 65 + convert(int, ceiling(26*rand()))
    > select asc = te(char(asc), j)
    >
    > insert A (NAME, C2, C3)
    > values(asc, code, i)
    >
    > set i = i + 1
    > end
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0099', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0090', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009990', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009999', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009099', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0099999', 1)
    >
    > -- run this - you could use set statistics io on if you wish
    > -- The 'slow query is one with param = '009'
    > - The fast one uses '005' in this case
    >
    > set statistics time on
    >
    > declare
    > v1 varchar(8)
    > set v1 = '005'
    >
    > select top 1 C2, C3 from A
    > where C2 like v1 + '%'
    > order by len(C2) desc
    >
    > set v1 = '009'
    > select top 1 C2, C3 from A
    > where C2 like v1 + '%'
    > order by len(C2) desc
    >
    > On a server with default collation = SQL_Latin1_General_CP1_CI_AS
    > '005' = approx 1 ms - for default col collation
    > '009' = approx 1 ms - for default col collation
    > '005' = approx 1 ms - for for col collation = Latin1_General_CI_AS
    > '009' = approx 25 ms - for for col collation = Latin1_General_CI_AS
    >
    >
    >
    > On a server with default collation = Latin1_General_CI_AS
    > '005' = approx 1 ms - for default col collation
    > '009' = approx 120 ms - for default col collation
    > '005' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    > '009' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    >
    >
    > I see a large difference between the slow (ie '009' or in fact
    > ANYTHING with 9 as first non-zero digit ) and '005' (or ANY other
    > string) IFF the column collation sequence = Latin1_General_CI_AS
    >
    > I am further saying that this is particularly marked when the server
    > collation default = Latin1_General_CI_AS
    >
    > What is going on here ?
    >
    >
    > Andrew, I really appreciate your taking the time to follow me on this
    > - its not something that is immediately apparent, I grant you .. but
    > is important to me. You cannot imagine how ....
    >
    >
    > Gavin

    Andrew J. Kelly Guest

  8. #8

    Default Re: Interesting behaviour of : LIKE v + '%'

    Ran some additional SQL statements and I do not get the same behavior as
    you previously posted.
    First: To eliminate the data values as a variable, created 2 tables and
    then populated with the same data based on your example data generation
    SQL. See the last script.

    Second: Ran the provided SQL batch with adjustments for now having 2
    tables and see the same cpu time difference you indicated.

    Third: Changed the sequence of the SQL statements within the batch and
    noticed that occasionally , the 3rd statement in the batch would have
    the 20 ms CPU times but on other runs, cpu times would be identical (0
    ms) for all.

    Could this be a bug in how "statistics time" reports rather than a
    problem with the execution of the SQL ?

    Running SQL Server 2000 - 8.00.760

    -- Batch
    declare T_CI_AS varchar(8)
    , CP1_CI_AS varchar(8)
    set T_CI_AS = '005'
    select top 1 T_CI_AS , C2, C3 from T_CI_AS
    where C2 like T_CI_AS + '%'
    order by len(C2) desc

    set T_CI_AS = '009'
    select top 1 T_CI_AS , C2, C3 from T_CI_AS
    where C2 like T_CI_AS + '%'
    order by len(C2) desc

    set CP1_CI_AS = '005'
    select top 1 CP1_CI_AS , C2, C3 from T_CP1_CI_AS
    where C2 like CP1_CI_AS + '%'
    order by len(C2) desc

    set CP1_CI_AS = '009'
    select top 1 CP1_CI_AS , C2, C3 from T_CP1_CI_AS
    where C2 like CP1_CI_AS + '%'
    order by len(C2) desc
    go
    -- Table creation and data population
    if exists (select * from dbo.sysobjects where id =
    object_id(N'[dbo].[T_CI_AS]') and OBJECTPROPERTY(id, N'IsUserTable') =
    1)
    drop table [dbo].[T_CI_AS]
    GO
    CREATE TABLE [dbo].[T_CI_AS] (
    [NAME] [varchar] (30) COLLATE Latin1_General_CI_AS NULL ,
    [C2] [varchar] (8) COLLATE Latin1_General_CI_AS NULL ,
    [C3] [int] NULL
    ) ON [PRIMARY]
    GO
    insert into T_CI_AS select * from A
    go
    if exists (select * from dbo.sysobjects where id =
    object_id(N'[dbo].[T_CP1_CI_AS]') and OBJECTPROPERTY(id, N'IsUserTable')
    = 1)
    drop table [dbo].T_CP1_CI_AS
    GO
    CREATE TABLE [dbo].T_CP1_CI_AS (
    [NAME] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [C2] [varchar] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [C3] [int] NULL
    ) ON [PRIMARY]
    GO
    insert into T_CP1_CI_AS select * from A
    go


    Carl Federl
    Please post DDL (create table) with datatypes, primary and foreign keys.

    *** Sent via Developersdex [url]http://www.developersdex.com[/url] ***
    Don't just participate in USENET...get rewarded for it!
    Carl Federl Guest

  9. #9

    Default Re: Interesting behaviour of : LIKE v + '%'

    Hi Gavin,

    I ran your script on SQL-Server 2000 SP3 (8.00.760) but could not
    reproduce your results. The performance was very close. So I increased
    the number of rows to 250000. The results that I got were:

    '005' query
    -----------
    db Latin1_General_CI_AS, table Latin1_General_CI_AS, 2002 ms
    db Latin1_General_CI_AS, table SQL_Latin1_General_CP1_CI_AS, 1732 ms
    db SQL_Latin1_General_CP1_CI_AS, table Latin1_General_CI_AS, 2003 ms
    db SQL_Latin1_General_CP1_CI_AS, table SQL_Latin1_General_CP1_CI_AS,
    1732 ms

    '009' query
    -----------
    db Latin1_General_CI_AS, table Latin1_General_CI_AS, 1713 ms
    db Latin1_General_CI_AS, table SQL_Latin1_General_CP1_CI_AS, 1392 ms
    db SQL_Latin1_General_CP1_CI_AS, table Latin1_General_CI_AS, 1742 ms
    db SQL_Latin1_General_CP1_CI_AS, table SQL_Latin1_General_CP1_CI_AS,
    1402 ms

    The only conclusion I can draw based on this (non-representative) test
    is that SQL_Latin1_General_CP1_CI_AS collation is slightly faster than
    Latin1_General_CI_AS. The database collation does not seem to matter.

    Hope this helps,
    Gert-Jan


    Gavin John Fowler wrote:
    >
    > OK, there are two cases, server default collation to this
    > (1) of Latin1_General_CI_AS and (2) of SQL_Latin1_General_CP1_CI_AS
    >
    > My claim is
    > (A) That a server with default collation(2) (sort order = 52) in my
    > tests ALWAYS has better and more predictable performance that a
    > default of (1).
    >
    > and
    >
    > (B) On a server with default collation of (1) then the performance for
    > some parameter values is appalling. Unless the column collation seq
    > for a queried table is changed to that of (2) -
    > SQL_Latin1_General_CP1_CI_AS
    > - in which case a query against those columns returns quiickly.
    >
    > Here is a script: (Note for VERY poor performance use a server with
    > default collation of Latin1_General_CI_AS )
    >
    > Create either the first or second version of the table and then fill
    > with the data using the WHILE loop.
    > Test using the parameterized queries below.
    >
    > if exists (select * from dbo.sysobjects where id =
    > object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    > drop table [dbo].[A]
    > GO
    >
    > CREATE TABLE [dbo].[A] (
    > [NAME] [varchar] (30) COLLATE Latin1_General_CI_AS NULL ,
    > [C2] [varchar] (8) COLLATE Latin1_General_CI_AS NULL ,
    > [C3] [int] NULL
    > ) ON [PRIMARY]
    > GO
    >
    > if exists (select * from dbo.sysobjects where id =
    > object_id(N'[dbo].[A]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
    > drop table [dbo].[A]
    > GO
    >
    > CREATE TABLE [dbo].[A] (
    > [NAME] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    > [C2] [varchar] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    > [C3] [int] NULL
    > ) ON [PRIMARY]
    > GO
    >
    > -- fill it with data
    >
    > declare
    > i int,
    > j int,
    > k int,
    > code varchar(8),
    > asc varchar(8)
    >
    > set k = 1
    > set j = 1
    > set i = 1
    >
    > while i < 250
    > begin
    > set asc = ''
    > set code = '00'
    >
    > set k = convert(int, ceiling(6*rand()))
    > select code = code + te(k, convert(int, ceiling(9*rand())))
    >
    > select j = convert(int, floor(15*rand()))
    > set j = 15 + j
    > select asc = 65 + convert(int, ceiling(26*rand()))
    > select asc = te(char(asc), j)
    >
    > insert A (NAME, C2, C3)
    > values(asc, code, i)
    >
    > set i = i + 1
    > end
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0099', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0090', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009990', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009999', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009099', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '009', 1)
    >
    > insert A (NAME, C2, C3)
    > values('abc', '0099999', 1)
    >
    > -- run this - you could use set statistics io on if you wish
    > -- The 'slow query is one with param = '009'
    > - The fast one uses '005' in this case
    >
    > set statistics time on
    >
    > declare
    > v1 varchar(8)
    > set v1 = '005'
    >
    > select top 1 C2, C3 from A
    > where C2 like v1 + '%'
    > order by len(C2) desc
    >
    > set v1 = '009'
    > select top 1 C2, C3 from A
    > where C2 like v1 + '%'
    > order by len(C2) desc
    >
    > On a server with default collation = SQL_Latin1_General_CP1_CI_AS
    > '005' = approx 1 ms - for default col collation
    > '009' = approx 1 ms - for default col collation
    > '005' = approx 1 ms - for for col collation = Latin1_General_CI_AS
    > '009' = approx 25 ms - for for col collation = Latin1_General_CI_AS
    >
    > On a server with default collation = Latin1_General_CI_AS
    > '005' = approx 1 ms - for default col collation
    > '009' = approx 120 ms - for default col collation
    > '005' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    > '009' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    >
    > I see a large difference between the slow (ie '009' or in fact
    > ANYTHING with 9 as first non-zero digit ) and '005' (or ANY other
    > string) IFF the column collation sequence = Latin1_General_CI_AS
    >
    > I am further saying that this is particularly marked when the server
    > collation default = Latin1_General_CI_AS
    >
    > What is going on here ?
    >
    > Andrew, I really appreciate your taking the time to follow me on this
    > - its not something that is immediately apparent, I grant you .. but
    > is important to me. You cannot imagine how ....
    >
    > Gavin
    Gert-Jan Strik Guest

  10. #10

    Default Re: Interesting behaviour of : LIKE v + '%'

    OK - many thanks to all - especially Andrew for taking the time to get
    this started.

    I have, of course, measured times with current_timstamp (I understood
    it was accurate to 1 ms ?) an got identical results.

    Also, I too considered a bug in - God knows - something in the way
    times are being reported.

    This SQL is normally part of a much larger stmt tucked away in a proc
    - took a fair bit of work to isolate the offender - but I stand by
    what I see and measure (in 2 differenmt ways : set statistics time on
    (Andrew) and select current_timestamp fore and aft. In both ways I
    get same results)

    As to the sample being small - it is, but you must understand that it
    is the kind of thing which is called fro a C app via ODBC repeatedly
    (MANY, many times per second with different parameter sets - ie the
    '005' or '009' values) on small tables (a few hundre or thousands of
    rows) - so your scale up (Gert) is not an accurate representation.
    (This is my fault as I have not been able to go into too much detail
    here.)

    I see what I see and am well capable of measuring it :) Believe me
    .....

    I have run it on different servers (diffrent major & minor versions)
    with different default collations - and differing results

    Other thoughts were that it is a db restored from 6.5 or 7 that may be
    the issue - hence I created tables in tmpdb end even on entirely new
    devices ... still the same.

    Fragmentation ? Again, no.

    I exepect millisecond responses in this case - and certainly do not
    expect initial condition sensitivity of this order.

    I am puzzled - but seem to be in a minority of one ... ah well, at
    least I can work around it by altering the coll seq ..

    Again, many thanks. I too must leave this before I get sidetracked
    any more in this fascinatng issue ..

    Gavin




    Erland Sommarskog <sommaralgonet.se> wrote in message news:<Xns93B2B7ED37F2Yazorman127.0.0.1>...
    > Gavin John Fowler (Gavin_Fowlerhotmail.com) writes:
    > > On a server with default collation = SQL_Latin1_General_CP1_CI_AS
    > > '005' = approx 1 ms - for default col collation
    > > '009' = approx 1 ms - for default col collation
    > > '005' = approx 1 ms - for for col collation = Latin1_General_CI_AS
    > > '009' = approx 25 ms - for for col collation = Latin1_General_CI_AS
    > >
    > >
    > >
    > > On a server with default collation = Latin1_General_CI_AS
    > > '005' = approx 1 ms - for default col collation
    > > '009' = approx 120 ms - for default col collation
    > > '005' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    > > '009' = approx 1 ms - for col collation = SQL_Latin1_General_CP1_CI_AS
    >
    > These timings are too small to be conclusive. I don't know about
    > SET STATISTICS TIME, but regular datetime has a resolution of
    > 3.33 ms. When using getdate() for performance tests, I usually
    > regard anything below 50 ms as white noise. So I would suggest that
    > you add at least 100 times more data to your testtable, maybe
    > 1000 times more.
    >
    > And, by the way, getdate() is probably more accurate than SET STATISTICS
    > TIME:
    >
    > SELECT start = getdate()
    > -- Test to do goes here
    > SELECT tookms = datediff(ms, start, getdate())
    > PRINT 'It took ' + ltrim(str(tookms)) + ' ms.'
    Gavin John Fowler Guest

  11. #11

    Default Re: Interesting behaviour of : LIKE v + '%'

    In addition to minding what the collation of the variable (and/or)
    expression
    used in the LIKE (and/or vs the collation of the column being liked),
    you need to mind, "For the current arranagement of the data (
    in the table and any indexes) vs
    the actual query ran (including the value(s) it is working with),
    did the query happen to get to stop early real-soon or real-late?"
    To make that even more fun, include/exclude the ORDER BY too.

    Bye,
    Delbert Glass


    Delbert Glass Guest

  12. #12

    Default Re: Interesting behaviour of : LIKE v + '%'

    Gavin John Fowler (Gavin_Fowlerhotmail.com) writes:
    > As to the sample being small - it is, but you must understand that it
    > is the kind of thing which is called fro a C app via ODBC repeatedly
    > (MANY, many times per second with different parameter sets - ie the
    > '005' or '009' values) on small tables (a few hundre or thousands of
    > rows) - so your scale up (Gert) is not an accurate representation.
    > (This is my fault as I have not been able to go into too much detail
    > here.)
    I should add to my previous post that what is important is to get timings
    that are at least some 100 ms. This can be achieved by beefing up the data
    size, but also by running the query you want to test more than once
    between two readings of getdate().

    --
    Erland Sommarskog, SQL Server MVP, [email]sommaralgonet.se[/email]

    Books Online for SQL Server SP3 at
    [url]http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp[/url]
    Erland Sommarskog Guest

  13. #13

    Default Re: Interesting behaviour of : LIKE v + '%'

    >Don't understand 'real-soon' or 'real-late; I'm afraid .. :)

    Until you do, I am afraid you will not be able to
    fully understand what you are seeing :-(
    This is only one component of what is going on.

    Let's dream a simple example.

    Say we have a PhoneBook table.
    To keep things real simple, let's also say it is not index.

    We then observe:
    SELECT TOP 1 LastName FROM PhoneBook WHERE LastName LIKE 'A%' -- 100000 ms
    SELECT TOP 1 LastName FROM PhoneBook WHERE LastName LIKE 'N%' -- 0 ms
    SELECT TOP 1 LastName FROM PhoneBook WHERE LastName LIKE 'Z%' -- 100000 ms

    This is easily understood by realizing:
    The queries do a table scan and can stop early.

    and by further realizing:
    One of the first rows encountered by the table scan
    happens to begins with an N;
    thus, the second query gets to stop real-soon.
    Similarly, the table scan happens to scan many
    rows before encountering a last name that starts
    with A. Likewise for Z. Ie the query did not get
    to stop real-soon.
    In fact, the query might not get to stop earlier at all.
    For example, say, the phonebook has no last name starting with A.
    (In which case, the alert reader will realize the phone book
    as a small portion of last names starting with Z.)

    -----

    Having/not having an ORDER BY
    (including what specifically the order is to be)
    can have an impact on how much work a query
    must do before it knows it can skip doing further work
    for various task it must to for fullfill the request.
    There are also additional effect(s) on what work
    and how much work is done to fullfill the request.

    Study these:

    Including an ORDER BY will
    ( ) always, (x) sometimes, ( ) never
    make a query slower.

    Excluding an ORDER BY will
    ( ) always, (x) sometimes, ( ) never
    make a query faster.

    until you come to realize what they say.
    TIP: switch slower and faster.

    -----

    For all queries, the computer does
    an uncertain amount a work fulfilling the request.
    This needs to be taken into consideration
    whenever comparing one timing measurement to another
    regardless of how precise and realiable the measurement --
    the measurement is a measurement of the work done
    fullfilling a particular run of the query
    not a measurement of the query.

    -----

    The COLLATION of each side of LIKE
    needs to be taken into consideration.

    There is a good chance that the default collation
    has an effect because it use being used
    because you have not specifically specified a collation
    for each side of LIKE.
    See BOL for info about using COLLATION with DECLARE and CAST.

    However there is a chance that even when
    you do specifically controll the collation of each side
    of LIKE the default collation still has an effect
    so don't drop the default collation ball to quickly.

    -----

    Just because you did not create an index yourself,
    does not mean an index (or other "powerfull-tool")
    is not being used to fullfill your request.

    -----

    Mixing and matching all the above together
    gives the possible bug a lot of foliage/surprises
    to hind behind and distract you.

    Realize, you are like the phone book queries giving above.
    If you happen across the bug, you get to stop real-soon.
    If not, you get to do a lot of work
    and might even end up empty handed.

    Bye,
    Delbert Glass



    Delbert Glass Guest

  14. #14

    Default Re: Interesting behaviour of : LIKE v + '%'

    "Delbert Glass" <delbertnoincoming.com> wrote in message news:<OCJUwjkRDHA.2188TK2MSFTNGP10.phx.gbl>...
    > >Don't understand 'real-soon' or 'real-late; I'm afraid .. :)
    >
    > Until you do, I am afraid you will not be able to
    > fully understand what you are seeing :-(
    > This is only one component of what is going on.
    >
    > ----- SNipped Only for space and clarity ----------

    I thank you for this and understand what you say here (BTW did not
    understand the terminolgy 'real-soon/late', I get the principle)

    I have considered very carefully what you say and believe that the
    only issue I cannot counter is the one of index creation by a
    'powerful-tool' (deep inside SQL Server itself ?) as I do not have
    knowledge of its internals.
    > Realize, you are like the phone book queries giving above.
    > If you happen across the bug, you get to stop real-soon.
    > If not, you get to do a lot of work
    > and might even end up empty handed.
    Now this is _so_ true !
    Happens I did stumble across it, found out a way round it and moved on
    to other things
    >
    > Bye,
    > Delbert Glass
    Many thanks,

    G
    Gavin John Fowler Guest

  15. #15

    Default Re: Interesting behaviour of : LIKE v + '%'

    "Delbert Glass" <delbertnoincoming.com> wrote in message news:<OCJUwjkRDHA.2188TK2MSFTNGP10.phx.gbl>...
    >
    > The COLLATION of each side of LIKE
    > needs to be taken into consideration.
    >
    > There is a good chance that the default collation
    > has an effect because it use being used
    > because you have not specifically specified a collation
    > for each side of LIKE.
    > See BOL for info about using COLLATION with DECLARE and CAST.
    >
    > However there is a chance that even when
    > you do specifically controll the collation of each side
    > of LIKE the default collation still has an effect
    > so don't drop the default collation ball to quickly.
    >
    Did follow this up to no avail I'm afraid.
    Explicit CASTing to same COLL type on both sides of LIKE has no effect
    on the outcome of my query. If I cast to CP1 - it works a treat -
    Always !

    However .. I can live with it

    > -----
    >
    > Bye,
    > Delbert Glass
    Gavin John Fowler Guest

  16. #16

    Default Re: Interesting behaviour of : LIKE v + '%'

    [posted and mailed]

    Here is a followup to your issue.

    We have discussed this in our internal MVP forum, and we have now confirmed
    that there is an issue with 009% and Windows collations. However, we do not
    yet whether this is a bug or just the way things are. (Technically it is not
    a bug, since you get the correct result. But it could be a bug-lookalike
    that causes the delay.)

    What we have found is that amount of data in the table does not have any
    significance. The behaviour is repeatable with empty tables. Neither does
    the database collation has anything to do with it. As for the server
    collation, we have not fully verified whether it matters or not, but I run
    with Finnish_Swedish_CS_AS and Steve Kass who also looked into the case,
    usually runs with SQL_Latin1_General_CP1_CI_AS.

    Here is a repro. The slower machine you run it on, the better. :-)

    create table T (
    c1 varchar(100) collate Latin1_General_CI_AS,
    c2 varchar(100) collate Latin1_General_CS_AS,
    c3 varchar(100) collate Latin1_General_BIN,
    c4 varchar(100) collate Finnish_Swedish_CI_AS,
    c5 varchar(100) collate SQL_Latin1_General_CP1_CI_AS,
    c6 varchar(100) collate SQL_SwedishPhone_Pref_CP1_CI_AS
    )

    create nonclustered index T_nci1 on T(c1)
    create nonclustered index T_nci2 on T(c2)
    create nonclustered index T_nci3 on T(c3)
    create nonclustered index T_nci4 on T(c4)
    create nonclustered index T_nci5 on T(c5)
    create nonclustered index T_nci6 on T(c6)
    go

    create proc P1 p varchar(10) as select c1 from T where c1 LIKE p+'%'
    go
    create proc P2 p varchar(10) as select c2 from T where c2 LIKE p+'%'
    go
    create proc P3 p varchar(10) as select c3 from T where c3 LIKE p+'%'
    go
    create proc P4 p varchar(10) as select c4 from T where c4 LIKE p+'%'
    go
    create proc P5 p varchar(10) as select c5 from T where c5 LIKE p+'%'
    go
    create proc P6 p varchar(10) as select c6 from T where c6 LIKE p+'%'
    go
    set nocount on
    go
    -- Run this batch several times to see pattern. The SQL Collations
    -- usually takes "no time", same for the Windows binary collations. The
    -- non-binary collations very rarely takes 0 ms. Change 9 to 5 and all
    -- collations take "no time".
    SELECT getdate()
    exec P1 '000EN9'
    select getdate()
    exec P2 '000EN9'
    select getdate()
    exec P3 '000EN9'
    select getdate()
    exec P4 '000EN9'
    select getdate()
    exec P5 '000EN9'
    select getdate()
    exec P6 '000EN9'
    select getdate()
    go
    drop table T
    drop proc P1, P2, P3, P4, P5, P6
    go


    --
    Erland Sommarskog, SQL Server MVP, [email]sommaralgonet.se[/email]

    Books Online for SQL Server SP3 at
    [url]http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp[/url]
    Erland Sommarskog Guest

  17. #17

    Default Re: Interesting behaviour of : LIKE v + '%'

    Perhaps you should take a video camera to the next meeting.


    In addition to going something like:
    For any non-binary Windows collation ....

    It goes something like this:

    Depending on the collation of
    the pattern for LIKE (not PATINDEX)

    a)
    whenever '9' is allowed by inclusion but not exclusion
    in any range prior to the first wildcard (% or _)
    with a special exclusion of the case of when
    the only such '9' allowed is allowed by a range
    that only allows '9' as the first character
    of a match to the pattern.

    or (including and)

    b)
    whenever '9' is one of the valid character(s)
    as the last character in the match
    prior to the first wildcard (% or _) in the pattern
    with a special exclusion of the case of when
    that '9' is allowed by exclusion.

    some "extra resources" are used.


    Of course that might need to be tweaked
    to account for different behaviors
    depending on what portion of the character set
    is being included/excluded by a range.

    (Or additional triggering further down the pattern, etc.)

    Realize: You can make the useage of the "extra resources"
    occur at run-time or at p and compile time.

    Wild guess: Part of an easter egg?

    I wonder: Would such an easter egg care
    if it ran during p and compile time vs run-time?

    Ponder: If an easter egg runs during compile time,
    does it have more security rights, etc then if it
    runs at run-time? Even it does not,
    would it be more likely to crash the database server?

    Note: There is no need to use a table.

    Just to give everyone a kick start,
    here is a query that qualifies twice
    with the useage occuring at p and compile time
    and is suggestive of patterns to use.

    select 666
    where
    '' collate Latin1_General_CI_AS
    like
    '[9,1]_' collate Latin1_General_CI_AS

    Bye,
    Delbert Glass


    Delbert Glass Guest

  18. #18

    Default Re: Interesting behaviour of : LIKE v + '%'

    Thanks.


    "Delbert Glass" <delbertnoincoming.com> wrote in message news:<ehEnyx0RDHA.1920TK2MSFTNGP11.phx.gbl>...
    > Perhaps you should take a video camera to the next meeting.
    >
    >
    > In addition to going something like:
    > For any non-binary Windows collation ....
    >
    > It goes something like this:
    >
    > Depending on the collation of
    > the pattern for LIKE (not PATINDEX)
    >
    > a)
    > whenever '9' is allowed by inclusion but not exclusion
    > in any range prior to the first wildcard (% or _)
    > with a special exclusion of the case of when
    > the only such '9' allowed is allowed by a range
    > that only allows '9' as the first character
    > of a match to the pattern.
    >
    > or (including and)
    >
    > b)
    > whenever '9' is one of the valid character(s)
    > as the last character in the match
    > prior to the first wildcard (% or _) in the pattern
    > with a special exclusion of the case of when
    > that '9' is allowed by exclusion.
    >
    > some "extra resources" are used.
    >
    >
    > Of course that might need to be tweaked
    > to account for different behaviors
    > depending on what portion of the character set
    > is being included/excluded by a range.
    >
    > (Or additional triggering further down the pattern, etc.)
    >
    > Realize: You can make the useage of the "extra resources"
    > occur at run-time or at p and compile time.
    >
    > Wild guess: Part of an easter egg?
    >
    > I wonder: Would such an easter egg care
    > if it ran during p and compile time vs run-time?
    >
    > Ponder: If an easter egg runs during compile time,
    > does it have more security rights, etc then if it
    > runs at run-time? Even it does not,
    > would it be more likely to crash the database server?
    >
    > Note: There is no need to use a table.
    >
    > Just to give everyone a kick start,
    > here is a query that qualifies twice
    > with the useage occuring at p and compile time
    > and is suggestive of patterns to use.
    >
    > select 666
    > where
    > '' collate Latin1_General_CI_AS
    > like
    > '[9,1]_' collate Latin1_General_CI_AS
    >
    > Bye,
    > Delbert Glass
    Gavin John Fowler Guest

  19. #19

    Default Re: Interesting behaviour of : LIKE v + '%'

    Delbert Glass (delbertnoincoming.com) writes:
    > I wonder: Would such an easter egg care
    > if it ran during p and compile time vs run-time?
    It is not that the 9 is completely unique. If you run the my repro
    with the 9 replace by Z, you see a similar result. Except that
    Finnish_Swedish_CI_AS runs in no time. If you instead put in in
    place of 9/Z, the only one that takes time is Finnish_Swedish_CI_AS.
    ( is the last letter of the Swedish and Finnish alphabets.)

    So the problem has to do with that the character in question being at
    the end of the range. Apparently the Windows collations have separate
    ranges for digits and charactets.
    > select 666
    > where
    > '' collate Latin1_General_CI_AS
    > like
    > '[9,1]_' collate Latin1_General_CI_AS
    I don't know what you intended with this, but when I ran this with SELECT
    getdate() before and after, I always got the same time.

    --
    Erland Sommarskog, SQL Server MVP, [email]sommaralgonet.se[/email]

    Books Online for SQL Server SP3 at
    [url]http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp[/url]
    Erland Sommarskog Guest

  20. #20

    Default Re: Interesting behaviour of : LIKE v + '%'

    True, but realize that's
    because it's using the same extra time.

    Say, keep clearing out your proc cache
    Also switch the 9 back and forth to 8.

    For me the 9 takes about 50ms
    the the 8 takes about 0ms.

    Instead of clearing the proc cache,
    you can change the 1 to a 2, 3, 4, 5, 6, 7, 8.
    since that will also require creating a new plan
    the first time for each pattern
    and avoids switching the pattern to something
    that does not use the extra time.

    A question araises:
    Are the extra resources being used when
    checking for a match or when processing the
    to pattern to create a pattern matching state machine.
    Not wanting to hog all the fun, I'll let someoneelse
    test which (might be both).

    Too roughly something along the lines of the following
    could be happing.

    if not bAlreadyUsedExtraResources then
    if ] before first first (% _ or end_of_pattern) then
    if SpecailCollation and This[]AllowsSpecialCharacter then
    UseExtraResources
    bAlreadyUsedExtraResources = true
    end
    end
    end
    if not bAlreadyUsedExtraResources then
    if first (% _ or end_of_pattern)
    if SpecailCollation and PreviousStateAllowsSpecialCharacter then
    UseExtraResources
    bAlreadyUsedExtraResources = true
    end
    end
    end

    where depending on which collation,
    special characters are 9, z, Z, etc
    or sometimes no special characters.

    The above is obviously too rough in numerious ways.
    For example, it makes no attempt to account
    for all of the cases I previously mentioned.
    Like:

    select 666
    where
    '' collate Latin1_General_CI_AS
    like
    '[9,^r]_' collate Latin1_General_CI_AS

    taking about 0ms for me where as '[9,1_]' takes about 50ms.


    Of course it might just be a bug, memory allocation, or something
    that happens to occur under such conditions.
    Perhaps a memory allocation during an exception due to
    accessing past the end of ranges
    like 0 thru 9, a thru z, A thru Z, etc.
    But then why not after the fist wildchar?

    -----

    Here is an interesting example.

    select 666
    where
    '' collate Latin1_General_CI_AS
    like
    '[9,1]_' collate Latin1_General_CI_AS
    and
    '' collate Latin1_General_CI_AS
    like
    '[9,1]_' collate Latin1_General_CI_AS

    For me it takes about 100ms.
    Recall including the condition only once
    takes about 50ms.

    -----

    For me replacing the 9 and the collation in:

    select * from test
    where
    '' collate Latin1_General_CI_AS
    like
    '[9,1_]' collate Latin1_General_CI_AS

    gives me the following behavior:

    Latin1_General_CI_AS
    9, z, Z, use extra resources.
    does not.

    Finnish_Swedish_CI_AS
    9, use extra resources.
    z Z do not.

    -----

    Isn't it the vendors turn?

    Bye,
    Delbert Glass

    "Erland Sommarskog" <se> wrote in message
    news:0.0.1... 
    >
    > It is not that the 9 is completely unique. If you run the my repro
    > with the 9 replace by Z, you see a similar result. Except that
    > Finnish_Swedish_CI_AS runs in no time. If you instead put in in
    > place of 9/Z, the only one that takes time is Finnish_Swedish_CI_AS.
    > ( is the last letter of the Swedish and Finnish alphabets.)
    >
    > So the problem has to do with that the character in question being at
    > the end of the range. Apparently the Windows collations have separate
    > ranges for digits and charactets.

    >
    > I don't know what you intended with this, but when I ran this with SELECT
    > getdate() before and after, I always got the same time.
    >
    > --
    > Erland Sommarskog, SQL Server MVP, se
    >
    > Books Online for SQL Server SP3 at
    > http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp[/ref]


    Delbert Guest

Similar Threads

  1. interesting preloader
    By morphC2004 in forum Macromedia Flash
    Replies: 0
    Last Post: February 12th, 10:52 PM
  2. [PHP] Very interesting.....
    By Jay Blanchard in forum PHP Development
    Replies: 3
    Last Post: September 5th, 07:09 PM
  3. Very interesting.....
    By Scott Fletcher in forum PHP Development
    Replies: 0
    Last Post: September 5th, 06:21 PM
  4. interesting
    By Ursula Baranowski in forum Photography
    Replies: 0
    Last Post: July 10th, 05:57 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139