Professional Web Applications Themes

Favourite topic: primary keys (natural vs surrogate/artificial) - Microsoft SQL / MS SQL Server

Another issue arises under my favourite topic of discussion! What solutions do people have for creating a "natural" key that is both meaningful, and unique. Consider a situation where people create "things" that will be stored as unique entities, but where the user themselves have not predefined a natural, unique identifier. In other words, it is left up do the database to create an identifier (heck, every "natural" key must have been created by someone or something at some point, right?) Well, a natural key might be defined at least partly by who is submitting it and when. As a ...

  1. #1

    Default Favourite topic: primary keys (natural vs surrogate/artificial)

    Another issue arises under my favourite topic of discussion!

    What solutions do people have for creating a "natural" key that is both
    meaningful, and unique.
    Consider a situation where people create "things" that will be stored as
    unique entities, but where the user themselves have not predefined a
    natural, unique identifier. In other words, it is left up do the database to
    create an identifier (heck, every "natural" key must have been created by
    someone or something at some point, right?)

    Well, a natural key might be defined at least partly by who is submitting it
    and when. As a quick example, a table created in 2003 might be keyed as
    TBL03x

    The idea is to have as much of the key as possible be meaningful, the x
    is there merely to guarantee uniqueness.
    Is there any way to efficiently and safely implement creating the unique
    (x) part of the key?
    By "efficient", I mean without causing serialized inserts. Is an instead of
    insert trigger the best solution (using either a temp table with an identity
    and using that, or using the max+1 method (which I would rather avoid...
    like the plague)? Moreover, is there any _standard_ solution? (Are IOI
    triggers sql server specific?)
    By "safe", I mean that relying on serial inserts is surely bad. OK, we could
    have a stored proc to insert records one at a time, acting as a middle tier
    key generator. But what happens when Mr dbo comes along in a year's time and
    performs a basic insert statement against the table from some other table?
    Or more generally, for some reason in the future a set based insert
    operation is required?

    As an almost total tangent, I have seen a few posts which recommend using a
    datetime key... eg this from JC...

    CREATE TABLE Foobar
    (trans_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
    PRIMARY KEY,
    product_id INTEGER NOT NULL,
    quantity INTEGER NOT NULL,
    ...);

    Surely this fails Joe's own test regarding set based inserts as every record
    in the inserted set will get the same timestamp.


    Donald Halloran Guest

  2. #2

    Default Re: Favourite topic: primary keys (natural vs surrogate/artificial)

    Yes this topic causes wars<g> but here is my take on your question. If you
    have to work that hard to find a natural key then it usually is not the best
    thing for this. Going to extremes with natural keys can cause the compound
    key to be very large. The larger it gets the more bersome it is to work
    with. Why propagate a 30,40 or more character key to many child tables when
    you can just use a 4 byte INT instead? That large value is made up of many
    smaller columns that are typically of no use logically in the children
    tables. It gets even worse if you want to cluster the Key whether it be the
    Primary on the Foreign ones.

    --

    Andrew J. Kelly
    SQL Server MVP


    "Donald Halloran" <archonATTquantumfireDOTTcom> wrote in message
    news:3f0418ffduster.adelaide.on.net...
    > Another issue arises under my favourite topic of discussion!
    >
    > What solutions do people have for creating a "natural" key that is both
    > meaningful, and unique.
    > Consider a situation where people create "things" that will be stored as
    > unique entities, but where the user themselves have not predefined a
    > natural, unique identifier. In other words, it is left up do the database
    to
    > create an identifier (heck, every "natural" key must have been created by
    > someone or something at some point, right?)
    >
    > Well, a natural key might be defined at least partly by who is submitting
    it
    > and when. As a quick example, a table created in 2003 might be keyed as
    > TBL03x
    >
    > The idea is to have as much of the key as possible be meaningful, the x
    > is there merely to guarantee uniqueness.
    > Is there any way to efficiently and safely implement creating the unique
    > (x) part of the key?
    > By "efficient", I mean without causing serialized inserts. Is an instead
    of
    > insert trigger the best solution (using either a temp table with an
    identity
    > and using that, or using the max+1 method (which I would rather avoid...
    > like the plague)? Moreover, is there any _standard_ solution? (Are IOI
    > triggers sql server specific?)
    > By "safe", I mean that relying on serial inserts is surely bad. OK, we
    could
    > have a stored proc to insert records one at a time, acting as a middle
    tier
    > key generator. But what happens when Mr dbo comes along in a year's time
    and
    > performs a basic insert statement against the table from some other table?
    > Or more generally, for some reason in the future a set based insert
    > operation is required?
    >
    > As an almost total tangent, I have seen a few posts which recommend using
    a
    > datetime key... eg this from JC...
    >
    > CREATE TABLE Foobar
    > (trans_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
    > PRIMARY KEY,
    > product_id INTEGER NOT NULL,
    > quantity INTEGER NOT NULL,
    > ...);
    >
    > Surely this fails Joe's own test regarding set based inserts as every
    record
    > in the inserted set will get the same timestamp.
    >
    >

    Andrew J. Kelly Guest

  3. #3

    Default Re: Favourite topic: primary keys (natural vs surrogate/artificial)

    Donald,

    Some questions to consider when determining a primary key:

    1) Why does a primary key have to be meaningful?
    2) Why would you want a 30, 40 byte or more character field to be a primary
    key when a 4 byte int will do?
    3) Why would you want a field that can have its value changed to be a
    primary key? Do you realize what sort of concurrency issues this could
    raise when the value of the field is changed, and potentially dozens of
    foreign key tables need to be updated?
    4) Do you really want foreign key tables to consist of their own 30,40 byte
    (or more) primary key and another 30, 40 byte foreign key field, when two 4
    byte int fields will suffice?
    5) Have you considered the hassles of creating foreign key tables, when the
    master table has a multi-segmented primary key?

    I am in the middle of managing and co-developing my second very large
    financial application (currently 280 tables). Every single table has a 4
    byte Identity field as its primary key. I cannot express how easy this
    makes my life. It's easy to refer to individual records. The values of the
    P.K's don't change so I don't need to worry about concurreny issues, and
    naming is consistent (every P.K. field is named "RowID"). This makes
    creating foreign key relationships a breeze. And, performance is great.

    J.R.
    Largo SQL Tools
    The Finest Collection of SQL Tools Available
    [url]http://www.largosqltools.com[/url]

    "Donald Halloran" <archonATTquantumfireDOTTcom> wrote in message
    news:3f0418ffduster.adelaide.on.net...
    > Another issue arises under my favourite topic of discussion!
    >
    > What solutions do people have for creating a "natural" key that is both
    > meaningful, and unique.
    > Consider a situation where people create "things" that will be stored as
    > unique entities, but where the user themselves have not predefined a
    > natural, unique identifier. In other words, it is left up do the database
    to
    > create an identifier (heck, every "natural" key must have been created by
    > someone or something at some point, right?)
    >
    > Well, a natural key might be defined at least partly by who is submitting
    it
    > and when. As a quick example, a table created in 2003 might be keyed as
    > TBL03x
    >
    > The idea is to have as much of the key as possible be meaningful, the x
    > is there merely to guarantee uniqueness.
    > Is there any way to efficiently and safely implement creating the unique
    > (x) part of the key?
    > By "efficient", I mean without causing serialized inserts. Is an instead
    of
    > insert trigger the best solution (using either a temp table with an
    identity
    > and using that, or using the max+1 method (which I would rather avoid...
    > like the plague)? Moreover, is there any _standard_ solution? (Are IOI
    > triggers sql server specific?)
    > By "safe", I mean that relying on serial inserts is surely bad. OK, we
    could
    > have a stored proc to insert records one at a time, acting as a middle
    tier
    > key generator. But what happens when Mr dbo comes along in a year's time
    and
    > performs a basic insert statement against the table from some other table?
    > Or more generally, for some reason in the future a set based insert
    > operation is required?
    >
    > As an almost total tangent, I have seen a few posts which recommend using
    a
    > datetime key... eg this from JC...
    >
    > CREATE TABLE Foobar
    > (trans_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
    > PRIMARY KEY,
    > product_id INTEGER NOT NULL,
    > quantity INTEGER NOT NULL,
    > ...);
    >
    > Surely this fails Joe's own test regarding set based inserts as every
    record
    > in the inserted set will get the same timestamp.
    >
    >

    Amos Soma Guest

  4. #4

    Default Re: Favourite topic: primary keys (natural vs surrogate/artificial)

    "Amos Soma" <amos_j_somanospam.com> wrote in message news:<OuFaGzWQDHA.2036TK2MSFTNGP10.phx.gbl>...
    > Donald,
    >
    > Some questions to consider when determining a primary key:
    >
    > 1) Why does a primary key have to be meaningful?
    > 2) Why would you want a 30, 40 byte or more character field to be a primary
    > key when a 4 byte int will do?
    > 3) Why would you want a field that can have its value changed to be a
    > primary key? Do you realize what sort of concurrency issues this could
    > raise when the value of the field is changed, and potentially dozens of
    > foreign key tables need to be updated?
    > 4) Do you really want foreign key tables to consist of their own 30,40 byte
    > (or more) primary key and another 30, 40 byte foreign key field, when two 4
    > byte int fields will suffice?
    > 5) Have you considered the hassles of creating foreign key tables, when the
    > master table has a multi-segmented primary key?
    I'm afraid I haven't really made my question clear.
    What I am trying to do is come up with a unique identifier that the
    user will use, directly, to retrieve information related to a
    submission.
    As another f'rinstance....
    The marketing department (whose code is, say, MKT, and everyone in the
    department knows that) creates a contract. They create it in 2003. It
    is the 57'th contract they have created. They want to store it in a
    database. They also want to be able to retrieve it quickly based on
    the contract number.
    I could let SQL server generate an identity based ID for the contract.
    It might be something like 122437.
    Alternatively, I could create a more meaningful key, like MKT-03-057.
    Or, for brevity, MKT03057. It's only 8 bytes long, and is much easier
    for marketing to remember than 122437. On the flipside, you can also
    easily look at the key and see that it was the 57th submission by
    marketing in 2003.
    The problem is getting the database to create that "natural" ID for
    the contract. The MKT part is easy (you know who is submitting it),
    and the 03 part is easy, (datepart(year, getdate())), but how about
    the "counter" part? It doesnt have to be a counter, we could, eg, use
    3 letter words instead. That would be even easier to remember.
    MKT03CAT. But either way, it is up to the database to come up with the
    part that makes it unique.
    Archon Guest

  5. #5

    Default Re: Favourite topic: primary keys (natural vs surrogate/artificial)

    >
    > 1) Why does a primary key have to be meaningful?
    >
    It doesn't have to be. But what really made RDBMS stand out from the
    early database systems was the fact that rows where identifed by value
    and not pointers.
    >
    > 2) Why would you want a 30, 40 byte or more character field to be a
    > primary key when a 4 byte int will do?
    >
    See above and because it can reduce the number of joins involving
    Key columns
    >
    > 3) Why would you want a field that can have its value changed to be a
    > primary key? Do you realize what sort of concurrency issues
    > this could
    > raise when the value of the field is changed, and potentially
    > dozens of
    > foreign key tables need to be updated?
    >
    A changing primary key is usually a bad choice but it depends on
    frequency. Personally, not being able to update a column worries me.
    SQL2K's DRI seems to handle that fairly well.
    >
    > 4) Do you really want foreign key tables to consist of their own
    > 30,40 byte
    > (or more) primary key and another 30, 40 byte foreign key field, when
    > two 4 byte int fields will suffice?
    >
    Yes I do, if it is actually a irreducible key. With surrogates I have
    to "chase" the ID field all the way to the "parent".. A lot like
    pointers really....
    >
    > 5) Have you considered the hassles of creating foreign key tables,
    > when the master table has a multi-segmented primary key?
    >
    Design is the most important part of a good database project. If I have
    to spend a few extra hours so be it.

    --
    Posted via [url]http://dbforums.com[/url]
    byrmol Guest

Similar Threads

  1. Replies: 1
    Last Post: April 7th, 02:49 PM
  2. Preserving Primary Keys in DTS
    By dj shane in forum Coldfusion Database Access
    Replies: 3
    Last Post: September 27th, 08:57 PM
  3. Tablespaces and primary keys
    By Wes in forum PostgreSQL / PGSQL
    Replies: 1
    Last Post: January 25th, 04:34 AM
  4. Deleting Primary Keys
    By Don Bryant in forum Informix
    Replies: 4
    Last Post: September 22nd, 08:26 AM
  5. Primary Keys
    By John Simmons in forum PHP Development
    Replies: 6
    Last Post: August 17th, 10:13 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139