Professional Web Applications Themes

Does a normalized design lead to complex queries? - MySQL

Suppose we have to design a database for a recruitment agency. There will be a table "candidates" with fields "candidateid","last name","first name" ; the languages mastered by a candidate as well as the skills are separated into different tables. Eg. The skills are stored in the table "skills" with fields "skillid", "skillname","candidateid" where skillid is the primary autoincrement key. The language table is defined in a similar way. A candidate having two skills will lead to two records in the table "skills"; similarly a candidate mastering 3 languages will lead to 3 records in the table "languages". Retrieving candidates with ...

  1. #1

    Default Does a normalized design lead to complex queries?

    Suppose we have to design a database for a recruitment agency. There
    will be a table "candidates" with fields "candidateid","last
    name","first name" ; the languages mastered by a candidate as
    well as the skills are separated into different tables. Eg. The skills
    are stored in the table "skills" with fields "skillid",
    "skillname","candidateid" where skillid is the primary
    autoincrement key. The language table is defined in a similar way. A
    candidate having two skills will lead to two records in the table
    "skills"; similarly a candidate mastering 3 languages will lead to
    3 records in the table "languages".
    Retrieving candidates with skills with skillname="php" and
    skillname="asp" as well as languagename="EN" and
    languagename="FR" can be done as follows:
    SELECT DISTINCT c.name FROM candidates c, skills s, languages l WHERE
    l.candidateid=c.candidateid AND s.candidateid=c.candidateid AND
    (s.skillname='php' OR s.skillname='asp') AND (l.languagename='FR' OR
    l.languagename='EN')
    GROUP BY c.name HAVING (count(DISTINCT s.skillid)=2 AND count(DISTINCT
    l.languageid)=2)
    This already rather complex query (for a non-expert as myself) is
    probably the disadvantage of a normalized design.
    The problem that bothers me most is the following: what happens if I
    want to built a search tool to find candidates using a flexible search
    interface; in this interface the user could type "meta queries"
    such as: find me the candidates that (master php AND (asp OR net)) AND
    (that master French OR (Dutch AND German))
    This query looks quite simple but how to translate this to one query
    for the normalized database?
    Do I first have to do the following conversion:
    find me the candidates that (master (php AND asp) OR (php AND net)) AND
    (that master French OR (Dutch AND German))
    becoming
    find me the candidates that (master (php AND asp)) AND (that master
    French)
    OR
    find me the candidates that (master (php AND asp)) AND (that master
    Dutch AND German)
    OR
    find me the candidates that (master (php AND net)) AND (that master
    French)
    OR
    find me the candidates that (master (php AND net)) AND (that master
    Dutch AND German)
    Next I assume the three ORs have to be combined via UNION statements?
    This seems quite complicated so I was wondering if there is no easier
    way to create an SQL query starting from the "meta query" above?

    phlype.johnson@gmail.com Guest

  2. #2

    Default Does a normalized design lead to complex queries?

    Suppose we have to design a database for a recruitment agency. There
    will be a table "candidates" with fields "candidateid","last
    name","first name" ; the languages mastered by a candidate as
    well as the skills are separated into different tables. Eg. The skills
    are stored in the table "skills" with fields "skillid",
    "skillname","candidateid" where skillid is the primary
    autoincrement key. The language table is defined in a similar way. A
    candidate having two skills will lead to two records in the table
    "skills"; similarly a candidate mastering 3 languages will lead to
    3 records in the table "languages".
    Retrieving candidates with skills with skillname="php" and
    skillname="asp" as well as languagename="EN" and
    languagename="FR" can be done as follows:
    SELECT DISTINCT c.name FROM candidates c, skills s, languages l WHERE
    l.candidateid=c.candidateid AND s.candidateid=c.candidateid AND
    (s.skillname='php' OR s.skillname='asp') AND (l.languagename='FR' OR
    l.languagename='EN')
    GROUP BY c.name HAVING (count(DISTINCT s.skillid)=2 AND count(DISTINCT
    l.languageid)=2)
    This already rather complex query (for a non-expert as myself) is
    probably the disadvantage of a normalized design.
    The problem that bothers me most is the following: what happens if I
    want to built a search tool to find candidates using a flexible search
    interface; in this interface the user could type "meta queries"
    such as: find me the candidates that (master php AND (asp OR net)) AND
    (that master French OR (Dutch AND German))
    This query looks quite simple but how to translate this to one query
    for the normalized database?
    Do I first have to do the following conversion:
    find me the candidates that (master (php AND asp) OR (php AND net)) AND
    (that master French OR (Dutch AND German))
    becoming
    find me the candidates that (master (php AND asp)) AND (that master
    French)
    OR
    find me the candidates that (master (php AND asp)) AND (that master
    Dutch AND German)
    OR
    find me the candidates that (master (php AND net)) AND (that master
    French)
    OR
    find me the candidates that (master (php AND net)) AND (that master
    Dutch AND German)
    Next I assume the three ORs have to be combined via UNION statements?
    This seems quite complicated so I was wondering if there is no easier
    way to create an SQL query starting from the "meta query" above?

    phlype.johnson@gmail.com Guest

  3. #3

    Default Re: Does a normalized design lead to complex queries?

    Conceptually, not only your queries, but your entire application design
    would be _easier_ with normalized tables. You are creating a monster the
    other way.

    Needed Tables:
    candidates table
    skills table
    candidate_skills table that joins the other two to achieve the many to many
    relationship

    You do not need a separate language table, as it fundamentally just another
    type of skill. If you must, add another table that houses "skill types" and
    add that foreign key to the skills table to delineate the type which each
    skillset belongs to. How far you normalize is really the issue, but at
    least a basic normalization is certainly in order.

    Each table should have a non-data related identity column, and the joins to
    other tables should be via those ID columns, not data. Queries on such
    would be no more complex than what you are currently trying to conjur up.
    Do it right and save yourself loads of time later, especially when new
    skills become available, especially when relating skills to individuals.
    Then you will also avoid issues like trying to perform queries on variations
    of entries that are technically the same, but spelled differently like ENG
    vs. english, VB vs. visual basic vs. MSVB vs.... etcetera.

    tblCandidate
    CandidateID,PreferredName,_OtherContactFieldsEtc_
    1,John,Someplace somwhere some phone some email etcetera
    2,Sue,----
    3,Sam,----

    tblSkill
    SkillID,SkillCode,Skill,SkillType
    1,VBDN,MS Visual Basic DotNet,Programming Language
    2,CSDN,MS C# DotNet,Programming Language
    3,ENGL,English,Spoken Language
    4,GRMN,German,Spoken Language
    5,PRLD,Project Lead,Positions
    6,ELPR,Entry Level Programmer,Positions
    7,PRMN,Project Manager,Positions
    8,SPAN,Spanish,Spoken Language
    9,ASSO,Associates,School
    10,BACH,Bachelors,School
    11,MSTR,Masters,School
    12,PHDS,PhD,School
    13,TAPI,TAPI,Specialties
    14,NADM,Network Admin,Positions
    15,DBAD,DBA,Positions
    16,DBAN,Data yst,Positions
    17,ASPS,ASP,Programming Language
    18,JSPS,JSP,Programming Language
    19,PHPS,PHP,Programming Language
    20,RDHT,Red Hat,O/S
    21,WNXP,MS WinXP,O/S
    22,SV03,MS Server 2003,O/S
    23,DEBN,Debian,O/S
    24,SUSE,SUSE,O/S
    25,MNDK,Mandrake,O/S
    26,SQLS,MS SQL Server,Database
    27,MYSQ,MySQL,Database
    28,ORCL,Oracle,Database
    29,ACCS,MS Access (JET),Database
    30,FOXP,Fox,Database
    31,FRNC,French,Spoken Language
    32,DNET,Dot Net (General),Programming Language
    33,DTCH,Dutch,Spoken Language

    tblCandidateSkill
    CandidateSkillID,CandidateID,SkillID
    1,1,2
    2,1,4
    3,1,20
    4,1,9
    5,2,8
    6,2,3
    7,2,6
    8,2,19
    9,3,10
    10,3,16
    11,3,28
    12,3,26
    13,3,11
    14,3,17
    15,3,33
    16,3,4
    17,3,19

    So, given the sample data above, and using this query (this is just *one*
    way to do this)

    SELECT tblCandidate.*
    FROM (((((((tblCandidate LEFT JOIN tblCandidateSkill ON
    tblCandidate.CandidateID = tblCandidateSkill.CandidateID) LEFT JOIN tblSkill
    ON tblCandidateSkill.SkillID = tblSkill.SkillID) LEFT JOIN tblCandidateSkill
    AS tblCandidateSkill_1 ON tblCandidate.CandidateID =
    tblCandidateSkill_1.CandidateID) LEFT JOIN tblSkill AS tblSkill_1 ON
    tblCandidateSkill_1.SkillID = tblSkill_1.SkillID) LEFT JOIN
    (tblCandidateSkill AS tblCandidateSkill_2 LEFT JOIN tblSkill AS tblSkill_2
    ON tblCandidateSkill_2.SkillID = tblSkill_2.SkillID) ON
    tblCandidate.CandidateID = tblCandidateSkill_2.CandidateID) LEFT JOIN
    (tblCandidateSkill AS tblCandidateSkill_3 LEFT JOIN tblSkill AS tblSkill_3
    ON tblCandidateSkill_3.SkillID = tblSkill_3.SkillID) ON
    tblCandidate.CandidateID = tblCandidateSkill_3.CandidateID) LEFT JOIN
    tblCandidateSkill AS tblCandidateSkill_4 ON tblCandidate.CandidateID =
    tblCandidateSkill_4.CandidateID) LEFT JOIN tblSkill AS tblSkill_4 ON
    tblCandidateSkill_4.SkillID = tblSkill_4.SkillID
    WHERE (((tblSkill.SkillCode)="MSTR") AND ((tblSkill_1.SkillCode)="PHPS") AND
    ((tblSkill_2.SkillCode)="ASPS") AND ((tblSkill_3.SkillCode)="FRNC")) OR
    (((tblSkill.SkillCode)="MSTR") AND ((tblSkill_1.SkillCode)="PHPS") AND
    ((tblSkill_2.SkillCode)="DNET") AND ((tblSkill_3.SkillCode)="FRNC")) OR
    (((tblSkill.SkillCode)="MSTR") AND ((tblSkill_1.SkillCode)="PHPS") AND
    ((tblSkill_2.SkillCode)="ASPS") AND ((tblSkill_3.SkillCode)="DTCH") AND
    ((tblSkill_4.SkillCode)="GRMN")) OR (((tblSkill.SkillCode)="MSTR") AND
    ((tblSkill_1.SkillCode)="PHPS") AND ((tblSkill_2.SkillCode)="DNET") AND
    ((tblSkill_3.SkillCode)="DTCH") AND ((tblSkill_4.SkillCode)="GRMN"))
    ORDER BY tblCandidate.PreferredName;

    You would get (CSV formatted):
    CandidateID,PreferredName,_OtherContactFieldsEtc_
    3,Sam,----

    Per the example data, Sam is one highly qualified individual <grin>.

    Your query interface would further benefit by having the user select via
    checkbox or somesuch, all desired skills, and instead of using SkillCode
    values, use the SkillID of each.

    There are, however, much better ways of doing this kind of search, more
    technically advanced SQL and/or a Stored Procedure that would be optimized
    to not be so hard on resources.

    A GUI could also handle each criteria individually... etcetra... etcetra...

    Please forgive any typos that may or may not exist in the lengthy reply
    above...

    HTH.

    ~ Duane Phillips.

    <com> wrote in message
    news:googlegroups.com... 



    Duane Guest

Similar Threads

  1. How do I get a rotation from a normalized vector
    By The Minds Eye in forum Macromedia Director 3D
    Replies: 3
    Last Post: October 4th, 10:14 AM
  2. Problem when heavy reporting and complex queries
    By aduka7 in forum Coldfusion Database Access
    Replies: 0
    Last Post: April 5th, 07:34 AM
  3. Need advice on getting data out of normalized database
    By Dragonhunter in forum ASP Database
    Replies: 2
    Last Post: September 24th, 01:10 PM
  4. Second Normalized Form..
    By Sender in forum Microsoft SQL / MS SQL Server
    Replies: 1
    Last Post: July 8th, 03:15 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139