Professional Web Applications Themes

Count unique data in column - MySQL

Hi, I have an optimization question. Is there a way to count instances of each unique entry in a column (not unique entries)? Let me explain using the column count. count 3 3 4 5 2 3 2 Question: Is there a way to return an associative array of total instances of unique entries? Desired Result: {2}->2, {3}->3, {4}->1, {5}->1 The current script requests all data (SELECT *) and sorts them in a script. I want to offset the count to the SQL server (good idea?). Kev...

  1. #1

    Default Count unique data in column

    Hi,

    I have an optimization question. Is there a way to count instances of
    each unique entry in a column (not unique entries)? Let me explain
    using the column count.

    count
    3
    3
    4
    5
    2
    3
    2

    Question:
    Is there a way to return an associative array of total instances of
    unique entries?
    Desired Result: {2}->2, {3}->3, {4}->1, {5}->1

    The current script requests all data (SELECT *) and sorts them in a
    script. I want to offset the count to the SQL server (good idea?).

    Kev

    kev Guest

  2. #2

    Default Re: Count unique data in column

    kev wrote:
    > Is there a way to return an associative array of total instances of
    > unique entries?
    > Desired Result: {2}->2, {3}->3, {4}->1, {5}->1
    SELECT `count`, COUNT(`count`) FROM tablename GROUP BY `count`

    This is made a little bit confusing by the fact that your column is
    named `count`.

    Regards,
    Bill K.
    Bill Karwin Guest

  3. #3

    Default Re: Count unique data in column

    Bill Karwin wrote:
    > kev wrote:
    > > Is there a way to return an associative array of total instances of
    > > unique entries?
    > > Desired Result: {2}->2, {3}->3, {4}->1, {5}->1
    >
    > SELECT `count`, COUNT(`count`) FROM tablename GROUP BY `count`
    >
    > This is made a little bit confusing by the fact that your column is
    > named `count`.
    >
    > Regards,
    > Bill K.
    Query Result:

    2 - 2
    3 - 3
    4 - 1
    5 - 1

    Simple! Thanks Bill.

    kev Guest

  4. #4

    Default Re: Count unique data in column

    I have a concern about my request on top of that since we are on the
    topic of optimization.

    What is a good way to approach SQL queries?

    Should I connect to the server on every query? (looks like a bad idea
    because you will increase server bandwidth assuming not localhost)

    Example:
    Start session

    On every query: Connect => Query => Disconnect

    End session

    Should I use pconnect (OR connect and save connection handler) and
    disconnect after the session has ended? How does the server handle the
    event when users are on a pconnect and the browser is closed? (would
    pconnect disconnect once session is over?) I do not want to leave
    multiple connections open to timeout.

    Example:
    Start session => Connect

    On every query: Query

    End session = Disconnect

    Feedback would be great. Just want to establish a standard. I am
    currently using the first method but I am wondering if it should worry
    about scalability. Thanks in advance

    Kev

    kev Guest

  5. #5

    Default Re: Count unique data in column

    kev wrote:
    > Should I connect to the server on every query?
    No, I would not recommend this. Connecting costs a large amount of
    overhead. It has to find the host, establish a socket, authenticate
    your username and password, and set up other resources. This is
    actually pretty expensive to do, and you should avoid doing it for every
    query.
    > Should I use pconnect (OR connect and save connection handler) and
    > disconnect after the session has ended?
    I assume you mean mysql_pconnect(), part of the PHP mysql package.
    [url]http://www.php.net/function.mysql-pconnect[/url]

    Yes, this is a good thing to do for greater scalability. This reduces
    the number of times that PHP needs to create a new connection to MySQL
    The web server (Apache) that runs PHP manages the persisent connections
    and lets them be re-used by later PHP requests.

    However, if you use PHP in CGI mode, starting a new process for each PHP
    request, the persistent connections won't last anyway.
    > How does the server handle the
    > event when users are on a pconnect and the browser is closed? (would
    > pconnect disconnect once session is over?)
    According to the manual, mysql_close() does not close connections opened
    by mysql_pconnect(). The persistent database connections will be used
    by later requests for different web sessions.

    Also note that subsequent PHP requests are _not_ likely to use the same
    database connection. So if you create connection-based things like
    temporary tables or user variables, use them only during the same PHP
    request, and then you should drop that table or set that variable to
    NULL, or else some other web session will get to see it.
    > I do not want to leave multiple connections open to timeout.
    You can force MySQL to time out long connections using the
    "interactive-timeout" option in the my.cnf (my.ini) configuration file.
    The default for this timeout is 28800 seconds. See
    [url]http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html[/url]

    Regards,
    Bill K.
    Bill Karwin Guest

  6. #6

    Default Re: Count unique data in column

    kev wrote:
    > I have a concern about my request on top of that since we are on the
    > topic of optimization.
    >
    > What is a good way to approach SQL queries?
    >
    > Should I connect to the server on every query? (looks like a bad idea
    > because you will increase server bandwidth assuming not localhost)
    >
    > Example:
    > Start session
    >
    > On every query: Connect => Query => Disconnect
    >
    > End session
    >
    > Should I use pconnect (OR connect and save connection handler) and
    > disconnect after the session has ended? How does the server handle the
    > event when users are on a pconnect and the browser is closed? (would
    > pconnect disconnect once session is over?) I do not want to leave
    > multiple connections open to timeout.
    >
    > Example:
    > Start session => Connect
    >
    > On every query: Query
    >
    > End session = Disconnect
    >
    > Feedback would be great. Just want to establish a standard. I am
    > currently using the first method but I am wondering if it should worry
    > about scalability. Thanks in advance
    >
    > Kev
    >
    Kev,

    I don't generally use pconnect() - it keeps connections open even if no one is
    using the website (or php code). It might be find if you're running a site with
    100K hits/hr. But none of mine do that.

    When I need MySQL, I just connect at the start of the page and close at the end
    of the page. Nice, clean, no muss, no fuss.

    Additionally, if you're using shared hosting, persistent connections may be
    disabled.


    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    [email]jstucklexattglobal.net[/email]
    ==================
    Jerry Stuckle Guest

  7. #7

    Default Re: Count unique data in column

    Bill Karwin <billkarwin.com> wrote:
    > kev wrote:
    >> Should I connect to the server on every query?
    >
    > No, I would not recommend this. Connecting costs a large amount of
    > overhead. It has to find the host, establish a socket, authenticate
    > your username and password, and set up other resources. This is
    > actually pretty expensive to do, and you should avoid doing it for every
    > query.
    I agree that one should not connect for every single query.
    OTOH MySQL connects are pretty cheap, probably cheaper than the
    underlying TCP connection setup in the network stack.
    >> Should I use pconnect (OR connect and save connection handler) and
    >> disconnect after the session has ended?
    >
    > I assume you mean mysql_pconnect(), part of the PHP mysql package.
    > [url]http://www.php.net/function.mysql-pconnect[/url]
    >
    > Yes, this is a good thing to do for greater scalability.
    No, it is not. Contrary, using mysql_pconnect() is considered harmful,
    as it tends to hog the MySQL server with idle connections. A very
    detailed (but German) discussion of the topic is here:

    <http://kris.koehntopp.de/artikel/webtune/>

    In a nutshell:

    a) many connections:

    This problem arises if you use different connections (that is <host,
    user, password> tupels) to connect to the database. If your scripts
    (*all* scripts running in one Apache instance) use N different
    connections, PHP will use up to N * max_clients connections to the
    database - independent from the real webserver load.

    Contrary, if each of your scripts uses only M different connections
    at a time (in most cases M=1) and refrains from using persistent
    connections, you will have at most M * max_clients open connections
    to the database. Plus, you will only reach that limit if your web-
    server is fully loaded.

    Idle connections will eat memory and file descriptors. Not good.


    b) MySQL sessions have state

    There are lots of objects in MySQL with session scope: locks, user
    variables, temporary tables. Currently there is no way to reset a
    session, so the best way to destroy those objects is to close the
    connection. Additionally there is a hidden attribute - the selected
    database (mysql_select_db(), USE <db>). If you re-use a MySQL
    connection handle in PHP, you cannot know the selected database.
    Therefore many database abstraction layers for PHP call
    mysql_select_db() *mandatory* right before each mysql_query().

    Forgotten locks may render your database useless. Forgotten temp.
    tables will eat memory and/or disk space. User variables may spread
    sensitive information between web applications. Very bad.


    Conclusion: mysql_pconnect() solves a nonexistent problem. Also it
    creates new problems. You should not use it.


    XL
    --
    Axel Schwenke, Senior Software Developer, MySQL AB

    Online User Manual: [url]http://dev.mysql.com/doc/refman/5.0/en/[/url]
    MySQL User Forums: [url]http://forums.mysql.com/[/url]
    Axel Schwenke Guest

  8. #8

    Default Re: Count unique data in column

    Awesome! Thanks for the information, Axel.

    Regards,
    Bill K.
    Bill Karwin Guest

  9. #9

    Default Re: Count unique data in column


    Axel Schwenke wrote:
    > Bill Karwin <billkarwin.com> wrote:
    > > kev wrote:
    > >> Should I connect to the server on every query?
    > >
    > > No, I would not recommend this. Connecting costs a large amount of
    > > overhead. It has to find the host, establish a socket, authenticate
    > > your username and password, and set up other resources. This is
    > > actually pretty expensive to do, and you should avoid doing it for every
    > > query.
    >
    > I agree that one should not connect for every single query.
    > OTOH MySQL connects are pretty cheap, probably cheaper than the
    > underlying TCP connection setup in the network stack.
    >
    > >> Should I use pconnect (OR connect and save connection handler) and
    > >> disconnect after the session has ended?
    > >
    > > I assume you mean mysql_pconnect(), part of the PHP mysql package.
    > > [url]http://www.php.net/function.mysql-pconnect[/url]
    > >
    > > Yes, this is a good thing to do for greater scalability.
    >
    > No, it is not. Contrary, using mysql_pconnect() is considered harmful,
    > as it tends to hog the MySQL server with idle connections. A very
    > detailed (but German) discussion of the topic is here:
    >
    > <http://kris.koehntopp.de/artikel/webtune/>
    >
    > In a nutshell:
    >
    > a) many connections:
    >
    > This problem arises if you use different connections (that is <host,
    > user, password> tupels) to connect to the database. If your scripts
    > (*all* scripts running in one Apache instance) use N different
    > connections, PHP will use up to N * max_clients connections to the
    > database - independent from the real webserver load.
    >
    > Contrary, if each of your scripts uses only M different connections
    > at a time (in most cases M=1) and refrains from using persistent
    > connections, you will have at most M * max_clients open connections
    > to the database. Plus, you will only reach that limit if your web-
    > server is fully loaded.
    >
    > Idle connections will eat memory and file descriptors. Not good.
    >
    >
    > b) MySQL sessions have state
    >
    > There are lots of objects in MySQL with session scope: locks, user
    > variables, temporary tables. Currently there is no way to reset a
    > session, so the best way to destroy those objects is to close the
    > connection. Additionally there is a hidden attribute - the selected
    > database (mysql_select_db(), USE <db>). If you re-use a MySQL
    > connection handle in PHP, you cannot know the selected database.
    > Therefore many database abstraction layers for PHP call
    > mysql_select_db() *mandatory* right before each mysql_query().
    >
    > Forgotten locks may render your database useless. Forgotten temp.
    > tables will eat memory and/or disk space. User variables may spread
    > sensitive information between web applications. Very bad.
    >
    >
    > Conclusion: mysql_pconnect() solves a nonexistent problem. Also it
    > creates new problems. You should not use it.
    >
    >
    > XL
    > --
    > Axel Schwenke, Senior Software Developer, MySQL AB
    >
    > Online User Manual: [url]http://dev.mysql.com/doc/refman/5.0/en/[/url]
    > MySQL User Forums: [url]http://forums.mysql.com/[/url]
    Good to know. Thanks for the info. Now I have to figure out how to pass
    the SQL connection handle across pages/the session.

    kev Guest

  10. #10

    Default Re: Count unique data in column

    kev wrote:
    > Axel Schwenke wrote:
    >
    >>Bill Karwin <billkarwin.com> wrote:
    >>
    >>>kev wrote:
    >>>
    >>>>Should I connect to the server on every query?
    >>>
    >>>No, I would not recommend this. Connecting costs a large amount of
    >>>overhead. It has to find the host, establish a socket, authenticate
    >>>your username and password, and set up other resources. This is
    >>>actually pretty expensive to do, and you should avoid doing it for every
    >>>query.
    >>
    >>I agree that one should not connect for every single query.
    >>OTOH MySQL connects are pretty cheap, probably cheaper than the
    >>underlying TCP connection setup in the network stack.
    >>
    >>
    >>>>Should I use pconnect (OR connect and save connection handler) and
    >>>>disconnect after the session has ended?
    >>>
    >>>I assume you mean mysql_pconnect(), part of the PHP mysql package.
    >>>[url]http://www.php.net/function.mysql-pconnect[/url]
    >>>
    >>>Yes, this is a good thing to do for greater scalability.
    >>
    >>No, it is not. Contrary, using mysql_pconnect() is considered harmful,
    >>as it tends to hog the MySQL server with idle connections. A very
    >>detailed (but German) discussion of the topic is here:
    >>
    >><http://kris.koehntopp.de/artikel/webtune/>
    >>
    >>In a nutshell:
    >>
    >>a) many connections:
    >>
    >>This problem arises if you use different connections (that is <host,
    >>user, password> tupels) to connect to the database. If your scripts
    >>(*all* scripts running in one Apache instance) use N different
    >>connections, PHP will use up to N * max_clients connections to the
    >>database - independent from the real webserver load.
    >>
    >>Contrary, if each of your scripts uses only M different connections
    >>at a time (in most cases M=1) and refrains from using persistent
    >>connections, you will have at most M * max_clients open connections
    >>to the database. Plus, you will only reach that limit if your web-
    >>server is fully loaded.
    >>
    >>Idle connections will eat memory and file descriptors. Not good.
    >>
    >>
    >>b) MySQL sessions have state
    >>
    >>There are lots of objects in MySQL with session scope: locks, user
    >>variables, temporary tables. Currently there is no way to reset a
    >>session, so the best way to destroy those objects is to close the
    >>connection. Additionally there is a hidden attribute - the selected
    >>database (mysql_select_db(), USE <db>). If you re-use a MySQL
    >>connection handle in PHP, you cannot know the selected database.
    >>Therefore many database abstraction layers for PHP call
    >>mysql_select_db() *mandatory* right before each mysql_query().
    >>
    >>Forgotten locks may render your database useless. Forgotten temp.
    >>tables will eat memory and/or disk space. User variables may spread
    >>sensitive information between web applications. Very bad.
    >>
    >>
    >>Conclusion: mysql_pconnect() solves a nonexistent problem. Also it
    >>creates new problems. You should not use it.
    >>
    >>
    >>XL
    >>--
    >>Axel Schwenke, Senior Software Developer, MySQL AB
    >>
    >>Online User Manual: [url]http://dev.mysql.com/doc/refman/5.0/en/[/url]
    >>MySQL User Forums: [url]http://forums.mysql.com/[/url]
    >
    >
    > Good to know. Thanks for the info. Now I have to figure out how to pass
    > the SQL connection handle across pages/the session.
    >
    Kev,

    You can't pass connections across pages - the connection will automatically
    close at the end of the PHP script if you don't close it yourself (it's a good
    idea to close it, though).

    But that's OK - you really don't want to keep the connection open while the user
    goes to lunch. Just connect before your first MySQL and close the connection
    after the last one (or at the end of the page).


    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    [email]jstucklexattglobal.net[/email]
    ==================
    Jerry Stuckle Guest

  11. #11

    Default Re: Count unique data in column

    Jerry Stuckle wrote:
    > kev wrote:
    > > Axel Schwenke wrote:
    > >
    > >>Bill Karwin <billkarwin.com> wrote:
    > >>
    > >>>kev wrote:
    > >>>
    > >>>>Should I connect to the server on every query?
    > >>>
    > >>>No, I would not recommend this. Connecting costs a large amount of
    > >>>overhead. It has to find the host, establish a socket, authenticate
    > >>>your username and password, and set up other resources. This is
    > >>>actually pretty expensive to do, and you should avoid doing it for every
    > >>>query.
    > >>
    > >>I agree that one should not connect for every single query.
    > >>OTOH MySQL connects are pretty cheap, probably cheaper than the
    > >>underlying TCP connection setup in the network stack.
    > >>
    > >>
    > >>>>Should I use pconnect (OR connect and save connection handler) and
    > >>>>disconnect after the session has ended?
    > >>>
    > >>>I assume you mean mysql_pconnect(), part of the PHP mysql package.
    > >>>[url]http://www.php.net/function.mysql-pconnect[/url]
    > >>>
    > >>>Yes, this is a good thing to do for greater scalability.
    > >>
    > >>No, it is not. Contrary, using mysql_pconnect() is considered harmful,
    > >>as it tends to hog the MySQL server with idle connections. A very
    > >>detailed (but German) discussion of the topic is here:
    > >>
    > >><http://kris.koehntopp.de/artikel/webtune/>
    > >>
    > >>In a nutshell:
    > >>
    > >>a) many connections:
    > >>
    > >>This problem arises if you use different connections (that is <host,
    > >>user, password> tupels) to connect to the database. If your scripts
    > >>(*all* scripts running in one Apache instance) use N different
    > >>connections, PHP will use up to N * max_clients connections to the
    > >>database - independent from the real webserver load.
    > >>
    > >>Contrary, if each of your scripts uses only M different connections
    > >>at a time (in most cases M=1) and refrains from using persistent
    > >>connections, you will have at most M * max_clients open connections
    > >>to the database. Plus, you will only reach that limit if your web-
    > >>server is fully loaded.
    > >>
    > >>Idle connections will eat memory and file descriptors. Not good.
    > >>
    > >>
    > >>b) MySQL sessions have state
    > >>
    > >>There are lots of objects in MySQL with session scope: locks, user
    > >>variables, temporary tables. Currently there is no way to reset a
    > >>session, so the best way to destroy those objects is to close the
    > >>connection. Additionally there is a hidden attribute - the selected
    > >>database (mysql_select_db(), USE <db>). If you re-use a MySQL
    > >>connection handle in PHP, you cannot know the selected database.
    > >>Therefore many database abstraction layers for PHP call
    > >>mysql_select_db() *mandatory* right before each mysql_query().
    > >>
    > >>Forgotten locks may render your database useless. Forgotten temp.
    > >>tables will eat memory and/or disk space. User variables may spread
    > >>sensitive information between web applications. Very bad.
    > >>
    > >>
    > >>Conclusion: mysql_pconnect() solves a nonexistent problem. Also it
    > >>creates new problems. You should not use it.
    > >>
    > >>
    > >>XL
    > >>--
    > >>Axel Schwenke, Senior Software Developer, MySQL AB
    > >>
    > >>Online User Manual: [url]http://dev.mysql.com/doc/refman/5.0/en/[/url]
    > >>MySQL User Forums: [url]http://forums.mysql.com/[/url]
    > >
    > >
    > > Good to know. Thanks for the info. Now I have to figure out how to pass
    > > the SQL connection handle across pages/the session.
    > >
    >
    > Kev,
    >
    > You can't pass connections across pages - the connection will automatically
    > close at the end of the PHP script if you don't close it yourself (it's a good
    > idea to close it, though).
    >
    > But that's OK - you really don't want to keep the connection open while the user
    > goes to lunch. Just connect before your first MySQL and close the connection
    > after the last one (or at the end of the page).
    >
    >
    > --
    > ==================
    > Remove the "x" from my email address
    > Jerry Stuckle
    > JDS Computer Training Corp.
    > [email]jstucklexattglobal.net[/email]
    > ==================
    I see. So its connection per page request. :) Thanks again.

    Kevin

    kev Guest

Similar Threads

  1. Adding UNIQUE constraint on NULL column
    By Dave Smith in forum PostgreSQL / PGSQL
    Replies: 6
    Last Post: January 13th, 09:39 PM
  2. Select after insert to the unique column
    By Bruno Wolff III in forum PostgreSQL / PGSQL
    Replies: 3
    Last Post: December 13th, 04:44 PM
  3. column count
    By Reggie in forum ASP.NET Data Grid Control
    Replies: 5
    Last Post: November 7th, 08:23 AM
  4. Access (ROWID) Unique ID from Datagrid Column
    By Mona Syed in forum ASP.NET Data Grid Control
    Replies: 0
    Last Post: June 1st, 02:52 PM
  5. counting number of unique entries under a column
    By mr_burns in forum PHP Development
    Replies: 2
    Last Post: October 9th, 07:29 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139