# Minimum Distance Theory Question... - Microsoft SQL / MS SQL Server

I have a solution in mind, but I wanted to hear what the local SQL-Server programming gurus have to say about my problem --- I've seen the problem stated in various flavors, but I am trying to drop all of this into a stored procedure, and my general knowledge of T-SQL vs the other programming languages I know (VB, Java, Perl, PHP, etc.) is starting to confuse me in terms of what I can get away with vs. what I can't. I currently have a table, tbl_ZipCodes, with standard columns of City, State, Latitude, Longitude (sic!) My problem is to ...

1. ## Minimum Distance Theory Question...

I have a solution in mind, but I wanted to hear what the local
SQL-Server programming gurus have to say about my problem --- I've
seen the problem stated in various flavors, but I am trying to drop
all of this into a stored procedure, and my general knowledge of T-SQL
vs the other programming languages I know (VB, Java, Perl, PHP, etc.)
is starting to confuse me in terms of what I can get away with vs.
what I can't.

I currently have a table, tbl_ZipCodes, with standard columns of City,
State, Latitude, Longitude (sic!)

My problem is to create the following:

user inputs a city and state
I return the zip codes within that city and state AND the zip codes of
the three (this could be any number in theory) nearest cities.

I currently have a function that given lat1, lat2, lon1, and lon2 that
will return the distance between two points:

float a;
float result;

a = long1-long2;
if (a < 0)
a = -a;

if (a > M_PI)
a = 2. * M_PI - a;
result = acos(sin(lat2)*sin(lat1)+cos(lat2)*cos(lat1)*cos(a ));
return (result);

So my thought was to somehow coopt this function into T-SQL to return
a recordset that might have a schema of:

City | State | Distance | Zip Code

This recordset would need to be generated under certain assumptions.
In this case, some cities can have multiple zip codes... in my current
problem set there is no way to know for certain where the "origin" of
my distance search would be, so I would define a "center point" by
taking the average latitudes and longtitudes of all of the latitudes
and longtitudes for a zip code in a specific city (recognizing that
the "accurate" position of my latitude and longtitude would need to be
based on a spherical coordinate system and not just simply doing
sum(lat)/#, sum(lon)/# --- although this is probably close enough for
my purposes.

The only way that I can see dealing with the problem is applying some
sort of row by row operation for distance or simply defining the
distance within the context of the query... I guess no matter what,
the math is somewhat expensive, but I imagine performing this as a
"set" operation vs. using a cursor and doing a row by row operation is
better, right?

Oh, and regarding the recordset I generate above, I would then look at
doing a SELECT on the above recordset. Since I want the first "X"
cities nearest to my city and state parameter in addition to the
zips associated with said cities, I keep wondering if there is a way
to perform this operation without a ton of helper procs that get
iteratively called vs. a way of doing this as a SQL operation.

So my humble request is, in T-SQL land, is my methodology the correct
one in terms of thinking about how to code the solution? Examples of
success/failure would be most appreciated.

I'll update as to whether or not my method works, but if anyone sees a
better way, I'd be most obliged.

Thanks,

Vincent Polite
[email]trinetgrinch[/email]
Vincent Polite Guest

2. ## Re: Minimum Distance Theory Question...

>> I currently have a table, ZipCodes, with standard columns of City,
State, Latitude, Longitude ... user inputs a city and state I return
the zip codes within that city and state AND the zip codes of the (n)
nearest cities. <<

Notice that you do not need the exact distance, so the spherical trig
is over kill. You can pretend the world is flat and still get the
answer. Let me assume that we have (x, y) co-ordinates.

SELECT my_city, my_state, Destinations.city, Destinations.state,
FROM (SELECT *
FROM Zipcodes
WHERE city = my_city
AND state = my_state) AS Origins (x,y)
CROSS JOIN
(SELECT *
FROM Zipcodes
WHERE city <> my_city
OR state <> my_state) AS Destinations (x,y)
CROSS JOIN
Sequence AS S1
WHERE SQRT (SQR(Origins.x - Destinations.x) + SQR(Origins.y -
Destinations.y))
<= S1.seq
AND S1.seq <= <<limit>>
GROUP BY my_city, my_state, Destinations.city, Destinations.state;

The Sequence table is a standard programming trick; it is a list of
integers from 1 to (n). The Origins table is where we start, all the
zip codes in my city. The Destinaitons table is where we finish and
radius is how far away they are. The constant <<limit>> is some upper
limit on distance.

You can now prune this down with a "GROUP BY radius HAVING COUNT(*) =
--CELKO-- Guest

3. ## Re: Minimum Distance Theory Question...

I swear, reading intelligent SQL to me is so frustrating. When I see
cool solutions, I think to myself, DUH...

That's a pretty nice solution. As advertised, here is my crap. I
ended up using a user-defined function, and didn't bother with finding
the exact center of my zip codes as much as taking an average latitude
and longtitude.

I think you're right about the spherical coordinate transformation
possibly being overkill, but I just didn't want to frustrate my boys
near the North Pole. :)

I'd appreciate the commentary, as my T-SQL, especially as it relates
to the intelligent use of user defined functions clearly needs help.

Thanks for your attempted solution, btw. I definitely like the idea
better than what I've done.

Vincent

/*
Author - Vincent Polite
Created on - 07/08/2003
Description - Stored Procedure to return Zip nearest to specific City
and State
*/
CREATE PROCEDURE [DBO].[usp_GetDistanceFromCity]
str_city VARCHAR,
str_state VARCHAR(2)
AS
declare iReturn int
declare Distance float
declare lat1 float
declare lon1 float

SELECT lat1 = AVG(Latitude),lon1=AVG(Longitude)
FROM dbo.tbl_zip_codes
WHERE (State = str_state) AND (City = str_city)
GROUP BY State, City

SELECT ZipCode, City, State, Latitude, Longitude, lat1 as lat1,
lon1 as lon1, dbo.CalculateDistance(lat1, lon1, Latitude,
Longitude) as Distance from tbl_zip_codes
WHERE City='Oakland' and State = 'CA'
ORDER by Distance

IF ERROR=0
RETURN 1
ELSE
RETURN 0
GO
[email]71062.1056compuserve.com[/email] (--CELKO--) wrote in message news:<c0d87ec0.0307081451.64f25dfposting.google.c om>...
> >> I currently have a table, ZipCodes, with standard columns of City,
> State, Latitude, Longitude ... user inputs a city and state I return
> the zip codes within that city and state AND the zip codes of the (n)
> nearest cities. <<
>
> Notice that you do not need the exact distance, so the spherical trig
> is over kill. You can pretend the world is flat and still get the
> answer. Let me assume that we have (x, y) co-ordinates.
>
> SELECT my_city, my_state, Destinations.city, Destinations.state,
> FROM (SELECT *
> FROM Zipcodes
> WHERE city = my_city
> AND state = my_state) AS Origins (x,y)
> CROSS JOIN
> (SELECT *
> FROM Zipcodes
> WHERE city <> my_city
> OR state <> my_state) AS Destinations (x,y)
> CROSS JOIN
> Sequence AS S1
> WHERE SQRT (SQR(Origins.x - Destinations.x) + SQR(Origins.y -
> Destinations.y))
> <= S1.seq
> AND S1.seq <= <<limit>>
> GROUP BY my_city, my_state, Destinations.city, Destinations.state;
>
> The Sequence table is a standard programming trick; it is a list of
> integers from 1 to (n). The Origins table is where we start, all the
> zip codes in my city. The Destinaitons table is where we finish and
> radius is how far away they are. The constant <<limit>> is some upper
> limit on distance.
>
> You can now prune this down with a "GROUP BY radius HAVING COUNT(*) =
> ??" or by adjusting <<limit>>.
Vincent Polite Guest

4. ## one more thing...

Did I mention that the stored proc I posted doesn't quite work? I
seem to be losing the references to my lat1 and lon1 variables, as
my distance function always seems to evaluate to zero.

Any ideas would be helpful, as much as I admire CELKO's solution, I am
curious to know if there's anything obvious that would indicate why my
approach doesn't work.

Best,

Vincent

[email]71062.1056compuserve.com[/email] (--CELKO--) wrote in message news:<c0d87ec0.0307081451.64f25dfposting.google.c om>...
> >> I currently have a table, ZipCodes, with standard columns of City,
> State, Latitude, Longitude ... user inputs a city and state I return
> the zip codes within that city and state AND the zip codes of the (n)
> nearest cities. <<
>
> Notice that you do not need the exact distance, so the spherical trig
> is over kill. You can pretend the world is flat and still get the
> answer. Let me assume that we have (x, y) co-ordinates.
>
> SELECT my_city, my_state, Destinations.city, Destinations.state,
> FROM (SELECT *
> FROM Zipcodes
> WHERE city = my_city
> AND state = my_state) AS Origins (x,y)
> CROSS JOIN
> (SELECT *
> FROM Zipcodes
> WHERE city <> my_city
> OR state <> my_state) AS Destinations (x,y)
> CROSS JOIN
> Sequence AS S1
> WHERE SQRT (SQR(Origins.x - Destinations.x) + SQR(Origins.y -
> Destinations.y))
> <= S1.seq
> AND S1.seq <= <<limit>>
> GROUP BY my_city, my_state, Destinations.city, Destinations.state;
>
> The Sequence table is a standard programming trick; it is a list of
> integers from 1 to (n). The Origins table is where we start, all the
> zip codes in my city. The Destinaitons table is where we finish and
> radius is how far away they are. The constant <<limit>> is some upper
> limit on distance.
>
> You can now prune this down with a "GROUP BY radius HAVING COUNT(*) =
> ??" or by adjusting <<limit>>.
Vincent Polite Guest

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•