我有一个 Django/PostgreSQL 应用程序,可以显示哪些用户离特定用户最近。它在 ORDER BY 子句中使用 PostGIS 2.0 KNN(K 最近邻)<-> 运算符来列出用户,最近的优先。我在初始数据集中发现的两个搜索结果是乱序的(所有距离都是从加利福尼亚州洛杉矶开始测量的):
Member, City, State, Distance (miles)
user1, North Las Vegas, NV, 239
user2, Phoenix, AZ, 365
user3, Provo, UT, 568
user4, Twin Falls, ID, 630
user5, Albuquerque, NM, 673
user6, Portland, OR, 828
user7, Bozeman, MT, 896
user8, Seattle, WA, 962
user9, Boulder, CO, 834 <- Out of order!
user10, Laramie, WY, 862 <- Out of order!
user11, Naperville, IL, 1756
成员名称只是 Django 的 contrib.auth.models 用户类中的用户名列。包含几何信息的UserAccount类定义如下:
class UserAccount(models.Model):
user = models.OneToOneField(User, primary_key=True, unique=True)
address_line_1 = models.CharField(max_length=30)
address_line_2 = models.CharField(max_length=30, blank=True)
city = models.CharField(max_length=30)
region = models.CharField(max_length=30, blank=True)
postal_code = models.CharField(max_length=10, blank=True)
country = models.ForeignKey('Country')
measurement_sys = models.CharField(max_length=5) # US or Metric
# User's home (default) and current longitude and latitude
home_lon = models.FloatField(default=0.0)
home_lat = models.FloatField(default=0.0)
current_lon = models.FloatField(default=0.0)
current_lat = models.FloatField(default=0.0)
# GeoDjango-specific fields
home_point = models.PointField(srid=4326)
current_point = models.PointField(srid=4326)
objects = models.GeoManager()
这是我的 Django View 中的查询:
def members(request, template):
"""View all members of the website."""
uid = request.session['uid'] # PK from User table
# Get the current user's lon/lat and measurement system
try:
ua = UserAccount.objects.get(user_id=uid)
lon = ua.current_lon
lat = ua.current_lat
measurement_sys = ua.measurement_sys
except UserAccount.DoesNotExist as e:
return HttpResponseRedirect(reverse('unable-to-display-members'))
# Define the proximity query.
if measurement_sys == 'US':
multiplier = 0.000621371 # Convert to miles
else:
multiplier = 0.001 # Convert to kilometers
query = "SELECT \
ua.user_id, \
au.username, \
ua.city, \
ua.region, \
ST_Distance( \
ua.current_point::geography, \
ST_GeographyFromText( \
'SRID=4326;POINT(" \
+ str(lon) \
+ " " \
+ str(lat) + \
")' \
) \
)*" + str(multiplier) + " AS distance \
FROM \
user_account ua \
INNER JOIN \
auth_user au \
ON (ua.user_id = au.id) \
WHERE ua.user_id != %s \
ORDER BY \
ua.current_point::geometry \
<-> \
'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geometry \
LIMIT 250;"
# Run the proximity query
raw_queryset = UserAccount.objects.raw(query, [uid])
# Paginate results
user_list = [user for user in raw_queryset]
list_size = len(list(user_list))
paginator = Paginator(user_list, 10, 4)
paginator._count = list_size
page = request.GET.get('page')
try:
users = paginator.page(page)
except PageNotAnInteger:
users = paginator.page(1)
except EmptyPage:
users = paginator.page(paginator.num_pages)
return render(request, template, {'users': users})
我在查询中做错了什么吗? KNN 运算符有时会“打嗝”并乱序返回一些结果吗?我问这个是因为当我尝试从我的表中取出两个乱序记录,然后为地址更远的用户添加额外的记录时(即在 IL、LA、MI、NC、PA、NY 和ME),所有结果的顺序都是正确的。
顺便说一下,我的输入位于 here .
谢谢!
最佳答案
更新的答案:
Postgis 有两个针对 kNN 邻居功能的近似解决方案,因为 September 2011 :
- 使用 <-> 运算符,您可以使用边界框的中心获得最近的邻居来计算对象间距离。
- 使用 <#> 运算符,您可以使用边界框本身获得最近的邻居来计算对象间距离。
您的问题是,两者都是近似值,因此并不完美。因此,如果您想要最好的 250 个结果,您可以使用它们中的任何一个来检索例如最好的 1000 个结果,然后按 ST_DISTANCE 和 LIMIT 250 对相同结果进行排序,以从大约 1000 个结果中获得最好的 250 个结果。
示例:
SELECT * FROM
(SELECT *,ST_DISTANCE(current_point::geography, 'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geography ) AS st_dist
FROM ua
ORDER BY current_point::geometry <-> 'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geometry
LIMIT 1000) AS s
ORDER BY st_dist LIMIT 250;
关于django - PostGIS 最近邻搜索结果乱序?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23941425/