如何通过绘制KNN弯头来确定DBSCAN的最佳epsilon值以米为单位

问题描述

在执行DBSCAN之前,我需要找到最佳的epsilon值,所有点都是地理坐标,在将其转换为弧度以使用hasrsine度量应用DBSCAN之前,我需要以米为单位的epsilon值

from sklearn.neighbors import NearestNeighbors
neigh = NearestNeighbors(n_neighbors=4)
nbrs = neigh.fit(firms[['y','x']])
distances,indices = nbrs.kneighbors(firms[['y','x']])

然后

# Plotting K-distance Graph
distances = np.sort(distances,axis=0)
distances = distances[:,1]
plt.figure(figsize=(20,10))
plt.plot(distances)
plt.title('K-distance Graph',fontsize=20)
plt.xlabel('Data Points sorted by distance',fontsize=14)
plt.ylabel('Epsilon',fontsize=14)
plt.show()

图形输出是这样,但是我需要以米为单位的epsilon值。

enter image description here

解决方法

我希望这有助于澄清一些观察结果:

a)您已经在使用该方法找到最佳epsilon值,并且从图中得出eps = 0.005。

b)如果您的点是地理坐标,则无需先以米为单位的epsilon值,而只需将其转换为弧度,然后转换为弧度即可使用,因为您可以从地理坐标直接转换为弧度,然后乘以6371000/1000,以千米为单位得出结果,如下所示:

from sklearn.metrics.pairwise import haversine_distances
from math import radians
bsas = [-34.83333,-58.5166646]
paris = [49.0083899664,2.53844117956]
bsas_in_radians = [radians(_) for _ in bsas]
paris_in_radians = [radians(_) for _ in paris]
result = haversine_distances([bsas_in_radians,paris_in_radians])
result * 6371000/1000  # multiply by Earth radius to get kilometers

以下代码段:

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.haversine_distances.html

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...