一、什么是H3?
将地球空间划分成可是识别的单元。将经纬度H3编码成六边形的网格索引。
二、为什么用H3?
2.1 GEOHASH存在一些不足
- 不同精度下网格的形状不一且精度的变化幅度时小时大
- 在不同维度的地区会出现地理单元单位面积差异较大的情况
- 存在8邻域到中心网格的距离不相等问题
2.2 H3的映射原理简述
基于正多边形内角和公式( θ = ( x ? 2 ) ? 180 \theta=(x-2)*180 θ=(x?2)?180 ), 和顶点和为360计算出, 360 y = ( x ? 2 ) ? 180 x \frac{360}{y} = \frac{(x-2)*180}{x} y360?=x(x?2)?180? ,所有y(正多边形个数), x的组合
六边形因为边数最多,最接近圆,所以理论上来说在某些场景下是最优的选择。H3干脆摒弃传统的地图投影,直接在地球上铺满六边形。采用多层网格映射
三、H3的主要应用是什么?
- 优化乘车价格和调度(动态定价)
- 地图空间数据可视化和挖掘
- 用于整个市场的分析和优化
四、Uber H3实战: 英国交通事故点聚类
脚本在notebook里运行即可
import numpy as np
import pandas as pd
import folium
from h3 import h3
from sklearn.cluster import DBSCAN
from folium.plugins import HeatMap
def creat_map(cluster):map_fig = folium.Map(zoom_start=12)def color_choose(cnt):color_list = ['#FFC1C1', '#EEB4B4', '#FF6A6A', '#EE6363', '#CD5555', '#8B3A3A']if cnt <= 14:return color_list[0]elif cnt <= 17:return color_list[1]elif cnt <= 21:return color_list[2] elif cnt <= 25:return color_list[3] elif cnt <= 30:return color_list[4] else:return color_list[5] for cluster in cluster.values():points = cluster['geom']ac_cnt = cluster['count']tooltip = f'{
ac_cnt} accidents'map_fig.add_child(folium.vector_layers.Polygon(locations=points,tooltip=tooltip,fill=True,fill_color='#ff0000',fill_opacity=0.4,weight=2,opacity=0.7))# 边界设置max_lat = df.Latitude.max()min_lat = df.Latitude.min()max_lon = df.Longitude.max()min_lon = df.Longitude.min()map_fig.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])return map_figfile = './dftRoadSafety_Accidents_2016.csv'
column_types = {
'Accident_Index': np.string_, 'LSOA_of_Accident_Location': np.string_}
uk_acc = pd.read_csv(file, dtype=column_types)# 将经纬度转换成H3s
global H3_LEVEL
H3_LEVEL = 7
def lat_lng_2_h3(row):return h3.geo_to_h3(row['Latitude'], row['Longitude'], H3_LEVEL)
uk_acc['h3'] = uk_acc.apply(lat_lng_2_h3, axis=1)
# DBSCAN 聚类
## 角度 -> 弧度 1 * np.pi / 180
uk_acc['rad_lng'] = np.radians(uk_acc['Longitude'].values)
uk_acc['rad_lat'] = np.radians(uk_acc['Latitude'].values)
eps_in_meter = 50.0
EARTH_R = 6370996.8 # 地球半径
dbscan = DBSCAN(eps=eps_in_meter/EARTH_R, min_samples=10, metric='haversine')
uk_acc = uk_acc.loc[~uk_acc['rad_lat'].isna(), :].reset_index(drop=True)
uk_acc['cluster'] = dbscan.fit_predict(uk_acc[['rad_lat', 'rad_lng']])
df = uk_acc[(uk_acc.cluster != -1)].reset_index(drop=True).copy()
uk_acc['cluster'].value_counts()# 绘制聚合后的数据
clusters = dict()
for idx, row in df.iterrows():key = row['h3']if key in clusters:clusters[key]['count'] += 1else:clusters[key] = {
'count' : 1, 'geom': h3.h3_to_geo_boundary(h=key)}
relevat_clusters = {
k : v for (k, v) in clusters.items() if v['count'] >= 10
}
creat_map(relevat_clusters)
# 热力图
from folium.plugins import HeatMap
map_hooray = folium.Map(location=df.loc[0, ['Latitude', 'Longitude']].tolist(), zoom_start=14)
HeatMap(df[['Latitude', 'Longitude']]).add_to(map_hooray)
for idx in range(df.shape[0]):folium.Marker(df.loc[idx, ['Latitude', 'Longitude']].tolist(),tooltip=df.loc[idx, 'cluster'].tolist()).add_to(map_hooray)
map_hooray
参考:
https://www.biaodianfu.com/uber-h3.html
- 参考链接中部分脚本进行了修改