Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Spatial-Data-Analysis.md
T
2026-05-10 22:08:15 +09:00

5.4 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-spatial-data-analysis Spatial Data Analysis 10_Wiki/Topics verified self
Geospatial-Analysis
GIS-Analysis
none A 0.9 applied
geospatial
gis
spatial
analysis
statistics
2026-05-10 pending
language framework
python geopandas-shapely-pysal

Spatial Data Analysis

매 한 줄

"매 location 의 matter — Tobler's First Law". 매 가까운 곳 의 더 관련 — 매 spatial autocorrelation 의 측정 / modeling. 1854 Snow's cholera map 에서 시작, 2026 에 epidemiology, urban planning, climate, autonomous driving 의 중심.

매 핵심

매 data type

  • Vector: point (city), line (road), polygon (district) — GeoJSON, Shapefile, GeoParquet.
  • Raster: gridded (satellite imagery, DEM, climate) — GeoTIFF, Zarr, COG (Cloud-Optimized GeoTIFF).
  • Network: routable graphs (road, transit) — OSMnx, pgRouting.
  • Trajectory: time-stamped points — MovingPandas.

매 operation

  • Spatial join: 매 polygon 안 의 point 의 매칭.
  • Buffer: 매 distance d 만큼 의 surround region.
  • Overlay: intersection, union, difference.
  • Reprojection: CRS (coordinate reference system) — WGS84, UTM, Web Mercator.
  • Aggregation: pixel/zone 별 statistics.

매 statistic

  • Moran's I: 매 global spatial autocorrelation — Tobler's law 의 측정.
  • Getis-Ord G*: 매 local hotspot — 매 cluster 의 위치 의 발견.
  • Variogram / Kriging: 매 spatial interpolation — geostatistics.
  • Geographically Weighted Regression (GWR): 매 spatially-varying coefficients.

💻 패턴

1. GeoPandas — vector load + filter

import geopandas as gpd
gdf = gpd.read_file("districts.geojson").to_crs("EPSG:3857")  # Web Mercator
seoul = gdf[gdf["name"].str.contains("Seoul")]

2. Spatial join — points in polygons

points = gpd.read_file("incidents.csv")
joined = gpd.sjoin(points, gdf, how="left", predicate="within")
counts = joined.groupby("district").size()

3. Buffer + overlay

roads = gpd.read_file("roads.shp")
buffer_500m = roads.buffer(500)  # CRS 가 meters 인 경우
flood = gpd.read_file("flood.geojson")
risk = gpd.overlay(buffer_500m, flood, how="intersection")

4. Moran's I (PySAL)

from libpysal.weights import Queen
from esda.moran import Moran
w = Queen.from_dataframe(gdf)
moran = Moran(gdf["income"], w)
print(moran.I, moran.p_sim)  # autocorrelation + permutation p-value

5. Local hotspot (Getis-Ord G*)

from esda.getisord import G_Local
g = G_Local(gdf["crime"], w, transform="R")
gdf["z"] = g.Zs
# 매 z>2.58 → 매 99% hotspot

6. Raster — Rasterio + xarray

import rioxarray
da = rioxarray.open_rasterio("landsat.tif", masked=True)
ndvi = (da.sel(band=4) - da.sel(band=3)) / (da.sel(band=4) + da.sel(band=3))
ndvi.rio.to_raster("ndvi.tif")

7. Kriging interpolation

from pykrige.ok import OrdinaryKriging
ok = OrdinaryKriging(x, y, z, variogram_model="spherical")
grid_z, _ = ok.execute("grid", gridx, gridy)

8. STAC + COG (cloud-native, 2026)

import pystac_client
import stackstac
catalog = pystac_client.Client.open("https://earth-search.aws.element84.com/v1")
items = catalog.search(collections=["sentinel-2-l2a"], bbox=bbox).item_collection()
stack = stackstac.stack(items, assets=["B04", "B08"])  # 매 lazy xarray

9. H3 hexagonal indexing (Uber)

import h3
hexes = [h3.latlng_to_cell(lat, lng, resolution=9) for lat, lng in coords]
# 매 hex aggregation 으로 zone-based stats

매 결정 기준

상황 Approach
Vector ops GeoPandas / Shapely
Raster ops Rasterio / rioxarray / xarray
Cloud-scale (TB+) STAC + COG + Dask
Hotspot detection Getis-Ord G*
Continuous interpolation Kriging
Discrete zoning / aggregation H3 / S2 cells
Routing OSMnx / pgRouting
Visualization Folium, Kepler.gl, Deck.gl

기본값: GeoPandas + EPSG:4326 → ops 시 projected CRS (UTM/3857) → ESDA (PySAL) for stats.

🔗 Graph

🤖 LLM 활용

언제: place-name geocoding 의 disambiguation, narrative description of spatial pattern, OSM tag interpretation. 언제 X: 매 numerical kriging, projection — 매 dedicated geospatial library 의 사용.

안티패턴

  • Mixing CRS without conversion: meters + degrees 의 mix → 매 silent error.
  • Web Mercator for area calc: distortion at high latitudes → 매 equal-area projection (Mollweide, Equal Earth) 의 사용.
  • Ignoring spatial autocorrelation in regression: OLS assumption 의 violation → GWR / spatial lag model.
  • Rasterizing then re-vectorizing: precision loss — 매 vector ops 의 가능 시 매 vector 의 유지.

🧪 검증 / 중복

  • Verified (PySAL docs, Geocomputation with Python — Lovelace et al., USGS standards).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — vector/raster/STAC + ESDA patterns.