Skip to main content

GeoParquet

Modern columnar storage format combining Apache Parquet efficiency with geospatial capabilities.

Driver Metadata

PropertyValue
Short NameGeoParquet
Long NameGeoParquet (Parquet + GeoArrow)
Supported Sincev0.3.0
Status✅ Stable
File Extension.parquet, .geoparquet

Driver Capabilities

  • Read Support: Full
  • Write Support: Full
  • Info Support: Yes
  • Geometry Format: WKB (Well-Known Binary) with GeoArrow types
  • Geometry Types: All standard types with WKB encoding
  • Metadata: CRS, bounding boxes, encoding information

Format Description

GeoParquet combines Apache Parquet's columnar storage with WKB-encoded geometries and GeoArrow types for geospatial data.

Reading Data

Get Information

geoetl-cli info data.parquet --driver GeoParquet

Output:

Dataset: data.parquet
Driver: GeoParquet

=== Geometry Columns ===
+----------+------------------+-----+
| Column | Extension | CRS |
+----------+------------------+-----+
| geometry | geoarrow.wkb | N/A |
+----------+------------------+-----+

=== Fields ===
+------------+--------+----------+
| Field | Type | Nullable |
+------------+--------+----------+
| name | String | Yes |
| population | Int64 | Yes |
+------------+--------+----------+

Convert from GeoParquet

# To GeoJSON
geoetl-cli convert \
--input data.parquet \
--output data.geojson \
--input-driver GeoParquet \
--output-driver GeoJSON
# To CSV
geoetl-cli convert \
--input data.parquet \
--output data.csv \
--input-driver GeoParquet \
--output-driver CSV

Writing Data

Convert to GeoParquet

# From GeoJSON
geoetl-cli convert \
--input data.geojson \
--output data.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet
# From CSV
geoetl-cli convert \
--input data.csv \
--output data.parquet \
--input-driver CSV \
--output-driver GeoParquet \
--geometry-column geometry

Write Options

Currently, GeoETL writes GeoParquet with optimal defaults. Future versions may support:

  • Compression codec selection (SNAPPY, GZIP, ZSTD)
  • Row group size
  • Column encoding
  • Metadata customization

Examples

Convert GeoJSON to GeoParquet

geoetl-cli convert \
-i data.geojson \
-o data.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet

Convert CSV to GeoParquet

geoetl-cli convert \
-i data.csv \
-o data.parquet \
--input-driver CSV \
--output-driver GeoParquet \
--geometry-column geometry

Convert GeoParquet to GeoJSON

geoetl-cli convert \
-i data.parquet \
-o data.geojson \
--input-driver GeoParquet \
--output-driver GeoJSON

Troubleshooting

CSV Export with bbox Columns

Problem: Error when exporting GeoParquet (created by ogr2ogr) to CSV

Cause: ogr2ogr GeoParquet files may have bbox struct columns that CSV cannot represent

Solution: Roundtrip via GeoJSON

# This may fail
geoetl-cli convert -i ogr_created.parquet -o output.csv \
--input-driver GeoParquet --output-driver CSV

# Use roundtrip instead
geoetl-cli convert -i ogr_created.parquet -o temp.geojson \
--input-driver GeoParquet --output-driver GeoJSON

geoetl-cli convert -i temp.geojson -o output.csv \
--input-driver GeoJSON --output-driver CSV

Large File Performance

Problem: Slow processing or high memory usage

Solution: GeoParquet uses streaming architecture - ensure sufficient disk I/O:

# Use verbose mode to monitor progress
geoetl-cli -v convert \
-i large.geojson \
-o large.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet

Format Specification

See Also