Skip to main content

geoetl-cli convert

Convert geospatial data between different formats.

Synopsis

geoetl-cli convert [OPTIONS] \
--input <PATH> \
--output <PATH> \
--input-driver <DRIVER> \
--output-driver <DRIVER>

Description

The convert command transforms geospatial data from one format to another. It supports converting between GeoJSON, CSV (with WKT geometries), and GeoParquet formats.

GeoETL uses a streaming architecture for constant memory usage regardless of dataset size, making it suitable for large-scale data conversion.

Required Options

OptionShortTypeDescription
--input-ipathInput file path
--output-opathOutput file path
--input-driverstringInput format driver (see supported drivers)
--output-driverstringOutput format driver (see supported drivers)

Optional Options

Data Processing Options

OptionTypeDescription
--sqlstringSQL query to apply to input dataset (table name is inferred from input filename)
--table-namestringCustom table name for SQL queries (overrides the inferred name)
--batch-sizenumberRows per batch for processing (default: 8192)
--read-partitionsnumberNumber of partitions for reading (default: 1)
--write-partitionsnumberNumber of partitions for writing (default: 1)

CSV-Specific Options

OptionTypeRequired ForDescription
--geometry-columnstringCSV inputName of column containing WKT geometries
--geometry-typestringCSV input (optional)Geometry type hint (Point, LineString, Polygon, etc.)

Global Options

OptionShortDescription
--help-hPrint help information
--verbose-vEnable verbose output

Examples

Example 1: GeoJSON to CSV

geoetl-cli convert \
--input cities.geojson \
--output cities.csv \
--input-driver GeoJSON \
--output-driver CSV

Example 2: CSV to GeoJSON

geoetl-cli convert \
--input data.csv \
--output data.geojson \
--input-driver CSV \
--output-driver GeoJSON \
--geometry-column geometry
REQUIRED

The --geometry-column parameter is required for CSV input to specify which column contains WKT geometries.

Example 3: GeoJSON to GeoParquet

geoetl-cli convert \
--input large_dataset.geojson \
--output large_dataset.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet

Example 4: Short Form

geoetl-cli convert \
-i input.geojson \
-o output.csv \
--input-driver GeoJSON \
--output-driver CSV

Example 5: Verbose Output

geoetl-cli -v convert \
-i large_file.geojson \
-o large_file.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet

Output:

INFO geoetl_cli: Converting large_file.geojson to large_file.parquet
INFO geoetl_core::operations: Starting conversion:
INFO geoetl_core::operations: Input: large_file.geojson (Driver: GeoJSON)
INFO geoetl_core::operations: Output: large_file.parquet (Driver: GeoParquet)
INFO geoetl_core::operations: Read 1000 record batch(es)
INFO geoetl_core::operations: Total rows: 1000000
INFO geoetl_core::operations: Conversion completed successfully

Example 6: Custom Geometry Column

geoetl-cli convert \
-i database_export.csv \
-o output.geojson \
--input-driver CSV \
--output-driver GeoJSON \
--geometry-column wkt_geom

Use case: When CSV uses custom column name like wkt_geom, geom, or the_geom.

Example 7: Filter Data with SQL Query

geoetl-cli convert \
-i cities.csv \
-o big_cities.csv \
--input-driver CSV \
--output-driver CSV \
--geometry-column wkt \
--sql "SELECT * FROM cities WHERE population > 1000000"

Use case: Export only records matching specific criteria (e.g., large cities, specific regions).

Note: The table name cities is automatically inferred from the input filename cities.csv.

Example 8: Select Specific Columns

geoetl-cli convert \
-i full_dataset.geojson \
-o simplified.geojson \
--input-driver GeoJSON \
--output-driver GeoJSON \
--sql "SELECT name, population, geometry FROM full_dataset"

Use case: Reduce file size by excluding unnecessary columns.

Note: Table name full_dataset matches the input filename.

Example 9: Aggregate Data with SQL

geoetl-cli convert \
-i parcels.csv \
-o zone_summary.csv \
--input-driver CSV \
--output-driver CSV \
--geometry-column wkt \
--sql "SELECT zone_type, COUNT(*) as count, SUM(area) as total_area FROM parcels GROUP BY zone_type"

Use case: Generate summary statistics from spatial data.

Example 10: Sort and Limit Results

geoetl-cli convert \
-i buildings.geojson \
-o top_10_tallest.geojson \
--input-driver GeoJSON \
--output-driver GeoJSON \
--sql "SELECT * FROM buildings ORDER BY height DESC LIMIT 10"

Use case: Extract top N records based on a criterion.

Example 11: Custom Table Name

geoetl-cli convert \
-i very_long_complex_filename_2024_v2.csv \
-o output.csv \
--input-driver CSV \
--output-driver CSV \
--geometry-column wkt \
--table-name data \
--sql "SELECT * FROM data WHERE category = 'residential'"

Use case: Simplify SQL queries when dealing with complex filenames.

Note: --table-name data overrides the inferred table name very_long_complex_filename_2024_v2.

Common Workflows

Workflow 1: Optimize for Cloud Storage

# Convert to GeoParquet for compression
geoetl-cli convert \
-i source.geojson \
-o optimized.parquet \
--input-driver GeoJSON \
--output-driver GeoParquet

Workflow 2: Filter and Transform Pipeline

# Extract specific regions, select columns, and convert format
geoetl-cli convert \
-i global_cities.csv \
-o north_america_cities.geojson \
--input-driver CSV \
--output-driver GeoJSON \
--geometry-column wkt \
--sql "SELECT name, country, population, wkt FROM global_cities WHERE country IN ('USA', 'Canada', 'Mexico') ORDER BY population DESC"

Workflow 3: Data Reduction for Visualization

# Simplify large datasets for web maps
geoetl-cli convert \
-i all_buildings.geojson \
-o significant_buildings.geojson \
--input-driver GeoJSON \
--output-driver GeoJSON \
--sql "SELECT name, type, geometry FROM all_buildings WHERE floor_count > 5 OR landmark = true"

Driver Compatibility

Supported Conversions

From ↓ / To →GeoJSONCSVGeoParquet
GeoJSON
CSV
GeoParquet

All format combinations are supported.

Error Messages

Error: Driver not found

Message: Input driver 'geojson' not found

Cause: Driver name is case-sensitive or misspelled

Solution: Use exact driver name from geoetl-cli drivers:

# ❌ Wrong
--input-driver geojson

# ✅ Correct
--input-driver GeoJSON

Error: File not found

Message: No such file or directory

Cause: Input file doesn't exist or path is incorrect

Solution:

# Check file exists
ls -l input.geojson

# Use absolute path
geoetl-cli convert -i /full/path/to/input.geojson ...

Error: Missing geometry column

Message: Missing required option: geometry-column

Cause: CSV input requires --geometry-column parameter

Solution:

geoetl-cli convert \
-i data.csv \
-o data.geojson \
--input-driver CSV \
--output-driver GeoJSON \
--geometry-column geometry # Add this

Error: Permission denied

Message: Permission denied

Cause: No write permission for output directory

Solution:

# Check permissions
ls -ld output/

# Use different directory
geoetl-cli convert -i input.geojson -o /tmp/output.csv ...

Debugging

Enable Verbose Logging

geoetl-cli -v convert ...

Enable Debug Logging

RUST_LOG=debug geoetl-cli convert ...

Test with Small Sample

# Extract first 100 features
head -n 100 large.geojson > sample.geojson

# Test conversion
geoetl-cli convert -i sample.geojson -o test.csv \
--input-driver GeoJSON --output-driver CSV

See Also

Commands:

  • info - Display dataset information
  • drivers - List available drivers

Tutorials:

Drivers:

Reference:


Need help? Run geoetl-cli convert --help