MongoDB and geodata part 1 – from Shapefile to MongoDB 3.2

MongoDB (3.2) is a kind of database-hipster at the moment – with improving support for spatial data. So it was time for me to discover some of it’s features concerning spatial data. As a GIS-user my first intention was to get some bigger simple (point) geodata into MongoDB. Part 1 covers this topic, part 2 will cover some spatial operations within MongoDB. I also want to do some performance checks between PostgreSQL/PostGIS and MongoDB related to geodata.

QGIS_mongoDB_adressen
Geodata in QGIS and MongoDB

After having finished my personal fight with MongoDB in Docker and mongoimport, it was time to get some geodata from the Desktop-GIS into MongoDB. As a sample-dataset I used the adress-points of Tirol (OpenGovernmentData). MongoDB works with JSON, so I decided to give GeoJSON a try.  And YES, MongoDB supports GeoJSON and has a nice import-tool called “mongoimport” – but does not like the native one generated by GIS or ogr2ogr. My workflow:

  1. Transform/Convert the geodata to epsg:4326 (GCS WGS84) in GeoJSON-format
  2. GeoJSON-cleanup for mongoimport
  3. Use mongoimport to import the modified GeoJSON
  4. Create spatial index (2dsphere) – we need it for spatial operations in the database

In detail…

  • Transform the geodata to epsg:4326 (GCS WGS84)
  • Convert the geodata to GeoJSON (e.g. with ArcGIS or QGIS) – or with ogr2ogr:
[flo@localhost adressentirol]$ ogr2ogr -f geoJSON adrtirol.json adr_epsg4326.shp
  • The “original” GeoJSONs did not work with mongoimport. You find some hints in the WWW, but they differ… in my case I had to do the following GeoJSON-cleanup:
    1) Remove the top part until the “feature-objects” start

geojson_header4mongo_orig

2) Remove the comma (“,”) at all line ends (search / replace – e.g. with nano, notepad++,…). Takes some time having a file with more than 170 000 lines 😉

3) Remove the last two brackets at the bottom

geojson_footer4mongo_orig

The final “GeoJSON” looked like this:

geojson_4mongo

  • Now mongoimport worked like a charme to import the modified GeoJSON into MongoDB. Start your Mongo-Shell, create a new database with “use databasename”. The collection can be defined with mongoimport.
flo@XXX:/var/mongodata$ sudo mongoimport --drop --host 127.0.0.1:27017 --db geodatatest --collection adrtirol < /var/mongodata/adrtirol.json 
[sudo] password for flo: 
connected to: 127.0.0.1:27017
dropping: geodatatest.adrtirol
                        38000   12666/second
                        79100   13183/second
                        120000  13333/second
                        161100  13425/second
imported 175259 objects
  • The last step was to create a 2dsphere spatial index on the imported data within Mongo-Shell:
db.collection.createIndex({"geometry" : "2dsphere" } )
  • Want to check your indexes ?
db.collectionname.getIndexes()
  • And now, let’s have a look at our geodata in MongoDB 🙂
db.collectionname.find()

MongoDB_PointGeodata

 

Part 2 is going to focus on some spatial queries with MongoDB. All informations given are related to MongoDB 3.2. They are improving very fast and add new features with every release (e.g. db.collection.ensureIndex()is deprecated since 3.0 and replaced by createIndex).