Friday, November 9, 2012

Convert Shapefile to Raster with GDAL and Python

Shapefile, i.e. vector data, often needs to be converted to raster data in order to do further analysis. At the moment I see the following methods to do this:

1) Joel Lawhead has a few python scripts using his shapefile libary, another one using the PIL library, the description available at this page.
2) On google I find reference to gdal.RasterizeLayer, but I have not found more about this yet in the documentation, some info in this post.
3) Use gdal_rasterize from command line or in a batch script
4) Use gdal_rasterize but call it from within a python script.

The following script uses the ability of python to call commands as if typed at the command line. There seem to be a few ways to do this, I follow the one referred to in lecture 7 at the previously mentioned link

Using gdal_rasterize is described here and from command line I would call for one of the polygons:

gdal_rasterize -a ICE_TYPE -where "ICE_TYPE='Open Water'" -burn 2 -l ice20120608_3575 -tr 1000 -1000 C:\...\ice20120608_3575.shp  C:\...\ice20120608_3575.tif

Calling this command from within Python it becomes:

import os
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Water\'\" -burn 2 -l ' + shapefileshortname +' -tr 1000 -1000 ' +  shapefile + ' ' + outraster)

The task consists in taking a shapefile containing polygons in one layer having different sea ice types as attributes/fields and converting this to a raster map. In addition, Spitsbergen is to be masked out (and later I need to add Greenland as well for masking out).

The resulting script:

# Importing needed modules 
import os 
# The shapefile to be rasterized: 
shapefile = 'C:\Users\max\Documents\Icecharts\Data\IceChart_2012\ice20120608_3575.shp' 
print 'Rasterize ' + shapefile 
 #get path and filename seperately 
(shapefilefilepath, shapefilename) = os.path.split(shapefile)              
#get file name without extension 
(shapefileshortname, extension) = os.path.splitext(shapefilename)          
# The land area to be masked out, also being a shapefile to be rasterized 
SvalbardCoast = 'C:\Users\max\Documents\PythonProjects\s100-landp_3575.shp' 
# The raster file to be created and receive the rasterized shapefile 
outrastername = shapefileshortname + '.tif' 
outraster = 'C:\\Users\\max\\Documents\\Icecharts\\Data\\IceChart_2012\\' + outrastername 
# Rasterize first Ice Type and at same time create file -- call gdal_rasterize commandline 
print '\n Open Water' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Water\'\" -burn 2 -l ' + shapefileshortname +' -tr 1000 -1000 ' +  shapefile + ' ' + outraster) 
# Rasterize the other Ice types, adding them to the already created file 
print '\nVery Open Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Very Open Drift Ice\'\" -b 1 -burn 3 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
print '\n Open Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Drift Ice\'\" -b 1 -burn 4 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
print '\n Close Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Close Drift Ice\'\" -b 1 -burn 5 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
print '\n Very Close Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Very Close Drift Ice\'\" -b 1 -burn 6 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
print '\n Fast Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Fast Ice\'\" -b 1 -burn 1 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
# Rasterize Spitsbergen land area on top 
print '\n SvalbardRaster' 
os.system('gdal_rasterize  -b 1 -burn 8 -l s100-landp_3575 '  +  SvalbardCoast + ' ' + outraster) 
print "\n Done"

Here is the image of the shapefile to be rasterized, each polygon has an attribute "ICE_TYPE":

the Spitsbergen land area shapefile is this one:

And the final raster looks like this -- not as colourful having values 1 to 8 and no colour table assigned:


  1. can you please explain
    how you masked out Spitsbergen shapefile from final raster file using gdal or python

  2. if you look to the left to the image, you can see that Spitsbergen is just a layer, on top of the shapefile, so it is not masked.

    But if you convert the Spitsbergen shapefile to a raster, you can use this command to burn in the mask, assuming in this case that the Spitsbergen mask has a value of 251. This command sets the icechart to 251 where the Spitsbergen mask is 251, and leaves the ice mask value where the Spitsbergen mask is not 251:

    landmask = gdal.Open(landmask_raster, gdalconst.GA_ReadOnly)
    landraster = landmask.ReadAsArray()
    icechart = numpy.where( (landraster == 251), 251, icechart)

  3. Hi Max,

    I've tried the code above but am getting no out put. Any suggestions?

    C:\Users\anythingmapping\Documents\code\Anythingmapping_WP\lcdb_taupo>gdal_rasterize -a EditAuthor -where 'EditAuthor="Terralink"' -l lcdb lcdb.shp C:\Users\anythingmapping\Documents\code\Anythingmapping_WP\geoTiff.tif
    ERROR 10: Pointer 'hDS' is NULL in 'GDALGetProjectionRef'.

    Warning 1: The input vector layer has a SRS, but the output raster dataset SRS is unknown.
    Ensure output raster dataset has the same SRS, otherwise results might be incorrect.
    0...10...20...30...40...50...60...70...80...90...100 - done.

    1. It looks like your tiff file already exists and has no projection information, or a faulty one? Could that be?

    2. It doesn't currently exist. I've tried renaming the export but I get the same result. How do I set the output SRS?

    3. The error message could indicate a faulty projection i formation in the input shapefile... Just a guess just now...