Friday, November 9, 2012

Convert Shapefile to Raster with GDAL and Python

Shapefile, i.e. vector data, often needs to be converted to raster data in order to do further analysis. At the moment I see the following methods to do this:

1) Joel Lawhead has a few python scripts using his shapefile libary, another one using the PIL library, the description available at this page.
2) On google I find reference to gdal.RasterizeLayer, but I have not found more about this yet in the documentation, some info in this post.
3) Use gdal_rasterize from command line or in a batch script
4) Use gdal_rasterize but call it from within a python script.

The following script uses the ability of python to call commands as if typed at the command line. There seem to be a few ways to do this, I follow the one referred to in lecture 7 at the previously mentioned link

Using gdal_rasterize is described here and from command line I would call for one of the polygons:

gdal_rasterize -a ICE_TYPE -where "ICE_TYPE='Open Water'" -burn 2 -l ice20120608_3575 -tr 1000 -1000 C:\...\ice20120608_3575.shp  C:\...\ice20120608_3575.tif


Calling this command from within Python it becomes:

import os
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Water\'\" -burn 2 -l ' + shapefileshortname +' -tr 1000 -1000 ' +  shapefile + ' ' + outraster)


The task consists in taking a shapefile containing polygons in one layer having different sea ice types as attributes/fields and converting this to a raster map. In addition, Spitsbergen is to be masked out (and later I need to add Greenland as well for masking out).

The resulting script:




# Importing needed modules 
import os 
 
# The shapefile to be rasterized: 
shapefile = 'C:\Users\max\Documents\Icecharts\Data\IceChart_2012\ice20120608_3575.shp' 
print 'Rasterize ' + shapefile 
 #get path and filename seperately 
(shapefilefilepath, shapefilename) = os.path.split(shapefile)              
#get file name without extension 
(shapefileshortname, extension) = os.path.splitext(shapefilename)          
 
# The land area to be masked out, also being a shapefile to be rasterized 
SvalbardCoast = 'C:\Users\max\Documents\PythonProjects\s100-landp_3575.shp' 
 
# The raster file to be created and receive the rasterized shapefile 
outrastername = shapefileshortname + '.tif' 
outraster = 'C:\\Users\\max\\Documents\\Icecharts\\Data\\IceChart_2012\\' + outrastername 
 
# Rasterize first Ice Type and at same time create file -- call gdal_rasterize commandline 
print '\n Open Water' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Water\'\" -burn 2 -l ' + shapefileshortname +' -tr 1000 -1000 ' +  shapefile + ' ' + outraster) 
 
# Rasterize the other Ice types, adding them to the already created file 
print '\nVery Open Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Very Open Drift Ice\'\" -b 1 -burn 3 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
 
print '\n Open Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Open Drift Ice\'\" -b 1 -burn 4 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
 
print '\n Close Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Close Drift Ice\'\" -b 1 -burn 5 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
 
print '\n Very Close Drift Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Very Close Drift Ice\'\" -b 1 -burn 6 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
 
print '\n Fast Ice' 
os.system('gdal_rasterize -a ICE_TYPE -where \"ICE_TYPE=\'Fast Ice\'\" -b 1 -burn 1 -l ' + shapefileshortname +' ' +  shapefile + ' ' + outraster) 
 
# Rasterize Spitsbergen land area on top 
print '\n SvalbardRaster' 
os.system('gdal_rasterize  -b 1 -burn 8 -l s100-landp_3575 '  +  SvalbardCoast + ' ' + outraster) 
 
print "\n Done"



Here is the image of the shapefile to be rasterized, each polygon has an attribute "ICE_TYPE":

the Spitsbergen land area shapefile is this one:

And the final raster looks like this -- not as colourful having values 1 to 8 and no colour table assigned:

Monday, October 29, 2012

Reprojecting a shapefile with GDAL/OGR and Python

I needed to reproject shapefiles -- unprojected lat/long WGS84 -- and once again the page Geoprocessing with Python using Open Source GIS was of great help, so the credit goes to this page, which is very worthwhile studying as an introduction to using gdal/ogr and Python.

The code below is an adjustment of homework 2b in the page above for my case.

I tried to add explanations to the original code so that it's easy to follow by just reading it -- hope I'm right....



# -*- coding: utf-8 -*-  
# Import Modules 
import ogr, osr, os, sys 
 
# set file names 
infile = 'C:\Users\max\Documents\Icecharts\Data\IceChart_2012\ice20120608.shp' 
outfile = 'C:\Users\max\Documents\Icecharts\Data\IceChart_2012\ice20120608_epsg3575.shp' 
 
#get path and filename seperately 
(outfilepath, outfilename) = os.path.split(outfile) 
#get file name without extension            
(outfileshortname, extension) = os.path.splitext(outfilename)   #get file name without extension 
 
# Spatial Reference of the input file 
# Access the Spatial Reference and assign the input projection 
inSpatialRef = osr.SpatialReference() 
inSpatialRef.ImportFromEPSG(4326)         # unprojected WGS84 
 
# Spatial Reference of the output file 
# Access the Spatial Reference and assign the output projection 
# UTM 33N which we use for all Spitsbergen does not work since area too far  
# outside UTM 33 
outSpatialRef = osr.SpatialReference() 
 outSpatialRef.ImportFromEPSG(3575) # EPSG:3575: WGS 84 / North Pole LAEA Europe 
  
# create Coordinate Transformation 
coordTransform = osr.CoordinateTransformation(inSpatialRef, outSpatialRef) 
 
# Open the input shapefile and get the layer 
driver = ogr.GetDriverByName('ESRI Shapefile') 
indataset = driver.Open(infile, 0) 
if indataset is None: 
    print ' Could not open file' 
    sys.exit(1) 
inlayer = indataset.GetLayer() 
 
# Create the output shapefile but check first if file exists 
if os.path.exists(outfile): 
    driver.DeleteDataSource(outfile) 
 
outdataset = driver.CreateDataSource(outfile) 
 
if outfile is None: 
    print ' Could not create file' 
    sys.exit(1) 
outlayer = outdataset.CreateLayer(outfileshortname, geom_type=ogr.wkbPolygon) 
 
# Get the FieldDefn for attributes ICE_TYPE and ID and add to output shapefile 
# (which i know are in my file) 
feature = inlayer.GetFeature(0) 
fieldDefn1 = feature.GetFieldDefnRef('ICE_TYPE') 
fieldDefn2 = feature.GetFieldDefnRef('ID') 
 
outlayer.CreateField(fieldDefn1) 
outlayer.CreateField(fieldDefn2) 
 
# get the FeatureDefn for the output shapefile 
featureDefn = outlayer.GetLayerDefn() 
 
#Loop through input features and write to output file 
infeature = inlayer.GetNextFeature() 
while infeature: 
  
    #get the input geometry 
    geometry = infeature.GetGeometryRef() 
  
    #reproject the geometry, each one has to be projected seperately 
    geometry.Transform(coordTransform) 
  
    #create a new output feature 
    outfeature = ogr.Feature(featureDefn) 
  
    #set the geometry and attribute 
    outfeature.SetGeometry(geometry) 
    outfeature.SetField('ICE_TYPE', infeature.GetField('ICE_TYPE')) 
    outfeature.SetField('ID', infeature.GetField('ID')) 
  
    #add the feature to the output shapefile 
    outlayer.CreateFeature(outfeature) 
  
    #destroy the features and get the next input features 
    outfeature.Destroy 
    infeature.Destroy 
    infeature = inlayer.GetNextFeature() 
  
#close the shapefiles 
indataset.Destroy() 
outdataset.Destroy() 
 
#create the prj projection file 
outSpatialRef.MorphToESRI() 
file = open(outfilepath + '\\'+ outfileshortname + '.prj', 'w') 
file.write(outSpatialRef.ExportToWkt()) 
file.close() 
 
#end






The unprojected file:

The projected shape file:

Friday, October 26, 2012

Accessing vertices from a polygon with OGR/gdal and Python

It took me a bit to figure out how to access the single vertices of a polygon using gdal/ogr, but I found the missing key towards the answer finally on this page

The shapefile contains one or more layers. These layers again can contain one or more features, each having attributed and a geometry, which can be an polygon, for example. However, this geometry, being a polygon, contains another geometry, a "Linear Ring", and it is this geometry deep down I have to access to get individual points.

It's seen in the gml notation, where the polygon contains a Linear Ring:

     <gml:Polygon>
         <gml:outerBoundaryIs>
                 <gml:LinearRing>
                         <gml:posList>0,0 100,0 100,100 0,100 0,0</gml:posList>
                 </gml:LinearRing>
        </gml:outerBoundaryIs>
     </gml:Polygon>


So how to dig down to this Linear Ring and thus the individual vertices?

How to open shapefiles is nicely described in lecture one on this very helpful page  :

     import ogr, sys, os
     fn = 'C:/Users/max/Documents/GlacierMasks2000s_CryoClimMøte.shp'
     driver = ogr.GetDriverByName('ESRI Shapefile')
     dataSource = driver.Open(fn, 0)


I can check how many layers my file contains:

     >>> dataSource.GetLayerCount()
     1
     >>>


and since it is only one layer, I proceed to access this layer. Being one layer only, it has the index "0":

     >>> layer = dataSource.GetLayer(0)

I can check how many features this layer contains:

     >>> layer.GetFeatureCount()
     1560
     >>>


For this example I only want to get one of the features, also checking some info about it. The item list is the attribute table entry for this feature:

        >>> feature = layer.GetFeature(44)
    >>> feature.GetFieldCount()
    21
    >>> feature.items()
    {'Comment': None, 'Y_cent': 8869558.8657, 'SatelliteI': 'SPI 08-027    Svalbard', 'IDENT': 22210.0, 'OBJECTID_1': 45, 'NAME': 'Idunbreen', 'OBJECTID': 46, 'NumLines': 2, 'Shape_Area': 333164842.804, 'X_cent': 592561.435311, 'SOURCE2': 'SPIRIT-SPOT DEM, ASTER DEM (GDEM), NPI DEM, Unpubl SPRI Interferometric velocities', 'SOURCE': 'SPOT5HRS-IPY_SPIRIT_PROJECT (Korona et. al.,2009)', 'YEAR_': 2008, 'AvgLength': 24053.902741, 'DDMM': 1408, 'MaxLength': 25786.323167, 'New_additi': 0, 'Shape_Le_1': 87176.4155756, 'TIDEWATER': 1, 'ANALYST': 'Rickard Pettersson, Uppsala University, Sweden', 'Shape_Leng': 87176.415576}


But I want to access the geometry in this feature, finding out if it is a point, a polygon or...

       >>> geometry = feature.GetGeometryRef()
    >>> geometry.GetGeometryCount()
    1
    >>> geometry.GetGeometryName()
    'POLYGON'
    >>>


At this point I thought, I already could access the vertices of the polygon, but this does not work:

      >>> geometry.GetPointCount()
   0
   >>> geometry.GetX()
   0.0
   ERROR 6: Incompatible geometry for operation
   >>> 


I have to dig one level deeper and access the geometry contained within the polygon:

      >>> ring = geometry.GetGeometryRef(0)
   >>> ring.GetGeometryName()
   'LINEARRING'
   >>> ring.GetPointCount()
   578
   >>> ring.GetPoint(500)
   (599109.9332999997, 8882494.9067, 0.0)
   >>> 


The last line is the latitude, longitude and elevation information for point number 500.

Finally a code collecting all of this and looping through the points and writing them into a text file:


import ogr, os, sys


# Open Shapefile

fn = 'C:/Users/max/Documents/GlacierMasks2000s_CryoClimMøte.shp'
driver = ogr.GetDriverByName('ESRI Shapefile')
dataSource = driver.Open(fn, 0)

# Get Layer
print 'The file contains ', dataSource.GetLayerCount(), ' Layers'
layer = dataSource.GetLayer(0)

# Get Features
print layer.GetName(), ' contains ', layer.GetFeatureCount(), ' features'
feature = layer.GetFeature(44)

# Get Geometry
geometry = feature.GetGeometryRef()
print ' Feature contains the Geometry', geometry.GetGeometryName()
print ' It contains', geometry.GetGeometryCount(), geometry.GetGeometryName()

# Get Geometry inside Geometry
ring = geometry.GetGeometryRef(0)
print geometry.GetGeometryName(), ' contains the Geometry', ring.GetGeometryName()
print ' It contains', ring.GetPointCount(), ' points in a ', ring.GetGeometryName()

# Write points in Vectors and Textfile
pointsX = []; pointsY = []; pointsZ = []
numpoints = ring.GetPointCount()

file = open(r'C:\Users\max\Documents\PythonProjects\text.txt','w')

for p in range(numpoints):
        lon, lat, z = ring.GetPoint(p)
        pointsX.append(lon)
        pointsY.append(lat)
        pointsZ.append(z)
        x = [str(lon), str(lat), str(z), '\n']
        file.writelines(x)
  
print ' the first 5 points ', pointsX[0:10], pointsY[0:10] , pointsZ[0:10]      

dataSource = None
file.close()

#end



Matlab comparison

In Matlab I can read the same shapefile with

    >> shape = shaperead('C:/Users/max/Documents/GlacierMasks2000s_CryoClimMøte.shp')

   shape =

   1560x1 struct array with fields:
    Geometry
     BoundingBox
    X
    Y
    OBJECTID_1
    OBJECTID
    NAME
    Comment
    IDENT
    New_additi
    YEAR_
    DDMM
    ANALYST
    SOURCE
    SOURCE2
    SatelliteI
    Shape_Leng
    X_cent
    Y_cent
    TIDEWATER
    MaxLength
    AvgLength
    NumLines
    Shape_Le_1
    Shape_Area



To access the same polygon as in the example above, Matlab uses the following command. But be aware, Python counts the indices from "0" and Matlab from "1". THe "feature.item()" command in Python above shows the OBJECTID of 46 for feature 44, which in Matlab is feature 45: >> shape(45).OBJECTID

ans =

    46

>> length( shape(45).X )

ans =

   579 


The X and Y in the struct array corresponds to the vertices called pointX and pointY in the Python code above. The attribute fields are accessed in Python via feature.item and others, whil in Matlab with shape(featurenumber).OBJECTID and so on.


Wednesday, October 3, 2012

Creating a Subset of Raster Data with Python / gdal

Continuing to play with Python and gdal...  Reading a GeoTIFF, subsetting it and saving as ENVI file.

First importing the libraries and setting working directory:

   >>> import gdal
   >>> from gdalconst import *
   >>> import os
   >>> os.chdir('C:\Users\max\Documents\PythonProjects')


Opening the file:

   >>> filename = 'ERS1PRI_19920430o04133tr481fr1989_AppOrb_Calib_Spk_SarsimTC_LinDB.tif'
   >>> dataset = gdal.Open(filename, GA_ReadOnly)


I get the band from the opened file. The ReadAsArray with the offset in x and y direction and the x dimensions and y dimensions of the subset:

    >>> band = dataset.GetRasterBand(1)
  >>> subset = band.ReadAsArray(3500, 3500, 50, 50)


The subset can be inspected:

   >>> subset
  array([[-11.49956703, -10.51208401,  -9.58573532, ..., -11.87286568,
        -12.77604961, -15.28282642],
       [-13.6348238 , -14.13398838, -13.61361694, ..., -13.09877014,
        -13.3733263 , -15.26138687],
       [-14.01767826, -12.75454712, -12.39109707, ..., -13.30418301,
        -14.24573517, -15.0796051 ],
       ...,
       [ -9.57748413, -11.63577938, -14.19717503, ...,   3.46974254,
          0.29983696,   0.51451778],
       [-11.11393261, -12.06205177, -13.13371372, ...,   2.4157486 ,
          3.08015943,   0.57344818],
       [-11.68845177, -11.69376755, -11.36115456, ...,   1.95159626,
          1.82419062,   3.39742208]], dtype=float32)
 



Now I register the ENVI driver and Create the ENVI file using the dimensions of the subset:

    >>> driver = gdal.GetDriverByName('ENVI')
  >>> driver.Register()
  >>> outDataset = driver.Create('ERS1PRI_19920430_ENVI_subset', 50, 50, 1, gdal.GDT_Float32)


Finally I get the band of the created ENVI file and write the subset array into the file. After that I can close the files.
   >>> outBand = outDataset.GetRasterBand(1)
   >>> outBand.WriteArray(subset, 0, 0)
   >>> band = None

   >>> dataset = None
   >>> outDataset = None
   >>> subset = None


Closing files is not a "close" command but simply setting the variables to "None"

[I seem to have problems with WriteArray not always writing the file correctly -- unsure yet when and why, the one above works]

Sunday, September 30, 2012

Writing Raster Data with Python / gdal

So now I have a SAR image in a an array called "data" as described in the previous post I want to save it, let's say, in an ERDAS img data format? This is basically converting the file format:


 I first get and register the appropriate driver for the output file:

        >>> driver = gdal.GetDriverByName('HFA')
    >>> driver.Register()


I want to use the same size from the input file which is in "dataset" as in the previous post  

    >>> cols = dataset.RasterXSize
    >>> rows = dataset.RasterYSize
    >>> bands = dataset.RasterCount
    >>> datatype = band.DataType

Then Creating the output file:

    >>> outDataset = driver.Create('ERS1PRI_19920430.pix', cols, rows, bands, datatype)


I want my outputfile to have the same georeferencing and projection information as the input file (the "0" shows the execution went fine):

        >>> geoTransform = dataset.GetGeoTransform()
    >>> outDataset.SetGeoTransform(geoTransform )
    0
    >>> proj = dataset.GetProjection()
    >>> outDataset.SetProjection(proj)
    0
    >>>




That has to be assigned before writing the data to the output band, otherwise it's not in the file!

Before writing the data, I have to get the band from the newly created file:

     >>> outBand = outDataset.GetRasterBand(1)


Now I can write the input image to the new output image:

    >>> outBand.WriteArray(data, 0, 0)


clean variables and close files
      >>> dataset = None
   >>> band = None
   >>> outBand = None
   >>> outDataset = None


[still to find out: got this to work for PCI format, with ENVI hdr the image but not the geoinformation was transferred and Erdas img did not work -- file too big?]

Friday, September 28, 2012

Reading Raster Data with Python and gdal

I am trying to learn Python for Geoprocessing. Here are some very basic notes on "playing" with Python/gdal/ogr. The online documentation is, I must say, rather confusing, so I try it step by step at the command line as below.

I follow the very useful lecture notes at http://www.gis.usu.edu/~chrisg/python/2009/ and the tutorial at http://www.gdal.org/gdal_tutorial.html

Importing both gdal and gdalconst; just "import gdalconst" does not work (and I don't understand why right now...):

    >>> import gdal
    >>> from gdalconst import *


Defining the filename (assuming it's located in the current working directory of python -- use os.getced and os.chdir):

    >>> filename = 'ERS1PRI_19920430o04133tr481fr1989_AppOrb_Calib_Spk_SarsimTC_LinDB.tif'

Now the file can be opened, the driver only needs to be imported for write-access if I understand correctly:

    >>> dataset = gdal.Open(filename, GA_ReadOnly)

Typing "dataset" shows the pointer to the opened file:

    >>> dataset
    <osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x0000000003550EA0> >
>>>


Various information can be retrieved from the opened file:

    >>> cols = dataset.RasterXSize
    >>> rows = dataset.RasterYSize
    >>> bands = dataset.RasterCount
    >>> driver = dataset.GetDriver().LongName

    >>> cols
    7257
    >>> rows
    7226
    >>> bands
    1
    >>>driver

    'GeoTIFF'
    >>>

Geoinformation can be retrieved with GetGeoTransform().

>>> geotransform = dataset.GetGeoTransform()

The variable "geotransform" now contains a list with Geoinformation:

    >>> geotransform
    (368745.92379062285, 20.0, 0.0, 8828671.611738198, 0.0, -20.0) 


The answer to what these values mean are found in the documentation:

    adfGeoTransform[0] /* top left x */
    adfGeoTransform[1] /* w-e pixel resolution */
    adfGeoTransform[2] /* rotation, 0 if image is "north up" */
    adfGeoTransform[3] /* top left y */
    adfGeoTransform[4] /* rotation, 0 if image is "north up" */
    adfGeoTransform[5] /* n-s pixel resolution */


and one can retrieve a single value from this list for example with

    >>> originX = geotransform[0]
    >>> originY = geotransform[3]
    >>> pixelWidth = geotransform[1]
    >>> pixelHeight = geotransform[5]
    >>> originX
    368745.92379062285
    >>> originY
    8828671.611738198
    >>> pixelWidth
    20.0
    >>> pixelHeight
    -20.0 


But how to get the individual data values in the file?

Get the band and read the first line:

    >>> band = dataset.GetRasterBand(1)

    >>> bandtype = gdal.GetDataTypeName(band.DataType)
    >>> bandtype
    'Float32'
    >>> scanline = band.ReadRaster( 0, 0, band.XSize, 1,band.XSize, 1, band.DataType)




Since I was not sure what the "ReadRaster" parameters meant, google led me to this useful page:

The ReadRaster() call has the arguments: def ReadRaster(self, xoff, yoff, xsize, ysize, buf_xsize = None, buf_ysize = None, buf_type = None, band_list = None ): The xoff, yoff, xsize, ysize parameter define the rectangle on the raster file to read. The buf_xsize, buf_ysize values are the size of the resulting buffer. So you might say "0,0,512,512,100,100" to read a 512x512 block at the top left of the image into a 100x100 buffer (downsampling the image).
which I found here

Typing "scanline" gives me long lines of this:

    >>> scanline
    '`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`B\xa2\x8d`.........


Note that the returned scanline is of type string, and contains xsize*4 bytes of raw binary floating point data. To convert this into readable values use struct.unpack and instead you get long lines of float numbers:

    >>> import struct
    >>> value = struct.unpack('f' * band.XSize, scanline)
    >>> value
    (-1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, -1.0000000031710769e-30, ....


Now I can get individual values, but all are in "one line" and not in an array:

     >>> value[8]
    -1.0000000031710769e-30


I rather read the whole file into an array:

    >>>data = band.ReadAsArray(0, 0, cols, rows)
    >>> value = data[3500,4000]
    >>> value
    -8.30476 


Using the numpy library I can define the datatype in the array:
   >>> import numpy
   >>> data = band.ReadAsArray(0, 0, dataset.RasterXSize, dataset.RasterYSize).astype(numpy.float)
   >>> value = data[3500,4000]
   >>> value
   -8.3047599792480469
   >>>


One has  to be careful not confusing column and rows! Matrix is value = data[row, column], and it starts with '0', so the value -8.30476 is located at y=row=3501 and x=column=4001.

To be continued....

Sunday, March 11, 2012

Removing Extreme Values and Outliers with NEST

Values outside of the processed SAR image should be NAN, i.e. "no value" and usually are, but at some point in my processing chain -- I think when converting the calibration to dB -- the NAN value is converted to a value of "-1E-30" For further processing, I wish to get these values to NAN again. Especially when mosaicing, the values of overlapping areas of several SAR images should not be influences by this invalid outlier



To change band values I choose Utilities>Band Math and get the following windows:


The Band Math uses a C-code If-expression which in this case says: If the image value is between -1E-30 and -2E-30, then replace it by NaN, otherwise keep the original value.

The following is not very clear to me -- You have to deselect "virtual" and the result is written into a new channel (save your file afterwards).

It would seem that you then simply could delete the original channel and retain the new band, but that does not seem to work, somehow the new band is still referenced to the original one. You have to choose Utilities>Spatial Subset from View to do the job. Here I go straight to the "Band Subset" tab and deselect the original band:


If I know choose "ok", you get a new product with the one corrected channel, which you then can save to disc.

This is of course too tedious if you have dozens of images, so to do this in batch mode, I create the processing line in "Graph Builder"



Then I save this graph as an XML file. You edit this XML-file so that the input and output file names have a place holder "$file" and "$target". Look in 1-Read that it reads "<file>$file</file>" and in 2-Write that it reads "<file>$target</file>".


Now you create a RemoveOutliers.bat file containing


for /r C:\Users\max\location_of_files\ %%X in (*.dim) do (gpt   C:\Users\max\location_of_XMLfile\ RemoveOutliers.xml -Pfile="%%X"  -Tfile=" C:\Users\max\location_of_new_files\%%~nX.dim")

What happens here?
  1. The for-command goes through the directory containing your files to find files named "*.dim" and passes the file name to "%%X".
  2. For each of these input files "-Pfile="%%X", the NEST command "gpt" applies the Graph Builder production chain saved in "RemoveOutliers.xml" 
  3. The output is saved in the parameter -Tfile, which here is written "%%~nX.dim", the same filename but in a new directory (but you also can give it a new name like   "%%~nX_NaN.dim" if you wish)
You navigate the DOS window (type "cmd" at Windows Start> "search programs and files" to open it) to the directory containing RemoveOutliers.bat, then type "RemoveOutliers.bat" and all scenes in the specified folder will be processed.

Curiously, and conveniently, if you run Band Math using "gpt" from commandline, the new file only contains the new band, so no reason for manually deleting the original band as in the NEST GUI version above is needed.

The Result (what is dark outside the scene is now NaN)





Saturday, March 10, 2012

Terrain Correction of SAR Images -- Part 4

A short comment only on the "Range Doppler Terrain Correction." As described in Part 1, the SARSIM algorithm takes the DEM and using orbit parameters of the satellite creates a simulated SAR image from this DEM. The simulated and the real SAR image, which will look very similiar, are coregistered. Through this simulation, the displacement for each location in the original landscape, the DEM, is known, so if the simulated SAR image is transformed back into the original DEM -- and the coregistered SAR image along with it -- the pixels of the SAR image will receive their real, geographical location.

The Range Doppler Algorithm does not simulate a SAR image to coregister this and the original SAR image, but calculates displacement based on orbit parameters and a DEM. The Range Doppler algorithm is much faster in processing scenes. When comparing scenes with both SARSIM and Range Doppler methods, I find no difference in the final product. However, the Range Doppler method does not work for quite a few of my scenes. If I understood ESA correctly, this is due to not accurate enough data in the SAR metadata, such that calculations of the displacement is incorrect. This appears to be special with data from the Arctic regions

I haven't therefore used this one that much, but in other areas of the world it may be worth using the Range Doppler Terrain Correction.

Choose Geometry>Terrain Correction>Range Doppler Terrain Correction  (in Graph Builder choose "Terrain Correction"). The settings are as follows


Terrain Correction of SAR Images -- Part 3

The most convenient way to process large quantities of SAR data is using the methods through command line. With the "gpt" command as described in the Nest help pages or in the SNAP help pages you can process single scenes from command line, but here is a way to process large quantities of scenes from command line. With the DOS "for" command you recursively search through your directories for scenes to be processed and hand these scenes each to the gpt command.

Here is how to do it:

First you create the processing chain with the graph builder as described in part 2 and save it as an XML file. Especially in the beginning, you may want to keep it to simpler processing chains not containing all tasks at once. In our case, let's take only the SARSIM Terrain Correction:


You set the values in Graph Builder and save it, lets say as "SARSIM_TC.xml". You still should check and edit the XML file for the parameters you need (map projection, resolution, etc) and you will have to modify the saved XML file at twoplaces for batch command line use as follows:

Make sure that the filenames in the XML file have $file  as placeholders as in the following examples:


 <node id="1-Read">
    <operator>Read</operator>
    <sources/>
    <parameters class="com.bc.ceres.binding.dom.Xpp3DomElement">
      <file>$file</file>
    </parameters>
  </node>
Now you create a SARSIM_TC.bat file containing


for /r C:\Users\max\location_of_files\ %%X in (*.dim) do (gpt   C:\Users\max\location_of_XMLfile\ SARSIM_TC.xml -Pfile="%%X"  -t " C:\Users\max\location_of_files\%%~nX_SarSimTC.dim")

What happens here?


  1. The for-command goes through the directory containing your files to find files named "*.dim" and passes the file name to "%%X".
  2. For each of these input files "-Pfile="%%X", the NEST command "gpt" applies the Graph Builder production chain saved in "SARSIMTC_dB.xml" 
  3. The output is saved in the parameter -Tfile, which here is written "%%~nX_SarSimTC.dim", taking the filename and between original name and filetype adding "_SarSimTC" to indicate this is having been processed with SarSim. You may choose different naming, but I find this convenient.
You navigate the DOS window (type "cmd" at Windows Start> "search programs and files" to open it) to the directory containing SARSIM_TC.bat, then type "SARSIM_TC.bat" and all scenes in the specified folder will be processed.

The results will be the same as shown in Part 1



Terrain Correction of SAR Images -- Part 2

Rather than clicking through each applied method in the menue, a production chain can be implemented with the "Graph Builder". Choose "Graphs>Graph Builder" and by right clicking in the graph space you can add methods and connect them with arrows in the order of running through these.

You can save the whole graph as an xml-file for later use, and this is also needed for batch command line processing. The individual tabs in the Graph Builder are just the same as described in part 1, only be aware that for the SARSIM Terrain Correction three individual parts have to be chosen (below the numbers 6 through 8). The "SARSIM-Terrain-Correction" you can choose is only part of the similarly named SARSIM Terrain Correction in Part 1!


Terrain Correction of SAR Images -- Part 1

A characteristic of side-looking SAR image is the so-called foreshortening and layover, a reflected signal from a mountaintop reaches the sensor earlier or at the same time as the signal at the foot of the mountain. This results in the typical look of mountains that seem to have "fallen over" towards the sensor:


In the original image to the left, a pixel is basically displaced depending on its elevation above sea-level, so it is important to remove this layover as seen in the image above to the right. The freely available SNAP SAR Toolbox (previous version known as the  NEST SAR Toolbox ) is in many ways a great tool for satellite image processing and makes it very easy terrain-correct SAR images in a fully automatic process.

The algorithm takes the DEM and using orbit parameters of the satellite creates a simulated SAR image from this DEM. The simulated and the real SAR image, which will look very similiar, are coregistered. Through this simulation, the displacement for each location in the original landscape, the DEM, is known, so if the simulated SAR image is transformed back into the original DEM -- and the coregistered SAR image along with it -- the pixels of the SAR image will receive their real, geographical location. (It's actually quite easy in principle, but not sure this description is clear...)

Below is the original ESA SAR image as loaded into SNAP (or NEST) displaying the typical layover: 


Before the terrain correction, I apply the newest orbit file (Utilities>Apply Orbit), calibrate the values (SAR Tools>Radiometric Correction>Calibrate; but not in dB since the terrain correction needs linear values!) and the run a speckle filter, median 3x3 (SAR Tools>Speckle Filtering>Single Image)

Now to the actual terrain correction. Choose Geometry>Terrain Correction>SAR Simulation Terrain Correction. In the first tab 1-Read you choose the product to be corrected: 


The second tab defines the output. Unfortunately the default output filename in this case is only "SarSimTC.dim", I follow the SNAP (NEST) naming convention where all methods applied are contained in the filename, such as "ORIGINALNAME_AppOrb_Calib_Spk_SarSimTC.dim" but this has to be typed manually:


In the "3-SAR simulation" tab, one can choose various DEM such as GETASSE and SRTM, but in my case I choose an "External DEM" and specify the file path. I set the "no data value" to 9999.0, otherwise all ocean surface will be NAN. There is something unusual in this tab -- if you do not highlight any of the source bands, the first one only will be processed, in this case "Amplitude_VV". If other bands also should be processed, both source bands (in this case Amplitude_VV and Amplitude_VH) must be highlighted by choosing and clicking them!


The "4-GCP Selection" tab I leave the given values: 


Finally, the "5-SARSIM-Terrain-Correction" tab. For my purpose, I choose 20m resolution for the output image, the map projection of Svalbard/Spitsbergen WGS1984 / UTM 33N and prefer nearest neighbour interpolation:

Now I can choose "process" and this particular run takes 6.5 minutes on a Windows 7 64-bit computer for one source band. For two source bands it is much, much longer (80 minutes in this run!) and may not work if the computer has too little RAM, so the best and fast way is to process individual source bands separately. (The Range Doppler algorithm for removing layover which I will discuss in a later  is faster but does not work for some scenes at high latitudes(?)).

 I choose "Utilities>Dataset Conversion>Linear to dB" to get desibel values and get this final result. The data fits perfectly to cartographic shapefiles of the coastlines and to the other geolocations.


compared again to the original SAR image the difference is easily visible!


A problem in the current version: If you -- after having processed a particular scene -- choose a different scene under "1-Read" as input having differently named source bands, the source band list under "3-SAR Simulation" does not update -- you have to close the whole window and start all over -- part 2 and following describe how to process a large number of scenes.

The next postings will discuss how to run all this from command line and do batch processing.

Welcome

This blog is meant as a place for my own notes on Remote Sensing and GIS tasks. Maybe some others find these helpful in their own work or can come with advice or point out errors in case they find any.

You may find here a link to my current projects.