Tuesday, March 8, 2016

Landsat QGIS tutorial -- Import and Band Ratio

This was written as a rough work flow for a colleague working on a project:

1) Create a personal account at http://earthexplorer.usgs.gov/ , mark your interest area and dates. Do not search to large date periods, only the first 100 results will be shown. Then choose Landsat-8 under Datasets>Landsat Archive:



2) Under Additional Criteria you can choose cloud cover threshold, but it seems to me that it is best to search for all scenes. A scene may have 80% cloud cover, but possibly just your glacier may be cloudfree on that image. There is very few cloud free images for Svalbard!

3)  In the result list, choose the scenes you want to download. Either add them to the bulk download, or use "download option" to choose a single scene. Get the GeoTiff file: 


4) The Landsat files are a Tif-file for each channel, but for displaying band combinations you need to merge these into one file:

5) Extract the Landsat zip-file and then find the merge dialog in QGIS: 


6) Choose all bands of the Landsat image to be merged. Be sure to have "layer stack" chosen!

7) For displaying a rgb image, you have to check which band is red, green or blue. This varies between Landsat sensors, consult this page for finding out. For Landsat 8, rgb is channel 4,3,2

Choose the property dialog (right-click the Landsat layer and choose "property" to view various band combination, in this example for Landsat-8 it is rgb, i.e. the regulart visual view by assigning 4,3,2 to r,g,b:


8) Band Ratio images: For discerning surfaced, often ratio between various bands are used. A known one is the NDVI index 
For glaciers, GLIMS suggests a few band ratio approaches, like "(TM3 / TM5)>2.6" Be aware that band numbers do not correspond to the same wavelength bands on different sensors, not even between different Landsat sensors! Check which actually wavelengths bands your formula operates with.

To create a band ratio image, use the QGIS raster calculator. This example calculates an output raster with (band4 / band5 )>2.6, but you could do any other operation you like

Reshaping Polygons in QGIS

We have shapefile giving the glacier outlines on Svalbard, now after a few years we want to update these outlines since glaciers have changed in extent:


For reshaping this polygon, activate the advanced digitizing menue in QGIS:


Then select the shapefile layer, choose "toggle editing" to make the shapefile layer editable and choose the "Reshape Feature" Button:

 Now start digitizing, first click outside the polygon, crossing the old glacier front, the digitize the new glacier front, and at the end again cross the old front line as indicated here. Below you see this for e retreating glacier, similarly an advancing glacier you cross the old front line before digitizing the new front. THe principle is nicely explained at this blog post. You can zoom into the image use the arrow keys to move the image while digitzing:

Start Digitizing:

End of Digitizing: 

For an advancing glacier you start and end by crossing the polygon like this, again look at this blog post to understand the principle: 



When you now right click, the polygon is adjusted to the new front line:

Now select the polygon you just edited, right click the layer and choose "open attribute table. In the table you choose to show the selected polygon, where you can update the relevant columns (year, date, analyst, source2 and sensor): 






 

Wednesday, February 24, 2016

Creating a Texture Image with a GLCM Co-Occurence Matrix using Scikit-Image and Python

Many clustering methods like the previously mentioned k-means cluster all the image values without considering any possible relation or patterns between neighbouring pixels. Texture is a measure considering this spatial relation --- an area with similar absolute values may visually be very different showing a different texture pattern (Source: Wikipedia)





The sci-kit image package for Python allows to calculate the Grey Level Co-Occurence Matrix (GCLM) and its texture measures as described on this page. However, this description, just as in the previous k-means example, leaves out the process of creating the texture image itself.

On this page you find a very good tutorial explaining GCLM and texture. A texture image is created by moving a window over you raster replacing the centre pixel covered by this window with the calculated texture measure of the moving window, as you see in this image taken from Myrka Hall-Beyers excellent tutorial:




A 7x7 window for any pixel would be defined by

glcm_window = sarraster[i-3: i+4, j-3 : j+4]


And the GCLM for a one-pixel offset [1] to the right [0] for this window as documented here would then be

glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )


Now you have to calculate this for every single pixel. The only way for making this specific calculation I can come up with is the slow process of looping through each pixel. Since some software implementations (like the SNAP SAR Toolbox) make this much faster, there must be some other way, so maybe a reader of this post has an idea. (I asked this question also at http://stackoverflow.com/questions/35551249/implementing-glcm-texture-feature-with-scikit-image-and-python ).

The following is only practical for small raster images, named sarraster in this case:


for i in range(sarraster.shape[0] ):
    print i,
    for j in range(sarraster.shape[1] ):
        
        #windows needs to fit completely in image
        if i <3 or j <3:        
            continue
        if i > (contrastraster.shape[0] - 4) or j > (contrastraster.shape[0] - 4):
            continue
        
        #Define size of moving window
        glcm_window = sarraster[i-3: i+4, j-3 : j+4]
        #Calculate GLCM and textures
        glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )

        # Calculate texture and write into raster where moving window is centered
        contrastraster[i,j]      = greycoprops(glcm, 'contrast')
        dissimilarityraster[i,j] = greycoprops(glcm, 'dissimilarity')
        homogeneityraster[i,j]   = greycoprops(glcm, 'homogeneity')
        energyraster[i,j]        = greycoprops(glcm, 'energy')
        correlationraster[i,j]   = greycoprops(glcm, 'correlation')
        ASMraster[i,j]           = greycoprops(glcm, 'ASM')
        glcm = None
        glcm_window = None
        


The following is the original image with various texture images. This is just a first test example, so no clear patterns are seen in this image:



The complete script for creating the image above. For a 1100 x 1400 raster this script takes about an hour while in the SNAP SAR Toolbox such calculations take < 1 minute, so I wonder about alternative implementations:


# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import gdal, gdalconst
import numpy as np
from skimage.feature import greycomatrix, greycoprops

#Read SAR image into Numpy Array
filename = "//mnt//glaciology//SARimage.jpg"
sarfile = gdal.Open(filename, gdalconst.GA_ReadOnly)
sarraster = sarfile.ReadAsArray()


#Create rasters to receive texture and define filenames
contrastraster = np.copy(sarraster)
contrastraster[:] = 0

dissimilarityraster = np.copy(sarraster)
dissimilarityraster[:] = 0

homogeneityraster = np.copy(sarraster)
homogeneityraster[:] = 0

energyraster = np.copy(sarraster)
energyraster[:] = 0

correlationraster = np.copy(sarraster)
correlationraster[:] = 0

ASMraster = np.copy(sarraster)
ASMraster[:] = 0

# Create figure to receive results
fig = plt.figure()
fig.suptitle('GLCM Textures')

# In first subplot add original SAR image
ax = plt.subplot(241)
plt.axis('off')
ax.set_title('Original Image')
plt.imshow(sarraster, cmap = 'gray')


for i in range(sarraster.shape[0] ):
    print i,
    for j in range(sarraster.shape[1] ):
        
        # windows needs to fit completely in image
        if i > (contrastraster.shape[0] - 4) or j > (contrastraster.shape[0] - 4):
            continue
        
        # Define size of moving window
        glcm_window = sarraster[i-3: i+4, j-3 : j+4]
        # Calculate GLCM and textures
        glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )

        # Calculate texture and write into raster where moving window is centered
        contrastraster[i,j]      = greycoprops(glcm, 'contrast')
        dissimilarityraster[i,j] = greycoprops(glcm, 'dissimilarity')
        homogeneityraster[i,j]   = greycoprops(glcm, 'homogeneity')
        energyraster[i,j]        = greycoprops(glcm, 'energy')
        correlationraster[i,j]   = greycoprops(glcm, 'correlation')
        ASMraster[i,j]           = greycoprops(glcm, 'ASM')
        glcm = None
        glcm_window = None
        
        


texturelist = {1: 'contrast', 2: 'dissimilarity', 3: ' homogeneity', 4: 'energy', 5: 'correlation', 6: 'ASM'}
for key in texturelist:
    ax = plt.subplot(2,3,key)
    plt.axis('off')
    ax.set_title(texturelist[key])
    plt.imshow(eval(texturelist[key] + "raster"), cmap = 'gray')

plt.show()

Tuesday, February 23, 2016

K-Means Clustering of a Satellite Images using Scipy

Scipy can be used nicely to apply k-means classification on a satellite image. Most blog entries I find (such as this here) describe how to apply k-means to x/y data but not how to get a image classified into these k-means clusters.

In a first step you load the satellite image into a raster, but then flatten this raster to convert the row x column raster into a one-dimensional array

>>> flatsarraster = sarraster.flatten()
>>> sarraster.shape
(1486, 1109)
>>> flatsarraster.shape
(1647974,)
>>> sarraster
array([[-17.93579102, -17.26957893, -17.6971817 , ...,  -3.9816401 ,
         -4.78030491,  -5.49047613],
       [-15.8222208 , -18.36615753, -17.06410408, ...,  -5.16635323,
         -4.9837966 ,  -5.192698  ],
       [-13.59493637, -15.24811172, -13.07635021, ...,  -3.1499095 ,
         -3.1499095 ,  -5.5106535 ],
       ..., 
       [-15.59587097, -15.20659828, -14.20381069, ...,  -6.38638067,
         -6.44060993,  -3.72703171],
       [-15.73380089, -15.21032333, -14.21171951, ...,  -5.8094368 ,
         -6.5339241 ,  -4.01910543],
       [-15.65506649, -14.98183632, -14.98183632, ...,  -6.26982498,
         -5.8094368 ,  -4.90749454]], dtype=float32)
>>> flatsarraster
array([-17.93579102, -17.26957893, -17.6971817 , ...,  -6.26982498,
        -5.8094368 ,  -4.90749454], dtype=float32)

The command centroids, variance = kmeans(flatsarraster, [number of cluster]) then uses the flattened raster and calculates the centroids of the wanted numbers of clusters, in this case 8 clusters:
>>> from scipy.cluster.vq import *
>>> centroids, variance = kmeans(flatsarraster, 8)
>>> centroids
array([  0.44888815, -11.17466545, -20.26683044,  -5.95262289,
       -13.68685532,  -8.66477108,  -3.06361866, -16.28946304], dtype=float32)

 Now the command code, distance = vq(flatsarraster, centroids) gives a one dimensional array "code" of the same length as the flattened raster, so each position does not contain the number of the original image, but the number of the cluster it belongs to.

>>> flatsarraster
array([-17.93579102, -17.26957893, -17.6971817 , ...,  -6.26982498,
        -5.8094368 ,  -4.90749454], dtype=float32)
>>> code
array([7, 7, 7, ..., 3, 3, 3])
>>> 

So in a third step we reshape this array "code" into the "row x column" raster and get a image showing the clusters.

codeim = code.reshape(sarraster.shape[0], sarraster.shape[1])

Here the code in total, in this case creating a loop to classify 2 to 8 clusters:

# -*- coding: utf-8 -*-


import gdal, gdalconst
from scipy.cluster.vq import *
import matplotlib.pyplot as plt

# Read SAR image into Numpy Array
filename = "//mnt//glaciology//processedSARimages//SARimage.tif"
sarfile = gdal.Open(filename, gdalconst.GA_ReadOnly)
sarraster = sarfile.ReadAsArray()

# Flatten image to get line of values
flatsarraster = sarraster.flatten()

# Create figure to receive results
fig = plt.figure()
fig.suptitle('K-Means Classification')

# In first subplot add original SAR image
ax = plt.subplot(241)
plt.axis('off')
ax.set_title('Original Image')
plt.imshow(sarraster, cmap = 'gray')

# In remaining subplots add k-means classified images
for i in range(7):
    print "Calculate k-means with ", i+2, " cluster."
    
    #This scipy code classifies k-mean, code has same length as flattened
    #SAR raster and defines which class the SAR value corresponds to
    centroids, variance = kmeans(flatsarraster, i+2)
    code, distance = vq(flatsarraster, centroids)
    
    #Since code contains the classified values, reshape into SAR dimensions
    codeim = code.reshape(sarraster.shape[0], sarraster.shape[1])
    
    #Plot the subplot with (i+2)th k-means
    ax = plt.subplot(2,4,i+2)
    plt.axis('off')
    xlabel = str(i+2) , ' clusters'
    ax.set_title(xlabel)
    plt.imshow(codeim)
    
plt.show()

The resulting images, each color indicating a cluster: