Wednesday, February 24, 2016

Creating a Texture Image with a GLCM Co-Occurence Matrix using Scikit-Image and Python

Many clustering methods like the previously mentioned k-means cluster all the image values without considering any possible relation or patterns between neighbouring pixels. Texture is a measure considering this spatial relation --- an area with similar absolute values may visually be very different showing a different texture pattern (Source: Wikipedia)





The sci-kit image package for Python allows to calculate the Grey Level Co-Occurence Matrix (GCLM) and its texture measures as described on this page. However, this description, just as in the previous k-means example, leaves out the process of creating the texture image itself.

On this page you find a very good tutorial explaining GCLM and texture. A texture image is created by moving a window over you raster replacing the centre pixel covered by this window with the calculated texture measure of the moving window, as you see in this image taken from Myrka Hall-Beyers excellent tutorial:




A 7x7 window for any pixel would be defined by

glcm_window = sarraster[i-3: i+4, j-3 : j+4]


And the GCLM for a one-pixel offset [1] to the right [0] for this window as documented here would then be

glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )


Now you have to calculate this for every single pixel. The only way for making this specific calculation I can come up with is the slow process of looping through each pixel. Since some software implementations (like the SNAP SAR Toolbox) make this much faster, there must be some other way, so maybe a reader of this post has an idea. (I asked this question also at http://stackoverflow.com/questions/35551249/implementing-glcm-texture-feature-with-scikit-image-and-python ).

The following is only practical for small raster images, named sarraster in this case:


for i in range(sarraster.shape[0] ):
    print i,
    for j in range(sarraster.shape[1] ):
        
        #windows needs to fit completely in image
        if i <3 or j <3:        
            continue
        if i > (contrastraster.shape[0] - 4) or j > (contrastraster.shape[0] - 4):
            continue
        
        #Define size of moving window
        glcm_window = sarraster[i-3: i+4, j-3 : j+4]
        #Calculate GLCM and textures
        glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )

        # Calculate texture and write into raster where moving window is centered
        contrastraster[i,j]      = greycoprops(glcm, 'contrast')
        dissimilarityraster[i,j] = greycoprops(glcm, 'dissimilarity')
        homogeneityraster[i,j]   = greycoprops(glcm, 'homogeneity')
        energyraster[i,j]        = greycoprops(glcm, 'energy')
        correlationraster[i,j]   = greycoprops(glcm, 'correlation')
        ASMraster[i,j]           = greycoprops(glcm, 'ASM')
        glcm = None
        glcm_window = None
        


The following is the original image with various texture images. This is just a first test example, so no clear patterns are seen in this image:



The complete script for creating the image above. For a 1100 x 1400 raster this script takes about an hour while in the SNAP SAR Toolbox such calculations take < 1 minute, so I wonder about alternative implementations:


# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import gdal, gdalconst
import numpy as np
from skimage.feature import greycomatrix, greycoprops

#Read SAR image into Numpy Array
filename = "//mnt//glaciology//SARimage.jpg"
sarfile = gdal.Open(filename, gdalconst.GA_ReadOnly)
sarraster = sarfile.ReadAsArray()


#Create rasters to receive texture and define filenames
contrastraster = np.copy(sarraster)
contrastraster[:] = 0

dissimilarityraster = np.copy(sarraster)
dissimilarityraster[:] = 0

homogeneityraster = np.copy(sarraster)
homogeneityraster[:] = 0

energyraster = np.copy(sarraster)
energyraster[:] = 0

correlationraster = np.copy(sarraster)
correlationraster[:] = 0

ASMraster = np.copy(sarraster)
ASMraster[:] = 0

# Create figure to receive results
fig = plt.figure()
fig.suptitle('GLCM Textures')

# In first subplot add original SAR image
ax = plt.subplot(241)
plt.axis('off')
ax.set_title('Original Image')
plt.imshow(sarraster, cmap = 'gray')


for i in range(sarraster.shape[0] ):
    print i,
    for j in range(sarraster.shape[1] ):
        
        # windows needs to fit completely in image
        if i > (contrastraster.shape[0] - 4) or j > (contrastraster.shape[0] - 4):
            continue
        
        # Define size of moving window
        glcm_window = sarraster[i-3: i+4, j-3 : j+4]
        # Calculate GLCM and textures
        glcm = greycomatrix(glcm_window, [1], [0],  symmetric = True, normed = True )

        # Calculate texture and write into raster where moving window is centered
        contrastraster[i,j]      = greycoprops(glcm, 'contrast')
        dissimilarityraster[i,j] = greycoprops(glcm, 'dissimilarity')
        homogeneityraster[i,j]   = greycoprops(glcm, 'homogeneity')
        energyraster[i,j]        = greycoprops(glcm, 'energy')
        correlationraster[i,j]   = greycoprops(glcm, 'correlation')
        ASMraster[i,j]           = greycoprops(glcm, 'ASM')
        glcm = None
        glcm_window = None
        
        


texturelist = {1: 'contrast', 2: 'dissimilarity', 3: ' homogeneity', 4: 'energy', 5: 'correlation', 6: 'ASM'}
for key in texturelist:
    ax = plt.subplot(2,3,key)
    plt.axis('off')
    ax.set_title(texturelist[key])
    plt.imshow(eval(texturelist[key] + "raster"), cmap = 'gray')

plt.show()

Tuesday, February 23, 2016

K-Means Clustering of a Satellite Images using Scipy

Scipy can be used nicely to apply k-means classification on a satellite image. Most blog entries I find (such as this here) describe how to apply k-means to x/y data but not how to get a image classified into these k-means clusters.

In a first step you load the satellite image into a raster, but then flatten this raster to convert the row x column raster into a one-dimensional array

>>> flatsarraster = sarraster.flatten()
>>> sarraster.shape
(1486, 1109)
>>> flatsarraster.shape
(1647974,)
>>> sarraster
array([[-17.93579102, -17.26957893, -17.6971817 , ...,  -3.9816401 ,
         -4.78030491,  -5.49047613],
       [-15.8222208 , -18.36615753, -17.06410408, ...,  -5.16635323,
         -4.9837966 ,  -5.192698  ],
       [-13.59493637, -15.24811172, -13.07635021, ...,  -3.1499095 ,
         -3.1499095 ,  -5.5106535 ],
       ..., 
       [-15.59587097, -15.20659828, -14.20381069, ...,  -6.38638067,
         -6.44060993,  -3.72703171],
       [-15.73380089, -15.21032333, -14.21171951, ...,  -5.8094368 ,
         -6.5339241 ,  -4.01910543],
       [-15.65506649, -14.98183632, -14.98183632, ...,  -6.26982498,
         -5.8094368 ,  -4.90749454]], dtype=float32)
>>> flatsarraster
array([-17.93579102, -17.26957893, -17.6971817 , ...,  -6.26982498,
        -5.8094368 ,  -4.90749454], dtype=float32)

The command centroids, variance = kmeans(flatsarraster, [number of cluster]) then uses the flattened raster and calculates the centroids of the wanted numbers of clusters, in this case 8 clusters:
>>> from scipy.cluster.vq import *
>>> centroids, variance = kmeans(flatsarraster, 8)
>>> centroids
array([  0.44888815, -11.17466545, -20.26683044,  -5.95262289,
       -13.68685532,  -8.66477108,  -3.06361866, -16.28946304], dtype=float32)

 Now the command code, distance = vq(flatsarraster, centroids) gives a one dimensional array "code" of the same length as the flattened raster, so each position does not contain the number of the original image, but the number of the cluster it belongs to.

>>> flatsarraster
array([-17.93579102, -17.26957893, -17.6971817 , ...,  -6.26982498,
        -5.8094368 ,  -4.90749454], dtype=float32)
>>> code
array([7, 7, 7, ..., 3, 3, 3])
>>> 

So in a third step we reshape this array "code" into the "row x column" raster and get a image showing the clusters.

codeim = code.reshape(sarraster.shape[0], sarraster.shape[1])

Here the code in total, in this case creating a loop to classify 2 to 8 clusters:

# -*- coding: utf-8 -*-


import gdal, gdalconst
from scipy.cluster.vq import *
import matplotlib.pyplot as plt

# Read SAR image into Numpy Array
filename = "//mnt//glaciology//processedSARimages//SARimage.tif"
sarfile = gdal.Open(filename, gdalconst.GA_ReadOnly)
sarraster = sarfile.ReadAsArray()

# Flatten image to get line of values
flatsarraster = sarraster.flatten()

# Create figure to receive results
fig = plt.figure()
fig.suptitle('K-Means Classification')

# In first subplot add original SAR image
ax = plt.subplot(241)
plt.axis('off')
ax.set_title('Original Image')
plt.imshow(sarraster, cmap = 'gray')

# In remaining subplots add k-means classified images
for i in range(7):
    print "Calculate k-means with ", i+2, " cluster."
    
    #This scipy code classifies k-mean, code has same length as flattened
    #SAR raster and defines which class the SAR value corresponds to
    centroids, variance = kmeans(flatsarraster, i+2)
    code, distance = vq(flatsarraster, centroids)
    
    #Since code contains the classified values, reshape into SAR dimensions
    codeim = code.reshape(sarraster.shape[0], sarraster.shape[1])
    
    #Plot the subplot with (i+2)th k-means
    ax = plt.subplot(2,4,i+2)
    plt.axis('off')
    xlabel = str(i+2) , ' clusters'
    ax.set_title(xlabel)
    plt.imshow(codeim)
    
plt.show()

The resulting images, each color indicating a cluster: