Image Segmentation Stepwise Implementation

Application of Attention Res-UNet in Image Segmentation

Putting into practice a foundation model for picture segmentation requires a methodical process that includes multiple crucial components. A thorough explanation of the implementation procedure is provided below:

Step 1: Import necessary libraries

Python

import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
import matplotlib.pyplot as plt
import glob
from tqdm import tqdm
from skimage.io import imread, imshow
from skimage.transform import resize
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Input, Conv2D, Conv2DTranspose,
                                     BatchNormalization, Activation,
                                     MaxPooling2D, UpSampling2D,
                                     Concatenate, Dropout, Lambda)

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras import backend as K
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model

Step 2: Download and Extract Mask Images

Python

%%capture
# Download and Extract Mask Image Dataset
!wget https://dataverse.harvard.edu/api/access/datafile/3838943 -O mask_imgs.zip
!mkdir masks
!unzip mask_imgs.zip -d /content/masks

# Download and Extract Image Dataset Part 1
!wget https://dataverse.harvard.edu/api/access/datafile/3172585 -O imgs_1.zip
!mkdir imgs
!unzip imgs_1.zip -d /content/imgs

# Download and Extract Image Dataset Part 2
!wget https://dataverse.harvard.edu/api/access/datafile/3172584 -O imgs_2.zip
!unzip imgs_2.zip -d /content/imgs

Step 3: Load Mask Images and Image Dataset

Load Mask Images

Python

mask_img_list = os.listdir(
    '/content/masks/HAM10000_segmentations_lesion_tschandl')
df_mask_images = pd.DataFrame(mask_img_list, columns=['image_id'])
print('Mask Image Dataset Size: ', df_mask_images.size)
print(df_mask_images.sample(5))

Output:

Mask Image Dataset Size:  10015
                           image_id
856   ISIC_0026955_segmentation.png
8481  ISIC_0029327_segmentation.png
3663  ISIC_0031816_segmentation.png
5760  ISIC_0029516_segmentation.png
141   ISIC_0030870_segmentation.png

Load Image Dataset

Python

img_list = os.listdir('/content/imgs/')
df_images = pd.DataFrame(img_list, columns=['image_id'])
print('Image Dataset Size: ', df_images.size)
print(df_images.sample(5))

Output:

Image Dataset Size:  5000
              image_id
658   ISIC_0027306.jpg
3046  ISIC_0027958.jpg
153   ISIC_0026017.jpg
2846  ISIC_0027431.jpg
2056  ISIC_0026607.jpg

Step 4: Preprocessing

Load and Resize Images and Masks

Python

img_bad = []
mask_bad = []
not_found_imgs = []

start_val = 24306
i = start_val + 1
size = start_val + 1000

while i <= size:
    num = str(i)
    zeroes = 7 - len(num)

    mask_path = f'/content/masks/HAM10000_segmentations_lesion_tschandl/ISIC_{zeroes * "0"}{num}_segmentation.png'
    mask = cv2.imread(mask_path)
    if type(mask) is np.ndarray:
        mask = cv2.resize(mask, (128, 128))
        mask_bad.append(mask)
    else:
        not_found_imgs.append(i)

    img_path = f'/content/imgs/ISIC_{zeroes * "0"}{num}.jpg'
    img = cv2.imread(img_path)
    if type(img) is np.ndarray:
        img = cv2.resize(img, (128, 128))
        img_bad.append(img)
        #print(i)
    i += 1

mask_bad = np.array(mask_bad)
img_bad = np.array(img_bad)
print('Loaded Images:', len(img_bad))
print('Images Not Found:', not_found_imgs)

Output:

Loaded Images: 1000
Images Not Found: []

Display Image and Mask

Python

def RGBimshow(img):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)

i = 135
plt.rcParams['figure.figsize'] = [10, 5]
plt.figure(1)
plt.subplot(1,2,1)
RGBimshow(img_bad[i])
plt.figure(2)
plt.subplot(1,2,1)
plt.rcParams['figure.figsize'] = [10, 5]
plt.imshow(mask_bad[i])

Output:

Step 5: Model Building

Model Architecture: Consists of residual connections and attention processes combined with the U-Net topology.
Encoder Block: This four-layer block combines max pooling and a residual convolutional block for downsampling.
Decoder block: The four-layer decoder block combines an attention block, a residual convolutional block, and upsampling.
1×1 convolutions (3-layers) are used to get the final output, which is then batch normalized and sigmoid activated.Using the training dataset and validation, the Attention Residual U-Net model is trained.

Define Helper Functions

Python

# Repeats the elements of a tensor along the specified axis
def repeat_elem(tensor, rep):
    # lambda function to repeat Repeats the elements of a tensor along an axis by a factor of rep.
    # If tensor has shape (None, 256,256,3), lambda will return a tensor of shape (None, 256,256,6), if specified axis=3 and rep=2.
    return Lambda(lambda x, repnum: K.repeat_elements(x, repnum, axis=3), arguments={'repnum': rep})(tensor)


# Residual convolutional block with optional batch normalization and dropout
def res_conv_block(x, filter_size, size, dropout, batch_norm=False):
    # First convolutional layer
    conv = Conv2D(size, (filter_size, filter_size), padding='same')(x)
    if batch_norm:
        conv = BatchNormalization(axis=3)(conv)
    conv = Activation('relu')(conv)

    # Second convolutional layer
    conv = Conv2D(size, (filter_size, filter_size), padding='same')(conv)
    if batch_norm:
        conv = BatchNormalization(axis=3)(conv)
    if dropout > 0:
        conv = Dropout(dropout)(conv)

    # Shortcut connection to the input
    shortcut = Conv2D(size, kernel_size=(1, 1), padding='same')(x)
    if batch_norm:
        shortcut = BatchNormalization(axis=3)(shortcut)

    # Adding the shortcut and the conv output
    res_path = tf.keras.layers.add([shortcut, conv])
    res_path = Activation('relu')(res_path)
    return res_path

# Creates a gating signal with optional batch normalization
def gating_signal(input, out_size, batch_norm=False):

    x = Conv2D(out_size, (1, 1), padding='same')(input)
    if batch_norm:
        x = BatchNormalization()(x)
    x = Activation('relu')(x)
    return x

Attention block

Attention block for enhancing feature maps from the encoder using gating signal from the decoder

Python

# Attention block for enhancing feature maps from the encoder using gating signal from the decoder
def attention_block(x, gating, inter_shape):
    shape_x = K.int_shape(x)
    shape_g = K.int_shape(gating)

    # Downsample the input tensor x
    theta_x = Conv2D(inter_shape, (2, 2), strides=(2, 2), padding='same')(x)
    shape_theta_x = K.int_shape(theta_x)

    # Apply 1x1 convolution to the gating signal
    phi_g = Conv2D(inter_shape, (1, 1), padding='same')(gating)
    upsample_g = Conv2DTranspose(inter_shape, (3, 3), strides=(
        shape_theta_x[1] // shape_g[1], shape_theta_x[2] // shape_g[2]), padding='same')(phi_g)

    # Add the downsampled input tensor and upsampled gating signal
    concat_xg = tf.keras.layers.add([upsample_g, theta_x])
    act_xg = Activation('relu')(concat_xg)

    # Apply a 1x1 convolution followed by a sigmoid activation
    psi = Conv2D(1, (1, 1), padding='same')(act_xg)
    sigmoid_xg = Activation('sigmoid')(psi)

    shape_sigmoid = K.int_shape(sigmoid_xg)

    # Upsample the attention map
    upsample_psi = UpSampling2D(size=(
        shape_x[1] // shape_sigmoid[1], shape_x[2] // shape_sigmoid[2]))(sigmoid_xg)

    # Repeat the attention map along the channel dimension
    upsample_psi = repeat_elem(upsample_psi, shape_x[3])

    # Multiply the input tensor by the attention map
    y = tf.keras.layers.multiply([upsample_psi, x])

    # Apply a 1x1 convolution and batch normalization
    result = Conv2D(shape_x[3], (1, 1), padding='same')(y)
    result_bn = BatchNormalization()(result)
    return result_bn

Encoder block

Encoder block consisting of a residual convolutional block and a max pooling layer

Python

# Encoder block consisting of a residual convolutional block and a max pooling layer
def encoder_block(inputs, filter_size, filter_num, dropout_rate, batch_norm):
    conv = res_conv_block(inputs, filter_size, filter_num,
                          dropout_rate, batch_norm)
    pool = MaxPooling2D(pool_size=(2, 2))(conv)
    return conv, pool

Decoder block

Python

# Decoder block consisting of upsampling, attention mechanism, concatenation, and residual convolutional block
def decoder_block(input, conv, filter_size, filter_num, dropout_rate, batch_norm, up_samp_size, axis):
    # Create a gating signal from the input
    gating = gating_signal(input, filter_num, batch_norm)

    # Create an attention block using the gating signal and the corresponding encoder output
    att = attention_block(conv, gating, filter_num)

    # Upsample the input
    up = UpSampling2D(size=(up_samp_size, up_samp_size),
                      data_format="channels_last")(input)

    # Concatenate the upsampled input with the attention output
    up = Concatenate(axis=axis)([up, att])

    # Apply a residual convolutional block to the concatenated output
    up_conv = res_conv_block(
        up, filter_size, filter_num, dropout_rate, batch_norm)
    return up_conv

Define Attention ResUNet Model

Python

def Attention_Res_UNet(input_shape, NUM_CLASSES=1, dropout_rate=0.0, batch_norm=True):
    FILTER_NUM = 64  # number of basic filters for the first layer
    FILTER_SIZE = 3  # size of the convolutional filter
    UP_SAMP_SIZE = 2  # size of upsampling filters

    inputs = Input(input_shape, dtype=tf.float32)
    axis = 3
    # Downsampling layers (Encoder Block)
    conv_128, pool_64 = encoder_block(
        inputs, FILTER_SIZE, FILTER_NUM, dropout_rate, batch_norm)
    conv_64, pool_32 = encoder_block(
        pool_64, FILTER_SIZE, 2*FILTER_NUM, dropout_rate, batch_norm)
    conv_32, pool_16 = encoder_block(
        pool_32, FILTER_SIZE, 4*FILTER_NUM, dropout_rate, batch_norm)
    conv_16, pool_8 = encoder_block(
        pool_16, FILTER_SIZE, 8*FILTER_NUM, dropout_rate, batch_norm)
    conv_8 = res_conv_block(
        pool_8, FILTER_SIZE, 16*FILTER_NUM, dropout_rate, batch_norm)
    # Upsampling layers (Decoder Block)
    up_conv_16 = decoder_block(conv_8, conv_16, FILTER_SIZE,
                               8*FILTER_NUM, dropout_rate, batch_norm, UP_SAMP_SIZE, axis)
    up_conv_32 = decoder_block(up_conv_16, conv_32, FILTER_SIZE,
                               4*FILTER_NUM, dropout_rate, batch_norm, UP_SAMP_SIZE, axis)
    up_conv_64 = decoder_block(up_conv_32, conv_64, FILTER_SIZE,
                               2*FILTER_NUM, dropout_rate, batch_norm, UP_SAMP_SIZE, axis)
    up_conv_128 = decoder_block(up_conv_64, conv_128, FILTER_SIZE,
                                FILTER_NUM, dropout_rate, batch_norm, UP_SAMP_SIZE, axis)
    # 1x1 convolutional layer
    conv_final = Conv2D(NUM_CLASSES, kernel_size=(1, 1))(up_conv_128)
    conv_final = BatchNormalization(axis=axis)(conv_final)
    conv_final = Activation('sigmoid')(conv_final)

    model = Model(inputs, conv_final, name="AttentionResUNet")
    return model

Model Summary

Python

input_shape = (128,128,3)
model = Attention_Res_UNet(input_shape)
model.summary()

Output:

Model: "AttentionResUNet"
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type)        ┃ Output Shape      ┃    Param # ┃ Connected to      ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_2       │ (None, 128, 128,  │          0 │ -                 │
│ (InputLayer)        │ 3)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_96 (Conv2D)  │ (None, 128, 128,  │      1,792 │ input_layer_2[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_96[0][0]   │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_62       │ (None, 128, 128,  │          0 │ batch_normalizat… │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_98 (Conv2D)  │ (None, 128, 128,  │        256 │ input_layer_2[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_97 (Conv2D)  │ (None, 128, 128,  │     36,928 │ activation_62[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_98[0][0]   │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_97[0][0]   │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_26 (Add)        │ (None, 128, 128,  │          0 │ batch_normalizat… │
│                     │ 64)               │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_63       │ (None, 128, 128,  │          0 │ add_26[0][0]      │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d_8     │ (None, 64, 64,    │          0 │ activation_63[0]… │
│ (MaxPooling2D)      │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_99 (Conv2D)  │ (None, 64, 64,    │     73,856 │ max_pooling2d_8[… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_99[0][0]   │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_64       │ (None, 64, 64,    │          0 │ batch_normalizat… │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_101 (Conv2D) │ (None, 64, 64,    │      8,320 │ max_pooling2d_8[… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_100 (Conv2D) │ (None, 64, 64,    │    147,584 │ activation_64[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_101[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_100[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_27 (Add)        │ (None, 64, 64,    │          0 │ batch_normalizat… │
│                     │ 128)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_65       │ (None, 64, 64,    │          0 │ add_27[0][0]      │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d_9     │ (None, 32, 32,    │          0 │ activation_65[0]… │
│ (MaxPooling2D)      │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_102 (Conv2D) │ (None, 32, 32,    │    295,168 │ max_pooling2d_9[… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_102[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_66       │ (None, 32, 32,    │          0 │ batch_normalizat… │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_104 (Conv2D) │ (None, 32, 32,    │     33,024 │ max_pooling2d_9[… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_103 (Conv2D) │ (None, 32, 32,    │    590,080 │ activation_66[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_104[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_103[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_28 (Add)        │ (None, 32, 32,    │          0 │ batch_normalizat… │
│                     │ 256)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_67       │ (None, 32, 32,    │          0 │ add_28[0][0]      │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d_10    │ (None, 16, 16,    │          0 │ activation_67[0]… │
│ (MaxPooling2D)      │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_105 (Conv2D) │ (None, 16, 16,    │  1,180,160 │ max_pooling2d_10… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_105[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_68       │ (None, 16, 16,    │          0 │ batch_normalizat… │
│ (Activation)        │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_107 (Conv2D) │ (None, 16, 16,    │    131,584 │ max_pooling2d_10… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_106 (Conv2D) │ (None, 16, 16,    │  2,359,808 │ activation_68[0]… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_107[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_106[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_29 (Add)        │ (None, 16, 16,    │          0 │ batch_normalizat… │
│                     │ 512)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_69       │ (None, 16, 16,    │          0 │ add_29[0][0]      │
│ (Activation)        │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ max_pooling2d_11    │ (None, 8, 8, 512) │          0 │ activation_69[0]… │
│ (MaxPooling2D)      │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_108 (Conv2D) │ (None, 8, 8,      │  4,719,616 │ max_pooling2d_11… │
│                     │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 8, 8,      │      4,096 │ conv2d_108[0][0]  │
│ (BatchNormalizatio… │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_70       │ (None, 8, 8,      │          0 │ batch_normalizat… │
│ (Activation)        │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_110 (Conv2D) │ (None, 8, 8,      │    525,312 │ max_pooling2d_11… │
│                     │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_109 (Conv2D) │ (None, 8, 8,      │  9,438,208 │ activation_70[0]… │
│                     │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 8, 8,      │      4,096 │ conv2d_110[0][0]  │
│ (BatchNormalizatio… │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 8, 8,      │      4,096 │ conv2d_109[0][0]  │
│ (BatchNormalizatio… │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_30 (Add)        │ (None, 8, 8,      │          0 │ batch_normalizat… │
│                     │ 1024)             │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_71       │ (None, 8, 8,      │          0 │ add_30[0][0]      │
│ (Activation)        │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_111 (Conv2D) │ (None, 8, 8, 512) │    524,800 │ activation_71[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 8, 8, 512) │      2,048 │ conv2d_111[0][0]  │
│ (BatchNormalizatio… │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_72       │ (None, 8, 8, 512) │          0 │ batch_normalizat… │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_113 (Conv2D) │ (None, 8, 8, 512) │    262,656 │ activation_72[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_8  │ (None, 8, 8, 512) │  2,359,808 │ conv2d_113[0][0]  │
│ (Conv2DTranspose)   │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_112 (Conv2D) │ (None, 8, 8, 512) │  1,049,088 │ activation_69[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_31 (Add)        │ (None, 8, 8, 512) │          0 │ conv2d_transpose… │
│                     │                   │            │ conv2d_112[0][0]  │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_73       │ (None, 8, 8, 512) │          0 │ add_31[0][0]      │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_114 (Conv2D) │ (None, 8, 8, 1)   │        513 │ activation_73[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_74       │ (None, 8, 8, 1)   │          0 │ conv2d_114[0][0]  │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_16    │ (None, 16, 16, 1) │          0 │ activation_74[0]… │
│ (UpSampling2D)      │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ lambda_8 (Lambda)   │ (None, 16, 16,    │          0 │ up_sampling2d_16… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ multiply_8          │ (None, 16, 16,    │          0 │ lambda_8[0][0],   │
│ (Multiply)          │ 512)              │            │ activation_69[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_115 (Conv2D) │ (None, 16, 16,    │    262,656 │ multiply_8[0][0]  │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_17    │ (None, 16, 16,    │          0 │ activation_71[0]… │
│ (UpSampling2D)      │ 1024)             │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_115[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_8       │ (None, 16, 16,    │          0 │ up_sampling2d_17… │
│ (Concatenate)       │ 1536)             │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_116 (Conv2D) │ (None, 16, 16,    │  7,078,400 │ concatenate_8[0]… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_116[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_75       │ (None, 16, 16,    │          0 │ batch_normalizat… │
│ (Activation)        │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_118 (Conv2D) │ (None, 16, 16,    │    786,944 │ concatenate_8[0]… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_117 (Conv2D) │ (None, 16, 16,    │  2,359,808 │ activation_75[0]… │
│                     │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_118[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      2,048 │ conv2d_117[0][0]  │
│ (BatchNormalizatio… │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_32 (Add)        │ (None, 16, 16,    │          0 │ batch_normalizat… │
│                     │ 512)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_76       │ (None, 16, 16,    │          0 │ add_32[0][0]      │
│ (Activation)        │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_119 (Conv2D) │ (None, 16, 16,    │    131,328 │ activation_76[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 16, 16,    │      1,024 │ conv2d_119[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_77       │ (None, 16, 16,    │          0 │ batch_normalizat… │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_121 (Conv2D) │ (None, 16, 16,    │     65,792 │ activation_77[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_9  │ (None, 16, 16,    │    590,080 │ conv2d_121[0][0]  │
│ (Conv2DTranspose)   │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_120 (Conv2D) │ (None, 16, 16,    │    262,400 │ activation_67[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_33 (Add)        │ (None, 16, 16,    │          0 │ conv2d_transpose… │
│                     │ 256)              │            │ conv2d_120[0][0]  │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_78       │ (None, 16, 16,    │          0 │ add_33[0][0]      │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_122 (Conv2D) │ (None, 16, 16, 1) │        257 │ activation_78[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_79       │ (None, 16, 16, 1) │          0 │ conv2d_122[0][0]  │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_18    │ (None, 32, 32, 1) │          0 │ activation_79[0]… │
│ (UpSampling2D)      │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ lambda_9 (Lambda)   │ (None, 32, 32,    │          0 │ up_sampling2d_18… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ multiply_9          │ (None, 32, 32,    │          0 │ lambda_9[0][0],   │
│ (Multiply)          │ 256)              │            │ activation_67[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_123 (Conv2D) │ (None, 32, 32,    │     65,792 │ multiply_9[0][0]  │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_19    │ (None, 32, 32,    │          0 │ activation_76[0]… │
│ (UpSampling2D)      │ 512)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_123[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_9       │ (None, 32, 32,    │          0 │ up_sampling2d_19… │
│ (Concatenate)       │ 768)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_124 (Conv2D) │ (None, 32, 32,    │  1,769,728 │ concatenate_9[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_124[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_80       │ (None, 32, 32,    │          0 │ batch_normalizat… │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_126 (Conv2D) │ (None, 32, 32,    │    196,864 │ concatenate_9[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_125 (Conv2D) │ (None, 32, 32,    │    590,080 │ activation_80[0]… │
│                     │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_126[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │      1,024 │ conv2d_125[0][0]  │
│ (BatchNormalizatio… │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_34 (Add)        │ (None, 32, 32,    │          0 │ batch_normalizat… │
│                     │ 256)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_81       │ (None, 32, 32,    │          0 │ add_34[0][0]      │
│ (Activation)        │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_127 (Conv2D) │ (None, 32, 32,    │     32,896 │ activation_81[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 32, 32,    │        512 │ conv2d_127[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_82       │ (None, 32, 32,    │          0 │ batch_normalizat… │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_129 (Conv2D) │ (None, 32, 32,    │     16,512 │ activation_82[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_10 │ (None, 32, 32,    │    147,584 │ conv2d_129[0][0]  │
│ (Conv2DTranspose)   │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_128 (Conv2D) │ (None, 32, 32,    │     65,664 │ activation_65[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_35 (Add)        │ (None, 32, 32,    │          0 │ conv2d_transpose… │
│                     │ 128)              │            │ conv2d_128[0][0]  │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_83       │ (None, 32, 32,    │          0 │ add_35[0][0]      │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_130 (Conv2D) │ (None, 32, 32, 1) │        129 │ activation_83[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_84       │ (None, 32, 32, 1) │          0 │ conv2d_130[0][0]  │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_20    │ (None, 64, 64, 1) │          0 │ activation_84[0]… │
│ (UpSampling2D)      │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ lambda_10 (Lambda)  │ (None, 64, 64,    │          0 │ up_sampling2d_20… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ multiply_10         │ (None, 64, 64,    │          0 │ lambda_10[0][0],  │
│ (Multiply)          │ 128)              │            │ activation_65[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_131 (Conv2D) │ (None, 64, 64,    │     16,512 │ multiply_10[0][0] │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_21    │ (None, 64, 64,    │          0 │ activation_81[0]… │
│ (UpSampling2D)      │ 256)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_131[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_10      │ (None, 64, 64,    │          0 │ up_sampling2d_21… │
│ (Concatenate)       │ 384)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_132 (Conv2D) │ (None, 64, 64,    │    442,496 │ concatenate_10[0… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_132[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_85       │ (None, 64, 64,    │          0 │ batch_normalizat… │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_134 (Conv2D) │ (None, 64, 64,    │     49,280 │ concatenate_10[0… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_133 (Conv2D) │ (None, 64, 64,    │    147,584 │ activation_85[0]… │
│                     │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_134[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        512 │ conv2d_133[0][0]  │
│ (BatchNormalizatio… │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_36 (Add)        │ (None, 64, 64,    │          0 │ batch_normalizat… │
│                     │ 128)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_86       │ (None, 64, 64,    │          0 │ add_36[0][0]      │
│ (Activation)        │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_135 (Conv2D) │ (None, 64, 64,    │      8,256 │ activation_86[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 64, 64,    │        256 │ conv2d_135[0][0]  │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_87       │ (None, 64, 64,    │          0 │ batch_normalizat… │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_137 (Conv2D) │ (None, 64, 64,    │      4,160 │ activation_87[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_11 │ (None, 64, 64,    │     36,928 │ conv2d_137[0][0]  │
│ (Conv2DTranspose)   │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_136 (Conv2D) │ (None, 64, 64,    │     16,448 │ activation_63[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_37 (Add)        │ (None, 64, 64,    │          0 │ conv2d_transpose… │
│                     │ 64)               │            │ conv2d_136[0][0]  │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_88       │ (None, 64, 64,    │          0 │ add_37[0][0]      │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_138 (Conv2D) │ (None, 64, 64, 1) │         65 │ activation_88[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_89       │ (None, 64, 64, 1) │          0 │ conv2d_138[0][0]  │
│ (Activation)        │                   │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_22    │ (None, 128, 128,  │          0 │ activation_89[0]… │
│ (UpSampling2D)      │ 1)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ lambda_11 (Lambda)  │ (None, 128, 128,  │          0 │ up_sampling2d_22… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ multiply_11         │ (None, 128, 128,  │          0 │ lambda_11[0][0],  │
│ (Multiply)          │ 64)               │            │ activation_63[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_139 (Conv2D) │ (None, 128, 128,  │      4,160 │ multiply_11[0][0] │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ up_sampling2d_23    │ (None, 128, 128,  │          0 │ activation_86[0]… │
│ (UpSampling2D)      │ 128)              │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_139[0][0]  │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_11      │ (None, 128, 128,  │          0 │ up_sampling2d_23… │
│ (Concatenate)       │ 192)              │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_140 (Conv2D) │ (None, 128, 128,  │    110,656 │ concatenate_11[0… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_140[0][0]  │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_90       │ (None, 128, 128,  │          0 │ batch_normalizat… │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_142 (Conv2D) │ (None, 128, 128,  │     12,352 │ concatenate_11[0… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_141 (Conv2D) │ (None, 128, 128,  │     36,928 │ activation_90[0]… │
│                     │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_142[0][0]  │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │        256 │ conv2d_141[0][0]  │
│ (BatchNormalizatio… │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ add_38 (Add)        │ (None, 128, 128,  │          0 │ batch_normalizat… │
│                     │ 64)               │            │ batch_normalizat… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_91       │ (None, 128, 128,  │          0 │ add_38[0][0]      │
│ (Activation)        │ 64)               │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_143 (Conv2D) │ (None, 128, 128,  │         65 │ activation_91[0]… │
│                     │ 1)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 128, 128,  │          4 │ conv2d_143[0][0]  │
│ (BatchNormalizatio… │ 1)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_92       │ (None, 128, 128,  │          0 │ batch_normalizat… │
│ (Activation)        │ 1)                │            │                   │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘
 Total params: 39,090,377 (149.12 MB)
 Trainable params: 39,068,871 (149.04 MB)
 Non-trainable params: 21,506 (84.01 KB)

Step 6: Model Training

Prepare Data for Training

Python

s = max([img_bad.shape[0]])
img = []
mask = []
y = []

for i in range(s):
    try:
        img.append(img_bad[i])
        mask.append(mask_bad[i][:, :, 0:1])
        y.append([0, 1, 0])
    except:
        pass

img = np.array(img)
mask = np.array(mask)
mask = mask.astype(bool)
y = np.array(y)
mask.shape

Output:

(1000, 128, 128, 1)

Define Callbacks and Compile Model

Python

call = EarlyStopping(monitor='val_accuracy', patience=5,
                     restore_best_weights=True)

arr = []


class CustomCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        arr.append(self.model.predict(img[1:2].reshape(128, 128)))


model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
              loss='binary_crossentropy', metrics=['accuracy'])

Train Model

Python

history = model.fit(x=img, y=mask, epochs=50, callbacks=[call])

Output:

Epoch 1/50
32/32 [==============================] - 89s 962ms/step - loss: 0.4492 - accuracy: 0.8589
Epoch 2/50
32/32 [==============================] - 22s 676ms/step - loss: 0.3339 - accuracy: 0.9085
Epoch 3/50
32/32 [==============================] - 22s 678ms/step - loss: 0.2804 - accuracy: 0.9188
Epoch 4/50
32/32 [==============================] - 22s 679ms/step - loss: 0.2390 - accuracy: 0.9282
Epoch 5/50
32/32 [==============================] - 22s 684ms/step - loss: 0.2261 - accuracy: 0.9261
Epoch 6/50
32/32 [==============================] - 22s 690ms/step - loss: 0.2024 - accuracy: 0.9345
Epoch 7/50
32/32 [==============================] - 22s 692ms/step - loss: 0.1948 - accuracy: 0.9338
Epoch 8/50
32/32 [==============================] - 22s 693ms/step - loss: 0.1869 - accuracy: 0.9359
Epoch 9/50
32/32 [==============================] - 22s 694ms/step - loss: 0.1766 - accuracy: 0.9385
Epoch 10/50
32/32 [==============================] - 22s 697ms/step - loss: 0.1765 - accuracy: 0.9373
Epoch 11/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1702 - accuracy: 0.9388
Epoch 12/50
32/32 [==============================] - 22s 702ms/step - loss: 0.1637 - accuracy: 0.9403
Epoch 13/50
32/32 [==============================] - 23s 705ms/step - loss: 0.1560 - accuracy: 0.9432
Epoch 14/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1505 - accuracy: 0.9458
Epoch 15/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1451 - accuracy: 0.9467
Epoch 16/50
32/32 [==============================] - 22s 702ms/step - loss: 0.1444 - accuracy: 0.9465
Epoch 17/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1419 - accuracy: 0.9474
Epoch 18/50
32/32 [==============================] - 22s 698ms/step - loss: 0.1366 - accuracy: 0.9488
Epoch 19/50
32/32 [==============================] - 22s 701ms/step - loss: 0.1317 - accuracy: 0.9502
Epoch 20/50
32/32 [==============================] - 22s 701ms/step - loss: 0.1356 - accuracy: 0.9475
Epoch 21/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1271 - accuracy: 0.9518
Epoch 22/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1206 - accuracy: 0.9541
Epoch 23/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1293 - accuracy: 0.9506
Epoch 24/50
32/32 [==============================] - 22s 701ms/step - loss: 0.1226 - accuracy: 0.9532
Epoch 25/50
32/32 [==============================] - 22s 701ms/step - loss: 0.1232 - accuracy: 0.9527
Epoch 26/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1362 - accuracy: 0.9481
Epoch 27/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1158 - accuracy: 0.9550
Epoch 28/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1057 - accuracy: 0.9595
Epoch 29/50
32/32 [==============================] - 22s 700ms/step - loss: 0.1132 - accuracy: 0.9555
Epoch 30/50
32/32 [==============================] - 22s 702ms/step - loss: 0.1097 - accuracy: 0.9577
Epoch 31/50
32/32 [==============================] - 22s 702ms/step - loss: 0.0975 - accuracy: 0.9621
Epoch 32/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0986 - accuracy: 0.9617
Epoch 33/50
32/32 [==============================] - 22s 698ms/step - loss: 0.1057 - accuracy: 0.9579
Epoch 34/50
32/32 [==============================] - 22s 701ms/step - loss: 0.0950 - accuracy: 0.9627
Epoch 35/50
32/32 [==============================] - 22s 703ms/step - loss: 0.0931 - accuracy: 0.9634
Epoch 36/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0878 - accuracy: 0.9655
Epoch 37/50
32/32 [==============================] - 22s 699ms/step - loss: 0.1033 - accuracy: 0.9596
Epoch 38/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0928 - accuracy: 0.9638
Epoch 39/50
32/32 [==============================] - 23s 704ms/step - loss: 0.0972 - accuracy: 0.9618
Epoch 40/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0953 - accuracy: 0.9623
Epoch 41/50
32/32 [==============================] - 22s 698ms/step - loss: 0.0808 - accuracy: 0.9679
Epoch 42/50
32/32 [==============================] - 22s 701ms/step - loss: 0.0744 - accuracy: 0.9708
Epoch 43/50
32/32 [==============================] - 22s 703ms/step - loss: 0.0691 - accuracy: 0.9732
Epoch 44/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0736 - accuracy: 0.9708
Epoch 45/50
32/32 [==============================] - 22s 698ms/step - loss: 0.0671 - accuracy: 0.9738
Epoch 46/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0681 - accuracy: 0.9733
Epoch 47/50
32/32 [==============================] - 22s 702ms/step - loss: 0.0685 - accuracy: 0.9728
Epoch 48/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0880 - accuracy: 0.9650
Epoch 49/50
32/32 [==============================] - 22s 699ms/step - loss: 0.0734 - accuracy: 0.9711
Epoch 50/50
32/32 [==============================] - 22s 700ms/step - loss: 0.0623 - accuracy: 0.9752

Step 7: Predictions

Helper Functions to Calculate Area

Python

def maskArea(img):
    DPI = 72
    INCH_TO_CM = 2.54
    sum_of_pixels = (img.sum() / 255)
    img_area = ((1 / DPI) ** 2) * (INCH_TO_CM ** 2) * sum_of_pixels
    return img_area


def area(img):
    DPI = 72
    INCH_TO_CM = 2.54
    sum_of_pixels = img.sum()
    img_area = ((1 / DPI) ** 2) * (INCH_TO_CM ** 2) * sum_of_pixels
    return img_area

Predict and Display Results

Python

i = 55
# Predict the mask for the i-th image
img_test_1 = model.predict(img[i:i+1]).reshape(128, 128)

# Plot the predicted mask
plt.figure(1)
plt.subplot(122)
plt.imshow(img_test_1, cmap=plt.cm.binary)
plt.title('Predicted Mask')
plt.colorbar()

# Plot the original image
plt.figure(2)
plt.subplot(122)
RGBimshow(img[i])
plt.title('Original Image')

# Plot the ground truth mask
plt.figure(3)
plt.subplot(122)
plt.imshow(mask[i].reshape(128, 128), cmap=plt.cm.binary)
plt.title('Ground Truth Mask')

# Convert the predicted mask to binary
img_test_2 = (img_test_1 >= 0.5)

# Plot the binary predicted mask
plt.figure(1)
plt.subplot(1, 2, 1)
plt.imshow(img_test_2)
plt.title('Binary Predicted Mask')

# Plot the binary predicted mask with binary colormap
plt.figure(2)
plt.subplot(1, 2, 1)
plt.imshow(img_test_2, cmap=plt.cm.binary)
plt.title('Binary Predicted Mask (Binary Colormap)')

i = 135
# Predict the mask for the i-th image
img_pred = model.predict(img[i:i+1]).reshape(128, 128)
img_pred = (img_pred >= 0.5)

# Plot the predicted mask
plt.figure(1)
plt.subplot(122)
plt.imshow(img_pred, cmap=plt.cm.binary)
plt.title('Predicted Mask')
plt.colorbar()

# Plot the original image
plt.figure(2)
plt.subplot(122)
RGBimshow(img[i])
plt.title('Original Image')

# Plot the ground truth mask
plt.figure(3)
plt.subplot(122)
plt.imshow(mask_bad[i][:, :, 0].reshape(128, 128), cmap=plt.cm.binary)
plt.title('Ground Truth Mask')

i = 57
# Predict the mask for the i-th image
img_pred = model.predict(img[i:i+1]).reshape(128, 128)
img_pred = (img_pred >= 0.5)

# Plot the predicted mask
plt.figure(1)
plt.subplot(122)
plt.imshow(img_pred, cmap=plt.cm.binary)
plt.title('Predicted Mask')
plt.colorbar()

# Plot the original image
plt.figure(2)
plt.subplot(122)
RGBimshow(img[i])
plt.title('Original Image')

# Plot the ground truth mask
plt.figure(3)
plt.subplot(122)
plt.imshow(mask_bad[i][:, :, 0].reshape(128, 128), cmap=plt.cm.binary)
plt.title('Ground Truth Mask')

Output:

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step

Text(0.5, 1.0, 'Ground Truth Mask')

Segment Anything : A Foundation Model for Image Segmentation

In computer vision, segmenting an image into separate segments or regions is a crucial operation. The article “Segment Anything – A Foundation Model for Image Segmentation” provides an introduction to Attention Res-UNet which is an essential model for making separate aspects visible through images.

In this article, we explore the idea of a foundation model designed for image segmentation, which includes its structure and how to execute it in several stages such as data preparation, creation, learning as well as outcome forecasts, also talk about performance evaluation measures of the product and offers some examples for a better understanding of its use across different fields too.

Table of Content

Overview of Image Segmentation
What is Attention Res-UNet
Image Segmentation Stepwise Implementation

Step 1: Import necessary libraries
Step 2: Download and Extract Mask Images
Step 3: Load Mask Images and Image Dataset

Load Mask Images
Load Image Dataset

Step 4: Preprocessing

Load and Resize Images and Masks
Display Image and Mask

Step 5: Model Building

Define Helper Functions
Attention block
Encoder block
Decoder block
Define Attention ResUNet Model
Model Summary

Step 6: Model Training

Prepare Data for Training
Define Callbacks and Compile Model
Train Model

Step 7: Predictions

Helper Functions to Calculate Area
Predict and Display Results

Application of Attention Res-UNet in Image Segmentation
Performance Evaluation and Case Studies of Attention Res-UNet
Case Studies:
Conclusion

Image Segmentation Stepwise Implementation

Step 1: Import necessary libraries

Step 2: Download and Extract Mask Images

Step 3: Load Mask Images and Image Dataset

Load Mask Images

Load Image Dataset

Step 4: Preprocessing

Load and Resize Images and Masks

Display Image and Mask

Step 5: Model Building

Define Helper Functions

Attention block

Encoder block

Decoder block

Define Attention ResUNet Model

Model Summary

Step 6: Model Training

Prepare Data for Training

Define Callbacks and Compile Model

Train Model

Step 7: Predictions

Helper Functions to Calculate Area

Predict and Display Results

Segment Anything : A Foundation Model for Image Segmentation

Similar Reads