Training the machine learning model
After preprocessing comes the hard part, we can use various machine learning algorithms and techniques to break CAPTCHA. Convolutional Neural Networks(CNNs) and Recurrent Neural Networks(RNNs) can both be used to break CAPTCHA. While CNNs are a perfect match for image recognition and are very effective while recognizing images, RNNs can process sequential data very proficiently, suitable for things like audio-based CAPTCHA. Preprocessed images can be fed to the Machine Learning model. Using clever mathematics, the model will start to recognize patterns in the provided images and it adjusts its weights and biases and learns.
But there is one hurdle. The CAPTCHA images are highly variable, and this makes finding patterns quite hard for Machine Learning models. So, data augmentation has to be used to make the test data more variable. This can be done by rotating, scaling, and flipping. But before data augmentation, we need to split the data into two parts, one for training and the other for testing. This way, we can identify how accurate our model is later. Libraries like TensorFlow can help you create CNNs of your choice for a wide variety of applications, so it is a valid choice for this use.
Python3
# Creating training dataset dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) dataset_train = ( dataset_train. map ( encode_sample, num_parallel_calls = tf.data.AUTOTUNE ) .batch(batch_size) .prefetch(buffer_size = tf.data.AUTOTUNE) ) # Creating validation dataset val_data = tf.data.Dataset.from_tensor_slices((x_valid, y_valid)) val_data = ( val_data. map ( encode_sample, num_parallel_calls = tf.data.AUTOTUNE ) .batch(batch_size) .prefetch(buffer_size = tf.data.AUTOTUNE) ) |
Now let’s plot some images from the training data.
Python3
# Visualizing some training data _, ax = plt.subplots( 4 , 4 , figsize = ( 10 , 5 )) for batch in dataset_train.take( 1 ): dir_img = batch[ "image" ] img_labels = batch[ "label" ] for i in range ( 16 ): img = (dir_img[i] * 255 ).numpy().astype( "uint8" ) label = tf.strings.reduce_join(num_to_char( img_labels[i])).numpy().decode( "utf-8" ) ax[i / / 4 , i % 4 ].imshow(img[:, :, 0 ].T, cmap = "gray" ) ax[i / / 4 , i % 4 ].set_title(label) ax[i / / 4 , i % 4 ].axis( "off" ) plt.show() ) |
Output:
Python3
# CTC loss calculation class LayerCTC(layers.Layer): def __init__( self , name = None ): super ().__init__(name = name) self .loss_fn = keras.backend.ctc_batch_cost def call( self , y_true, y_pred): # Compute the training-time loss value batch_len = tf.cast(tf.shape(y_true)[ 0 ], dtype = "int64" ) input_length = tf.cast(tf.shape(y_pred)[ 1 ], dtype = "int64" ) label_length = tf.cast(tf.shape(y_true)[ 1 ], dtype = "int64" ) input_length = input_length * \ tf.ones(shape = (batch_len, 1 ), dtype = "int64" ) label_length = label_length * \ tf.ones(shape = (batch_len, 1 ), dtype = "int64" ) loss = self .loss_fn(y_true, y_pred, input_length, label_length) self .add_loss(loss) # Return Computed predictions return y_pred def model_build(): # Define the inputs to the model input_img = layers. Input ( shape = (img_width, img_height, 1 ), name = "image" , dtype = "float32" ) img_labels = layers. Input (name = "label" , shape = ( None ,), dtype = "float32" ) # First convolutional block x = layers.Conv2D( 32 , ( 3 , 3 ), activation = "relu" , kernel_initializer = "he_normal" , padding = "same" , name = "Conv1" , )(input_img) x = layers.MaxPooling2D(( 2 , 2 ), name = "pool1" )(x) # Second convolutional block x = layers.Conv2D( 64 , ( 3 , 3 ), activation = "relu" , kernel_initializer = "he_normal" , padding = "same" , name = "Conv2" , )(x) x = layers.MaxPooling2D(( 2 , 2 ), name = "pool2" )(x) # Reshaping the output before passing to RNN new_shape = ((img_width / / 4 ), (img_height / / 4 ) * 64 ) x = layers.Reshape(target_shape = new_shape, name = "reshape" )(x) x = layers.Dense( 64 , activation = "relu" , name = "dense1" )(x) x = layers.Dropout( 0.2 )(x) # RNNs x = layers.Bidirectional(layers.LSTM( 128 , return_sequences = True , dropout = 0.25 ))(x) x = layers.Bidirectional(layers.LSTM( 64 , return_sequences = True , dropout = 0.25 ))(x) # Output layer x = layers.Dense( len (char_to_num.get_vocabulary()) + 1 , activation = "softmax" , name = "dense2" )(x) # Calculate CTC loss at each step output = LayerCTC(name = "ctc_loss" )(img_labels, x) # Defining the model model = keras.models.Model( inputs = [input_img, img_labels], outputs = output, name = "ocr_model_v1" ) opt = keras.optimizers.Adam() # Compile the model model. compile (optimizer = opt) return model |
After creating an instance of the model now let’s print the summary of the model and the number of parameters that have been used in this model.
Python3
# Build the model model = model_build() model.summary() |
Output:
Model: "ocr_model_v1" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== image (InputLayer) [(None, 200, 50, 1) 0 [] ] Conv1 (Conv2D) (None, 200, 50, 32) 320 ['image[0][0]'] pool1 (MaxPooling2D) (None, 100, 25, 32) 0 ['Conv1[0][0]'] Conv2 (Conv2D) (None, 100, 25, 64) 18496 ['pool1[0][0]'] pool2 (MaxPooling2D) (None, 50, 12, 64) 0 ['Conv2[0][0]'] reshape (Reshape) (None, 50, 768) 0 ['pool2[0][0]'] dense1 (Dense) (None, 50, 64) 49216 ['reshape[0][0]'] dropout (Dropout) (None, 50, 64) 0 ['dense1[0][0]'] bidirectional (Bidirectional) (None, 50, 256) 197632 ['dropout[0][0]'] bidirectional_1 (Bidirectional (None, 50, 128) 164352 ['bidirectional[0][0]'] ) label (InputLayer) [(None, None)] 0 [] dense2 (Dense) (None, 50, 21) 2709 ['bidirectional_1[0][0]'] ctc_loss (LayerCTC) (None, 50, 21) 0 ['label[0][0]', 'dense2[0][0]'] ================================================================================================== Total params: 432,725 Trainable params: 432,725 Non-trainable params: 0 __________________________________________________________________________________________________
Now we are ready to train the model. We will train the model for 100 epochs and along with some early stopping methods so, that the model does not overfit the data.
Python3
# Early Stopping Parameters and EPOCH epochs = 100 early_stopping_patience = 10 early_stopping = keras.callbacks.EarlyStopping( monitor = "val_loss" , patience = early_stopping_patience, restore_best_weights = True ) # Training the model history = model.fit( dataset_train, validation_data = val_data, epochs = epochs, callbacks = [early_stopping], ) |
Output:
Epoch 80/100 59/59 [==============================] - 2s 36ms/step - loss: 1.7622 - val_loss: 7.1511 Epoch 81/100 59/59 [==============================] - 2s 35ms/step - loss: 1.7216 - val_loss: 7.0523 Epoch 82/100 59/59 [==============================] - 3s 47ms/step - loss: 1.5814 - val_loss: 7.1403 Epoch 83/100 59/59 [==============================] - 2s 37ms/step - loss: 1.6464 - val_loss: 7.0921 Epoch 84/100 59/59 [==============================] - 2s 35ms/step - loss: 1.6113 - val_loss: 7.1740 Epoch 85/100 59/59 [==============================] - 2s 35ms/step - loss: 1.5529 - val_loss: 7.1272 Epoch 86/100 59/59 [==============================] - 2s 39ms/step - loss: 1.5346 - val_loss: 7.0750
How to Break a CAPTCHA System with Machine Learning?
CAPTCHA, short for Completely Automated Public Turing Test to Tell Computers and Humans Apart, is a revolutionary technology that helps identify humans from bots and saves your site from malicious intentions. But this technology has begun to show its age. Captcha was supposed to be a robust system, but artificial intelligence is driving it almost useless. To break a Captcha, we require a machine-learning model which we need to train. After its training, all that is required is to feed the model any CAPTCHA you want, which it will solve for you.
Through this article, we will explore how one can break a CAPTCHA system with the help of machine learning. We will discuss in detail the complete process. Besides, we will also share the limitations of this approach and the ethical and moral issues that need to be considered while attempting this. This should be remembered that our intention behind breaking CAPTCHA should be to educate ourselves and highlight the incapability of the system to filter out non-humans. But CAPTCHAs are the things saving sites from malicious attacks, and they are effectively safeguarding the internet. So, using bots to break CAPTCHAs on websites without permission is unethical at best and also illegal, depending on your location.