Deployment
Step 1: Check the saved model
Python3
!saved_model_cli show - - dir {path} - - all |
Output:
2023-09-15 14:34:42.403572: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['conv2d_input'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 32, 32, 3)
name: serving_default_conv2d_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['dense_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
The MetaGraph with tag set ['serve'] contains the following ops: {'AssignVariableOp', 'StringJoin', 'BiasAdd', 'StatefulPartitionedCall', 'Pack', 'MaxPool', 'SaveV2', 'VarHandleOp', 'Identity', 'Softmax', 'NoOp', 'StaticRegexFullMatch', 'Relu', 'Const', 'DisableCopyOnRead', 'MatMul', 'MergeV2Checkpoints', 'Reshape', 'Conv2D', 'Select', 'Placeholder', 'RestoreV2', 'ShardedFilename', 'ReadVariableOp'}
Concrete Functions:
Function Name: '__call__'
Option #1
Callable with:
Argument #1
inputs: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='inputs')
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Option #2
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='conv2d_input')
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Option #3
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='conv2d_input')
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #4
Callable with:
Argument #1
inputs: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='inputs')
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Function Name: '_default_save_signature'
Option #1
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='conv2d_input')
Function Name: 'call_and_return_all_conditional_losses'
Option #1
Callable with:
Argument #1
inputs: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='inputs')
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Option #2
Callable with:
Argument #1
inputs: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='inputs')
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #3
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='conv2d_input')
Argument #2
DType: bool
Value: False
Argument #3
DType: NoneType
Value: None
Option #4
Callable with:
Argument #1
conv2d_input: TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='conv2d_input')
Argument #2
DType: bool
Value: True
Argument #3
DType: NoneType
Value: None
Step 2: Define the model directory
Python3
import os # Define the model directory os.environ[ "MODEL_DIR" ] = my_dir |
Step 3: Start TensorFlow Model Server
Python3
import subprocess # Command to start TensorFlow Model Server command = f "nohup tensorflow_model_server --rest_api_port=8501 --model_name=CIFARModel --model_base_path='{my_dir}' > server.log 2>&1" # Execute the command using subprocess subprocess.Popen(command, shell = True ) |
Output:
<Popen: returncode: None args: "nohup tensorflow_model_server --rest_api_por...>
Check the server log
Python3
!tail server.log |
Output:
2023-09-15 14:34:44.080115: E external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:828]
tfg_optimizer{} failed: NOT_FOUND: Op type not registered 'DisableCopyOnRead' in binary running on GFG19509-LAPTOP.
Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph
which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph,
as contrib ops are lazily registered when the module is first accessed.
when importing GraphDef to MLIR module in GrapplerHook
Step 4: Request to the model in TensorFlow Serving to Predict the label of test images
Plot the first 10 test images with their lables
Python3
import matplotlib.pyplot as plt def plot_images(images, titles, rows = 2 , cols = 5 ): fig, axes = plt.subplots(rows, cols, figsize = ( 13 , 5 )) for i, ax in enumerate (axes.ravel()): ax.imshow(images[i].reshape( 32 , 32 , 3 )) ax.axis( 'off' ) ax.set_title(titles[i]) # Select the first 10 images from data_test and their corresponding labels sample_indices = np.linspace( 0 , 9 , 10 ,dtype = int ) sample_images = [data_test[i] for i in sample_indices] sample_labels = [classes[label_test[i].item()] for i in sample_indices] # Plot the selected images plot_images(sample_images, sample_labels) plt.show() |
Output:
Create JSON Object
Create JSON Object with JSON library as shown the code below. I have used official code, you may modify code for your needs accordingly.
Python3
import json # Define the data signature_name = "serving_default" # Consider the first 10 data_test images instances = data_test[ 0 : 10 ].tolist() # Create a dictionary data_dict = { "signature_name" : signature_name, "instances" : instances } # Convert the dictionary to a JSON string data = json.dumps(data_dict) # Print the JSON data print (f 'Data: {data[:50]} ... {data[-52:]}' ) |
Output:
Data: {"signature_name": "serving_default", "instances": ... 164, 163, 204], [182, 182, 225], [186, 185, 223]]]]}
RUN Experiments
To run experiments, we need to define JSON Objects, install requests, and do some visualisation.
Install Requests
We will install the requests library. Run the following bash command:
!pip install -q requests
Run the following Python 3 code:
Python3
import requests import json # Define the API endpoint api_url = 'http://localhost:8501/v1/models/CIFARModel:predict' # Set the request headers headers = { "content-type" : "application/json" } # Send a POST request to the API with the JSON data response = requests.post(api_url, data = data, headers = headers) # Check if the request was successful if response.status_code = = 200 : # Parse the JSON response and extract predictions response_data = json.loads(response.text) predictions = response_data[ 'predictions' ] else : print (f "Failed to make a request. Status code: {response.status_code}" ) |
Now run the following codes to check the classes of both predicted and actual label:
Python3
for prediction in predictions: target = max (prediction) object_ = prediction.index(target) print (classes[object_]) |
Output:
cat
ship
ship
aeroplane
frog
frog
automobile
frog
cat
truck
You can see the output taken from the model which is around 70% accurate and matches the model accuracy.
Serving a TensorFlow Model
TensorFlow Serving stands as a versatile and high-performance system tailored for serving machine learning models in production settings. Its primary objective is to simplify the deployment of novel algorithms and experiments while maintaining consistent server architecture and APIs. While it seamlessly integrates with TensorFlow models, TensorFlow Serving’s adaptability also enables the service to be expanded for serving diverse model types and data beyond TensorFlow.