Production Machine Learning (14 May 2023)

2 minute read

Production Machine Learning

For the past two weeks, I’ve been picking up Kaggle courses as well as Coursera courses. Here are some notes taken that I think might be useful:

Adapting to Data: Different kind of data changes

  • change in distribution
  • change in depedencies and change in ingested data
  • Code smell
  • Model not updated to new data (Cold start problem)
    • dynamic train
    • understand model limit
  • Reroll old model with model versioning
  • Concept drift
    • Change in P(Y|X) is a shift in the underlying relationship between model input and output
  • Data drift
    • Change in P(X) is a shift in the distribution of data
  • Prediction Shift (Population)
    • Change P(X|Y) is a shift in model prediction
  • Output shift (Co-variate Shift)

Tuning Performance to reduce training time

Commonly Occurs- Large inputs
  • Input requires parsing
  • Small models | - Expensive Computation
  • Underpowered Hardware | - Large number of inputs
  • complex models | | Take Action | - Store efficiently
  • Paralleize reads Consider batch size | - Train on faster accel.
  • Upgrade processor
  • Run on TPU
  • Simplify model | - Add more memory
  • Use fewer layers
  • Reduce batch size | | | | | |


  • mirrored
  • multi-worker mirrored
  • tpu
  • parameter server


  1. Create a strategy object

    strategy = tf.distribute.MultiWorkerMirroredStrategy()

  2. Wrap creation of model parameters within strategy scope

    1with strategy.scope():
    2	model = create_model()
    3	model.compile(
    4		loss = 'sparse_categorical_crossentropy'
    5		optimizer = tf.keras.optimizers.Adam(0.0001),
    6		metrics=['accuracy'])
  3. Scale the batch size by the number of replicas in the cluster

    1per_replica_batch_size = 64
    2global_batch_size = per_replica_batch_size \
    3	* strategy.num_replicas_in_sync

Readings: Designing High-pe

Readings: Designing High-peformance ML Systems

In this module, you focus on either I/O performance or computational speed, depending on the

model. For more information, see the following readings and videos.

● How to Evaluate the Performance of Your Machine Learning Model

● Best practices for performance and cost optimization for machine learning

● How To Improve Machine Learning Model Performance: Five Ways

● Distributed TensorFlow model training on Cloud AI Platform (TF Dev Summit ‘20)

● Distributed training with TensorFlow

● Speeding Up Neural Network Training with Data Echoing

● Machine Learning Performance Improvement Cheat Sheet

● Building a High-Performance Data Pipeline with Tensorflow 2.x

● Distributed training with TensorFlow

● AutoML Tables


● Introduction to Kubeflow

● Orchestrating TFX Pipelines

● Introduction to Machine Learning Pipelines with Kubeflow

● Kubeflow — a machine learning toolkit for Kubernetes

● ML for Mobile and Edge Devices - TensorFlow Lite

● TensorFlow Lite Examples | Machine Learning Mobile Apps

● Optimize TensorFlow models for mobile and embedded devices

● The Essential Guide To Learn TensorFlow Mobile and Tensorflow Lite