{"id":53933,"date":"2025-09-25T10:45:30","date_gmt":"2025-09-25T00:45:30","guid":{"rendered":"https:\/\/www.cloudproinc.com.au\/?p=53933"},"modified":"2025-09-25T10:45:33","modified_gmt":"2025-09-25T00:45:33","slug":"deploy-a-model-with-tensorflow-serving","status":"publish","type":"post","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/","title":{"rendered":"Deploy a Model with TensorFlow Serving"},"content":{"rendered":"\n<p>In this blog post Deploy a Model with TensorFlow Serving on Docker and Kubernetes we will walk through how to package a TensorFlow model, serve it locally with Docker, and scale it on Kubernetes. Deploy a Model with TensorFlow Serving on Docker and Kubernetes is aimed at technical teams who want a reliable, fast, and maintainable way to serve models in production.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>At a high level, TensorFlow Serving is a purpose-built, high-performance inference server. It loads models in TensorFlow\u2019s SavedModel format, exposes standard REST and gRPC endpoints, and supports model versioning and batching out of the box. Compared to DIY Flask or FastAPI wrappers, it\u2019s faster to stand up, easier to operate, and designed for zero-downtime upgrades.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-is-tensorflow-serving\">What is TensorFlow Serving<\/h2>\n\n\n\n<p>TensorFlow Serving (TF Serving) is a C++ server that:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Reads TensorFlow SavedModel directories (versioned as 1, 2, 3\u2026)<\/li>\n\n\n\n<li>Serves predictions over HTTP\/REST (default port 8501) and gRPC (default port 8500)<\/li>\n\n\n\n<li>Hot-reloads new model versions and supports canarying\/rollback<\/li>\n\n\n\n<li>Optionally batches requests for higher throughput<\/li>\n<\/ul>\n\n\n\n<p>Because it\u2019s optimized in C++ and tightly integrated with TensorFlow runtimes (CPU and GPU), you get strong performance without writing server code. Your team focuses on model training and packaging; TF Serving handles the serving.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-prerequisites\">Prerequisites<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/category\/docker\/\">Docker <\/a>installed locally<\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/category\/python\/\">Python <\/a>3.9+ and TensorFlow for exporting a model<\/li>\n\n\n\n<li>curl for quick REST testing<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-1-export-a-savedmodel\">Step 1: Export a SavedModel<\/h2>\n\n\n\n<p>We\u2019ll create a simple Keras model and export it in the SavedModel format, versioned under <code>models\/my_model\/1<\/code>. TF Serving looks for numeric subfolders representing versions.<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-126bd373b1dfa52a2a65a016dcd5515f\"><code>import tensorflow as tf\nimport numpy as np\n\n# Build a tiny model\nmodel = tf.keras.Sequential(&#91;\n    tf.keras.layers.Input(shape=(4,), name=\"features\"),\n    tf.keras.layers.Dense(16, activation=\"relu\"),\n    tf.keras.layers.Dense(1, activation=\"sigmoid\")\n])\nmodel.compile(optimizer=\"adam\", loss=\"binary_crossentropy\")\n\n# Train on dummy data (replace with your real data)\nx = np.random.rand(200, 4).astype(\"float32\")\ny = (x.mean(axis=1) &gt; 0.5).astype(\"float32\")\nmodel.fit(x, y, epochs=3, verbose=0)\n\n# Export as SavedModel (version 1)\nexport_path = \"models\/my_model\/1\"\ntf.saved_model.save(model, export_path)\nprint(\"SavedModel exported to\", export_path)\n<\/code><\/pre>\n\n\n\n<p>This export includes a default signature (<code>serving_default<\/code>) TF Serving will use for inference.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-2-serve-locally-with-docker\">Step 2: Serve locally with Docker<\/h2>\n\n\n\n<p>Run the official TF Serving container, mounting your model directory and exposing REST and gRPC ports:<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-ff91b40ca90d0bf29f17660bb5a8e462\"><code>docker run --rm -p 8501:8501 -p 8500:8500 \\\n  -v \"$PWD\/models\/my_model:\/models\/my_model\" \\\n  -e MODEL_NAME=my_model \\\n  --name tfserving \\\n  tensorflow\/serving:latest\n<\/code><\/pre>\n\n\n\n<p>What this does:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Binds REST on <code>localhost:8501<\/code> and gRPC on <code>localhost:8500<\/code><\/li>\n\n\n\n<li>Loads the highest numeric version under <code>\/models\/my_model<\/code><\/li>\n\n\n\n<li>Exposes the model under the name <code>my_model<\/code><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-3-send-a-prediction\">Step 3: Send a prediction<\/h2>\n\n\n\n<p>Use REST for a quick test:<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-cab3162e032bc1d1e230925fd4122e36\"><code># Model status\ncurl http:\/\/localhost:8501\/v1\/models\/my_model\n\n# Predict (two rows, 4 features each)\ncurl -X POST http:\/\/localhost:8501\/v1\/models\/my_model:predict \\\n  -H \"Content-Type: application\/json\" \\\n  -d '{\"instances\": &#91;&#91;0.1,0.2,0.3,0.4],&#91;0.9,0.8,0.1,0.0]]}'\n<\/code><\/pre>\n\n\n\n<p>You\u2019ll get back a JSON with <code>predictions<\/code>. In production, you can switch to gRPC for lower latency and better throughput, but REST is perfect for quick testing and many web services.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-4-upgrade-and-roll-back-with-versions\">Step 4: Upgrade and roll back with versions<\/h2>\n\n\n\n<p>To deploy a new model version without downtime:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Export your updated model to <code>models\/my_model\/2<\/code><\/li>\n\n\n\n<li>Place it alongside version 1 on the same path<\/li>\n\n\n\n<li>TF Serving will detect the new version and start serving it once loaded<\/li>\n<\/ul>\n\n\n\n<p>Roll back by removing or disabling version 2; the server will return to serving the latest available version. You can tune how quickly it polls the filesystem with <code>--file_system_poll_wait_seconds<\/code> if needed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-5-serve-multiple-models\">Step 5: Serve multiple models<\/h2>\n\n\n\n<p>For multi-model setups, point TF Serving at a model config file:<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-e907c81bf0b092937a6f6862c360b631\"><code># models.config (textproto)\nmodel_config_list: {\n  config: {\n    name: \"fraud_model\"\n    base_path: \"\/models\/fraud_model\"\n    model_platform: \"tensorflow\"\n  }\n  config: {\n    name: \"churn_model\"\n    base_path: \"\/models\/churn_model\"\n    model_platform: \"tensorflow\"\n  }\n}\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-c54fafda815531f1f5b85707e2a7945d\"><code>docker run --rm -p 8501:8501 -p 8500:8500 \\\n  -v \"$PWD\/models:\/models\" \\\n  -v \"$PWD\/models.config:\/models\/models.config\" \\\n  tensorflow\/serving:latest \\\n  --model_config_file=\/models\/models.config \\\n  --strict_model_config=false\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-6-move-to-kubernetes\">Step 6: Move to Kubernetes<\/h2>\n\n\n\n<p>On Kubernetes, mount your model directory from a PersistentVolume and expose a Service. A minimal example:<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-a055400ee6dd90370b4a9570cead7f31\"><code>---\napiVersion: apps\/v1\nkind: Deployment\nmetadata:\n  name: tfserving\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: tfserving\n  template:\n    metadata:\n      labels:\n        app: tfserving\n    spec:\n      containers:\n      - name: tfserving\n        image: tensorflow\/serving:latest\n        args:\n          - \"--model_name=my_model\"\n          - \"--model_base_path=\/models\/my_model\"\n          - \"--port=8500\"\n          - \"--rest_api_port=8501\"\n        ports:\n          - containerPort: 8501\n          - containerPort: 8500\n        volumeMounts:\n          - name: model-volume\n            mountPath: \/models\/my_model\n      volumes:\n        - name: model-volume\n          persistentVolumeClaim:\n            claimName: tf-model-pvc\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: tfserving\nspec:\n  selector:\n    app: tfserving\n  ports:\n    - name: http\n      port: 8501\n      targetPort: 8501\n<\/code><\/pre>\n\n\n\n<p>Add an Ingress or API gateway with TLS, and consider autoscaling:<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-86e72bc145ec42c323f94e72182177b0\"><code>apiVersion: autoscaling\/v2\nkind: HorizontalPodAutoscaler\nmetadata:\n  name: tfserving\nspec:\n  scaleTargetRef:\n    apiVersion: apps\/v1\n    kind: Deployment\n    name: tfserving\n  minReplicas: 1\n  maxReplicas: 5\n  metrics:\n  - type: Resource\n    resource:\n      name: cpu\n      target:\n        type: Utilization\n        averageUtilization: 70\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-performance-and-reliability-tips\">Performance and reliability tips<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Batching: Enable batching to increase throughput under load.<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-f6bac79a4b241b0a9b3311360f9c7d7d\"><code># batching_config.txt\nmax_batch_size: 32\nbatch_timeout_micros: 2000\nnum_batch_threads: 8\nmax_enqueued_batches: 100\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-f151740ba862159f37b0f23d9bfdf307\"><code>docker run --rm -p 8501:8501 -p 8500:8500 \\\n  -v \"$PWD\/models\/my_model:\/models\/my_model\" \\\n  -v \"$PWD\/batching_config.txt:\/models\/batching_config.txt\" \\\n  -e MODEL_NAME=my_model \\\n  tensorflow\/serving:latest \\\n  --enable_batching=true \\\n  --batching_parameters_file=\/models\/batching_config.txt\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>CPU vs GPU: For heavy models or large batches, use <code>tensorflow\/serving:latest-gpu<\/code> with NVIDIA Container Toolkit.<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-5f04f8b4d4b700398246abfec4701009\"><code>docker run --gpus all -p 8501:8501 -p 8500:8500 \\\n  -v \"$PWD\/models\/my_model:\/models\/my_model\" \\\n  -e MODEL_NAME=my_model \\\n  tensorflow\/serving:latest-gpu\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Model size and cold starts: Keep models lean, and pre-warm by sending a small request after rollout.<\/li>\n\n\n\n<li>Versioning strategy: Always deploy to a new numeric folder (e.g., <code>\/2<\/code>), test, then cut traffic. Keep N-1 for quick rollback.<\/li>\n\n\n\n<li>Input validation: Enforce shapes and dtypes at your API edge to avoid malformed requests reaching TF Serving.<\/li>\n\n\n\n<li>Observability: Log request IDs at the caller, track latency and error rates, and capture model version in every metric\/event.<\/li>\n\n\n\n<li>Security: Put TF Serving behind an Ingress or API gateway with TLS and authentication. Restrict direct access to ports 8500\/8501.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-common-pitfalls\">Common pitfalls<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Signature mismatches: Ensure your client payload matches the SavedModel signature (<code>serving_default<\/code>). If in doubt, inspect with <code>saved_model_cli show --dir &lt;path> --all<\/code>.<\/li>\n\n\n\n<li>Wrong JSON shape: REST <code>instances<\/code> must match the model\u2019s expected shape. For a single vector input, wrap it as a list of lists.<\/li>\n\n\n\n<li>Mount paths: The container must see versioned subfolders under the base path (<code>\/models\/my_model\/1<\/code>, <code>\/2<\/code>, \u2026).<\/li>\n\n\n\n<li>Resource limits: Without CPU\/memory limits in Kubernetes, noisy neighbors can cause latency spikes. Set requests\/limits and autoscaling.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-this-approach-works\">Why this approach works<\/h2>\n\n\n\n<p>TF Serving abstracts the serving layer with an optimized, battle-tested server. Docker makes it reproducible on a laptop, CI, or any cloud VM. Kubernetes adds elasticity, resilience, and a paved path to GitOps and blue\/green rollouts. Together, they remove bespoke server code and let your team focus on model quality and business impact.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-wrap-up\">Wrap-up<\/h2>\n\n\n\n<p>You now have a clean path from a trained TensorFlow model to a production-ready, scalable serving stack. Start with Docker for fast iteration, then move to Kubernetes when you need high availability and autoscaling. If you want help adapting this for your environment\u2014object storage model syncing, canarying, observability, security\u2014CloudProinc.com.au can assist with reference architectures and hands-on implementation.<\/p>\n\n\n\n<ul class=\"wp-block-yoast-seo-related-links yoast-seo-related-links\">\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/build-a-keras-model-for-real-projects\/\">Build a Keras Model for Real Projects<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/14\/publish-a-port-from-a-container-to-your-computer\/\">Publish a Port from a Container to Your Computer<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/keras-functional-api\/\">Keras Functional API<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/21\/run-pytorch-in-net-with-torchsharp\/\">Run PyTorch in .NET with TorchSharp<\/a><\/li>\n\n\n\n<li><a href=\"null\">Recover Deleted or Lost Exchange Online Emails to PST<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Learn to package, serve, and scale TensorFlow models using Docker and Kubernetes with TensorFlow Serving. Practical steps, code, and production tips for teams.<\/p>\n","protected":false},"author":1,"featured_media":53941,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_yoast_wpseo_focuskw":"Deploy a Model with TensorFlow Serving","_yoast_wpseo_title":"","_yoast_wpseo_metadesc":"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.","_yoast_wpseo_opengraph-title":"","_yoast_wpseo_opengraph-description":"","_yoast_wpseo_twitter-title":"","_yoast_wpseo_twitter-description":"","_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[24,13,92],"tags":[],"class_list":["post-53933","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-blog","category-tensorflow"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Deploy a Model with TensorFlow Serving - CPI Consulting<\/title>\n<meta name=\"description\" content=\"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deploy a Model with TensorFlow Serving\" \/>\n<meta property=\"og:description\" content=\"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/\" \/>\n<meta property=\"og:site_name\" content=\"CPI Consulting\" \/>\n<meta property=\"article:published_time\" content=\"2025-09-25T00:45:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-09-25T00:45:33+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cloudproinc.azurewebsites.net\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1536\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"CPI Staff\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"CPI Staff\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/\"},\"author\":{\"name\":\"CPI Staff\",\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#\\\/schema\\\/person\\\/192eeeb0ce91062126ce3822ae88fe6e\"},\"headline\":\"Deploy a Model with TensorFlow Serving\",\"datePublished\":\"2025-09-25T00:45:30+00:00\",\"dateModified\":\"2025-09-25T00:45:33+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/\"},\"wordCount\":804,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#primaryimage\"},\"thumbnailUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png\",\"articleSection\":[\"AI\",\"Blog\",\"TensorFlow\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/\",\"url\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/\",\"name\":\"Deploy a Model with TensorFlow Serving - CPI Consulting\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#primaryimage\"},\"thumbnailUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png\",\"datePublished\":\"2025-09-25T00:45:30+00:00\",\"dateModified\":\"2025-09-25T00:45:33+00:00\",\"description\":\"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#primaryimage\",\"url\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png\",\"contentUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png\",\"width\":1536,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/deploy-a-model-with-tensorflow-serving\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Deploy a Model with TensorFlow Serving\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#website\",\"url\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/\",\"name\":\"Cloud Pro Inc - CPI Consulting Pty Ltd\",\"description\":\"Cloud, AI &amp; Cybersecurity Consulting | Melbourne\",\"publisher\":{\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#organization\",\"name\":\"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd\",\"url\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/favfinalfile.png\",\"contentUrl\":\"\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/favfinalfile.png\",\"width\":500,\"height\":500,\"caption\":\"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd\"},\"image\":{\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/#\\\/schema\\\/person\\\/192eeeb0ce91062126ce3822ae88fe6e\",\"name\":\"CPI Staff\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"caption\":\"CPI Staff\"},\"sameAs\":[\"http:\\\/\\\/www.cloudproinc.com.au\"],\"url\":\"https:\\\/\\\/cloudproinc.azurewebsites.net\\\/index.php\\\/author\\\/cpiadmin\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Deploy a Model with TensorFlow Serving - CPI Consulting","description":"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/","og_locale":"en_US","og_type":"article","og_title":"Deploy a Model with TensorFlow Serving","og_description":"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.","og_url":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/","og_site_name":"CPI Consulting","article_published_time":"2025-09-25T00:45:30+00:00","article_modified_time":"2025-09-25T00:45:33+00:00","og_image":[{"width":1536,"height":1024,"url":"https:\/\/cloudproinc.azurewebsites.net\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","type":"image\/png"}],"author":"CPI Staff","twitter_card":"summary_large_image","twitter_misc":{"Written by":"CPI Staff","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#article","isPartOf":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/"},"author":{"name":"CPI Staff","@id":"https:\/\/cloudproinc.azurewebsites.net\/#\/schema\/person\/192eeeb0ce91062126ce3822ae88fe6e"},"headline":"Deploy a Model with TensorFlow Serving","datePublished":"2025-09-25T00:45:30+00:00","dateModified":"2025-09-25T00:45:33+00:00","mainEntityOfPage":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/"},"wordCount":804,"commentCount":0,"publisher":{"@id":"https:\/\/cloudproinc.azurewebsites.net\/#organization"},"image":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","articleSection":["AI","Blog","TensorFlow"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/","url":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/","name":"Deploy a Model with TensorFlow Serving - CPI Consulting","isPartOf":{"@id":"https:\/\/cloudproinc.azurewebsites.net\/#website"},"primaryImageOfPage":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#primaryimage"},"image":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","datePublished":"2025-09-25T00:45:30+00:00","dateModified":"2025-09-25T00:45:33+00:00","description":"Learn how to deploy a model with TensorFlow Serving on Docker and Kubernetes for fast and reliable production serving.","breadcrumb":{"@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#primaryimage","url":"\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","contentUrl":"\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","width":1536,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/cloudproinc.com.au\/index.php\/2025\/09\/25\/deploy-a-model-with-tensorflow-serving\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/cloudproinc.azurewebsites.net\/"},{"@type":"ListItem","position":2,"name":"Deploy a Model with TensorFlow Serving"}]},{"@type":"WebSite","@id":"https:\/\/cloudproinc.azurewebsites.net\/#website","url":"https:\/\/cloudproinc.azurewebsites.net\/","name":"Cloud Pro Inc - CPI Consulting Pty Ltd","description":"Cloud, AI &amp; Cybersecurity Consulting | Melbourne","publisher":{"@id":"https:\/\/cloudproinc.azurewebsites.net\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/cloudproinc.azurewebsites.net\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/cloudproinc.azurewebsites.net\/#organization","name":"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd","url":"https:\/\/cloudproinc.azurewebsites.net\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/cloudproinc.azurewebsites.net\/#\/schema\/logo\/image\/","url":"\/wp-content\/uploads\/2022\/01\/favfinalfile.png","contentUrl":"\/wp-content\/uploads\/2022\/01\/favfinalfile.png","width":500,"height":500,"caption":"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd"},"image":{"@id":"https:\/\/cloudproinc.azurewebsites.net\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/cloudproinc.azurewebsites.net\/#\/schema\/person\/192eeeb0ce91062126ce3822ae88fe6e","name":"CPI Staff","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","caption":"CPI Staff"},"sameAs":["http:\/\/www.cloudproinc.com.au"],"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/author\/cpiadmin\/"}]}},"jetpack_featured_media_url":"\/wp-content\/uploads\/2025\/09\/deploy-a-model-with-tensorflow-serving-on-docker-and-kubernetes.png","jetpack-related-posts":[{"id":53934,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/09\/25\/build-a-keras-model-for-real-projects\/","url_meta":{"origin":53933,"position":0},"title":"Build a Keras Model for Real Projects","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"Learn how to design, train, and deploy Keras models using TensorFlow\u2014from data prep to production-ready saves\u2014with practical code, clear steps, and tips for speed, accuracy, and maintainability.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png 1x, \/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png 1.5x, \/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png 2x, \/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png 3x, \/wp-content\/uploads\/2025\/09\/build-a-keras-model-for-real-projects-from-idea-to-deployment.png 4x"},"classes":[]},{"id":53931,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/09\/25\/keras-functional-api\/","url_meta":{"origin":53933,"position":1},"title":"Keras Functional API","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"A clear, practical guide to Keras Functional API\u2014why it matters and how to build flexible deep learning models with branching, sharing, and custom workflows.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png 1x, \/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png 1.5x, \/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png 2x, \/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png 3x, \/wp-content\/uploads\/2025\/09\/keras-functional-api-demystified-for-flexible-deep-learning-workflows.png 4x"},"classes":[]},{"id":53573,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/08\/06\/how-to-code-and-build-a-gpt-large-language-model\/","url_meta":{"origin":53933,"position":2},"title":"How to Code and Build a GPT Large Language Model","author":"CPI Staff","date":"August 6, 2025","format":false,"excerpt":"In this blog post, you\u2019ll learn how to code and build a GPT LLM from scratch or fine-tune an existing one. We\u2019ll cover the architecture, key tools, libraries, frameworks, and essential resources to get you started fast. Table of contentsUnderstanding GPT LLM ArchitectureModel Architecture DiagramTools and Libraries to Build a\u2026","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/08\/CreateLLM.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/08\/CreateLLM.png 1x, \/wp-content\/uploads\/2025\/08\/CreateLLM.png 1.5x, \/wp-content\/uploads\/2025\/08\/CreateLLM.png 2x, \/wp-content\/uploads\/2025\/08\/CreateLLM.png 3x, \/wp-content\/uploads\/2025\/08\/CreateLLM.png 4x"},"classes":[]},{"id":53930,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/09\/25\/deploying-deep-learning-models\/","url_meta":{"origin":53933,"position":3},"title":"Deploying Deep Learning Models","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"A practical guide to serving deep learning models as secure, scalable REST APIs using FastAPI, Docker, and Kubernetes\u2014covering performance, security, and monitoring for production.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png 1x, \/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png 1.5x, \/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png 2x, \/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png 3x, \/wp-content\/uploads\/2025\/09\/deploying-deep-learning-models-as-fast-secure-rest-apis-in-production.png 4x"},"classes":[]},{"id":53929,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/09\/25\/turn-a-list-into-a-tensor-in-python\/","url_meta":{"origin":53933,"position":4},"title":"Turn a List into a Tensor in Python","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"Learn how to convert Python lists into tensors using NumPy, PyTorch, and TensorFlow, with tips on shapes, dtypes, performance, and common pitfalls.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png 1x, \/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png 1.5x, \/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png 2x, \/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png 3x, \/wp-content\/uploads\/2025\/09\/turn-a-list-into-a-tensor-in-python-with-numpy-pytorch-tensorflow.png 4x"},"classes":[]},{"id":53539,"url":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/2025\/07\/25\/understanding-transformers-the-architecture-driving-ai-innovation\/","url_meta":{"origin":53933,"position":5},"title":"Understanding Transformers: The Architecture Driving AI Innovation","author":"CPI Staff","date":"July 25, 2025","format":false,"excerpt":"In this blog post titled \"Understanding Transformers: The Architecture Driving AI Innovation,\" we'll delve into what Transformer architecture is, how it works, the essential tools we use to build transformer-based models, some technical insights, and practical examples to illustrate its impact and utility. The Transformer architecture has revolutionized the field\u2026","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/07\/create-a-highly-detailed-high-resolution-image-depicting-the-transformer-architecture.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/07\/create-a-highly-detailed-high-resolution-image-depicting-the-transformer-architecture.png 1x, \/wp-content\/uploads\/2025\/07\/create-a-highly-detailed-high-resolution-image-depicting-the-transformer-architecture.png 1.5x, \/wp-content\/uploads\/2025\/07\/create-a-highly-detailed-high-resolution-image-depicting-the-transformer-architecture.png 2x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53933","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/comments?post=53933"}],"version-history":[{"count":2,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53933\/revisions"}],"predecessor-version":[{"id":53953,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53933\/revisions\/53953"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/media\/53941"}],"wp:attachment":[{"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/media?parent=53933"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/categories?post=53933"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudproinc.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/tags?post=53933"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}