Official google cloud certified professional machine learning engineer study guide mona - Download t

Page 1


https://ebookmass.com/product/official-google-cloud-

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Official Google Cloud Certified Professional Data Engineer Study Guide 1st Edition Dan Sullivan

https://ebookmass.com/product/official-google-cloud-certifiedprofessional-data-engineer-study-guide-1st-edition-dan-sullivan/

ebookmass.com

Google Cloud Certified Associate Cloud Engineer All-in-One Exam Guide Jack Hyman

https://ebookmass.com/product/google-cloud-certified-associate-cloudengineer-all-in-one-exam-guide-jack-hyman/

ebookmass.com

Google Cloud Certified Professional Cloud Architect Allin-One Exam Guide Iman Ghanizada

https://ebookmass.com/product/google-cloud-certified-professionalcloud-architect-all-in-one-exam-guide-iman-ghanizada/

ebookmass.com

The Confucian Four Books for Women: A New Translation of the Nu Sishu and the Commentary of Wang Xiang Ann A. PangWhite

https://ebookmass.com/product/the-confucian-four-books-for-women-anew-translation-of-the-nu-sishu-and-the-commentary-of-wang-xiang-anna-pang-white/ ebookmass.com

https://ebookmass.com/product/from-globular-proteins-to-amyloids-1stedition-irena-roterman-konieczna-editor/

ebookmass.com

A Legacy of Bones 1st Edition Doug Burgess

https://ebookmass.com/product/a-legacy-of-bones-1st-edition-dougburgess/

ebookmass.com

The Impact of Protracted Peace Processes on Identities in Conflict: The Case of Israel and Palestine

Joana Ricarte

https://ebookmass.com/product/the-impact-of-protracted-peaceprocesses-on-identities-in-conflict-the-case-of-israel-and-palestinejoana-ricarte/

ebookmass.com

Bought by the Biker (Curvy for Darkness Instalove Romance Novellas Book 2) Annabelle

Winters

https://ebookmass.com/product/bought-by-the-biker-curvy-for-darknessinstalove-romance-novellas-book-2-annabelle-winters/

ebookmass.com

How to Implement Evidence-Based Healthcare 1st Edition

https://ebookmass.com/product/how-to-implement-evidence-basedhealthcare-1st-edition/

ebookmass.com

Reason in Nature: New Essays on Themes from John McDowell

https://ebookmass.com/product/reason-in-nature-new-essays-on-themesfrom-john-mcdowell-matthew-boyle/

ebookmass.com

Table of Contents

Cover

Table of Contents

Title Page

Copyright

Dedication

Acknowledgments

About the Author

About the Technical Editors

About the Technical Proofreader

Google Technical Reviewer

Introduction

Google Cloud Professional Machine Learning Engineer

Certification

Who Should Buy This Book

How This Book Is Organized

Bonus Digital Contents

Conventions Used in This Book

Google Cloud Professional ML Engineer Objective Map

How to Contact the Publisher

Assessment Test

Answers to Assessment Test

Chapter 1: Framing ML Problems

Translating Business Use Cases

Machine Learning Approaches

ML Success Metrics

Responsible AI Practices

Summary

Exam Essentials

Review Questions

Chapter 2: Exploring Data and Building Data Pipelines

Visualization

Statistics Fundamentals

Data Quality and Reliability

Establishing Data Constraints

Running TFDV on Google Cloud Platform

Organizing and Optimizing Training Datasets

Handling Missing Data

Data Leakage

Summary

Exam Essentials

Review Questions

Chapter 3: Feature Engineering

Consistent Data Preprocessing

Encoding Structured Data Types

Class Imbalance

Feature Crosses

TensorFlow Transform

GCP Data and ETL Tools

Summary

Exam Essentials

Review Questions

Chapter 4: Choosing the Right ML Infrastructure

Pretrained vs. AutoML vs. Custom Models

Pretrained Models

AutoML

Custom Training

Provisioning for Predictions

Summary

Exam Essentials

Review Questions

Chapter 5: Architecting ML Solutions

Designing Reliable, Scalable, and Highly Available ML Solutions

Choosing an Appropriate ML Service

Data Collection and Data Management

Automation and Orchestration

Serving

Summary

Exam Essentials

Review Questions

Chapter 6: Building Secure ML Pipelines

Building Secure ML Systems

Identity and Access Management

Privacy Implications of Data Usage and Collection

Summary

Exam Essentials

Review Questions

Chapter 7: Model Building

Choice of Framework and Model Parallelism

Modeling Techniques

Transfer Learning

Semi‐supervised Learning

Data Augmentation

Model Generalization and Strategies to Handle Overfitting and Underfitting

Summary

Exam Essentials

Review Questions

System Design with Kubeflow/TFX

Hybrid or Multicloud Strategies

Summary

Exam Essentials

Review Questions

Chapter 12: Model Monitoring, Tracking, and Auditing Metadata

Model Monitoring

Model Monitoring on Vertex AI

Logging Strategy

Model and Dataset Lineage

Vertex AI Experiments

Vertex AI Debugging

Summary

Exam Essentials

Review Questions

Chapter 13: Maintaining ML Solutions

MLOps Maturity

Retraining and Versioning Models

Feature Store

Vertex AI Permissions Model

Common Training and Serving Errors

Summary

Exam Essentials

Review Questions

Chapter 14: BigQuery ML

BigQuery – Data Access

BigQuery ML Algorithms

Explainability in BigQuery ML

BigQuery ML vs. Vertex AI Tables

Interoperability with Vertex AI

BigQuery Design Patterns

Summary

Exam Essentials

Review Questions

Appendix: Answers to Review Questions

Chapter 1: Framing ML Problems

Chapter 2: Exploring Data and Building Data Pipelines

Chapter 3: Feature Engineering

Chapter 4: Choosing the Right ML Infrastructure

Chapter 5: Architecting ML Solutions

Chapter 6: Building Secure ML Pipelines

Chapter 7: Model Building

Chapter 8: Model Training and Hyperparameter Tuning

Chapter 9: Model Explainability on Vertex AI

Chapter 10: Scaling Models in Production

Chapter 11: Designing ML Training Pipelines

Chapter 12: Model Monitoring, Tracking, and Auditing

Metadata

Chapter 13: Maintaining ML Solutions

Chapter 14: BigQuery ML

Index

End User License Agreement

List of Tables

Chapter 1

TABLE 1.1 ML problem types

TABLE 1.2 Structured data

TABLE 1.3 Time‐Series Data

TABLE 1.4 Confusion matrix for a binary classification example

TABLE 1.5 Summary of metrics

Chapter 2

TABLE 2.1 Mean, median, and mode for outlier detection

Chapter 3

TABLE 3.1 One‐hot encoding example

TABLE 3.2 Run a TFX pipeline on GCP

Chapter 4

TABLE 4.1 Vertex AI AutoML Tables algorithms

TABLE 4.2 AutoML algorithms

TABLE 4.3 Problems solved using AutoML

TABLE 4.4 Summary of the recommendation types available in Retail AI

Chapter 5

TABLE 5.1 ML workflow to GCP services mapping

TABLE 5.2 When to use BigQuery ML vs. AutoML vs. a custom model

TABLE 5.3 Google Cloud tools to read BigQuery data

TABLE 5.4 NoSQL data store options

Chapter 6

TABLE 6.1 Difference between server‐side and client‐side encryption

TABLE 6.2 Strategies for handling sensitive data

TABLE 6.3 Techniques to handle sensitive fields in data

Chapter 7

TABLE 7.1 Distributed training strategies using TensorFlow

List of Illustrations

Chapter 1

FIGURE 1.1 Business case to ML problem

FIGURE 1.2 AUC

FIGURE 1.3 AUC PR

Chapter 2

FIGURE 2.1 Box plot showing quartiles

FIGURE 2.2 Line plot

FIGURE 2.3 Bar plot

FIGURE 2.4 Data skew

FIGURE 2.5 TensorFlow Data Validation

FIGURE 2.6 Dataset representation

FIGURE 2.7 Credit card data representation

FIGURE 2.8 Downsampling credit card data

Chapter 3

FIGURE 3.1 Difficult to separate by line or a linear method

FIGURE 3.2 Difficult to separate classes by line

FIGURE 3.3 Summary of feature columnsGoogle Cloud via Coursera, www.coursera...

FIGURE 3.4 TensorFlow Transform

Chapter 4

FIGURE 4.1 Pretrained, AutoML, and custom models

FIGURE 4.2 Analyzing a photo using Vision AI

FIGURE 4.3 Vertex AI AutoML, providing a “budget”

FIGURE 4.4 Choosing the size of model in Vertex AI

FIGURE 4.5 TPU system architecture

FIGURE 8.4 Creating a managed notebook

FIGURE 8.5 Opening the managed notebook

FIGURE 8.6 Exploring frameworks available in a managed notebook

FIGURE 8.7 Data integration with Google Cloud Storage within a managed noteb...

FIGURE 8.8 Data Integration with BigQuery within a managed notebook

FIGURE 8.9 Scaling up the hardware from a managed notebook

FIGURE 8.10 Git integration within a managed notebook

FIGURE 8.11 Scheduling or executing code in the notebook

FIGURE 8.12 Submitting the notebook for execution

FIGURE 8.13 Scheduling the notebook for execution

FIGURE 8.14 Choosing TensorFlow framework to create a user‐managed notebook...

FIGURE 8.15 Create a user‐managed TensorFlow notebook

FIGURE 8.16 Exploring the network

FIGURE 8.17 Training in the Vertex AI console

FIGURE 8.18 Vertex AI training architecture for a prebuilt container

FIGURE 8.19 Vertex AI training console for pre‐built containersSource: Googl...

FIGURE 8.20 Vertex AI training architecture for custom containers

FIGURE 8.21 ML model parameter and hyperparameter

FIGURE 8.22 Configure hyperparameter tuning by training the pipeline UISourc...

FIGURE 8.23 Enabling an interactive shell in the Vertex AI consoleSource: Go...

FIGURE 8.24 Web terminal to access an interactive

shellSource: Google LLC.

Chapter 9

FIGURE 9.1 SHAP model explainability

FIGURE 9.2 Feature attribution using integrated gradients for cat image

Chapter 10

FIGURE 10.1 TF model serving options

FIGURE 10.2 Static reference architecture

FIGURE 10.3 Dynamic reference architecture

FIGURE 10.4 Caching architecture

FIGURE 10.5 Deploying to an endpoint

FIGURE 10.6 Sample prediction request

FIGURE 10.7 Batch prediction job in Console

Chapter 11

FIGURE 11.1 Relation between model data and ML code for MLOps

FIGURE 11.2 End‐to‐end ML development workflow

FIGURE 11.3 Kubeflow architecture

FIGURE 11.4 Kubeflow components and pods

FIGURE 11.5 Vertex AI Pipelines

FIGURE 11.6 Vertex AI Pipelines condition for deployment

FIGURE 11.7 Lineage tracking with Vertex AI Pipelines

FIGURE 11.8 Lineage tracking in Vertex AI Metadata store

FIGURE 11.9 Continuous training and CI/CD

FIGURE 11.10 CI/CD with Kubeflow Pipelines

FIGURE 11.11 Kubeflow Pipelines on GCP

FIGURE 11.12 TFX pipelines, libraries, and components

Mona Mona Pratap Ramamurthy

Copyright © 2024 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada and the United Kingdom.

ISBNs: 9781119944461 (paperback), 9781119981848 (ePDF), 9781119981565 (ePub)

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at www.wiley.com/go/permission.

Trademarks: WILEY and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Google Cloud is a trademark of Google, Inc. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and authors have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our website at www.wiley.com.

Library of Congress Control Number: 2023931675

Cover image: © Getty Images Inc./Jeremy Woodhouse

Cover design: Wiley

Acknowledgments

Although this book bears my name as author, many other people contributed to its creation. Without their help, this book wouldn't exist, or at best would exist in a lesser form. Pratap Ramamurthy as my co‐author has helped contribute a third of the content of this book. Kim Wimpsett, the development editor, Christine O'Connor, the managing editor, and Saravanan Dakshinamurthy, the production specialist, oversaw the book as it progressed through all its stages. Arielle Guy was the book's proofreader and Judy Flynn was the copyeditor. Last but not the least, thanks to Hitesh Hinduja for being an amazing reviewer throughout the book writing process.

I'd also like to thank Jim Minatel and Melissa Burlock at Wiley, and Dan Sullivan, who helped connect me with Wiley to write this book.

—Mona Mona

This book is the product of hard work by many people, and it was wonderful to see everyone come together as a team, starting with Jim Minatel and Melissa Burlock from Wiley and including Kim Wimpsett, Christine O' Connor, Saravanan Dakshinamurthy, Judy Flynn, Arielle Guy, and the reviewers.

Most importantly, I would like to thank Mona for spearheading this huge effort. Her knowledge from her previous writing experience and leadership from start to finish was crucial to bringing this book to completion.

—Pratap Ramamurthy

About the Author

Mona Mona is an AI/ML specialist at Google Public Sector. She is the author of the book Natural Language Processing with AWS AI Services and a speaker. She was a senior AI/ML specialist Solution Architect at AWS before joining Google. She has 14 certifications and has created courses for AWS AI/ML Certification Specialty Exam readiness. She has authored 17 blogs on AI/ML and also co‐authored a research paper on AWS CORD‐19 Search: A neural search engine for COVID‐19 literature, which won an award at the Association for the Advancement of Artificial Intelligence (AAAI) conference. She can be reached at monasheetal3@gmail.com.

Pratap Ramamurthy loves to solve problems using machine learning. Currently he is an AI/ML specialist at Google Public Sector. Previously he worked at AWS as a partner solution architect where he helped build the partner ecosystem for Amazon SageMaker. Later he was a principal solution architect at H2O.ai, a company that works on machine learning algorithms for structured data and natural language. Prior to that he was a developer and a researcher. To his credit he has several research papers in networking, server profiling technology, genetic algorithms, and optoelectronics. He holds three patents related to cloud technologies. In his spare time, he likes to teach AI using modern board games. He can be reached at pratap.ram@gmail.com.

About the Technical Editors

Hitesh Hinduja is an ardent artificial intelligence (AI) and data platforms enthusiast currently working as a senior manager in Azure Data and AI at Microsoft. He worked as a senior manager in AI at Ola Electric, where he led a team of 30+ people in the areas of machine learning, statistics, computer vision, deep learning, natural language processing, and reinforcement learning. He has filed 14+ patents in India and the United States and has numerous research publications under his name. Hitesh has been associated in research roles at India's top B‐schools: Indian School of Business, Hyderabad, and the Indian Institute of Management, Ahmedabad. He is also actively involved in training and mentoring and has been invited as a guest speaker by various corporations and associations across the globe. He is an avid learner and enjoys reading books in his free time.

Kanchana Patlolla is an AI innovation program leader at Google Cloud. Previously she worked as an AI/ML specialist in Google Cloud Platform. She has architected solutions with major public cloud providers in financial services industries on their quest to the cloud, particularly in their Big Data and machine learning journey. In her spare time, she loves to try different cuisines and relax with her kids.

About the Technical Proofreader

Adam Vincent is an experienced educator with a passion for spreading knowledge and helping people expand their skill sets. He is multi‐certified in Google Cloud, is a Google Cloud Authorized Trainer, and has created multiple courses about machine learning. Adam also loves playing with data and automating everything. When he is not behind a screen, he enjoys playing tabletop games with friends and family, reading sci‐fi and fantasy novels, and hiking.

Google Technical Reviewer

Wiley and the authors wish to thank the Google Technical Reviewer Emma Freeman for her thorough review of the proofs for this book.

Why Become Professional ML Engineer (PMLE)

Certified?

There are several good reasons to get your PMLE certification.

Provides proof of professional achievement Certifications are quickly becoming status symbols in the computer service industry. Organizations, including members of the computer service industry, are recognizing the benefits of certification.

Increases your marketability According to Forbes (www.forbes.com/sites/louiscolumbus/2020/02/10/15-toppaying-it-certifications-in-2020/?sh=12f63aa8358e), jobs that require GCP certifications are the highest‐paying jobs for the second year in a row, paying an average salary of $175,761/year. So, there is a demand from many engineers to get certified. Of the many certifications that GCP offers, the AI/ML certified engineer is a new certification and is still evolving.

Provides an opportunity for advancement IDC's research (www.idc.com/getdoc.jsp?containerId=IDC_P40729) indicates that while AI/ML adoption is on the rise, the cost, lack of expertise, and lack of life cycle management tools are among the top three inhibitors to realizing AI and ML at scale.

This book is the first in the market to talk about Google Cloud AI/ML tools and the technology covering the latest Professional ML Engineer certification guidelines released on February 22, 2022.

Recognizes Google as a leader in open source and AI

Google is the main contributor to many of the path‐breaking open source softwares that dramatically changed the landscape of AI/ML, including TensorFlow, Kubeflow, Word2vec, BERT, and T5. Although these algorithms are in the open source domain, Google has the distinct ability of bringing these open source projects to the market through the Google Cloud Platform (GCP). In this regard, the other cloud providers are frequently seen as trailing Google's offering.

Raises customer confidence As the IT community, users, small business owners, and the like become more familiar with the PMLE certified professional, more of them will realize that the PMLE professional is more qualified to architect secure, cost‐effective, and scalable ML solutions on the Google Cloud environment than a noncertified individual.

How to Become Certified

You do not have to work for a particular company. It's not a secret society. There is no prerequisite to take this exam. However, there is a recommendation to have 3+ years of industry experience, including one or more years designing and managing solutions using Google Cloud.

This exam is 2 hours and has 50–60 multiple‐choice questions. You can register two ways for this exam:

Take the online‐proctored exam from anywhere or sitting at home. You can review the online testing requirements at www.webassessor.com/wa.do?

page=certInfo&branding=GOOGLECLOUD&tabs=13.

Take the on‐site, proctored exam at a testing center.

We usually prefer to go with the on‐site option as we like the focus time in a proctored environment. We have taken all our certifications in a test center. You can find and locate a test center near you at www.kryterion.com/Locate-Test-Center.

Who Should Buy This Book

This book is intended to help students, developers, data scientists, IT professionals, and ML engineers gain expertise in the ML technology on the Google Cloud Platform and take the Professional Machine Learning Engineer exam. This book intends to take readers through the machine learning process starting from data and moving on through feature engineering, model training, and deployment on the Google Cloud. It also walks readers through best practices for when

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.