GOOGLE CLOUD PROFESSIONAL MACHINE LEARNING ENGINEER EXAM OBJECTIVES COVERED IN THIS CHAPTER:

2.1 Exploring and preprocessing organization‐wide data (e.g., Cloud Storage, BigQuery, Cloud Spanner, Cloud SQL, Apache Spark, Apache Hadoop). Considerations include:

Organizing different types of data (e.g., tabular, text, speech, images, videos) for efficient training Data preprocessing (e.g., Dataflow, TensorFlow Extended [TFX], BigQuery)

2.2 Model prototyping using Jupyter notebooks. Considerations include:

Choosing the appropriate Jupyter backend on Google Cloud (e.g., Vertex AI Workbench notebooks, notebooks on Dataproc) Using Spark kernels Integration with code source repositories Developing models in Vertex AI Workbench by using common frameworks (e.g., TensorFlow, PyTorch, sklearn, Spark, JAX)

3.2 Training models. Considerations include:

Organizing training data (e.g., tabular, text, speech, images, videos) on Google Cloud (e.g., Cloud Storage, BigQuery) Ingestion of various file types (e.g., CSV, JSON, images, Hadoop, databases) into training Training using different SDKs (e.g., Vertex AI custom training, Kubeflow on Google Kubernetes Engine, AutoML, tabular workflows) Using distributed training to organize reliable pipelines Hyperparameter tuning Troubleshooting ML model training failures

3.3 Choosing appropriate hardware for training. Considerations include:

Distributed training with TPUs and GPUs (e.g., Reduction Server on Vertex AI, Horovod)

Google Cloud Data and Analytics Overview

Screenshot 2025-04-03 at 2.58.38 PM.png

Collect

Process