BigQuery Machine Learning (BigQuery ML) enables users to create and execute machine learning models in BigQuery using SQL queries. The goal is to democratise machine learning by enabling SQL practitioners to build models using their existing tools and to increase development speed by eliminating the need for data movement.
There is an ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery. In this lab you will use this data to create a model that predicts whether a visitor will make a transaction.
What you'll learn
How to create, evaluate and use machine learning models in BigQuery
For each lab, you get a new Google Cloud project and set of resources for a fixed time at no cost.
Sign in to Qwiklabs using an incognito window.
Note the lab's access time (for example, 1:15:00), and make sure you can finish within that time.
There is no pause feature. You can restart if needed, but you have to start at the beginning.
When ready, click Start lab.
Note your lab credentials (Username and Password). You will use them to sign in to the Google Cloud Console.
Click Open Google Console.
Click Use another account and copy/paste credentials for this lab into the prompts.
If you use other credentials, you'll receive errors or incur charges.
Accept the terms and skip the recovery resource page.
Open BigQuery Console
In the Google Cloud Console, select Navigation menu > BigQuery.
The Welcome to BigQuery in the Cloud Console message box opens. This message box provides a link to the quickstart guide and lists UI updates.
Click Done.
Task 1. Explore the data
In this task, you explore and prepare a public dataset for a machine learning model. You execute a SQL query to inspect a sample of Google Analytics data and save the result as your training_data.
The data we will use in this lab sits in the bigquery-public-data project, that is available to all. Let's take a look at a sample of this data.
To verify that the bqml_lab dataset is present, click the arrow to the left of your Project_ID which starts as qwiklabs-gcp-00-XXXXXXXXXX. The bqml_lab dataset should be listed underneath.
Click to create a new SQL Query. A new tab is displayed in BigQuery Studio.
Explain the query
Add the query to Untitled query box.
SELECT
IF(totals.transactions IS NULL, 0, 1) AS label,
IFNULL(device.operatingSystem, "") AS os,
device.isMobile AS is_mobile,
IFNULL(geoNetwork.country, "") AS country,
IFNULL(totals.pageviews, 0) AS pageviews
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20160801' AND '20170631'
LIMIT 10000;
Select the query.
Click Gemini Code Assist button immediately to the left of the query.
Click Explain this query.
You see the dialog for Gemini open to the right of BigQuery Studio.
The Welcome to Gemini in Cloud Console message is displayed in the Gemini pane. Click Start Chatting. You can see an explanation for the query like the one below in the chat window.
Note: If no explanation appears in the chat window, please click Explain this query again after selecting the entire query.
Review the explanation.
Note: As a generative tool, the response from Gemini Code Assist will differ slightly each time, but the key points should be similar.
Gemini returns a response similar to the following:
This GoogleSQL query is designed to extract a sample of Google Analytics session data and transform it into a format suitable for machine learning or analytical purposes.
Click Run.
Click Save and then select Save view.
In the Save view dialog box, click Dataset and select bqml_lab.
For Table, type training_data and then click Save.
Click Check my progress to verify the objective.
Create a view table.
Task 2. Create a model
In this task, you generate a new machine learning model to predict visitor transactions using a SQL query natural language prompt in BigQuery. You specify a logistic regression model type and train it using the existing training_data.
Click to create a new SQL Query. A new tab is displayed in BigQuery Studio.
Click to access the SQL generation tool. You see the Generate SQL with Gemini dialog appear. You can enter a natural language prompt in this window to generate a new SQL statement.
Copy and paste the prompt below
Create a model name sample_model, type logistic_reg to predict whether a visitor will make a transaction using the training_data view table.
Click Generate. Gemini suggests a SQL query similar to the one below.
CREATE MODEL
`{{{project_0.project_id|set at lab start}}}`.`bqml_lab`.`sample_model` OPTIONS ( model_type = 'LOGISTIC_REG',
input_label_cols = ['label']) AS
SELECT
label,
os,
is_mobile,
country,
pageviews
FROM
`{{{project_0.project_id|set at lab start}}}`.`bqml_lab`.`training_data`;
Note: If the generated query results in the error message, consider adjusting the query syntax or replacing the entire query to match the example provided.
Click Insert.
Click Run.
In this case, bqml_lab is the name of the dataset, sample_model is the name of the model, training_data is the transactions data we looked at in the previous task. The model type specified is binary logistic regression.
Running the CREATE MODEL command creates a Query Job that will run asynchronously so you can, for example, close or refresh the BigQuery UI window.
Click Check my progress to verify the objective.
Create a model.
[Optional] Model information & training statistics
If interested, you can get information about the model by clicking on bqml_lab dataset on the left-hand menu and then click the sample_model dataset in the UI. Under Details, you should find some basic model info and training options used to produce the model. Under Training, you should see a table similar to this:
Task 3. Evaluate the Model
In this task, you evaluate the performance of your machine learning model by using the ML.EVALUATE function. This provides key metrics that show how accurately the model predicts visitor transactions.
Click to create a new SQL Query. A new tab is displayed in BigQuery Studio.
Click to access the SQL generation tool. You see the Generate SQL with Gemini dialog appear. You can enter a natural language prompt in this window to generate a new SQL statement.
Copy and paste the prompt below.
Write a query to evaluate the performance of the model `bqml_lab.sample_model` using the `ml.EVALUATE` function.
Click Generate. Gemini suggests a SQL query similar to the one below.
SELECT
*
FROM
ML.EVALUATE( MODEL `{{{project_0.project_id|set at lab start}}}`.`bqml_lab`.`sample_model`,
TABLE `{{{project_0.project_id|set at lab start}}}`.`bqml_lab`.`training_data`);
Note: If the generated query results in the error message, consider adjusting the query syntax or replacing the entire query to match the example provided.
Click Insert.
Click Run.
You should see a table similar to this:
Task 4. Use the model
In this task, you learn to use BigQuery's ML.PREDICT function to make predictions, but first, you must debug a query that uses an incorrect function. You will use Gemini to identify and correct the syntax error before running the query to predict the top 10 purchasing countries.
Click to create a new SQL Query and run the below query:
SELECT
IF(totals.transactions IS NULL, 0, 1) AS label,
IFNULL(device.operatingSystem, "") AS os,
device.isMobile AS is_mobile,
IFNULL(geoNetwork.country, "") AS country,
IFNULL(totals.pageviews, 0) AS pageviews,
fullVisitorId
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170701' AND '20170801';
You'll realize the SELECT and FROM portions of the query is similar to that used to generate training data. There is the additional fullVisitorId column which you will use for predicting transactions by individual user.The WHERE portion reflects the change in time frame (July 1 to August 1 2017).
Let's save this July data so we can use it in the next steps to make predictions using our model.
Click Save and then select Save view.
In the Save view dialog box, click Dataset and select bqml_lab.
For Table, type july_data and then click Save.
Predict purchases per country/region
With this query you will try to predict the number of transactions made by visitors of each country or region, sort the results, and select the top 10 by purchases:
Click the to create a new SQL query.
Copy and paste the query below.
SELECT
country,
TOTAL(predicted_label) as total_predicted_purchases
FROM
ml.PREDICT(MODEL `bqml_lab.sample_model`, (
SELECT * FROM `bqml_lab.july_data`))
GROUP BY country
ORDER BY total_predicted_purchases DESC
LIMIT 10;
Click Run. You confirm the query fails to run and you get the following error:
Function not found: TOTAL at [3:3]
Debug the code with Gemini and the error
Click the Gemini chat window in BigQuery.
In the chat window, copy and paste the following question.
Why am I getting "Function not found: TOTAL at [3:3]" when I run this query?
Press <SHIFT><ENTER>, or <SHIFT><return> on Mac, to create a new line in the chat window.
Select the query and copy it.
Paste it immediately after the question you asked.
Press <SHIFT><ENTER>, or <SHIFT><return> on Mac, to create a new line in the chat window.
Copy and paste the following sentence below:
Please suggest new code to resolve any issues.
Click Send prompt. You get a response from Gemini.
Review the suggestions in the response. Based upon these suggestions, indicates that TOTAL() is not a valid SQL aggregation function in BigQuery.
Within the suggestions, a refined query with a potential solution similar to the one below is provided:
SELECT
country,
SUM(predicted_label) as total_predicted_purchases
FROM
ml.PREDICT(MODEL `bqml_lab.sample_model`, (
SELECT * FROM `bqml_lab.july_data`))
GROUP BY country
ORDER BY total_predicted_purchases DESC
LIMIT 10;
Copy the refined query.
Click to create a new SQL query.
Paste the refined query in the new untitled query tab.
Click Run.
In this query, you're using ml.PREDICT and the BigQuery ML portion of the query is wrapped with standard SQL commands. For this lab you''re interested in the country and the sum of purchases for each country, so that's why SELECT, GROUP BY and ORDER BY. LIMIT is used to ensure you only get the top 10 results.
You should see a table similar to this:
Click Check my progress to verify the objective.
Fix the error and retrieve the top 10 purchases for each country.
Challenge
Predict purchases per user
This time you will try to predict the number of transactions each visitor makes, sort the results, and select the top 10 visitors by transactions.
End your lab
When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.
You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.
The number of stars indicates the following:
1 star = Very dissatisfied
2 stars = Dissatisfied
3 stars = Neutral
4 stars = Satisfied
5 stars = Very satisfied
You can close the dialog box if you don't want to provide feedback.
For feedback, suggestions, or corrections, please use the Support tab.
Copyright 2025 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.
Moduły tworzą projekt Google Cloud i zasoby na określony czas.
Moduły mają ograniczenie czasowe i nie mają funkcji wstrzymywania. Jeśli zakończysz moduł, musisz go zacząć od początku.
Aby rozpocząć, w lewym górnym rogu ekranu kliknij Rozpocznij moduł.
Użyj przeglądania prywatnego
Skopiuj podaną nazwę użytkownika i hasło do modułu.
Kliknij Otwórz konsolę w trybie prywatnym.
Zaloguj się w konsoli
Zaloguj się z użyciem danych logowania do modułu. Użycie innych danych logowania może spowodować błędy lub naliczanie opłat.
Zaakceptuj warunki i pomiń stronę zasobów przywracania.
Nie klikaj Zakończ moduł, chyba że właśnie został przez Ciebie zakończony lub chcesz go uruchomić ponownie, ponieważ spowoduje to usunięcie wyników i projektu.
Ta treść jest obecnie niedostępna
Kiedy dostępność się zmieni, wyślemy Ci e-maila z powiadomieniem
Świetnie
Kiedy dostępność się zmieni, skontaktujemy się z Tobą e-mailem
Jeden moduł, a potem drugi
Potwierdź, aby zakończyć wszystkie istniejące moduły i rozpocząć ten
Aby uruchomić moduł, użyj przeglądania prywatnego
Uruchom ten moduł w oknie incognito lub przeglądania prywatnego. Dzięki temu unikniesz konfliktu między swoim kontem osobistym a kontem do nauki, co mogłoby spowodować naliczanie dodatkowych opłat na koncie osobistym.
In this lab, you learn how to create and evaluate a machine learning model with BigQuery ML and use the model to predict purchase behavior.
Czas trwania:
Konfiguracja: 1 min
·
Dostęp na 60 min
·
Ukończono w 60 min