Відгуки: Serverless Data Processing with Dataflow - Writing an ETL Pipeline using Apache Beam and Dataflow (Python)
11439 відгуків
Luis Antonio C. · Відгук надано 18 хвилин тому
Could not complete Part1 Task 6 run the pipeline (using the provided Solution code) due to the following error: raise BeamIOError("Match operation failed", exceptions) apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://qwiklabs-gcp-00-f5855126f119/events.json': RefreshError(TransportError("Failed to retrieve https://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable. Last exception: HTTPSConnectionPool(host='metadata.google.internal', port=443): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1017)')))"))}
Lingmin M. · Відгук надано близько 7 годин тому
Qwiklabs Dataflow Lab Fix Summary Problems SSL/Metadata auth failure — google-auth 2.44.0+ broke the default HTTP transport, causing CERTIFICATE_VERIFY_FAILED errors when the notebook tried to reach GCP's metadata server Zone resource exhaustion — n1-standard-1 (Dataflow's default) was completely unavailable across all us-central1 zones for Qwiklabs accounts Pipeline exits early — p.run() doesn't wait for completion, causing silent failures argparse rejects extra flags — parse_args() blocks passing extra Beam arguments like --worker_machine_type Fixes 1. Fix SSL auth error bashexport GCE_METADATA_MTLS_MODE=none 2. Use e2-standard-2 machine type Add to your run command: bash--worker_machine_type=e2-standard-2 3. Fix pipeline code In my_pipeline.py: python# Change this: opts = parser.parse_args() options = PipelineOptions() p.run() # To this: opts, pipeline_args = parser.parse_known_args() options = PipelineOptions(pipeline_args) p.run().wait_until_finish() 4. Full working command bashcd $BASE_DIR export PROJECT_ID=$(gcloud config get-value project) export GCE_METADATA_MTLS_MODE=none python3 my_pipeline.py \ --project=${PROJECT_ID} \ --region=us-central1 \ --stagingLocation=gs://$PROJECT_ID/staging/ \ --tempLocation=gs://$PROJECT_ID/temp/ \ --runner=DataflowRunner \ --worker_machine_type=e2-standard-2 5. For the Dataflow Template UI Under Optional Parameters → uncheck "Use default machine type" → Series: E2 → Machine type: e2-standard-2
David O. · Відгук надано близько 8 годин тому
console did not open
Nihal Hussain M. · Відгук надано близько 9 годин тому
Mara Malina F. · Відгук надано близько 10 годин тому
VERY BAD>> I am unable to run my jobs are with DataflowRunner as I am always getting resources contraints errors.. job is not able to spn up us-cerntral1 region.. I am getting same error in all labs which requires to submit jobs on dataflow. I am able to run with DirectRunner. Please help in this as I have spent too many hours but end up getting same error again and again
Mallikarjunarao G. · Відгук надано близько 23 годин тому
couldnt finisih it lots of errors in the step by step or resources available
Sebastián P. · Відгук надано 1 день тому
couldn't finish because dataflow job could get the resources to actually run. Stupid and a waste of my time. still can't run due to limited resources
Tyler W. · Відгук надано 1 день тому
couldn't finish because dataflow job could get the resources to actually run. Stupid and a waste of my time.
Tyler W. · Відгук надано 1 день тому
1. There are 2 issues running the lab: - lab 1: apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://qwiklabs-gcp-04-9d1f7241cd59/events.json': RefreshError(TransportError("Failed to retrieve https://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable. Last exception: HTTPSConnectionPool(host='metadata.google.internal', port=443): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate - lab 2 Startup of the worker pool in us-central1 failed to bring up any of the desired 1 workers P.S. Not working http.client transport only supports the http scheme, httpswas specified P.S.S. export GCE_METADATA_MTLS_MODE=none resolves an issue with certificate. But still insufficient resources to run the job.
Igor P. · Відгук надано 1 день тому
Luis Antonio C. · Відгук надано 2 дні тому
Lingmin M. · Відгук надано 2 дні тому
Lingmin M. · Відгук надано 2 дні тому
Poor instructions - terrible!
Leighton C. · Відгук надано 3 дні тому
My conclusion from this lab is that I should not use or depend on this type of platform for production loads. It if does not even work in a controlled lab, then production is a no-go! ERROR:apache_beam.runners.dataflow.dataflow_runner:2026-04-01T09:25:07.346Z: JOB_MESSAGE_ERROR: Startup of the worker pool in us-central1 failed to bring up any of the desired 1 workers. This is likely a quota issue or a Compute Engine stockout. The service will retry. For troubleshooting steps, see https://cloud.google.com/dataflow/docs/guides/common-errors#worker-pool-failure for help troubleshooting. ZONE_RESOURCE_POOL_EXHAUSTED: Instance 'my-pipeline-1775035379230-04010223-mm24-harness-86c2' creation failed: The zone 'projects/qwiklabs-gcp-02-e142b19585b8/zones/us-central1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
Mikael W. · Відгук надано 4 дні тому
Luis Antonio C. · Відгук надано 4 дні тому
In the Python script, allow users to pass alternative machine types, so there is no possibility of being unable to complete the lab due to insufficient resources being available (for N1?)
Felix V. · Відгук надано 4 дні тому
1. There are 2 issues running the lab: - lab 1: apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://qwiklabs-gcp-04-9d1f7241cd59/events.json': RefreshError(TransportError("Failed to retrieve https://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable. Last exception: HTTPSConnectionPool(host='metadata.google.internal', port=443): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate - lab 2 Startup of the worker pool in us-central1 failed to bring up any of the desired 1 workers P.S. Not working http.client transport only supports the http scheme, httpswas specified P.S.S. export GCE_METADATA_MTLS_MODE=none resolves an issue with certificate. But still insufficient resources to run the job.
Igor P. · Відгук надано 4 дні тому
1. There are 2 issues running the lab: - lab 1: apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://qwiklabs-gcp-04-9d1f7241cd59/events.json': RefreshError(TransportError("Failed to retrieve https://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Compute Engine Metadata server unavailable. Last exception: HTTPSConnectionPool(host='metadata.google.internal', port=443): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate - lab 2 Startup of the worker pool in us-central1 failed to bring up any of the desired 1 workers P.S. Not working http.client transport only supports the http scheme, httpswas specified
Igor P. · Відгук надано 4 дні тому
Luis Antonio C. · Відгук надано 4 дні тому
Charles F. · Відгук надано 5 днів тому
Too much confuse the Python file and Instructions were not clear to proceed and modify the files. The code is running with errors. Need support.
Venkateswarlu Kuriseti N. · Відгук надано 5 днів тому
konda l. · Відгук надано 5 днів тому
Jorge Alberto M. · Відгук надано 7 днів тому
There is currently an issue with google.auth library (https://github.com/googleapis/google-cloud-python/issues/16090) that prevents me from finishing the lab - took a long time to find the root couse and I run out of time. Additionally, the worker zones and machine types have to be specified because without them the dataflow cannot spawn workers and jobs keep failing.
Przemyslaw S. · Відгук надано 8 днів тому
Ми не гарантуємо, що опубліковані відгуки написали клієнти, які придбали продукти чи скористалися ними. Відгуки не перевіряються Google.