Serverless Data Processing with Dataflow - Writing an ETL Pipeline using Apache Beam and Dataflow (Python) Reviews

11284 reviews

Akash R. · Reviewed over 3 years ago

Abhishek P. · Reviewed over 3 years ago

Runtime error using the provided solution, wasnt able to advance the lab (base) jupyter@theia-20221017-211349:~/project/training-data-analyst/quests/dataflow_python/1_Basic_ETL/lab$ python3 my_pipeline.py --project=${PROJECT_ID} --region=us-central1 --stagingLocation=gs://$PROJECT_ID/staging/ --tempLocation=gs://$PROJECT_ID/temp/ --runner=DataflowRunner /opt/conda/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py:2285: BeamDeprecationWarning: options is deprecated since First stable release. References to <pipeline>.options will not be supported is_streaming_pipeline = p.options.view_as(StandardOptions).streaming /opt/conda/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to <pipeline>.options will not be supported temp_location = p.options.view_as(GoogleCloudOptions).temp_location INFO:root:Building pipeline ... INFO:apache_beam.runners.portability.stager:Downloading source distribution of the SDK from PyPi INFO:apache_beam.runners.portability.stager:Executing command: ['/opt/conda/bin/python3', '-m', 'pip', 'download', '--dest', '/tmp/tmpw1vdo9yu', 'apache-beam==2.42.0', '--no-deps', '--no-binary', ':all:'] ERROR: Ignored the following versions that require a different python version: 2.10.0 Requires-Python >=2.7,<3.0; 2.3.0 Requires-Python >=2.7,<3.0; 2.4.0 Requires-Python >=2.7,<3.0; 2.5.0 Requires-Python >=2.7,<3.0; 2.6.0 Requires-Python >=2.7,<3.0; 2.7.0 Requires-Python >=2.7,<3.0; 2.8.0 Requires-Python >=2.7,<3.0; 2.9.0 Requires-Python >=2.7,<3.0 ERROR: Could not find a version that satisfies the requirement apache-beam==2.42.0 (from versions: 0.6.0, 2.0.0, 2.1.0, 2.1.1, 2.2.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0, 2.21.0, 2.22.0, 2.23.0, 2.24.0, 2.25.0, 2.26.0, 2.27.0, 2.28.0, 2.29.0rc1, 2.29.0, 2.30.0rc1, 2.30.0, 2.31.0rc1, 2.31.0, 2.32.0rc1, 2.32.0, 2.33.0rc1, 2.33.0rc2, 2.33.0, 2.34.0rc1, 2.34.0rc2, 2.34.0, 2.35.0rc2, 2.35.0rc3, 2.35.0rc5, 2.35.0rc6, 2.35.0rc8, 2.35.0, 2.36.0rc1, 2.36.0rc2, 2.36.0rc3, 2.36.0, 2.37.0rc1, 2.37.0rc2, 2.37.0rc3, 2.37.0, 2.38.0rc1, 2.38.0, 2.39.0rc1, 2.39.0rc2, 2.39.0, 2.40.0rc1, 2.40.0rc2, 2.40.0, 2.41.0rc1, 2.41.0rc2, 2.41.0, 2.42.0rc0, 2.42.0rc1, 2.42.0rc2) ERROR: No matching distribution found for apache-beam==2.42.0 Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/processes.py", line 89, in check_output out = subprocess.check_output(*args, **kwargs) File "/opt/conda/lib/python3.7/subprocess.py", line 411, in check_output **kwargs).stdout File "/opt/conda/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-m', 'pip', 'download', '--dest', '/tmp/tmpw1vdo9yu', 'apache-beam==2.42.0', '--no-deps', '--no-binary', ':all:']' returned non-zero exit status 1. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "my_pipeline.py", line 106, in <module> run() File "my_pipeline.py", line 103, in run p.run() File "/opt/conda/lib/python3.7/site-packages/apache_beam/pipeline.py", line 550, in run self._options).run(False) File "/opt/conda/lib/python3.7/site-packages/apache_beam/pipeline.py", line 574, in run return self.runner.run_pipeline(self, self._options) File "/opt/conda/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 475, in run_pipeline artifacts = environments.python_sdk_dependencies(options) File "/opt/conda/lib/python3.7/site-packages/apache_beam/transforms/environments.py", line 806, in python_sdk_dependencies skip_prestaged_dependencies=skip_prestaged_dependencies) File "/opt/conda/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py", line 310, in create_job_resources Stager._create_beam_sdk(sdk_remote_location, temp_dir)) File "/opt/conda/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py", line 822, in _create_beam_sdk sdk_local_file = Stager._download_pypi_sdk_package(temp_dir) File "/opt/conda/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py", line 936, in _download_pypi_sdk_package processes.check_output(cmd_args) File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/processes.py", line 97, in check_output .format(traceback.format_exc(), args[0][6], error.output)) RuntimeError: Full traceback: Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/processes.py", line 89, in check_output out = subprocess.check_output(*args, **kwargs) File "/opt/conda/lib/python3.7/subprocess.py", line 411, in check_output **kwargs).stdout File "/opt/conda/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-m', 'pip', 'download', '--dest', '/tmp/tmpw1vdo9yu', 'apache-beam==2.42.0', '--no-deps', '--no-binary', ':all:']' returned non-zero exit status 1. Pip install failed for package: apache-beam==2.42.0 Output from execution of subprocess: b''

Juan M. · Reviewed over 3 years ago

John D. · Reviewed over 3 years ago

Ricardo Q. · Reviewed over 3 years ago

Eric T. · Reviewed over 3 years ago

Marcos T. · Reviewed over 3 years ago

Mark C. · Reviewed over 3 years ago

Andrei B. · Reviewed over 3 years ago

Antonio C. · Reviewed over 3 years ago

Couganadane V. · Reviewed over 3 years ago

Gadamsetty S. · Reviewed over 3 years ago

Nikolaus M. · Reviewed over 3 years ago

Constructive lab

Ali M. · Reviewed over 3 years ago

Tamer K. · Reviewed over 3 years ago

Prajjwal S. · Reviewed over 3 years ago

Until now, this is the best explain lab I ever did in cloudskillboots. Have time tags before tasks, have at least double of required time if you only follow step by step the lab. Give you the opportunity to try yourself an the solution code is an extra non a step...

Ricardo Miguel N. · Reviewed over 3 years ago

Prashant K. · Reviewed over 3 years ago

jairo m. · Reviewed over 3 years ago

khadija b. · Reviewed over 3 years ago

Prajjwal S. · Reviewed over 3 years ago

Noé G. · Reviewed over 3 years ago

pratik B. · Reviewed over 3 years ago

Deshen C. · Reviewed over 3 years ago

Kyrylo K. · Reviewed over 3 years ago

We do not ensure the published reviews originate from consumers who have purchased or used the products. Reviews are not verified by Google.