Métadonnées TFX avis

2231 avis

the cluster deployment is blocked at 64% and does not change any more. Current errors: [GCE_STOCKOUT]:

Amine N. · Examiné il y a environ un an

AI Platform Pipelines is deprecated and is not functioning. Cannot deploy the pipeline package.

Kunal A. · Examiné il y a environ un an

the cluster deployment is blocked at 64% and does not change any more.

Amine N. · Examiné il y a environ un an

RAAGHAVAN K. · Examiné il y a environ un an

Jesus Eduardo J. · Examiné il y a environ un an

one more deprecated labs

Praveen C. · Examiné il y a environ un an

The tuner failed so the next step all failed (tried to archive the run but give me the same error: 2024-09-03 08:33:37.709428: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lo ... --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353041.309480180","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353041.309478959","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3443186458.py in <module> ----> 1 for artifact_type in store.get_artifact_types(): 2 print(artifact_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifact_types(self) 685 response = metadata_store_service_pb2.GetArtifactTypesResponse() 686 --> 687 self._call('GetArtifactTypes', request, response) 688 result = [] 689 for x in response.artifact_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353064.696169603","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353064.696167689","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/4236999377.py in <module> ----> 1 for execution_type in store.get_execution_types(): 2 print(execution_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_execution_types(self) 724 response = metadata_store_service_pb2.GetExecutionTypesResponse() 725 --> 726 self._call('GetExecutionTypes', request, response) 727 result = [] 728 for x in response.execution_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353069.086763692","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353069.086761886","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/768924533.py in <module> ----> 1 for context_type in store.get_context_types(): 2 print(context_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_context_types(self) 763 response = metadata_store_service_pb2.GetContextTypesResponse() 764 --> 765 self._call('GetContextTypes', request, response) 766 result = [] 767 for x in response.context_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353183.044450646","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353183.044449468","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/1900152474.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 schema_artifacts = store.get_artifacts_by_type(standard_artifacts.Schema.TYPE_NAME) 3 stats_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleStatistics.TYPE_NAME) 4 anomalies_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleAnomalies.TYPE_NAME) /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4213622925.py in <module> ----> 1 schema_file = os.path.join(schema_artifacts[-1].uri, 'schema.pbtxt') 2 print("Generated schame file:{}".format(schema_file)) 3 4 stats_path = stats_artifacts[-1].uri 5 train_stats_file = os.path.join(stats_path, 'train', 'stats_tfrecord') NameError: name 'schema_artifacts' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/1981782273.py in <module> ----> 1 schema = tfdv.load_schema_text(schema_file) 2 tfdv.display_schema(schema=schema) NameError: name 'schema_file' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4235532081.py in <module> ----> 1 train_stats = tfdv.load_statistics(train_stats_file) 2 eval_stats = tfdv.load_statistics(eval_stats_file) 3 tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats, 4 lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET') NameError: name 'train_stats_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/804134886.py in <module> ----> 1 train_anomalies = tfdv.load_anomalies_text(train_anomalies_file) 2 tfdv.display_anomalies(train_anomalies) NameError: name 'train_anomalies_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2432183026.py in <module> ----> 1 eval_anomalies = tfdv.load_anomalies_text(eval_anomalies_file) 2 tfdv.display_anomalies(eval_anomalies) NameError: name 'eval_anomalies_file' is not defined --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353200.878418859","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353200.878417252","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3595433068.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 model_eval_artifacts = store.get_artifacts_by_type(standard_artifacts.ModelEvaluation.TYPE_NAME) 3 hyperparam_artifacts = store.get_artifacts_by_type(standard_artifacts.HyperParameters.TYPE_NAME) 4 5 model_eval_path = model_eval_artifacts[-1].uri /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2981799238.py in <module> 1 # Latest pipeline run Tuner search space. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['space'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/163579549.py in <module> 1 # Latest pipeline run Tuner searched best_hyperparameters artifacts. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['values'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/3972218752.py in <module> ----> 1 eval_result = tfma.load_eval_result(model_eval_path) 2 tfma.view.render_slicing_metrics( 3 eval_result, slicing_column='Wilderness_Area') NameError: name 'model_eval_path' is not defined

Wiehan W. · Examiné il y a environ un an

Amine N. · Examiné il y a environ un an

The lab is broken

Boris Enrique M. · Examiné il y a plus d'un an

Another outdated lab for this course

Liam B. · Examiné il y a plus d'un an

For some reason the pipeline run in this lab has the tuner turned on when it really should not be using the tuner based on previous labs. The run failed inside the tuner stage, here's the logs: time="2024-08-25T20:59:56.022Z" level=info msg="capturing logs" argo=true 2024-08-25 20:59:56.628983: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib 2024-08-25 20:59:56.629036: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. INFO:absl:tensorflow_ranking is not available: No module named 'tensorflow_ranking' INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text' INFO:absl:Running driver for Tuner INFO:absl:MetadataStore with gRPC connection initialized INFO:absl:Adding KFP pod name tfx-covertype-lab-04-h5w92-4165073768 to execution INFO:absl:Running executor for Tuner INFO:absl:Attempting to infer TFX Python dependency for beam INFO:absl:Copying all content from install dir /tfx-src/tfx to temp dir /tmp/tmpcocfdac0/build/tfx INFO:absl:Generating a temp setup file at /tmp/tmpcocfdac0/build/tfx/setup.py INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpcocfdac0/build/tfx/setup.log INFO:absl:Added --extra_package=/tmp/tmpcocfdac0/build/tfx/dist/tfx_ephemeral-0.25.0.tar.gz to beam args WARNING:absl:workerCount is overridden with 2 INFO:absl:json_inputs='{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\"train\", \"eval\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}'. INFO:absl:json_outputs='{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}'. INFO:absl:json_exec_properties='{"custom_config": "{\"ai_platform_training_args\": {\"masterConfig\": {\"imageUri\": \"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\"}, \"project\": \"qwiklabs-gcp-01-8b52d2fcf958\", \"region\": \"us-central1\", \"serviceAccount\": \"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\"}}", "eval_args": "{\"num_steps\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\"num_steps\": 5000}", "tune_args": "{\n \"num_parallel_trials\": 3\n}", "tuner_fn": null}'. WARNING:googleapiclient.discovery_cache:file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect from google.appengine.api import memcache ModuleNotFoundError: No module named 'google.appengine' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module> from oauth2client.contrib.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module> from oauth2client.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 42, in autodetect from . import file_cache File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module> "file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth" ImportError: file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth INFO:googleapiclient.discovery:URL being requested: GET https://www.googleapis.com/discovery/v1/apis/ml/v1/rest INFO:absl:TrainingInput={'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'region': 'us-central1', 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com', 'workerCount': 2, 'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard'} INFO:absl:Submitting job='tfx_tuner_20240825210006', project='qwiklabs-gcp-01-8b52d2fcf958' to AI Platform. INFO:googleapiclient.discovery:URL being requested: POST https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs?alt=json INFO:googleapiclient.discovery:URL being requested: GET https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006?alt=json ERROR:absl:Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. Traceback (most recent call last): File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module> main() File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main execution_info = launcher.launch() File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 209, in launch copy.deepcopy(execution_decision.exec_properties)) File "/tfx-src/tfx/orchestration/launcher/in_process_component_launcher.py", line 72, in _run_executor copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties)) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/tuner/executor.py", line 121, in Do job_id) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 305, in start_aip_training job_labels=job_labels) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 194, in _launch_aip_training raise RuntimeError(err_msg) RuntimeError: Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. time="2024-08-25T21:09:59.279Z" level=info msg="sub-process exited" argo=true error="<nil>" time="2024-08-25T21:09:59.279Z" level=error msg="cannot save artifact /mlpipeline-ui-metadata.json" argo=true error="stat /mlpipeline-ui-metadata.json: no such file or directory" Error: exit status 1

Vu N. · Examiné il y a plus d'un an

Muhammad H. · Examiné il y a plus d'un an

Xiong w. · Examiné il y a plus d'un an

Probably the worst of your labs. And that's saying a lot because few work at all. Aren't you embarrassed to wate our time like this? How do you sleep at night?

Ilsa C. · Examiné il y a plus d'un an

incomplete lab info

Rupesh K. · Examiné il y a plus d'un an

Srinibash S. · Examiné il y a plus d'un an

Americo V. · Examiné il y a plus d'un an

Manuel G. · Examiné il y a plus d'un an

Muhammad H. · Examiné il y a plus d'un an

SHIVANK U. · Examiné il y a plus d'un an

Eddison L. · Examiné il y a plus d'un an

Pablo R. · Examiné il y a plus d'un an

Muhammad H. · Examiné il y a plus d'un an

multiple errors in the lab

Cristina D. · Examiné il y a plus d'un an

Pipelines deploying fails every time even if all steps are followed properly.

Aiswaria A. · Examiné il y a plus d'un an

Nous ne pouvons pas certifier que les avis publiés proviennent de consommateurs qui ont acheté ou utilisé les produits. Les avis ne sont pas vérifiés par Google.