Opiniones sobre Metadatos de TFX

2231 opiniones

the cluster deployment is blocked at 64% and does not change any more. Current errors: [GCE_STOCKOUT]:

Amine N. · Se revisó hace alrededor de 1 año

AI Platform Pipelines is deprecated and is not functioning. Cannot deploy the pipeline package.

Kunal A. · Se revisó hace alrededor de 1 año

the cluster deployment is blocked at 64% and does not change any more.

Amine N. · Se revisó hace alrededor de 1 año

RAAGHAVAN K. · Se revisó hace alrededor de 1 año

Jesus Eduardo J. · Se revisó hace alrededor de 1 año

one more deprecated labs

Praveen C. · Se revisó hace alrededor de 1 año

The tuner failed so the next step all failed (tried to archive the run but give me the same error: 2024-09-03 08:33:37.709428: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lo ... --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353041.309480180","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353041.309478959","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3443186458.py in <module> ----> 1 for artifact_type in store.get_artifact_types(): 2 print(artifact_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifact_types(self) 685 response = metadata_store_service_pb2.GetArtifactTypesResponse() 686 --> 687 self._call('GetArtifactTypes', request, response) 688 result = [] 689 for x in response.artifact_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353064.696169603","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353064.696167689","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/4236999377.py in <module> ----> 1 for execution_type in store.get_execution_types(): 2 print(execution_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_execution_types(self) 724 response = metadata_store_service_pb2.GetExecutionTypesResponse() 725 --> 726 self._call('GetExecutionTypes', request, response) 727 result = [] 728 for x in response.execution_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353069.086763692","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353069.086761886","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/768924533.py in <module> ----> 1 for context_type in store.get_context_types(): 2 print(context_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_context_types(self) 763 response = metadata_store_service_pb2.GetContextTypesResponse() 764 --> 765 self._call('GetContextTypes', request, response) 766 result = [] 767 for x in response.context_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353183.044450646","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353183.044449468","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/1900152474.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 schema_artifacts = store.get_artifacts_by_type(standard_artifacts.Schema.TYPE_NAME) 3 stats_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleStatistics.TYPE_NAME) 4 anomalies_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleAnomalies.TYPE_NAME) /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4213622925.py in <module> ----> 1 schema_file = os.path.join(schema_artifacts[-1].uri, 'schema.pbtxt') 2 print("Generated schame file:{}".format(schema_file)) 3 4 stats_path = stats_artifacts[-1].uri 5 train_stats_file = os.path.join(stats_path, 'train', 'stats_tfrecord') NameError: name 'schema_artifacts' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/1981782273.py in <module> ----> 1 schema = tfdv.load_schema_text(schema_file) 2 tfdv.display_schema(schema=schema) NameError: name 'schema_file' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4235532081.py in <module> ----> 1 train_stats = tfdv.load_statistics(train_stats_file) 2 eval_stats = tfdv.load_statistics(eval_stats_file) 3 tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats, 4 lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET') NameError: name 'train_stats_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/804134886.py in <module> ----> 1 train_anomalies = tfdv.load_anomalies_text(train_anomalies_file) 2 tfdv.display_anomalies(train_anomalies) NameError: name 'train_anomalies_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2432183026.py in <module> ----> 1 eval_anomalies = tfdv.load_anomalies_text(eval_anomalies_file) 2 tfdv.display_anomalies(eval_anomalies) NameError: name 'eval_anomalies_file' is not defined --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353200.878418859","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353200.878417252","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3595433068.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 model_eval_artifacts = store.get_artifacts_by_type(standard_artifacts.ModelEvaluation.TYPE_NAME) 3 hyperparam_artifacts = store.get_artifacts_by_type(standard_artifacts.HyperParameters.TYPE_NAME) 4 5 model_eval_path = model_eval_artifacts[-1].uri /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2981799238.py in <module> 1 # Latest pipeline run Tuner search space. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['space'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/163579549.py in <module> 1 # Latest pipeline run Tuner searched best_hyperparameters artifacts. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['values'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/3972218752.py in <module> ----> 1 eval_result = tfma.load_eval_result(model_eval_path) 2 tfma.view.render_slicing_metrics( 3 eval_result, slicing_column='Wilderness_Area') NameError: name 'model_eval_path' is not defined

Wiehan W. · Se revisó hace alrededor de 1 año

Amine N. · Se revisó hace alrededor de 1 año

The lab is broken

Boris Enrique M. · Se revisó hace más de 1 año

Another outdated lab for this course

Liam B. · Se revisó hace más de 1 año

For some reason the pipeline run in this lab has the tuner turned on when it really should not be using the tuner based on previous labs. The run failed inside the tuner stage, here's the logs: time="2024-08-25T20:59:56.022Z" level=info msg="capturing logs" argo=true 2024-08-25 20:59:56.628983: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib 2024-08-25 20:59:56.629036: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. INFO:absl:tensorflow_ranking is not available: No module named 'tensorflow_ranking' INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text' INFO:absl:Running driver for Tuner INFO:absl:MetadataStore with gRPC connection initialized INFO:absl:Adding KFP pod name tfx-covertype-lab-04-h5w92-4165073768 to execution INFO:absl:Running executor for Tuner INFO:absl:Attempting to infer TFX Python dependency for beam INFO:absl:Copying all content from install dir /tfx-src/tfx to temp dir /tmp/tmpcocfdac0/build/tfx INFO:absl:Generating a temp setup file at /tmp/tmpcocfdac0/build/tfx/setup.py INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpcocfdac0/build/tfx/setup.log INFO:absl:Added --extra_package=/tmp/tmpcocfdac0/build/tfx/dist/tfx_ephemeral-0.25.0.tar.gz to beam args WARNING:absl:workerCount is overridden with 2 INFO:absl:json_inputs='{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\"train\", \"eval\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}'. INFO:absl:json_outputs='{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}'. INFO:absl:json_exec_properties='{"custom_config": "{\"ai_platform_training_args\": {\"masterConfig\": {\"imageUri\": \"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\"}, \"project\": \"qwiklabs-gcp-01-8b52d2fcf958\", \"region\": \"us-central1\", \"serviceAccount\": \"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\"}}", "eval_args": "{\"num_steps\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\"num_steps\": 5000}", "tune_args": "{\n \"num_parallel_trials\": 3\n}", "tuner_fn": null}'. WARNING:googleapiclient.discovery_cache:file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect from google.appengine.api import memcache ModuleNotFoundError: No module named 'google.appengine' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module> from oauth2client.contrib.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module> from oauth2client.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 42, in autodetect from . import file_cache File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module> "file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth" ImportError: file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth INFO:googleapiclient.discovery:URL being requested: GET https://www.googleapis.com/discovery/v1/apis/ml/v1/rest INFO:absl:TrainingInput={'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'region': 'us-central1', 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com', 'workerCount': 2, 'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard'} INFO:absl:Submitting job='tfx_tuner_20240825210006', project='qwiklabs-gcp-01-8b52d2fcf958' to AI Platform. INFO:googleapiclient.discovery:URL being requested: POST https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs?alt=json INFO:googleapiclient.discovery:URL being requested: GET https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006?alt=json ERROR:absl:Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. Traceback (most recent call last): File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module> main() File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main execution_info = launcher.launch() File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 209, in launch copy.deepcopy(execution_decision.exec_properties)) File "/tfx-src/tfx/orchestration/launcher/in_process_component_launcher.py", line 72, in _run_executor copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties)) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/tuner/executor.py", line 121, in Do job_id) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 305, in start_aip_training job_labels=job_labels) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 194, in _launch_aip_training raise RuntimeError(err_msg) RuntimeError: Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. time="2024-08-25T21:09:59.279Z" level=info msg="sub-process exited" argo=true error="<nil>" time="2024-08-25T21:09:59.279Z" level=error msg="cannot save artifact /mlpipeline-ui-metadata.json" argo=true error="stat /mlpipeline-ui-metadata.json: no such file or directory" Error: exit status 1

Vu N. · Se revisó hace más de 1 año

Muhammad H. · Se revisó hace más de 1 año

Xiong w. · Se revisó hace más de 1 año

Probably the worst of your labs. And that's saying a lot because few work at all. Aren't you embarrassed to wate our time like this? How do you sleep at night?

Ilsa C. · Se revisó hace más de 1 año

incomplete lab info

Rupesh K. · Se revisó hace más de 1 año

Srinibash S. · Se revisó hace más de 1 año

Americo V. · Se revisó hace más de 1 año

Manuel G. · Se revisó hace más de 1 año

Muhammad H. · Se revisó hace más de 1 año

SHIVANK U. · Se revisó hace más de 1 año

Eddison L. · Se revisó hace más de 1 año

Pablo R. · Se revisó hace más de 1 año

Muhammad H. · Se revisó hace más de 1 año

multiple errors in the lab

Cristina D. · Se revisó hace más de 1 año

Pipelines deploying fails every time even if all steps are followed properly.

Aiswaria A. · Se revisó hace más de 1 año

No garantizamos que las opiniones publicadas provengan de consumidores que hayan comprado o utilizado los productos. Google no verifica las opiniones.