리뷰 TFX 메타데이터개
리뷰 2231개
the cluster deployment is blocked at 64% and does not change any more. Current errors: [GCE_STOCKOUT]:
Amine N. · 대략 1년 전에 리뷰됨
AI Platform Pipelines is deprecated and is not functioning. Cannot deploy the pipeline package.
Kunal A. · 대략 1년 전에 리뷰됨
the cluster deployment is blocked at 64% and does not change any more.
Amine N. · 대략 1년 전에 리뷰됨
RAAGHAVAN K. · 대략 1년 전에 리뷰됨
Jesus Eduardo J. · 대략 1년 전에 리뷰됨
one more deprecated labs
Praveen C. · 대략 1년 전에 리뷰됨
The tuner failed so the next step all failed (tried to archive the run but give me the same error: 2024-09-03 08:33:37.709428: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lo ... --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353041.309480180","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353041.309478959","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3443186458.py in <module> ----> 1 for artifact_type in store.get_artifact_types(): 2 print(artifact_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifact_types(self) 685 response = metadata_store_service_pb2.GetArtifactTypesResponse() 686 --> 687 self._call('GetArtifactTypes', request, response) 688 result = [] 689 for x in response.artifact_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353064.696169603","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353064.696167689","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/4236999377.py in <module> ----> 1 for execution_type in store.get_execution_types(): 2 print(execution_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_execution_types(self) 724 response = metadata_store_service_pb2.GetExecutionTypesResponse() 725 --> 726 self._call('GetExecutionTypes', request, response) 727 result = [] 728 for x in response.execution_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353069.086763692","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353069.086761886","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/768924533.py in <module> ----> 1 for context_type in store.get_context_types(): 2 print(context_type.name) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_context_types(self) 763 response = metadata_store_service_pb2.GetContextTypesResponse() 764 --> 765 self._call('GetContextTypes', request, response) 766 result = [] 767 for x in response.context_types: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353183.044450646","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353183.044449468","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/1900152474.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 schema_artifacts = store.get_artifacts_by_type(standard_artifacts.Schema.TYPE_NAME) 3 stats_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleStatistics.TYPE_NAME) 4 anomalies_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleAnomalies.TYPE_NAME) /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4213622925.py in <module> ----> 1 schema_file = os.path.join(schema_artifacts[-1].uri, 'schema.pbtxt') 2 print("Generated schame file:{}".format(schema_file)) 3 4 stats_path = stats_artifacts[-1].uri 5 train_stats_file = os.path.join(stats_path, 'train', 'stats_tfrecord') NameError: name 'schema_artifacts' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/1981782273.py in <module> ----> 1 schema = tfdv.load_schema_text(schema_file) 2 tfdv.display_schema(schema=schema) NameError: name 'schema_file' is not defined -------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/4235532081.py in <module> ----> 1 train_stats = tfdv.load_statistics(train_stats_file) 2 eval_stats = tfdv.load_statistics(eval_stats_file) 3 tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats, 4 lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET') NameError: name 'train_stats_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/804134886.py in <module> ----> 1 train_anomalies = tfdv.load_anomalies_text(train_anomalies_file) 2 tfdv.display_anomalies(train_anomalies) NameError: name 'train_anomalies_file' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2432183026.py in <module> ----> 1 eval_anomalies = tfdv.load_anomalies_text(eval_anomalies_file) 2 tfdv.display_anomalies(eval_anomalies) NameError: name 'eval_anomalies_file' is not defined --------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 201 try: --> 202 response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec)) 203 except grpc.RpcError as e: /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression) 945 wait_for_ready, compression) --> 946 return _end_unary_response_blocking(state, call, False, None) 947 /opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline) 848 else: --> 849 raise _InactiveRpcError(state) 850 _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1725353200.878418859","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1725353200.878417252","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}" > During handling of the above exception, another exception occurred: UnavailableError Traceback (most recent call last) /tmp/ipykernel_18688/3595433068.py in <module> 1 with metadata.Metadata(connection_config) as store: ----> 2 model_eval_artifacts = store.get_artifacts_by_type(standard_artifacts.ModelEvaluation.TYPE_NAME) 3 hyperparam_artifacts = store.get_artifacts_by_type(standard_artifacts.HyperParameters.TYPE_NAME) 4 5 model_eval_path = model_eval_artifacts[-1].uri /opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py in get_artifacts_by_type(self, type_name) 250 self, type_name: Text) -> List[metadata_store_pb2.Artifact]: 251 """Fetches artifacts given artifact type name.""" --> 252 return self.store.get_artifacts_by_type(type_name) 253 254 # TODO(b/145751019): Remove this once migrated to use MLMD built-in states. /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in get_artifacts_by_type(self, type_name) 585 response = metadata_store_service_pb2.GetArtifactsByTypeResponse() 586 --> 587 self._call('GetArtifactsByType', request, response) 588 result = [] 589 for x in response.artifacts: /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call(self, method_name, request, response) 175 while True: 176 try: --> 177 return self._call_method(method_name, request, response) 178 except errors.AbortedError: 179 num_retries -= 1 /opt/conda/lib/python3.7/site-packages/ml_metadata/metadata_store/metadata_store.py in _call_method(self, method_name, request, response) 205 # description. 206 # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode --> 207 raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error 208 209 def _swig_call(self, method, request, response) -> None: UnavailableError: failed to connect to all addresses --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/2981799238.py in <module> 1 # Latest pipeline run Tuner search space. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['space'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/163579549.py in <module> 1 # Latest pipeline run Tuner searched best_hyperparameters artifacts. ----> 2 json.loads(file_io.read_file_to_string(best_hparams_path))['values'] NameError: name 'best_hparams_path' is not defined --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_18688/3972218752.py in <module> ----> 1 eval_result = tfma.load_eval_result(model_eval_path) 2 tfma.view.render_slicing_metrics( 3 eval_result, slicing_column='Wilderness_Area') NameError: name 'model_eval_path' is not defined
Wiehan W. · 대략 1년 전에 리뷰됨
Amine N. · 대략 1년 전에 리뷰됨
The lab is broken
Boris Enrique M. · 1년 초과 전에 리뷰됨
Another outdated lab for this course
Liam B. · 1년 초과 전에 리뷰됨
For some reason the pipeline run in this lab has the tuner turned on when it really should not be using the tuner based on previous labs. The run failed inside the tuner stage, here's the logs: time="2024-08-25T20:59:56.022Z" level=info msg="capturing logs" argo=true 2024-08-25 20:59:56.628983: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib 2024-08-25 20:59:56.629036: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. INFO:absl:tensorflow_ranking is not available: No module named 'tensorflow_ranking' INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text' INFO:absl:Running driver for Tuner INFO:absl:MetadataStore with gRPC connection initialized INFO:absl:Adding KFP pod name tfx-covertype-lab-04-h5w92-4165073768 to execution INFO:absl:Running executor for Tuner INFO:absl:Attempting to infer TFX Python dependency for beam INFO:absl:Copying all content from install dir /tfx-src/tfx to temp dir /tmp/tmpcocfdac0/build/tfx INFO:absl:Generating a temp setup file at /tmp/tmpcocfdac0/build/tfx/setup.py INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpcocfdac0/build/tfx/setup.log INFO:absl:Added --extra_package=/tmp/tmpcocfdac0/build/tfx/dist/tfx_ephemeral-0.25.0.tar.gz to beam args WARNING:absl:workerCount is overridden with 2 INFO:absl:json_inputs='{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\"train\", \"eval\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}'. INFO:absl:json_outputs='{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}'. INFO:absl:json_exec_properties='{"custom_config": "{\"ai_platform_training_args\": {\"masterConfig\": {\"imageUri\": \"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\"}, \"project\": \"qwiklabs-gcp-01-8b52d2fcf958\", \"region\": \"us-central1\", \"serviceAccount\": \"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\"}}", "eval_args": "{\"num_steps\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\"num_steps\": 5000}", "tune_args": "{\n \"num_parallel_trials\": 3\n}", "tuner_fn": null}'. WARNING:googleapiclient.discovery_cache:file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect from google.appengine.api import memcache ModuleNotFoundError: No module named 'google.appengine' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module> from oauth2client.contrib.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module> from oauth2client.locked_file import LockedFile ModuleNotFoundError: No module named 'oauth2client.locked_file' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 42, in autodetect from . import file_cache File "/usr/local/lib/python3.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module> "file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth" ImportError: file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth INFO:googleapiclient.discovery:URL being requested: GET https://www.googleapis.com/discovery/v1/apis/ml/v1/rest INFO:absl:TrainingInput={'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'region': 'us-central1', 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com', 'workerCount': 2, 'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard'} INFO:absl:Submitting job='tfx_tuner_20240825210006', project='qwiklabs-gcp-01-8b52d2fcf958' to AI Platform. INFO:googleapiclient.discovery:URL being requested: POST https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs?alt=json INFO:googleapiclient.discovery:URL being requested: GET https://ml.googleapis.com/v1/projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006?alt=json ERROR:absl:Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. Traceback (most recent call last): File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module> main() File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main execution_info = launcher.launch() File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 209, in launch copy.deepcopy(execution_decision.exec_properties)) File "/tfx-src/tfx/orchestration/launcher/in_process_component_launcher.py", line 72, in _run_executor copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties)) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/tuner/executor.py", line 121, in Do job_id) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 305, in start_aip_training job_labels=job_labels) File "/tfx-src/tfx/extensions/google_cloud_ai_platform/runner.py", line 194, in _launch_aip_training raise RuntimeError(err_msg) RuntimeError: Job 'projects/qwiklabs-gcp-01-8b52d2fcf958/jobs/tfx_tuner_20240825210006' did not succeed. Detailed response {'jobId': 'tfx_tuner_20240825210006', 'trainingInput': {'scaleTier': 'CUSTOM', 'masterType': 'standard', 'workerType': 'standard', 'workerCount': '2', 'region': 'us-central1', 'masterConfig': {'imageUri': 'gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f', 'containerCommand': ['python', '-m', 'tfx.scripts.run_executor', '--executor_class_path', 'tfx.extensions.google_cloud_ai_platform.tuner.executor._WorkerExecutor', '--inputs', '{"examples": [{"artifact": {"id": "5", "type_id": "18", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transformed_examples/5", "properties": {"split_names": {"string_value": "[\\"train\\", \\"eval\\"]"}}, "custom_properties": {"name": {"string_value": "transformed_examples"}, "state": {"string_value": "published"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483968", "last_update_time_since_epoch": "1724619580784"}, "artifact_type": {"id": "18", "name": "Examples", "properties": {"split_names": "STRING", "version": "INT", "span": "INT"}}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "Examples"}], "transform_graph": [{"artifact": {"id": "4", "type_id": "22", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Transform/transform_graph/5", "custom_properties": {"state": {"string_value": "published"}, "name": {"string_value": "transform_graph"}, "producer_component": {"string_value": "Transform"}}, "state": "LIVE", "create_time_since_epoch": "1724619483965", "last_update_time_since_epoch": "1724619580782"}, "artifact_type": {"id": "22", "name": "TransformGraph"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "TransformGraph"}]}', '--outputs', '{"best_hyperparameters": [{"artifact": {"id": "9", "type_id": "28", "uri": "gs://qwiklabs-gcp-01-8b52d2fcf958-kubeflowpipelines-default//tfx_covertype_lab_04/aec7304e-9b62-4712-ad01-85d8dd6f711d/Tuner/best_hyperparameters/8", "custom_properties": {"name": {"string_value": "best_hyperparameters"}, "producer_component": {"string_value": "Tuner"}}}, "artifact_type": {"id": "28", "name": "HyperParameters"}, "__artifact_class_module__": "tfx.types.standard_artifacts", "__artifact_class_name__": "HyperParameters"}]}', '--exec-properties', '{"custom_config": "{\\"ai_platform_training_args\\": {\\"masterConfig\\": {\\"imageUri\\": \\"gcr.io/qwiklabs-gcp-01-8b52d2fcf958/tfx_covertype_lab_04@sha256:f29bff7ce54b6232257cd50901602a0349b3f07ca7df72bbbe7c94837e45925f\\"}, \\"project\\": \\"qwiklabs-gcp-01-8b52d2fcf958\\", \\"region\\": \\"us-central1\\", \\"serviceAccount\\": \\"tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com\\"}}", "eval_args": "{\\"num_steps\\": 500}", "kfp_pod_name": "tfx-covertype-lab-04-h5w92-4165073768", "module_file": "model.py", "train_args": "{\\"num_steps\\": 5000}", "tune_args": "{\\n \\"num_parallel_trials\\": 3\\n}", "tuner_fn": null}']}, 'serviceAccount': 'tfx-tuner-caip-service-account@qwiklabs-gcp-01-8b52d2fcf958.iam.gserviceaccount.com'}, 'createTime': '2024-08-25T21:00:25Z', 'startTime': '2024-08-25T21:09:45Z', 'endTime': '2024-08-25T21:09:48Z', 'state': 'FAILED', 'errorMessage': 'The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=478749246700&resource=ml_job%2Fjob_id%2Ftfx_tuner_20240825210006&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22tfx_tuner_20240825210006%22 ', 'trainingOutput': {}, 'labels': {'tfx_executor': 'ensions-google_cloud_ai_platform-tuner-executor-_workerexecutor', 'tfx_py_version': '3-7', 'tfx_runner': 'kfp', 'tfx_version': '0-25-0'}, 'etag': 'iPZ4v+6czZE=', 'jobPosition': '0'}. time="2024-08-25T21:09:59.279Z" level=info msg="sub-process exited" argo=true error="<nil>" time="2024-08-25T21:09:59.279Z" level=error msg="cannot save artifact /mlpipeline-ui-metadata.json" argo=true error="stat /mlpipeline-ui-metadata.json: no such file or directory" Error: exit status 1
Vu N. · 1년 초과 전에 리뷰됨
Muhammad H. · 1년 초과 전에 리뷰됨
Xiong w. · 1년 초과 전에 리뷰됨
Probably the worst of your labs. And that's saying a lot because few work at all. Aren't you embarrassed to wate our time like this? How do you sleep at night?
Ilsa C. · 1년 초과 전에 리뷰됨
incomplete lab info
Rupesh K. · 1년 초과 전에 리뷰됨
Srinibash S. · 1년 초과 전에 리뷰됨
Americo V. · 1년 초과 전에 리뷰됨
Manuel G. · 1년 초과 전에 리뷰됨
Muhammad H. · 1년 초과 전에 리뷰됨
SHIVANK U. · 1년 초과 전에 리뷰됨
Eddison L. · 1년 초과 전에 리뷰됨
Pablo R. · 1년 초과 전에 리뷰됨
Muhammad H. · 1년 초과 전에 리뷰됨
multiple errors in the lab
Cristina D. · 1년 초과 전에 리뷰됨
Pipelines deploying fails every time even if all steps are followed properly.
Aiswaria A. · 1년 초과 전에 리뷰됨
Google은 게시된 리뷰가 제품을 구매 또는 사용한 소비자에 의해 작성되었음을 보증하지 않습니다. 리뷰는 Google의 인증을 거치지 않습니다.