1 year ago
#378701
Abubakr Abd Elrazig Hassan
Cannot use tensorboard with Vertex AI Custom job
I'm trying to launch a custom training job using Vertex AI through XManager. When running Custom jobs with tensorboard enabled I get a tensorboard instance in experiments -> tensorboard instances
and a button on the custom job page that says OPEN TENSORBOARD
. However, this leads to an empty page that says Not found: TensorboardExperiment
.
- I observed this behaviour when running my own custom job and when running XManager's example cifar10_tensorflow. Note that in both cases the job runs to completion without problems.
- I can visualise the logs locally via the standard tensorboard package and passing as
log_dir
the cloud storage directory containing the experiments logs. - I can upload experiment logs to Vertex AI tensorboard manually using
tb-gcp-uploader --tensorboard_resource_name \
TENSORBOARD_INSTANCE_NAME \
--logdir=LOG_DIR \
--experiment_name=TB_EXPERIMENT_NAME --one_shot=True
- For more details check out the discussion: https://github.com/deepmind/xmanager/issues/15
tensorboard
google-cloud-ml
google-cloud-vertex-ai
0 Answers
Your Answer