I encountered an issue where HuggingFace's Trainer() would not start when using Google Cloud's Vertex AI Workbench.
A similar bug was reported on the following page:
HuggingFace Trainer() does nothing - only on Vertex AI workbench, works on colab
I am having issues getting the Trainer() function in huggingface to actually do anything on Vertex AI workbench notebooks.
I'm totally stumped and have no idea how to even begin to try debug this.
...
Initially, I had selected the "PyTorch" environment as shown below, and this is where the bug occurred:

As described in the article above, switching to the "Python" environment resolved the issue:

Note that when using this environment, you first need to run the following:
conda install pytorch cudatoolkit=11.0 -c pytorch
I hope this serves as a useful reference for anyone experiencing the same issue.



Comments
โฆ