i think this example notebook shows training a transformer model on the free tpus https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/causal_language_modeling_flax.ipynb