Tensorflow has a pow function. This is the lite CPU implementation, part of an inline header file over 8k lines long: https://github.com/tensorflow/tensorflow/blob/8c02285dc2664c2c74edbe7d2486f0845a4c499c/tensorflow/lite/kernels/internal/optimized/optimized_ops.h#L6254 . It automatically performs the integer optimization when its argument is an approximate integer. The floating point implementation is labeled "slow". The file is too big to easily review on my phone.

Tensorflow has GPU implementations of all its ops, and a heavily-maintained tensor (numerical array) class. Tensorflow's build system is bazel which has poor compatibility, but it just a pile of compilable sourcefiles.

GPU is highly valued but a fast floating point pow approximation could be missing from tensorflow, unsure it's hard to look.