I found, that using MKL library functions like vsTanh (https://un5zg2tx7j2d6pzvyg1g.julianrbryant.com/en-us/mkl-developer-reference-fortran-v-tanh) is quite faster than doing vector.mapv(|x| x.tanh()).
Is it worth including this in ndarray crate behind feature gate?