Missed autovectorization chances
While the developers of babl recognize SIMD as an important source of acceleration, the current form of many conversions are a bit hostile to autovectorization. The main issue is that the compiler receives no guarantee about the loop-carried dependence from the function signature: the way BablFunc...
is too vague. We could use some pragmas or some restrict
to assert the lack of dependence, but the most handy, cross-platform #pragma openmp simd
is for for-loops only, not for the while loops we use.
I cooked up a very unscientific example at https://gcc.godbolt.org/z/hrYvee. Play with the code and you can see how it vectorizes. By default, the first (current babl-style) sample should contain an unrolled but unvectorized loop, while the latter has a vectorized loop and an unvectorized footer.
PS: The char*
thing is fishy. But I think others have mentioned too.