Sunday, July 06, 2008

C++ Template magic

Now that Johan talked about why the STL simply rocks I have to add a quick note about C++ templates in general, to be precise about template specialization. I've recently written a ring buffer template, something like
template <class T, int TSize>
class RingBuffer
{
public:
RingBuffer();
// ...
void append(int count, T* first);
private:
std::vector<T> buffer;
int start;
int end;
};

template <class T, int TSize>
RingBuffer<T,TSize>::RingBuffer()
{
buffer.resize(TSize);
start = 0;
end = 0;
}

template <class T, int TSize>
void RingBuffer<T,TSize>::append(int count, T* first)
{
// code omitted: make sure count elements fit, otherwise return
// now there are two cases: either count elements fit completely,
// or we have to wrap around at the end of the ring buffer.
// the case of a wrap around is ignored here, too

copyHelper(&buffer[start], first, count);
}
Now the function copyHelper looks like this:
template <class T, int TSize>
static inline void copyHelper(T* dest, T* src, int count)
{
while (count--) {
*dest++ = *src++;
}
}
copyHelper simply assigns count elements with the operator=(), as T can also be a class. But in the case of T being e.g. a simple char this code part performs really bad. However, there is a solution: Template specialization. That means, we add another function copyHelper() and the compiler then is clever enough to pick the correct one:
template <>
static inline void copyHelper<char>(char* dest, char* src, int count)
{
memcpy(dest, src, count);
}
Now the code is really fast in the case of char. In other words, with template specialization we can heavily tune template code in a lot of cases. And ideally, STL implementations do exactly this, don't they? Does Qt use template specialization for its template classes?

2 comments:

Unknown said...

Qt uses template specialization quite extensively in performance critical drawing code.

Easy way to see, in qt-copy/src run:

find | egrep "\.(cpp|h)$" | xargs grep "template <>"

Unknown said...

A very nice way to reduce the number of overloaded functions in your actual class is to make use of type traits. Have a


template <typename T_>
struct TypeTraits
{
inline static void copy(const T_ * src, T_ * dest, int count)
{
std::copy(src, src + count, dest);
}
};

template <>
struct TypeTraits<char>
{
inline static void copy(const char * src, char * dst, int count)
{
memcpy(dst, src, count);
}
}


In your RingBuffer, you can then just write


TypeTraits<T_>::copy(src, dst, count);


and be done with it. That way your RingBuffer template will still by fairly readably, while it benefits from specialization on PODs.