Just Let It Flow

March 12, 2009

UNICODE Independence

Filed under: Code,Windows — adeyblue @ 7:16 am

As everybody who’s ever done some Windows programming knows, string handling functions come in two flavours. ‘A’ versions for char based strings, and ‘W’ versions for WCHAR based strings with the actual function names being macros that expand to the correct versions depending on whether UNICODE has been defined. This is fine and all, but since macros are generally frowned upon in C++ surely there’s a better way to go about this? An intelligent method that can choose which flavour of the function to call based on the type of its arguments rather than some global setting which requires user intervention to use the opposite.

The usual C++ alternatives to macros are usually inline functions or templates. As plain inline functions don’t give us any special compile time factilities they are of no use here, so we’re stuck with templates. They are a good choice for doing something like this as we can leverage the metaprogramming and type identification facilities they provide. The one we’ll be using as the key to our system is type equivalence, along with a compile-time if template to select between the ‘A’ and ‘W’ versions of the functions.

The heart of the technique is a single use of the ternary operator. It is fed a condition of “Is the character type narrow?” and returns the A version if so, and the W if not. Because of how the ternary operator works, both sides are cast to a common void(__stdcall*)(void) function pointer, however the types of both functions must be preserved so that the result can be cast back to their original type to enforce argument type checking. This is where boost comes in. As function names are really pointers to types and not types in their own right, BOOST_TYPEOF is leveraged to get the real type from a both A and W types of the function.

When we’ve got the correct resulting type, boost is further employed, specifically it’s add_pointer template, to turn the function type back into a callable function pointer. The eventual result of all this is the correct function pointer just like you’d written it directly into the source code with all the same facilities.

As all this produces a lot of code to type out everytime we want to use it, we turn back to the thing that started us off on this jaunt in the first place, a macro. This is a necessary evil however, as we need the function macro in it’s unexpanded state to weave our magic. After implementing the macro a call to MessageBox looks something like

template<class CharType>
int DisplayMessage(const std::basic_string<CharType>& message, const CharType* title)
{
    return WIN_FUNC(MessageBox)(NULL, message.c_str(), title, MB_OK);
}
 
// elsewhere
std::wstring wideHello = L"Hello there from the WCHARs";
WCHAR* wideTitle = L"Howdy";
DisplayMessage(wideHello, wideTitle);
std::string hello = "Hello there from the narrow chars";
char* narrowTitle = "Howdy";
DisplayMessage(hello, narrowTitle);

As you can see, the usage is the same as that of the TEXT macro with as little clutter around the modified area as possible, except now the function call looks like an object that takes the function name as a constructor parameter with the arguments to its operator() being those of the function.

Speaking of the TEXT macro, we need an equivalent replacement to use with our functions to enable the usage of correctly typed string and character literals as function parameters. The format is exacly the same but requires a few different type trait helpers from boost as well as a helpful hack via boost preprocessor to make up for the preprocessors shortfalls. There are also types in the Windows headers that depend on the UNICODE macro but as the underlying principles in the method aren’t tied to functions or strings, a third macro can be introduced to deal with them. This one is much simpler, with just a single compile time if needed.

An example showing all of the macros in actions would look something like this:

// note: the macros in the header below require the template parameter
// be named CharType although it's easily editable
template<class CharType>
BOOL LaunchProcess(const std::basic_string<CharType>& process)
{
    WIN_TYPE(STARTUPINFO) si = {0};
    PROCESS_INFORMATION pi = {0};
    std::basic_string<CharType> temp(process);
    BOOL launched = WIN_FUNC(CreateProcess)(NULL, &temp[0], NULL, NULL, FALSE, 0, NULL, NULL, &si, &pi);
    if(launched)
    {
        WIN_FUNC(MessageBox)(NULL, process.c_str(), WIN_TEXT("Launched Process"), MB_OK);
        CloseHandle(pi.hThread);
        CloseHandle(pi.hProcess);
    }
    return launched;
}

In unoptimized builds, both WIN_FUNC and WIN_TEXT incur two comparisons before the correct type is returned whereas WIN_TYPE takes one. In optimized builds with MinGW and VS 2008 you cannot tell that the macros have been used as all trace of them is optimized out.

Finally, here is the contents of the header that implements all the above. Boost typeof and type_traits headers are required in order to compile.

#ifndef CHAR_TYPE_AGNOSTIC_MACROS_H
#define CHAR_TYPE_AGNOSTIC_MACROS_H
 
#pragma once
 
#include <boost/type_traits/add_pointer.hpp>
#include <boost/typeof/typeof.hpp>
#include <boost/type_traits/is_same.hpp>
#include <boost/type_traits/is_array.hpp>
#include <boost/preprocessor/tuple/rem.hpp>
#include "IfThenElse.hpp"
 
//
// The macros
//
// Helper just for readability sake
#define IS_EQUAL_TO_CHAR(CharType) \
    boost::is_same<CharType, char>::value
 
// Uses the IfThenElse struct from above to pick the function 
// that contains the version of the func corresponding to CharType
 
#define PICK_RIGHT_TYPE(CharType, FunctionA, FunctionW) \
    typename IfThenElse /* typename is required since the result is always template parameter dependent */ \
    < \
        IS_EQUAL_TO_CHAR(CharType), \
        BOOST_TYPEOF(FunctionA), \
        BOOST_TYPEOF(FunctionW) \
    >::type
 
// The casts to a common type (here a function pointer taking and returning no arguments) 
// are required since both sides of the ternary operator must be of the same
// type, using the bare types violates the rule and generates a compile error.
// The cast to function pointer rather than straight void* is used because
// MinGW emits warnings when doing that
// The "returned" argument is the the function that will be called
#define FUNC_DECISION(ResType, Cond, True, False) \
    ( \
        reinterpret_cast<ResType> \
        ( \
            Cond ? \
            reinterpret_cast<void(__stdcall*)()>(True) : \
            reinterpret_cast<void(__stdcall*)()>(False) \
        ) \
    )
 
// just like above but for non-functions
#define ARG_DECISION(ResType, Cond, True, False) \
    reinterpret_cast<ResType> \
    ( \
        Cond ? \
        reinterpret_cast<const void*>(True) : \
        reinterpret_cast<const void*>(False) \
    )
 
#define CALL_CORRECT_VERSION(CharType, FunctionA, FunctionW) \
    ( \
        FUNC_DECISION \
        ( \
            typename boost::add_pointer<PICK_RIGHT_TYPE(CharType, FunctionA, FunctionW)>::type, \
            IS_EQUAL_TO_CHAR(CharType), \
            FunctionA, \
            FunctionW \
        ) \
    )
 
// The macro to use when calling the function in source code 
// The concatenation is done here because we don't want Function to expand to it's natural
// A or W type, otherwise we get stuff like GetWindowTextAA, GetWindowTextWA, etc.
// Arguments are specified in the normal way after the macro, for example
//
//template<class CharType>
//BOOL UserName(std::basic_string<CharType>& user)
//{
//    DWORD sizeRequired = 0;
//    WIN_FUNC(GetUserName)(NULL, &sizeRequired);
//    user.resize(sizeRequired);
//    return WIN_FUNC(GetUserName)(&user[0], &sizeRequired);
//}    
#define WIN_FUNC(Function) \
    CALL_CORRECT_VERSION \
    ( \
        CharType, \
        Function##A, \
        Function##W \
    )
 
// Argument to WIN_TEXT is a narrow string or character literal
//
// BOOST_PP_TUPLE_REM_CTOR is given an incorrect argument for the size of the tuple
// because commas delimit tuples values, if 1 is specified then everything after the first
// IS_EQUAL_TO_CHAR is discarded by the expanded macro
// so the argument is the number of commas in the expression enclosed by it
#define WIN_TEXT(Str) \
    ARG_DECISION \
    ( \
        BOOST_PP_TUPLE_REM_CTOR /* first arg is type to cast the result back to */ \
        ( \
            8, \
            ( \
                typename IfThenElse \
                < \
                    IS_EQUAL_TO_CHAR(CharType), /* Test if the CharType is char */ \
                    BOOST_PP_TUPLE_REM_CTOR /* It was char */\
                    ( \
                        3, \
                        ( \
                            typename IfThenElse /* determine whether we were passed a string or a char */ \
                            < \
                                boost::is_array<BOOST_TYPEOF(Str)>::value, \
                                const char*, /* was a string */ \
                                char /* was a char */ \
                            >::type \
                        ) \
                    ), \
                    BOOST_PP_TUPLE_REM_CTOR /* It was WCHAR */ \
                    ( \
                        3, \
                        ( \
                            typename IfThenElse \
                            < \
                                boost::is_array<BOOST_TYPEOF(Str)>::value, \
                                const wchar_t*, /* was a string */ \
                                wchar_t /* was a char */ \
                            >::type \
                        ) \
                    ) \
                >::type /* The type to cast back to, is the result of the equivalance applied to the results of the inner ifs */ \
            ) \
        ), \
        BOOST_PP_TUPLE_REM_CTOR /* Second argument to ARG_DECISION */ \
        ( \
            2, \
            ( \
                IS_EQUAL_TO_CHAR(CharType) /* Just a simple equivalence test */ \
            ) \
        ), \
        Str, /* Third arg */ \
        L##Str /* Fourth arg */ \
    )
 
// Like WIN_TEXT but for Windows types that are UNICODE dependent such as STARTUPINFO
#define WIN_TYPE(Type) \
    typename IfThenElse \
    < \
        IS_EQUAL_TO_CHAR(CharType),\
        Type##A, \
        Type##W \
    >::type
 
#endif // CHAR_TYPE_AGNOSTIC_MACROS_H

IfThenElse.hpp

#ifndef IF_THEN_ELSE_H
#define IF_THEN_ELSE_H
 
#pragma once
 
// Template "If" Function, copied from C++ Templates: A Complete Guide
// http://books.google.co.uk/books?id=EotSAwuBkJoC&pg=PA309&lpg=PA309&ct=result#PPA309,M1
template<bool cond, class TrueArg, class FalseArg>
struct IfThenElse;
 
template<class TrueType, class FalseType>
struct IfThenElse<true, TrueType, FalseType>
{
	typedef TrueType type;
};
 
template<class TrueType, class FalseType>
struct IfThenElse<false, TrueType, FalseType>
{
	typedef FalseType type;
};
 
#endif

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress