25 Sep 2011, 02:01

Positional Printf

Note: this article was originally found at Code Of The Day in Flipcode.

Printf-style formatting is well known, flexible and very useful in general. All the worries about unchecked parameters and so on have proven to be a non-íssue in my experience. However, it lacks parameter positioning, the way .NET allows. Why is this important for games? Any modern game needs to have a good localisation system; most game translators understand printf-style format strings and know what to do with them when they are among the text strings to be localised. There is a subtle kind of localised strings that can’t be easily translated most of the times: strings in which parameterized data changes position depending on the language. A good example is the date format, dd/mm/yy as we use in Europe as opposed to the mm/dd/yy format in the US. The format string would be “%d/%d/%d”, and the call could be something like:

printf(LTEXT\_DATE\_STRING, day, month, year);

How would you localise a string like that to the US without modifying the code? Using system locale functions defeats the point I’m making; another example are strings like “Rick’s Bar” as opposed to “El Bar de Rick”, where both “Rick” and “Bar” would change positions if they are parameters.

The function I present here wraps printf with an extension to format specifiers, which allows for parameter placement. In the given example, the US string would be “%{1}d/%{0}d/%{2}d”, where the number in parenthesis specifies which parameter should be used in that place of the format string. If no placement modifier is included, it assumes the next parameter from the previous format specifier. This way, the function will work fine with regular printf format strings.

The code will need modifications to be a full-blown function to use in your own application, and is only presented as an example of the technique. Check it, read the comments and modify at will. It is templatized in order to support both 8-bit chars and 16-bit wchar_t strings.

Caveats:

  • Asterisks (’*‘) in the format string can’t be supported this way.
  • It only supports “%c”, “%d”, “%g” and “%s” format strings (but will work with all format modifiers except the mentioned asterisk). Extending this support is trivial.
  • Portions of the code are system-dependent. This example currently works in Microsoft’s .NET compiler, but it shouldn’t be hard to modify for your own platform.
// -----------------------------------------------
// PositionalPrintfTest.cpp

#define UNICODE
#include <tchar.h>
#include <stdio.h>

#include "PositionalPrintf.h"

// Try different things with the fancy printf function
int _tmain(int argc, _TCHAR* argv[])
{
  printf(_T("%s %g %d %c\n"), _T("Hola"), 3.4f, 34, _T('3'));

  PositionalPrintf("%s %g %d %c\n", "Hola", 3.4f, 34, '3');
  PositionalPrintf(L"%s %g %d %c\n", L"Hola", 3.4f, 34, L'3');
  PositionalPrintf(_T("%s %g %d %c\n"), _T("Hola"), 3.4f, 34, _T('3'));

  PositionalPrintf("%{3}c %g - %{0}s's %{1}s\n%{3}c %{4}g - El %{1}s de %{0}s", "Rick", "Bar", 'S', 'E', 3.1415f);
}
// -----------------------------------------------
// PositionalPrintf.h

// Only two versions of this template are instantiated, char and wchar_t
// Any other attempt to use this function will result in a link error.
template<typename T>
void PositionalPrintf(const T *pszFmt, ...);
// -----------------------------------------------
// PositionalPrintf.cpp

#include <stdarg.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

// Templated functions for char/wchar_t operations.
template <typename T> int TempAtoi(const T *p);

template<> int TempAtoi(const char *p) { return atoi(p); }
template<> int TempAtoi(const wchar_t *p) { return _wtoi(p); }

template <typename T> void TempVPrintf(const T *p, va_list va);

template<> void TempVPrintf(const char *p, va_list va)     { vprintf(p, va); }


template<> void TempVPrintf(const wchar_t *p, va_list va)  { vwprintf(p, va); }

// The actual function
template<typename T>
void PositionalPrintf(const T *pszFmt, ...)
{
  enum TParam
  {
    TP_UNK,
    TP_CHAR,
    TP_INT,
    TP_STR,
    TP_FLOAT,
  };
  static const int MAX_PARAMS = 100;

  // Here we store the parameters we should be expecting on the stack


  // We decide this based on the format string.
  int nParams  = 0;
  int nextParam = 0;
  TParam aParams[MAX_PARAMS];

  // Here we store each '%' format element's data from the format string.
  struct
  {
    TParam param;
    int    pos;
  } aFormats[MAX_PARAMS];
  int nFormats = 0;

  for (int i = 0; i < MAX_PARAMS; ++i)
    aParams[i] = TP_UNK;

  // Scan the format string to detect expected parameters and the way to extract them from the stack.


  bool bError = false;
  for (const T *s = pszFmt; *s; ++s)
  {
    if (*s == T('%'))
    {
      // Found a format element. First thing to do is identify the position of the parameter
      // on the stack.


      int pos;
      if (*(s+1) == T('{'))
      {
        // Explicit position modifier
        pos = TempAtoi(s+2);
        nextParam = pos+1;
      }
      else
        pos = nextParam++; // Just se the next position.
      while (*s && *s != T('c') && *s != T('d') && *s != T('g') && *s != T('s'))
      {
        // Here we could detect '*' parameters (which won't work), missing '}', etc.


        s++;
      }
      if (!*s)
        break;

      if (pos >= MAX_PARAMS)
      {
        // error
        printf("ERROR! Parameter %d out of range.\n", pos);
        bError = true;
      }
      else 
      {
        // Identifica el tipo de parámetro que es


        // Aqui podemos extender esto un montón para cubrir todos los tipos de formato
        TParam p;
        if (*s == T('c'))      p = TP_CHAR;
        else if (*s == T('d')) p = TP_INT;
        else if (*s == T('s')) p = TP_STR;
        else                   p = TP_FLOAT;

        if (aParams[pos] != TP_UNK && aParams[pos] != p)
        {
          // error


          printf("ERROR! Parameter %d used with different format specifiers (%d and %d).\n", pos, aParams[pos], p);
          bError = true;
        }

        // Store the parameter type to be expected at position 'pos' on the stack.
        aParams[pos] = p;
        if (nParams <= pos)
          nParams = pos+1;

        // Store the parameter type and position on the stack, for this format element.
        if (nFormats >= MAX_PARAMS)
        {
          // error


          printf("ERROR! Too many format elements (%d)!\n", nFormats);
          bError = true;
        }
        else
        {
          aFormats[nFormats].param = p;
          aFormats[nFormats].pos   = pos;
          nFormats++;
        }
      }
    }
  }

  // Verify that all parameters on the stack are referenced. (optional)
  for (int i = 0; i < nParams; i++)
  {
    if (aParams[i] == TP_UNK)
    {
      // error


      // Benign if we assume that unused parameters are of INT size.
//      printf("ERROR! Parameter %d undefined.\n", i);
//      bError = true;
    }
  }
  if (bError)
    return;

  // Build a new format string removing the {n} modifiers
  T szNewFmt[3000];
  {
    T *p = szNewFmt;

    szNewFmt[0] = T(0);

    for (const T *s = pszFmt; *s; s++)
    {
      *p++ = *s;
      if (*s == T('%'))
      {
        if (*(s+1) == T('{'))
        {
          s += 2;
          while (*s && *s != T('}'))
            s++;
          if (!*s)
            break;
        }
      }
    }
    *p = T(0); // Zero-end the new format string


  }
  
  // Copy the parameters to the parameter buffer in correct order.
  // Here comes the system-dependent part
  char aParamBuf[3000];
  {
    char *p = aParamBuf;

    for (int i = 0; i < nFormats; ++i)
    {
      va_list va;
      va_start(va, pszFmt);

      // Skip stack parameters until the one we're looking for.


      // System-dependent: assumes that anything that is not of type double is of INT size
      for (int j = 0; j < aFormats[i].pos; ++j)
      {
        if (aParams[j] == TP_FLOAT)
          va_arg(va, double);
        else

          va_arg(va, int);
      }

      // Copy the parameter into the new parameter buffer
      // System-dependent: the size thing again.
      // System-dependent: assumes things about the order in which parameters are stored on the stack.
      if (aParams[j] == TP_FLOAT)
      {
        double d = va_arg(va, double);
        memcpy(p, &d, sizeof(d));
        p += sizeof(d);
      }
      else

      {
        int d = va_arg(va, int);
        memcpy(p, &d, sizeof(d));
        p += sizeof(d);
      }
    }
  }

  // System-dependent: assumes that aParamBuf can be cast straight to va_list.
  TempVPrintf(szNewFmt, (va_list)aParamBuf);
}

// Instantiate the char and wchar_t versions of the function.

template void PositionalPrintf(const char *pszFmt, ...);

template void PositionalPrintf(const wchar_t *pszFmt, ...);