Token Concatenation

2025-02-24, Mon

The other day I was checking some random Emacs built-in function and this DEFUN macro shows up in source code, so my next step is to check its definition to see what's behind the scene, and here is what's in emacs/src/lisp.h:

#define DEFUN(lname, fnname, sname, minargs, maxargs, intspec, doc) \
SUBR_SECTION_ATTRIBUTE                                            \
static union Aligned_Lisp_Subr sname =                            \
   {{{ PVEC_SUBR << PSEUDOVECTOR_AREA_BITS },			    \
     { .a ## maxargs = fnname },				    \
     minargs, maxargs, lname, {intspec}, lisp_h_Qnil}};	    \
 Lisp_Object fnname

This macro defines a static union and declares a function without its parameter list. All seems natural – except what is this strange "##" thing doing there? I tried to extract a snippet from emacs/src/fileio.c and pre-process it with gcc -E sample.c. So this:

DEFUN ("directory-name-p", Fdirectory_name_p, Sdirectory_name_p, 1, 1, 0,
         doc: /* Return non-nil if NAME ends with a directory separator character.  */)
(Lisp_Object name)
{ /* Omitted */  }

was transformed to this:

SUBR_SECTION_ATTRIBUTE static union Aligned_Lisp_Subr Sdirectory_name_p =
  {
    {
        { PVEC_SUBR << PSEUDOVECTOR_AREA_BITS },
        { .a1 = Fdirectory_name_p },
        1,
        1,
        "directory-name-p",
        {0},
        lisp_h_Qnil
    }
  };
Lisp_Object Fdirectory_name_p

(Lisp_Object name)
{  }

Apparently .a ## maxargs becomes .a1 after the preprocessing. This result raises questions: where is this behavior defined? – namely, Is it a standard behavior specified by the C standard, or implementation behavior specified by GCC? Is there any other exotic special characters being utilized by the C macro?

A quick check of the C99 doc n1256.pdf1, 2 reveals that this ## thing is indeed well defined as "The ## operator" in Section 6.10.3.3, along with "The # operator" in Section 6.10.3.2. While the standard is not self-explanatory for me, GNU provides examples easier to understand in "The C Preprocessor"3 Section 3.4 and Section 3.5.

Back to earlier questions: # and ## are two operators recognized by function-like macros. One is used for stringizing, and another for token pasting or token concatenation. With questions answered, it's time for more questions answered with examples:

#define SAMPLEFUNC(a, b, c)			\
  "a received as: " a				\
  "b received as: " b				\
  "c received as: " c

// Q: What is the argument value that each parameter receives?
// A: String literals separated by comma, stripped of comment
SAMPLEFUNC(what, is: /*this macro, foo*/, bar);
// "a received as: " what "b received as: " is: "c received as: " bar

// Q: What if not enough arguemnt is provided?
// A: Compilation error
SAMPLEFUNC(what, is);
// src/sample-func.c:37:20: error: too few arguments provided to function-like macro invocation
//    37 | SAMPLEFUNC(What, is);
//       |                    ^
// src/sample-func.c:27:9: note: macro 'SAMPLEFUNC' defined here
//    27 | #define SAMPLEFUNC(a, b, c) \
//       |         ^
//

// Q: Other than comma, is there other separator of macro arguments?
// A: Don't think so
SAMPLEFUNC(`~!@#$%^&*()--=+{}[]|\';''"":;<>,.,doc/**/);
// "a received as: " `~!@#$%^&*()--=+{}[]|\';''"":;<> "b received as: " . "c received as: " doc

And my curiosity was satisfied. Let's conclude this post with an example featuring all of stringizing, concatenation, and variadic.

#define foo bar
#define greetings(x, y, z, ...)		\
  x ## y : # z : __VA_ARGS__

greetings(f, oo, baz, hi, hello, howdy);
// bar : "baz" : hi, hello, howdy;

Footnotes:

1

C - Project status and milestones on open-std.org: https://open-std.org/jtc1/sc22/wg14/www/projects.html

2

Technically, n1256.pdf is not the standard documentation, it is merely a draft.