Print Environment Variables

2025-03-11, Tue

Environment variables could be printed by printenv(1) and env(1) in Unix. This article is not about them, it's about environ(7) and NULL pointer that ends the string array. Debian manual says that "The last pointer in this array has the value NULL", which means it could be either handy or tricky to traverse the environment variables, depending on how the array is processed.

1. environ as string array

The canonical way to interate environ is no different from argv. Remember that argv[argc] is also NULL, just like environ that ends with NULL.

#include <stdio.h>

void
print_str_array(char *desc, char **str_array)
{
  if (NULL != desc) fprintf(stdout, "%s\n", desc);
  char **str = str_array;
  while(*str != NULL) {
    fprintf(stdout, "%s\n", *str++);
  }
}

int
main(int argc, char *argv[])
{
  extern char **environ;
  print_str_array("====argv====", argv);
  print_str_array("====environ====", environ);
}

2. environ as char array

Instead of interating the string array through char** str, what if it is done through char* ch? Things soon become interesting when we want to move forward the array, as well as checking the NULL character. Namely:

  • We need to move forward by step of strlen(ch) + 1
  • We need to check the NULL through the comparison of *ch and 0

Now we have:

#include <stdio.h>
#include <string.h>

int
main(void)
{
  extern char **environ;
  char *ch;
  for (ch = *environ;
       *ch != 0; /* (NULL != ch) will not work */
       ch += strlen(ch) + 1) {
    fprintf(stdout, "%s\n", ch);
  }
}

This awkward *ch != 0 comes to place because we're manipulating chars (bytes) directly here. If you're confused, picture the memory layout in head and think about how variables are interpreted in the program. Given the same content in an array of bytes, the result could be quite different when we traverse those bytes through different pointer types, e.g. (char **), (char *), or even (int *), (union some_random_type *), etc.

Take one byte holding value of 0x00 as an example, when we point to this byte through char *pc, pc sees it as a char value. On the other hand, if we point to this byte through char **ps, ps sees it as a char * value. In summary, how we inspect the memory, determines what we see. It doesn't affect what's stored in memory, but it does affect how that storage is interpreted.

Following sections demonstrate how this mechanism works.

3. Other scenarios

Consider the following memory chunk, with each byte has its value represented as ASCII characters, with NUL replaced by \0:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
f o o = b a r \0 n a m e = m e c \0 \0
#include <stdio.h>
#include <string.h>

#define STR1 "foo=bar"
#define STR2 "name=mec"

const int MAXINDEX = sizeof(STR1) + sizeof(STR2);
int
main(void)
{
  char buf[MAXINDEX + 1];
  snprintf(&buf[0], sizeof(STR1), STR1);
  snprintf(&buf[sizeof(STR1)], sizeof(STR2), STR2);
  snprintf(&buf[sizeof(STR1) + sizeof(STR2)], 1 /* sizeof empty string*/, "");
}

Now we use three pointers to iterate this byte array, and see what happens.

  • char *pc
  • char **ps
  • int *pi

3.1. char *

First, it's (char *):

char *pc = buf;
for (int i = 0; i <= MAXINDEX; ++i) {
  fprintf(stdout, "%x ", *pc++);
}
// 66 6f 6f 3d 62 61 72 0 6e 61 6d 65 3d 6d 65 63 0 0 

The output here is the ASCII code of each byte in hexadecimal form.

3.2. char **

Followed is (char **):

char **ps = (char**)buf;
char str2[sizeof(STR2)];
memcpy(str2, ps + 1, sizeof(STR2));
fprintf(stdout, "str2: %s\n", str2);
// str2: name=mec

As shown above, the pointer arithmetic done by GCC correctly locates the address of second string.

3.3. int *

Last, let's try to interpret the byte array as numbers:

int *pi = (int *) buf;
for (int i = 0; i < MAXINDEX / sizeof(int); ++i) {
  fprintf(stdout, "%x, ", *pi++);
}
// 3d6f6f66, 726162, 656d616e, 63656d3d, 

The remaining two bytes are ignored for simplicity. Due to the little-endian system employed by my current system, bytes have been re-organized so that they could be properly interpreted as numbers. e.g.

chars (bytes) ints
66 6f 6f 3d 3d 6f 6f 66
62 61 72 00 00 72 61 62
63 61 6d 65 65 6d 61 6e
3d 6d 65 63 63 65 6d 3d

I guess these three scenairos could be summarized as: one data, many interpretations.

4. Conclusion

Key Takeaways:

  • How a pointer interprets its value when it is dereferenced, depends on the type of the pointer. It is totally normal for different type of pointers pointing to the same address while returning different values when dereferenced.
  • The compiler is doing a lot of work behind the scene.
  • Delay the dereference of pointer so that point arithmetics would be leveraged.
  • When feeling awkward, check whether you're processing memory with appropriate type.
  • Regarding data storage, data presentation, and how data is retrieved and manipulated – in some sense, it's not much different from the relationship between Data Structure and Algorithm.
  • When approaching a problem, check what is the data, and how we use them.