Print Environment Variables
2025-03-11, Tue
Environment variables could be printed by printenv(1)
and env(1)
in Unix. This article is not about them, it's about environ(7)
and
NULL
pointer that ends the string array. Debian manual says that
"The last pointer in this array has the value NULL", which means it
could be either handy or tricky to traverse the environment variables,
depending on how the array is processed.
1. environ
as string array
The canonical way to interate environ
is no different from argv
.
Remember that argv[argc]
is also NULL
, just like environ
that
ends with NULL
.
#include <stdio.h> void print_str_array(char *desc, char **str_array) { if (NULL != desc) fprintf(stdout, "%s\n", desc); char **str = str_array; while(*str != NULL) { fprintf(stdout, "%s\n", *str++); } } int main(int argc, char *argv[]) { extern char **environ; print_str_array("====argv====", argv); print_str_array("====environ====", environ); }
2. environ
as char array
Instead of interating the string array through char** str
, what if it
is done through char* ch
? Things soon become interesting when we
want to move forward the array, as well as checking the NULL
character. Namely:
- We need to move forward by step of
strlen(ch) + 1
- We need to check the
NULL
through the comparison of*ch
and0
Now we have:
#include <stdio.h> #include <string.h> int main(void) { extern char **environ; char *ch; for (ch = *environ; *ch != 0; /* (NULL != ch) will not work */ ch += strlen(ch) + 1) { fprintf(stdout, "%s\n", ch); } }
This awkward *ch != 0
comes to place because we're manipulating chars (bytes)
directly here. If you're confused, picture the memory layout in head and
think about how variables are interpreted in the program. Given the same
content in an array of bytes, the result could be quite different when we
traverse those bytes through different pointer types, e.g. (char **), (char *),
or even (int *), (union some_random_type
*), etc.
Take one byte holding value of 0x00
as an example, when we point to this byte
through char *pc
, pc
sees it as a char
value. On the other hand, if
we point to this byte through char **ps
, ps
sees it as a char *
value.
In summary, how we inspect the memory, determines what we see. It doesn't
affect what's stored in memory, but it does affect how that storage is interpreted.
Following sections demonstrate how this mechanism works.
3. Other scenarios
Consider the following memory chunk, with each byte has its value represented
as ASCII characters, with NUL
replaced by \0
:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
f | o | o | = | b | a | r | \0 | n | a | m | e | = | m | e | c | \0 | \0 |
#include <stdio.h> #include <string.h> #define STR1 "foo=bar" #define STR2 "name=mec" const int MAXINDEX = sizeof(STR1) + sizeof(STR2); int main(void) { char buf[MAXINDEX + 1]; snprintf(&buf[0], sizeof(STR1), STR1); snprintf(&buf[sizeof(STR1)], sizeof(STR2), STR2); snprintf(&buf[sizeof(STR1) + sizeof(STR2)], 1 /* sizeof empty string*/, ""); }
Now we use three pointers to iterate this byte array, and see what happens.
char *pc
char **ps
int *pi
3.1. char *
First, it's (char *):
char *pc = buf; for (int i = 0; i <= MAXINDEX; ++i) { fprintf(stdout, "%x ", *pc++); } // 66 6f 6f 3d 62 61 72 0 6e 61 6d 65 3d 6d 65 63 0 0
The output here is the ASCII code of each byte in hexadecimal form.
3.2. char **
Followed is (char **):
char **ps = (char**)buf; char str2[sizeof(STR2)]; memcpy(str2, ps + 1, sizeof(STR2)); fprintf(stdout, "str2: %s\n", str2); // str2: name=mec
As shown above, the pointer arithmetic done by GCC correctly locates the address of second string.
3.3. int *
Last, let's try to interpret the byte array as numbers:
int *pi = (int *) buf; for (int i = 0; i < MAXINDEX / sizeof(int); ++i) { fprintf(stdout, "%x, ", *pi++); } // 3d6f6f66, 726162, 656d616e, 63656d3d,
The remaining two bytes are ignored for simplicity. Due to the little-endian system employed by my current system, bytes have been re-organized so that they could be properly interpreted as numbers. e.g.
chars (bytes) | ints |
---|---|
66 6f 6f 3d | 3d 6f 6f 66 |
62 61 72 00 | 00 72 61 62 |
63 61 6d 65 | 65 6d 61 6e |
3d 6d 65 63 | 63 65 6d 3d |
I guess these three scenairos could be summarized as: one data, many interpretations.
4. Conclusion
Key Takeaways:
- How a pointer interprets its value when it is dereferenced, depends on the type of the pointer. It is totally normal for different type of pointers pointing to the same address while returning different values when dereferenced.
- The compiler is doing a lot of work behind the scene.
- Delay the dereference of pointer so that point arithmetics would be leveraged.
- When feeling awkward, check whether you're processing memory with appropriate type.
- Regarding data storage, data presentation, and how data is retrieved and manipulated – in some sense, it's not much different from the relationship between Data Structure and Algorithm.
- When approaching a problem, check what is the data, and how we use them.