It is important to understand how the symbols & and * are used in C. The symbol & as a unary operator is the address of operator (as a binary operatory it is the bit-wise and operator). The addressof operator may only be applied to variables. The resulting value is the address of the storage cell associated with that variable. If the type of the variable is X, the type of the address of operation is pointer to X.
In expressions, the symbol * as a unary operator means pointer indirection (as a binary operator it means multiplication). The indirection operator can be applied to any expression that is the data type pointer to X. The result of applying the indirection operator would be the value of type X in the storage cell that the pointer operand points to.
In a declaration statement, the symbol * adds pointer to to the type the declaration refers to. Note the symbol & is NOT used in declarations.
A generic pointer (a pointer whose type matches any other pointer) is declared as follows.
void * ptr;
struct nn { int ii; ss: +-------+-------+-------+ char cc; |ii: |cc: |ff: | float ff; +-------+-------+-------+ } ss; struct nn s2; s2: +-------+-------+-------+ |ii: |cc: |ff: | +-------+-------+-------+ int a1[3]; a1: ****** +-------+-------+-------+ * --*->|0: |1: |2: | ****** +-------+-------+-------+ struct { int i1; a2: +-------+-------+-------+ int i2; |i1: |i2: |i3: | int i3; +-------+-------+-------+ } a2; int a2d[2][3]; a2d:****** ****** +-------+-------+-------+ * --*->* --*->|0: |1: |2: | ****** ****** +-------+-------+-------+ * --*->|0: |1: |2: | ****** +-------+-------+-------+Assuming an int is 4 bytes, a char is 1 byte, a float is 4 bytes, ss is a 9 byte storage cell containing 3 elements. (Actually, most compilers will allocate more space for ss leaving unused space after the char so each element in ss begins on an even memory address.) The optional identifier nn defines a structure type. Additional struct variables can be defined just using the struct identifier.
The array declaration creates two separate entities. First, there is an array of 3 int elements which would occupy 12 bytes of memory. The identifier a1 is NOT be the array itself, it is a pointer value containing the address of the actual array. In this document, this pointer value will be referred to as an array reference. Furthermore a1 is not a normal storage cell, it is a compile-time constant. As a consequence, a1 may not be assigned a value and memory does not have to be allocated for it (although memory is allocated for the array itself). Since a1 is a constant, some compilers flag trying to takes its address as an error, just like taking the address of the constant 5 would be an error. But some compilers allow the addressing operator to be applied to array reference constants, with the resulting value being the address itself (a1 == &a1). IMHO, taking the address of an array reference constant should be flagged at compile time as an error.
Notice how collections a1 and a2 are treated differently even though they are both collections of 3 integers. Identifier a1 is a pointer (constant), an array reference. On the otherhand, a2 is the 12-byte collection itself. Array identifiers are treated differently than any other identifiers and the inconsistency causes much confusion. Whenever arrays are involved, programmers inexperienced in C must take time to carefully create a diagram of the storage cells, their relationships, and their data types.
Multi-dimensional arrays are implemented by creating arrays of arrays. In the example above, a2d and a2d[0] contain the same address, that of a2d[0][0].
int aa[10]; int *p1; int *p2; int ii; p1 = ⅈ p2 = aa; aa[5] = 20; p2[7] = 30; *p1 = 7; | int aa[10]; int *p1; int *p2; int ii; p1 = ⅈ p2 = aa; *(aa+5) = 20; *(p2+7) = 30; p1[0] = 7; |
Perhaps another way of thinking of the programs above is each has TWO arrays of integers. The first array contains ten integers and is pointed to by a1. The second array is an aray of one integer and is pointed to by the expression &ii. Although there is no variable that stores the ten integer array, the variable ii does contain the one integer array.
Consider the program spat1.c. Draw a diagram of the variables in the program. Determine the data type of all the values to be printed. What errors/warnings would you expect when the program compiled? If all errors are removed, what output would you expect from the corrected program?
Given the similarities between array references and pointers, how does one make sense of them?
int ii; int aa[10]; for (ii=0; ii<10; ii++) aa[ii] = ii;
typedef struct nn { int value; struct nn *next; } node; node *top; node *pp; pp = top; while (pp != NULL) { printf ("%d\n", (*pp).value); pp = (*pp).next; }
int a1[10]; int a2[10]; int ii; // Code that assign values to a1. for (ii=0; ii<10; ii++) a2[ii] = a1[ii]; // Note: a2 = a1 would not work, a2 is a constant! Even if a2 was // not a constant, you would be assigned the pointer not the array.
int quiz[30][10]; // 30 students, 10 quiz grades each int *best; // A dynamically assigned array reference. int ii; // Set best to grades of student with best grade on quiz 0. best = quiz[0]; for (ii=1; ii<30; ii++) if (quiz[ii][0] > best[0]) best = quiz[ii]; // Note best is an alias for some quiz[ii], so if best[1] // got changed, quiz[ii][1] would also change.
int *aa; // Really an array reference but the pointer value is assigned dynamically. int size; int ii; // Code that sets size. aa = malloc (size * sizeof(int)); Returns a pointer to size int-sized storage cells. for (ii=0; ii- In multi-dimensional arrays where the second dimension is variable, a pointer must be used. Consider the following example, where we need an 2-d array to store 3 types of grades(hws, quizes, attendance) where there are a different number of scores for each type of grade.
int *(grade[3]); grade[0] = malloc (HWS_CNT * sizeof(int)); grade[1] = malloc (QUIZ_CNT * sizeof(int)); grade[2] = malloc (ATT_CNT * sizeof(int));
typedef int age; typedef char line[80];After the definitions above, the following are equivalent.
int v1; | age v1; |
int *v2; | age *v2; |
char v3[80]; | line v3; |
char v4[10][80]; | line v4[10]; |
char (*v5)[80]; | line *v5; |
void ff(int arg[]) void ff(int *arg)Perhaps the former is more readable, but the latter is more commonly used and more accurately describes the alias nature.
Note that just as in Java, if the contents of an array formal parameter are changed in a function, the contents of the array actual argument will also be changed. One can prevent that from happening by using const in the declaration.
void ff(int const arg[]) void ff(int const * arg)
if a 2-dimensional array is passed into a function, it is necessary to specify all but the first dimension in the declaration. If the parameter is a 10x30 array, all the following are acceptable.
void ff(int arg[][30]) void ff(int arg[10][30]) void ff(int (*arg)[30])The following are not acceptable because for the compiler to compute the location of arg[ii] for an index value other than 0, the compiler must know how many element in each arg[ii]. The following declarations do not provide that information. the
void ff(int *(arg[10])) void ff(int **arg) void ff(int arg[][])
char *s1; char *s2; char s3[9]; int const s3_SZ = (sizeof(s3) / sizeof(char)) s1 = "abc"; s2 = a2; strcpy(s3, s1); // Careful, what if s1 is bigger than s3 strncpy(s3, s2, 9); // Better strncpy(s3, s2, s3_SZ); // Best strncpy(s3, "123", s3_SZ);Consider the second parameter in main. it is an array of strings. But since a string is an array of characters, that means the parameter is a 2d array of characters. However, it cannot be declared as char args[][] because each string is a different size. As was discussed above, when the second dimension in a 2d array is dynamic, the following declaration is required: char *(args[]) . Since a parameter is being declared, the folowing declaration is equivalent: char **args .
See string(3) for useful string functions.
int strcmp(char const *str1, char const *str2); int strcpy(char *str1, char const *str2); int strlen(char const *str); int strcat(char *str1, char const *str2);