Main

Week 2

Today we will continue working with C, focusing on dynamic memory allocation, and loops. Lecture codes are in the Git repository

Future lecture code will continue to be added to the above Bitbucket repository, so you should be able to just git pull after you clone it the first time. te

1. Allocating memory and freeing it

  • Anatomy of malloc
  • Separate declaration and allocation
  • When should we free memory

2. Multidimensional arrays

int matrix[2][4];   // A 2-D array of integers

3. Pointers

  1. int **matrix;
  2. const int numRows = 2, numColumns = 4;
  3. int i;
  4.  
  5. matrix = (int **) malloc ( numRows * sizeof(int *) );
  6. for (i = 0; i < numRows; i++) {
  7.   matrix[i] = (int *) malloc ( numColumns * sizeof(int) );
  8. }

After executing the above code, the dynamically allocated memory can be loosely represented as follows:

Note: For operator precedence, consult this table. When in doubt, use parentheses or look it up!

When you use malloc, you simply "reserve" chunks of memory and keep track of their location. You still don't have any values stored in the memory, i.e., it is uninitialized.

To initialize all values to something reasonable, for example, 0 in the case of an integer array, you can change the above code as follows.

  1. int **matrix;
  2. const int numRows = 2, numColumns = 4;
  3. int i;
  4.  
  5. matrix = (int **) malloc ( numRows * sizeof(int *) );
  6. for (i = 0; i < numRows; i++) {
  7.   matrix[i] = (int *) malloc ( numColumns * sizeof(int) );
  8.   memset(matrix[i], 0, numColumns * sizeof(int)); // set all values to 0
  9. }

You can also do it without memset, but it can be less efficient:

  1. int **matrix;
  2. const int numRows = 2, numColumns = 4;
  3. int i, j;
  4.  
  5. matrix = (int **) malloc ( numRows * sizeof(int *) );
  6. for (i = 0; i < numRows; i++) {
  7.   matrix[i] = (int *) malloc ( numColumns * sizeof(int) );
  8.   for (j = 0; j < numColumns; j++) {
  9.     matrix[i][j] = 0;
  10.   }
  11. }

Now all the values in the 2x4 matrix array are initialized to 0:


Next, we will focus on function arguments and continue working with pointers. We will also introduce some simple Bash commands and take a look at a Makefile.

4. Function arguments -- passing by value vs passing by reference.

Suppose we are required to provide a function that doesn't return a value (void) and which takes a single integer argument and adds 1 to it.

First version:

  1. #include <stdio.h>
  2. void increment (int x) {
  3.     x++;
  4. }
  5. int main() {
  6.     int y = 2;
  7.     printf("y = %d\n", y);
  8.     increment (y);
  9.     printf("After increment: y = %d\n", y);
  10.     return 0;
  11. }

This compiles and runs, but the value of y does not change because it is passed by value, i.e., its value, 2, is copied into x, leaving y unmodified. In C, function parameters are always passed by value. Pass-by-reference is simulated in C by explicitly passing pointer values.

Another try:

  1. #include <stdio.h>
  2. void increment (int *x) {
  3.     (*x)++;
  4. }
  5. int main() {
  6.     int y = 2;
  7.     printf("y = %d\n", y);
  8.     increment (&y);
  9.     printf("After increment: y = %d\n", y);
  10.     return 0;
  11. }

This time y is incremented because instead of copying its value (2) to the increment function's argument x, we simulated passing it by reference, i.e., we passed the address of y, so that inside the increment function, the value of x is the address of y and it can be used to modify whatever memory location x points to (in this case, y) by dereferencing it with *x.

5. C typedefs

In this exercise, we define the struct Point datatype, which stores two integer coordinates, x and y. Then we implement a function to "move" the point by adding an integer distance to both the x and y fields of the struct.

  1. struct Point {
  2.     int x;
  3.     int y;
  4. };

Every time we want to use it to declare a variable or a function argument, we have to specify the type as struct Point. Why not just Point?

In C (and C++), users can specify new types by using typedef. These types are called user-defined types, as opposed to the basic types provided by the language.

We can declare a type for the Point structure in one of the following ways.

  1. typedef struct {
  2.   int x, y;
  3. } Point;

Or the following syntax would also work:

  1. struct PointStruct {
  2.   int x, y;
  3. };
  4. typedef struct PointStruct Point;

Now we no longer use "struct" when declaring variables of type Point,

  1. Point x;
  2. void movePoint_wrong(Point p, const int dist);
  3. void movePoint(Point *p, const int dist);
  4. void printPoint(Point p);
  5. void printPoint2(const Point *p);

Passing more complex data types as arguments.

As you recall, a C struct consists of a list of fields, each of which can have almost any type. The total memory required for a struct is the sum of the storage requirements of all the fields, plus any internal padding. When you pass a struct by value, you are copying its contents, which can be inefficient for very large structs. Hence, when you have read-only function arguments of type struct, you should consider passing them through a const * instead of passing the entire struct (so only the address is copied rather than the entire contents of the struct).

See the complete implementation, including a Makefile in the week2/norris subdirectory of the lecture exercises repository.

C also has enum types (similar to Python and Java) and union types. Here is an example of a union type used inside a struct, which contains a somewhat arbitrary collection of fields, just for illustration purposes (and also assumes that the struct Point type is defined).

  1. struct TableItem {
  2.     int row;
  3.     char label[20];
  4.     union {
  5.         char c;
  6.         int i;
  7.         char *s;
  8.         double d;
  9.         struct Point *p;
  10.     } value;
  11. };
What this means is that you can define a variable of type TableItem and store values that match any of the union components. For example
  1. TableItem t;
  2. t.row = 0;
  3. strcpy(t.label, "An integer");
  4. t.value.i = 100;

7. C function pointers

So far we have only considered pointers to variables. In C you can point to functions, too, for example, fptr below is a pointer to a function. You can assign to it any function that has a matching prototype (return type and list of arguments):

double (*fptr) (double x[]);

You must initialize the pointer (i.e., assign a function to it) before you can call it. For example:

  1. double (*fptr) (double x[]);
  2.  
  3. double computeAverage(double y[]) {
  4.    //...
  5. }
  6.  
  7. int main() {
  8.   double x = {1.0,2.0,3.0};
  9.   double avg = 0.0;
  10.  
  11.   fptr = &computeAverage;
  12.   avg = fptr(x);
  13.   printf("Average: %f\n", avg);
  14. }

Why are function pointers useful? They can be used to enable callbacks -- i.e., have a certain function be called when some event occurs, or to pass functions as arguments of functions, enabling more flexible implementations.

Here is an example of an array of function pointers.

Many more examples of function pointers can be found here.

8. Separate compilation and makefiles (a very initial look, much more later)

  • A nice introduction to C header files and separate compilation can be found here.
  1. # Define some variables:
  2. SOURCES = clock.c timer.c     # could also use patterns $(wildcard *.c)
  3. HEADERS = clock.h timer.h
  4. OBJECTS = $(SOURCES:.c=.o)
  5. LIBS = -lm
  6.  
  7. CC = gcc
  8. CFLAGS = -std=c11 -g
  9.  
  10. # Next, we have the rules defining how the action corresponding to each target
  11. # should be performed.
  12. #
  13. # Anatomy of a rule:
  14. # Target : Dependencies
  15. #    Bash commands (must be indented using TAB)
  16.  
  17. %.o: %.c
  18.     $(CC) -c $(CFLAGS) $<
  19.  
  20. tests: clock.exe timer.exe
  21.  
  22. clock.exe: testClock.o clock.o
  23.     $(CC) $(CFLAGS) -o $@ $<
  24.  
  25. timer.exe: testTimer.o timer.o
  26.     $(CC) $(CFLAGS) -o $@ $<
  27.  
  28. # ... some targets omitted ...
  29.  
  30. memcheckClock: clock.exe
  31.         valgrind --leak-check=yes --track-origins=yes clock.exe
  32.  
  33. clean:
  34.     $(RM) $(OBJECTS) *.exe

Memory bugs

When managing your own memory, it is not uncommon to make mistakes which result in runtime problems. That is, your code will compile without any errors or warnings, but it will not behave correctly at runtime. In the best case, it will crash with some runtime error. In the worst case, you will get an answer, but it may not be correct, or somebody can exploit your bug to get access to privileged data.

One of the most commonly available tools for checking for dynamic memory problems in your code is Valgrind. You should start learning about it here or by using this little tutorial, and then refer to the rest of the documentation as needed for more options and debugging strategies.

Green Marinee theme adapted by David Gilbert, powered by PmWiki