Linear Algebra Project in C, continued#

Today#

Linear Algebra, testing, debugging, continued#

There’s a new assignment (see above) for today’s class. Today’s work really is just a continuation of last class, so it somewhat bothers me to start over in a new assignment repo, as that isn’t how things would work in real life, where one continues to work on the same code in the same repo. Nevertheless, I think this is the best way to keep everyone on the same page, so I completed the work from last class, and made it the starter code for today.

Supporting arbitrary vector length#

In the homework, I said:

  • Start thinking about how to make this library more generic, in particular, how to avoid having it work only with the hardcoded vector size N = 3.

So that’s what I’ll work on now. You can follow this in this PR, where I pushed my commits, and if github classroom works the way I’m hoping it will, a pull request will be created in your assignment repo, which you can merge. This will (fingers crossed) pull in those changes into your assignment repo, so you’ll have those commits to work on top of later. (Alternatively, you could choose to repeat this work yourself.)

My goal is to demonstrate how to do incremental changes, while keeping the code working (that’s what the tests are helpful for), which eventually lead to having a more generic collection of linear algebra functions.

What I want, for now, is to have a vector_dot() function that doesn’t depend on the global N vector length constant. Instead, I’ll make that function work for a vector length of n, which is not a hard-coded number. It not being hard-coded means I don’t know its value ahead of time, so I’ll need to pass it in as an additional argument.

First, I actually just changed the function to do its work based on n:

diff --git a/class9/c/vector_dot.c b/class9/c/vector_dot.c
index 5e7c9b7..5b565d6 100644
--- a/class9/c/vector_dot.c
+++ b/class9/c/vector_dot.c
@@ -10,8 +10,10 @@

 double vector_dot(const double* x, const double* y)
 {
+  int n = N;
+
   double sum = 0.f;
-  for (int i = 0; i < N; i++) {
+  for (int i = 0; i < n; i++) {
     sum += x[i] * y[i];
   }
   return sum;

That’s not a lot of progress, since the function still uses the hard-coded N, but only in the beginning/setup.

The logical next step is then to pass in int n as an argument, so we truly don’t depend on the hard-coded N anymore:

diff --git a/class9/c/vector_dot.c b/class9/c/vector_dot.c
index 5b565d6..8b65202 100644
--- a/class9/c/vector_dot.c
+++ b/class9/c/vector_dot.c
@@ -8,10 +8,8 @@
 // x: first vector
 // y: second vector

-double vector_dot(const double* x, const double* y)
+double vector_dot(int n, const double* x, const double* y)
 {
-  int n = N;
-
   double sum = 0.f;
   for (int i = 0; i < n; i++) {
     sum += x[i] * y[i];

The issue here is that now my code doesn’t compile (or work) anymore. But I actually like to work this way, since the compiler is now going to tell me what needs to be fixed. After making the compiler happy again, I now have the full commit:

diff --git a/class9/c/linear_algebra.h b/class9/c/linear_algebra.h
index 40fc593..9ae8ef1 100644
--- a/class9/c/linear_algebra.h
+++ b/class9/c/linear_algebra.h
@@ -6,7 +6,7 @@

 #define N (3)

-double vector_dot(const double* x, const double* y);
+double vector_dot(int n, const double* x, const double* y);
 void vector_add(const double* x, const double* y, double* z);
 void matrix_vector_mul(const double A[N][N], const double* x, double* y);


diff --git a/class9/c/test_vector_dot.c b/class9/c/test_vector_dot.c
index df309b9..6ff8528 100644
--- a/class9/c/test_vector_dot.c
+++ b/class9/c/test_vector_dot.c
@@ -13,7 +13,7 @@ int main(int argc, char** argv)
   double x[N] = {1., 2., 3.};
   double y[N] = {2., 3., 4.};

-  assert(vector_dot(x, y) == 20.);
+  assert(vector_dot(N, x, y) == 20.);

   return 0;
 }
\ No newline at end of file
diff --git a/class9/c/vector_dot.c b/class9/c/vector_dot.c
index 5b565d6..8b65202 100644
--- a/class9/c/vector_dot.c
+++ b/class9/c/vector_dot.c
@@ -8,10 +8,8 @@
 // x: first vector
 // y: second vector

-double vector_dot(const double* x, const double* y)
+double vector_dot(int n, const double* x, const double* y)
 {
-  int n = N;
-
   double sum = 0.f;
   for (int i = 0; i < n; i++) {
     sum += x[i] * y[i];

It’s a bit awkward (but really, there is just no non-awkward way of handling this in C), as normally a vector is a thing that consists of a bunch of numbers, but also knows how many numbers are in it. We can make “objects” that aggregate multiple pieces of information in C using a struct, which we’ll do after catching up on some more bits and pieces.

Dynamic memory allocation#

After some more work to do the same arbitrary vector length changes for the remaining functions, I also decided to allocate the arrays dynamically That is another rather useful feature (but also complication) – if the vector length isn’t hard-coded, the compiler (at compile time) doesn’t know how much memory to reserve to store all the vector elements, so that memory needs to be allocated at run time, when the length is known. C has functions like malloc, calloc and free to deal with this.

Introducing struct vector#

Now it’s time to act on the realization that a vector does not only store its elements, but it also knows how many elements there are. That is a distinction to what an “array” is in C, in particular after it, as usual, decays to a pointer – that lets you read / write the array element values, but it doesn’t have any information about how many there are.

I introduced a struct vector and rewrote the vector_dot function to take two struct vectors in this and the following commit. It’s not all that pretty, in particular all the ampersands, but at least it works, and the structure of the code makes sense (to me, anyway).

There’s repeated code that initializes and cleans up a vector, so I figured that’s a good reason to have that done in helper functions, too:

Finally, in this commit I also introduced a VEC macro to make accessing these vectors look a bit less ugly (maybe – but there are other advantages to doing this, as we’ll see later).

Your turn#

  • Following the example of vector_dot() and its test, adopt vector_add() and its test to use the new struct vector way of doing things.

  • More challenging: Introduce a struct matrix and convert matrix_vector_mul() to be based on struct matrix and struct vector.

Homework#

  • As usual, complete the work above, and write down your thoughts along the way.

  • Look through the Topics for this class. Edit the wiki page to add a “+” to any topic you think is interesting and would like to learn more about, and add a “-” to any topic you think is not interesting or that you already know well. Feel free to add any topics that are currently missing that you think would be interesting to learn about.