WordPress & Full-stack development

I'm following the online Harvard CS50 course and I'm making my notes public!

David opens the lecture by telling us that the Hollywood "enhance" trick doesn't work in real life. WHAT??

Together with some volunteers, he prepared some pixel art to demonstrate that there's only so much you can do with dots on a screen. Resolution refers to how many dots (pixels) you have on the screen and if you have more pixels, the image will look clearer and less 'blocky'.

Hexadecimal

Now we see a grid of 0's and 1's, that represent pixels. 0 is black, 1 is white. But how do we store colors? We use RGB (some amount of red, some amount of green, some amount of blue). There is a standard notation to represent colors: hexadecimal. Black is #000000. The first two digits are red, the middle two digits are green, the last two digits are blue. So green would be #00FF00. Blue would be #0000FF.

With hexadecimal, we use a combination of numbers and letters: 0 1 2 3 4 5 6 7 8 9 A B C D E F. Hexadecimal is otherwise known as base-16. So in hexadecimals you have two columns, 161 and 161.

  • 1 -> 01
  • 2 -> 02
  • 3 -> 03
  • 4 -> 04
  • 5 -> 05
  • 6 -> 06
  • 7 -> 07
  • 8 -> 08
  • 9 -> 09
  • 10 -> 0A
  • 11 -> 0B
  • 12 -> 0C
  • 13 -> 0D
  • 14 -> 0E
  • 15 -> 0F
  • 16 -> 10
  • ...
  • 255 -> FF

Memory

We can think of memory like a grid of locations, and each byte has an address. Computers typically use hexadecimal notation to number the bytes in their memory.

To make it clear that we're using hexadecimal notation, we prefix it by 0x (zero X). This doesn't do anything, it's just a heads-up that the following information will be hexadecimal number.

When we declare a variable in C, the computer puts it somewhere in the memory. For example, int n = 50 will be stored in 0x123.

In C, you can use some special syntax to show the memory location of a variable.

int n = 50;
printf("%p\n",&n);

Pointers

A pointer is the address of a variable, that you can even store in another variable.

int n = 50;
int *p = &n;

printf("%p",p);

Now we declare a pointer to store the address of the variable 'n'.

However, we can also 'visit' the address and output the content.

int n = 50;
int *p = &n; 

printf("%i",*p);

By convention, pointers take up more space (8 bytes).

Strings

In week 1, we used code like this:

string s = "Hi!"

This string is taking up 4 bytes because there is always a null character (\0) at the end to signal the end of the string.

Remember that a string is basically an array of characters. s[0] would give you 'H' and s[3] would give you the \0.

Each part of this array must live on it's own address in the memory. For example, s[0] could be stored at 0x123. This also means that the other items have to be adjacent (because of the nature of arrays). So s[1] would have to be 0x124.

But 'S' is actually a pointer. It points to the start of the char array (in this case 0x123). This way, the computer knows where the string starts, and where it ends (thanks to the null character).

So a string is basically a pointer that points to the start of the array of characters. This code will print out the start of the character array:

string s = "HI!";
printf("%p\n",s);

This would be the same as this:

string s = "HI!";
printf("%p\n",s);
printf("%p\n",s[0]);

If you also request the addresses of the following items in the array, you'd see that the addresses are back to back.

With this knowledge, we're going to take off the training wheels and we'll see 'strings' as char stars because the string data type actually doesn't exist in C (now we also don't need the CS50.h library anymore).

char *s = "HI!";

So this means, that you are now not declaring a char, but the address of a char (and specifically the address of the first char).

char *s = "HI!";
printf("%s",s);

In C, you can easily create your own datatypes.

For example, C doesn't have a datatype for a byte, but it does have uint8_t (which means a datatype for 8 bits). You could create your own datatype 'BYTE':

typedef uint8_t BYTE;

So essentialy, in the CS50 header file, there was this line of code that allowed us to work with 'strings':

typedef char *string; 
A string is just the address of it's first byte.

Pointer Arhithmetic

Pointer arhithmetic means doing math on addresses (like looping through adjacent addresses).

To print out the string 'HI!', you could run this code:

char *s = "HI!";
printf("%c",s[0]);
printf("%c",s[1]);
printf("%c\n",s[2]);

But you could also do math on it:

char *s = "HI!";
printf("%c",*s);
printf("%c",*(s+1));
printf("%c\n",*(s+2));

Comparing strings

Now that we know that strings actually don't exist, we can figure out why we can't compare strings the same way we compare integers.

int i = 50; 
int j = 50; 

if(i == j){
  printf("same");
} else {
  printf("Different");
}

This will print out "same"

char *i = "HI!"; 
char *j = "HI!"; 

if(i == j){
  printf("same");
} else {
  printf("Different");
}

This will print out "Different"

It can't compare strings, because this code will actually compare the addresses of i and j, and those will always be different.

To compare strings, we use strcmp. This function will do the heavy lifting of going through all the characters and compare them.

Printing out pointer addresses

You can print out the pointer address by using %p

char *s = get_string("s: ");

printf("%p",s);

Copying

Let's say that you have a string. You create a duplicate of that string because you want to capitalize the first letter.

char *s = get_string("s: ");
char *t = s;

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

This code will print out the string with the first letter capitalized two times and not once like you intended.

This is because the variable t is pointing to the same address as variable s. So it doesn't matter if you use S or T, because you'll change the same value (the character on the first position).

So how do we fix this?

We include stdlib.h so we can use malloc and free.

Malloc

malloc stands for memory allocate and it takes 1 argument (the number of bytes you want to allocate). In this case we'll take the same length as the length of the string that the user entered in the console, but we'll also add a byte for the null character (strlen prints out the length for "humans" and doesn't take the null character into account).

Malloc will ask the computer for a place in the memory that's enough to store the requested data allocation, and the function will return the address (and specifically the address of the first byte).

char *s = get_string("s: ");
char *t = malloc(strlen(s) +1);

for (int i = 0, n = strlen(s); i <= n; i++){
  // Copy all the characters 
  // Make sure to run the loop while i is lower or equal, so you include the final null character as well 
  t[i] = s[i];
}

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

You should make it a habit to always check for 'NULL' and abort the program early if there is not enough memory available.

char *s = get_string("s: ");
char *t = malloc(strlen(s) +1);

if(t == NULL){
  // out of memory
  return 1;
}

for (int i = 0, n = strlen(s); i <= n; i++){
  // Copy all the characters 
  // Make sure to run the loop while i is lower or equal, so you include the final null character as well 
  t[i] = s[i];
}

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

You should actually do this with get_string as well, because it can return null if there is not enough memory for the string you entered.

char *s = get_string("s: ");

if (s == NULL){
  return 1;
}


char *t = malloc(strlen(s) +1);

if(t == NULL){
  // out of memory
  return 1;
}

for (int i = 0, n = strlen(s); i <= n; i++){
  // Copy all the characters 
  // Make sure to run the loop while i is lower or equal, so you include the final null character as well 
  t[i] = s[i];
}

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

Good news, because we actually don't need to loop through S to copy it in T. There is a function called strcopy that we can use. This will replace the for loop, including the backslash zero.

char *s = get_string("s: ");

if (s == NULL){
  return 1;
}


char *t = malloc(strlen(s) +1);

if(t == NULL){
  // out of memory
  return 1;
}

strcpy(t,s);

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

Free

When using malloc, you also have to use free. Essentialy, you give back the memory that you don't use anymore so it can be used by something else. This is why rebooting your computer can make your computer faster, because not every programmer thinks about giving back the memory and rebooting your system resets the memory.

At the bottom of the program, we can free T.

char *s = get_string("s: ");

if (s == NULL){
  return 1;
}


char *t = malloc(strlen(s) +1);

if(t == NULL){
  // out of memory
  return 1;
}

strcpy(t,s);

t[0] = toupper(t[0]);

printf("%s\n",s);
printf("%s\n", t);

free(t);
return 0; // success

NULL

NULL is the address 0.

There is a difference between 'NUL' and 'NULL'.

NUL simply means "the string ends here".

NULL is the address 0.

Your computer will never use the address 0 (it's a wasted byte) just to make sure that there is a special symbol that comes back to signal that something went wrong.

Valgrind

Valgrind is a program to check your usage of memory.

This can help you find errors, like when you forget to use 'free'.

When you forget to use 'free' and you run valgrind, it will report "12 bytes in 1 blocks are definitely los", otherwise known as a memory leak. Valgrind doesn't know when you should free your memory, but it will give you the line where you define the malloc.

int *x = malloc(3 * sizeof(int)); // you can use sizeof to reserve room that is three times the space of an integer 

x[0] = 72;
x[1] = 73;
x[2] = 33;

free(x); // Valgrind will report a memory leak if you forget to free the memory. 

If everything is alright, Valgrind will report "no leaks are possible"

So you use Valgrind to detect memory related bugs.

Garbage values

If you declare a variable but don't give it a value and you start using it (printing it out or doing math on it), you could be manipulating garbage values (remnants from the past, maybe memory that was used by a different program).

We declare a new array with 1024 positions, but we don't assign any values to it and print out the values. This results in garbage values.

In this program:

  • We declare two variables x and y, both of them are pointers to the address that holds an integer
  • We malloc the size of an int (to give yourself space for x)
  • Go to the address in x and put 42 there
  • go to the address in y and put 13 there.
  • We set Y equal to X so that they're the same
  • Now when we assign *y to 13, we overwrite the value 42 with 13.

Quick summary

This is how you create two pointers that point to an integer:

int *x;
int *y;

Initially, they don't point to anything. The thing they're pointing to are called pointee's and setting them up is a seperate step.

This is how you allocate a pointee and point x to it.

x = malloc(sizeof(int));

Now we dereference the pointer x to store the integer 42 into it's pointee.

Dereferencing a pointer means that you access the value stored at the memory address pointed to by that pointer.
*x = 42;

Now we set up Y so that it points to the same address as X

y = x

Now they point to the same thing, and if you dereference *y to 13, the value will also be changed for X.

Swapping

If we want to swap the variable of A into B, we need a third 'temp' variable (just like how you would swap the contents of a glass with the contents of another glass).

So we write this void function:

void swap(int a, int b){
  int tmp = a;
  a = b;
  b = tmp; 
}

But this doesn't work when we call it in the main function and pass the arguments. Why? There is a scoping issue.

This is an example of 'passing by value' (or 'passing by copy') because you're passing X and Y into the swap function by value. The function is actually getting copies of the variables.

There are conventions in place to determine how computers can use the memory. Generally speaking (if you visualize the memory as a big rectangle), the top of your memory so to speak is for the machine code (the 0's and 1's that you compiled).

Below that is the room for the global variables. Then you have the heap where you have a lot of memory available (it grows downwards) and at the bottom you have the stack (this grows in the opposite direction from heap and grows up).

When you use malloc, you're using heap memory. When you use functions with variables and arguments, you're using stack memory.

So for instance, when we ran the swap function. The very first function (main) goes at the bottom of the computer's memory.

Then main calls swap, and this stack frame goes on top of the main stack frame. As soon as swap returns, that memory essentially goes away and gets removed from the stack.

So how can we fix this issue so that the swap function works? We can solve this by passing by reference. In that case we won't pass copies of the values, but we'll pass the addresses.

void swap(int *a, int *b){
  int tmp = *a;
  *a = *b;
  *b = tmp; 
}

You can read star as "go to the value of... in the memory". So we go to the value of A in the memory and put it in the tmp variable. Then we go to the value of A in the memory and replace it with the value of B in the memory. And finally, we go to the value of B in the memory and replace it with the value of the tmp variable.

When we call this function in the main void, we can't just pass the values of x and y, but we'll need to pass the address by using ampersand.

int x = 1;
int y = 2;

swap(&x,&y);

By using ampersand, we pass the address and not the value of the variable.

Overflow

The heap goes down, and the stack goes up. This can cause collisions. If you call many functions (maybe recursively where you pile up stack frames), or maybe when you use malloc a lot, you can cause heap overflow or stack overflow. These are specific examples of what we're calling buffer overflow.

Scanf

C does not make it easy to get user input safely without causing buffer overflow. That's why we used the CS50 library that includes functions like get_char, get_double, get_float, etc.

So what's the built-in alternative to these functions? Scanf.

In the scanf function we pass two arguments:

  • The format code for the type of data we want to get from the user
  • Where to store the user input

We can't just pass a variable as the second argument, because then we'd be passing a copy of that value. Instead, we want pass the address by using ampersand.

#include <stdio.h>

int main(void){
    int n;
    printf("n: ");
    // In the argument of scanf we pass the format code, telling scanf that we want to 'scan' an integer from the user
    // In the second argument, we define where we want to store the variable
    scanf("%i",&n);
    printf("n: %i\n",n);
}

This program gets an integer from the user and prints it

But how can we get a string from the user?

There is a fundamental problem, because you don't know in advance how many space you need to allocate for the string.

The CS50 will go through all of the bytes in the user input, and constantly allocates memory so that there is enough room for the user input.

File I/O

Some common functions related to files:

  • fopen: open a file
  • fclose: close a file
  • fprintf: print to a file
  • fscanf: read data from a file
  • fread: read binary data from a file
  • fwrite: write binary data from a file
  • fseek: move around in a file (like using the timeline on a Netflix video)

We're going to put this into practice by creating a phonebook program that stores the data instead of forgetting it every time when we rerun the program.

When using fopen we need to pass what file we want to open (phonebook.csv) and we have to tell fopen how we want to open it (for reading: R, for writing: W, or for appending: A).

In this case we'll use append so we append data to the file and don't overwrite anything.

We store fopen in FILE *file. So this means that file will be a pointer to that file in memory. fopen will return the address of the file.

We get the input from the user and we want to save that to the file. We use fprintf for this and you pass the address of the file you want to change, and what you want to print.

Then we close the file with fclose.

#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(void){
    
    FILE *file = fopen("phonebook.csv","a");

    char *name = get_string("Name: ");
    char *number = get_string("Number: ");

    fprintf(file, "%s,%s\n",name,number);

    fclose(file);
}

This will work, but when working with fopen we should always check for NULL. You should do this every time when working with pointers.

Now we work on a program that copies a file:

#include <stdio.h>
#include <stdint.h>

typedef uint8_t BYTE;

int maint(int argc, char *argv[]){
    // get a string from the command line

    //rb means Read Binary
    FILE *src = fopen(argv[1],"rb");
    //wb means Write Binary
    FILE *dst = fopen(argv[2],"wb");

    BYTE b;

    // How do we copy a file?
    // Loop through the file and copy all the bytes

    // fread will go through the value one byte at a time
    // We will store that in the address of b
    // the second argument defines the size of a byte
    // the third argument asks how many bytes you want to copy at a time (1)
    // the last argument asks you the file to read from (src)
    // We want to do this for as long as it succeeds. When there are no more bytes to read, it will return 0.
    while (fread(&b, sizeof(b), 1, src) != 0){
        // Where to find the byte? &b
        // what size? sizeof(b)
        // How many bytes at a time? 1
        // Where to write to? dst
        fwrite(&b, sizeof(b),1,dst);
    }

    fclose(dst);
    fclose(src);

}

With this knowledge, we can even manipulate files (at least bitmap files).

Bitmap files (.bmp) iplements images as a grid of bits, each of which represents a pixel coordinate.


Shorts

Hexadecimal

Most western cultures use the decimal system, aka base-10 to represent numeric data (0 1 2 3 4 5 6 7 8 9)

Computers use the binary system, aka base-2 to represent numeric (and indeed all data)

However, tring to parse a huge chain of 0s and 1s can be quite difficult. So we have the hexadeimal system (base-16) which is a more concise way to express the data on a computer's sytem (0 1 2 3 4 5 6 7 8 9 A B C D E F). It makes the mapping easy because a group of four binary digits (bits) has 16 different combinations, and each of those combinations maps to a single hexadecimal digit.

To make it obvious that we're working with hexadecimal number, we prefix them with 0x.

To convert binary to hexadecimal, you can group them by four (from right to left) and convert them to hexadecimal which will make it more concise.

For example, we have this binary number:

01000110101000101011100100111101

To convert it, we want to group it in groups of 4.

0100 0110 1010 0010 1011 1001 0011 1101

Starting from right to left:

1101 -> D
0011 -> 3
1001 -> 9
1011 -> B
0010 -> 2
...

This becomes:

0x46A2B93D

So what's the point? We use this quite a lot because memory addresses are represented in hexadecimal.

Pointers

Pointers offer us a powerful way to pass data between functions. Earlier, we have always passed copies of data (passing by value).

Every file on your computer lives on your disk drive (HDD or SSD). However, disk drives are just storage space. We can't directly work there. Manipulation and use of data can only take place in RAM, so we have to move data there. Memory is basically a huge array of 8-bit wide bytes (512 MB, 1GB, 2GB, 4GB ...). All the data on the memory is destroyed when the computer is being turned off.

When we move data into memory, it takes up space:

integer -> 4 bytes
character -> 1 byte
float -> 4 bytes
double -> 8 bytes (allows you to have more digits after the decimal point)
long long -> 8 bytes
string -> ????

Memory is a large array where every cell is 1 byte. Every cell has an address (index) so we can just get the value of the cell with a specific index.

When you define a character, you get for example get the cell '4' (or 0x4 in hexadecimal) in your memory to store the value of the character. If you want to store an integer, you could get cell 8, 9, 10 and 11 to store the value of the integer.

If we want to put the sring 'Lloyd' in the memory, we need 6 cells because we also need 1 byte to hold the \0 so we know when the string ends.

Pointers are just addresses (to locations in memory where variables live)
int k; 
k = 5;

This code creates a 'box' that can hold an integer, and then we put the value 5 into the box.

int* pk;
pk = &k;

What happens here?

We create a box called 'pk' that holds a pointer. We put the address of k into the box of pk. So the value in the box would be something like '080C74820'.

So what is a pointer?

A pointer is a data item whose

  • value is a memory address
  • type describes the data located at that memory address

In a way, pointers make a computer environment more like the real world. For example, you have a paper notebook and you want to update it. Instead of creating a copy of the notebook, updating it, and returning a copy of the (updated) notebook, you can simply update the original notebook and don't have to return anything.

NULL pointer

the simplestp ointer available to us in C is the NULL pointer (it points to nothing). When you create a pointer and you don't set its value immediately, you should always set the value of the pointer to NULL.

Ampersand operator

By using the ampersand operator (&) we can extract the address of an already existing variable.

If x is an int-type variable, then &x is a pointer-to-int whose value is the address of x.

If arr is an array of doubles, then &arr[i] is a pointer-to-double whose value is the address of the ith element of arr.

An array's name is actually just a pointer to its first element. That's why arrays are an exception on variable scope. When you pass an array as an argument, the function will update the content of the array. That wouldn't be the case if you passed an integer.

Dereferencing

Dereferencing means "go to the reference and change the value at this location".

If we have a pointer-to-char called pc, then *pc is the data that lives at the memory address stored inside the variable pc.

Star (*) is known as the dereference operator. It "goes to the reference" and accesses the data at the memory location, allowing you to manipulate it at will. This is similar to visiting your neighbor. You not only need to know their address to visit them, but you also have to go to that address to interact with them.

To "go to" the address, you use the * dereference operator.

What happens if we try to dereference a pointer whose value is NULL? You get a segmentation fault. This is actually good behaviour, because it defends against accidental dangerous manipulation of unknown pointers (which could break the program). That's we it's good practice to set your pointers to NULL immediately if you aren't setting them to a known, desired value. At the end of the day, it's better for your program to crash than to destroy other programs or functions.

Syntax

int* p;
  • The value of p is an address
  • We can dereference p with the * operator
  • If we do, what we'll find at the location is an int.

The star is actually both a part of the type name, and the variable name. So something like this wouldn't work:

int* pa, pb, pc;

Instead, you'd have to do something like this:

int* pa, *pb, *pc; 

How large is a string?

There is actually no datatype called string. It's a datatype that is created for us in CS50.h.

String is actually an alias for char*, which is a pointer to a character. Pointers are just addresses, so it's 4 or 8 bytes (depending on the system). On a 64-bit system it's 8 bytes. On a 32-bit machine it's 4-bytes.

what happens when we do this:

*pk = 35;

The star is the dereference operator here, so this means that we go to the location of pk and change what we find.

int m; // create a new 'box' with room for an integer
m = 4; // Put 4 into the box 
pk = &m; // Get the address of M and make the pointer pk now point to m

Defining custom types

With typedef we can create shorthand or rewritten names for data types. It can be an alias for a datatype that already exists or a new data type.

Alias 'byte' of the 'unsigned char' datatype.

typedef unsigned char byte;

Alias 'string' for 'char *'

typedef char * string;

Now we can simply use 'string' instead of 'char *' which isn't very intuitive.

You can also create a custom data type. For example when this is your struct:

struct car {
  int year;
  char model[10];
  char plate[7];
  int odometer;
  double engine_size
}

Now you can create an alias 'car_t' for the data type 'car'.

typedef struct car car_t;

Dynamic Memory Allocation

If we want to point a pointer variable at another variable that already exists in our system, we need to know exactly how much memory our system will need at the moment our program is compiled.

But what if we don't know how much memory we'll need at compile-time? We can do this by dynamically allocating memory to the heap.

Stack vs heap

Statically allocated memory (typically everything that you give a name to) comes from the stack. Dynamically allocated memory (which is memory that you allocate when the program is running) comes from the heap.

The stack and the heap are actually the same big chunk of memory. The stack gets allocated from the bottom to the top and the heap will allocate downward.

If we want to statically obtain an integer on the stack:

int x;

If we want to dynamically obtain an integer on the heap:

int *px = malloc(4);

// Or you could use sizeof 
int *px = malloc(sizeof(int));

malloc

To access memory on the heap we need to use the function malloc(). As an argument, we pass the number of bytes that we need. Malloc will then return a pointer to that memory. If it can't find any memory, it will return NULL.

You always need to check for NULL after using malloc.

Get an integer from the user and allocate an array of that size on the stack

int x = get_int();
float stack_array[x];

Get an integer from the user and allocate an array of that size on the heap

int x = get_int();
float* heap_array = malloc(x * sizeof(float));

When using dynamically allocated memory, you need to manually return it to the system for later use when the function in which it's created finishes execution.

Three golden rules for malloc() and free()

  1. Every block of memory that you malloc() must subsequently be free()d
  2. Only memory that you malloc() should be free()d. You don't have to free statically allocated memory.
  3. Do not free() a block of memory more than once

Call Stacks

When you call a function, the system sets aside space in memory for that function to do its necessary work. We call such chunks of memory stack frames or function frames.

It's possible that multiple function's stack frame exists in memory at a given time. For example, if main() calls move(), which then calls direction(), all three functions have open frames.

When a new function is called, a new frame is pushed onto the top of the stack and becomes the active frame. When a function finishes its work, its frame is popped off of the stack, and the frame immediately below it becomes the new, active, function on the top of the stack. This function picks up immediately where it left off. This is why recursion works.

For example, this is how the call stack would look like with the factorial function from last week:

They get stacked from bottom to top. Eventually the factorial function reaches it "if (n == 1)" statement and the stacked functions will start getting resolved.

As soon as a function returns something, the frame get's popped from the call stack. Eventually all the functions will be removed from the call stack.

File Pointers

So far, our programs have been "forgetful". We start our program, do something and then stop the program without having any "evidence" that the program ever ran.

Persistent data is data that does not disappear when your program stops running, for example when you're creating a phone book program where users can enter their own data and save it.

All file manipulation functions live in stdio.h (that also includes printf). All of them accept FILE* as one of their parameters, except for the function fopen() which is used to get a file pointer in the first place.

fopen()

It opens a file and returns a file pointer to it.

Remember to always check the return value to make sure that you don't get back NULL so that you don't dereference a NULL pointer.

FILE* ptr = fopen(<filename>,<operation>);

Some possible operations are:

  • r -> read a file
  • w -> write to a file
  • a -> append

When you use 'write' instead of 'append', you'll overwrite any existing files. So for example for our phonebook, we'd use append because we don't want to overwrite the current phone numbers in our CSV file.

fclose()

Closes the file pointed to by the given file pointer.

fclose(<file poiner>)

fgetc()

Reads and returns the next character from the file pointed to.

Note: the operation of the file pointer passed in as a parameter must be "r" for read, or you will get an error.

char ch = fgetc(<file pointer>);

If this is the first time you run the function, it will return the first character.

This means that we can loop through all the characters in the file.

char ch; 
while ((ch = fgetc(ptr)) != EOF)
  printf("%c",ch);

EOF is a special value that is the "end of file" character. So if the current character is not the End of File character, we keep looping through the file.

fputc()

Writes or appends the specified character to the pointed-to file.

Note: the operation of the file pointer passed in as a parameter must be "w" for write or "a" for append, or you will suffer an error.

fputc(<character>,<file pointer>);

With this knowledge, you could write a program that copies a file into another one (and basically recreate the linux cp command).

char ch; 
while ((ch = fgetc(ptr)) != EOF)
  fputc(ch, ptr2);

fread()

Reads <qty> units of size <size> from the file pointed to and stores them in memory in a buffer (usually an array) pointed to by <buffer>

Note: the operation of the file pointer passed in as a parameter must be "r" for read, or you will suffer an error.

Instead of looping through a file one character at a time, we can use fread() to loop through multiple characters at a time.

fread(<buffer>,<size>,<qty>,<file pointer>);

For example:

int arr[10];
fread(arr, sizeof(int), 10, ptr);

In this example, we read 40 bytes worth of information from the file ptr, and we're storing those in the array. The buffer actually needs to be a pointer, but since an array is a pointer we can just use the name of the array to point to the first value of the array.

We can also dynamically allocate our buffer on the heap:

double* arr2 = malloc(sizeof(double) * 80);
fread(arr2, sizeof(double), 80, ptr);

You can also use fread to get just 1 character back, but in that case you must remember to pass the address of the char variable.

char c; 
fread(&c, sizeof(char), 1, ptr);

fwrite()

Pretty much the same thing as fread, but for writing.

Note: the operation of the file pointe rpassed in as a parameter must be "w" for write or "a" for append, or you will suffer an error.

int arr[10];
fwrite(arr, sizeof(int), 10, ptr);

Instead of going from the file to the buffer, we now go from the buffer to the file.

If we want to dynamically allocate the chunk of memory:

double* arr2 = malloc(sizeof(double)*80);
fwrite(arr2, sizeof(double), 80, ptr);

And if you want to write a single character:

char c; 
fwrite(&c, sizeof(char), 1, ptr);

Other functions

There are a lot of other useful functions included in stdio.h, like:

  • fgets() reads a full string from a file
  • fputs() writes a full string to a file
  • fprintf() writes a formatted string to a file
  • fseek() allows you to rewind or fast-forward within a file
  • ftell() tells you at what (byte) position you are at within a file.
  • feof() tells you whether you've read to the end of a file
  • ferror() indicated whether an error has occured in working with a file.
You’ve successfully subscribed to Teebow Dev Blog
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Success! Your email is updated.
Your link has expired
Success! Check your email for magic link to sign-in.