Dev Notes

Software Development Resources by David Egan.

Build an Array of Strings From a File in C


C, Strings
David Egan

Use getline() to read lines from a file, and add them to dynamically allocated memory:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
	if (argc != 2) {
		fprintf(stderr, "Please supply a file path:\n%s <file path>\n", argv[0]);
		return EXIT_FAILURE;
	}
	FILE *fp = fopen(argv[1], "r");
	if(!fp) {
		perror("ERROR");
		return EXIT_FAILURE;
	}
	
	// Read lines from file, allocate memory and build an array of lines
	// -----------------------------------------------------------------
	char *lineBuf = NULL;
	size_t n = 0;
	size_t nLines = 0;
	ssize_t lineLength = 0;
	size_t sizeIncrement = 10;
	char **lines = malloc(sizeIncrement * sizeof(char**));
	size_t i = 0;
	
	while ((lineLength = getline(&lineBuf, &n, fp)) != -1) {
		// Memory reallocation is expensive - don't reallocate on every iteration.
		if (i >= sizeIncrement) {
			sizeIncrement += sizeIncrement;

			// Don't just overwrite with realloc - the original
			// pointer may be lost if realloc fails.
			char **tmp = realloc(lines, sizeIncrement * sizeof(char**));
			if (!tmp) {
				perror("realloc");
				return EXIT_FAILURE;
			}
			lines = tmp;
		}
		// Remove \n from the line.
		lineBuf[strcspn(lineBuf, "\n")] = 0;

		// Allocate space on the heap for the line.
		*(lines + i) = malloc((lineLength + 1) * sizeof(char));

		// Copy the getline buffer into the new string.
		strcpy(*(lines + i), lineBuf);

		i++;

		// Keep track of the number of lines read for later use.
		nLines = i;
	}

	// Do something with the array of strings.
	printf("nLines: %lu\n", nLines);
	for (size_t k = 0; k < nLines; k++) {
		printf("%lu\t %s\n", k, *(lines + k));
	}
	
	// Free the buffer utilised by `getline()`.
	free(lineBuf);

	// Free the array of strings.
	for (size_t i = 0; i < nLines; i++)
		free(*(lines + i));
	free(lines);

	fclose(fp);
	return 0;
}

size_t vs ssize_t

Note that the return value of getline() is assigned to ssize_t lineLength.

The reason that the ssize_t type is used is that getline() can return a negative value (actually -1) in the event of failure or the end-of-file condition.

ssize_t is guaranteed to be able to store values in the range -1 to SSIZE_MAX and should be used for a count of bytes or an error indication.

Realloc

Note that you should check that realloc() succeeds by checking for a non-null return value - or responding to the null case as in the example above.

You can do this by assigning realloc to a temporary variable, only copying it across to your target variable when you are sure that realloc() succeeded.

Otherwise, you risk writing NULL to your target variable, thereby causing a memory leak (since you can’t access the pointer to free memory).

References


comments powered by Disqus