Dev Notes

Software Development Resources by David Egan.

Contrasting C and JavaScript - Make an Array of Words From a String


C,, JavaScript
David Egan

This article contrasts how you would make an array of strings representing the “words” in an original sentence - in other words, build an array of substrings derived from an original string, delimited by space characters or the string end.

The objective is to write a function that mutates a passed-in array.

C Code

In C, function parameters are passed by value. The value of the parameter is copied into the function’s stack frame - any changes to the function parameter are only relevant within the scope of the function.

For a function to modify data and for such changes to persist in the outside scope, the parameter must be passed as a pointer. In this way, the memory address of the “outside” variable is copied into the stack frame of the function, where it can be de-referenced to access (and change) the actual data.

In the case of an array of strings, the variable to be changed in the function is already a pointer to a pointer. The base data type is a char, with a char* being a pointer to a char, which can be used to represent an array of char - or when null-terminated, a string. An array of strings therefore can be represented by a pointer to a char * - in other words, char **.

If char ** is our array of strings, to modify this in a function we need to pass in a pointer to char **, or char ***.

Inside the function, dereferencing char ***words once - *words provides the pointer to a pointer that is the array of strings we’re working on.

We need to make sure that the array has enough space to hold pointers to the required number of words. For this example, the required memory is allocated dynamically. This means that memory must be carefully tracked by the programmer and freed later. The code therefore includes a function to free the array, by firstly iterating over and freeing individual strings.

Because memory allocation (and reallocation) is relatively expensive, the function initially allocates space for up to 10 words. Whenever a new word is detected, we check to see if this memory allocation needs to be increased. If this is the case, a realloc() is performed, adding space for an extra 10 “word pointers”. We add an extra 10 rather than just a single extra pointer to avoid re-allocating memory too frequently.

When saving individual words as strings in the array, we know the exact number of characters in the string. We allocate memory for these using calloc(): char *tmp = calloc(wordSize + 1, sizeof(*tmp));. This function initialises all memory to zero, so if we initialise with one more member than we need for the word characters, the final character will be 0 - which serves as a null terminator and saves us the bother of null-terminating the string later.

Note that all memory allocations are checked for success - if calloc() or realloc() return NULL we trigger a graceful exit of the programme.


#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>

const char str[] = "Out of cheese error +++ redo from start";

void printArray(char **arr, size_t n);
void allocFail(const char *msg);
void freeArray(char **arr, size_t n);

size_t setArray(char ***words, const char *str)
{
	size_t len = strlen(str);
	size_t nAllocatedWords = 10;
	*words = calloc(nAllocatedWords, sizeof(*words));
	if (*words == NULL) {
		allocFail("calloc() failed.");
	}
	size_t i = 0;
	size_t wordCount = 0;
	while (i < len) {
		while (isspace(str[i]))
			i++;
		size_t j = i;
		// count until next space.
		while (j < len && !isspace(str[j]))
			j++;			
		// i is the index of the first letter of word
		// j - 1 is the index of the last.
		if (i != j) {
			wordCount++;
			if (wordCount > nAllocatedWords) {
				nAllocatedWords += 10;
				char **tmp = NULL;
				tmp = realloc(*words, sizeof(**words) * nAllocatedWords);
				if (!tmp) {
					allocFail("realloc failed.");
				}
				*words = tmp;
			}
			size_t wordSize = j - i;
			char *tmp = calloc(wordSize + 1, sizeof(*tmp));
			if (!tmp) {
				allocFail("calloc failed.");
			}
			(*words)[wordCount - 1] = tmp;
			strncpy(tmp, &str[i], j - i);
			i = j + 1;
		}
	}
	return wordCount;
}

int main()
{
	char **arr = NULL;
	size_t n = setArray(&arr, str);
	printArray(arr, n);
	freeArray(arr, n);
	return 0;
}

void printArray(char **arr, size_t n)
{
	for (size_t i = 0; i < n; i++) {
		printf("%s\n", arr[i]);
	}
}

void allocFail(const char *msg)
{
	fprintf(stderr, "%s\n", msg);
	exit(EXIT_FAILURE);
}

void freeArray(char **arr, size_t n)
{
	for (size_t i = 0; i < n; i++) {
		free(arr[i]);
	}
	free(arr);
}

JavaScript Code

This is considerably shorter:

#!/usr/bin/env node

const str = "Out of cheese error +++ redo from start";

let res = [];
var setArray = (str, arr) => {
	arr.push(...str.split(" "));
} 

setArray(str, res);
res.forEach((el) => console.log(el));

Inside the setArray() function, we use the split() method to split the string into an array of strings, delimited by a space character.

This is then unpacked to individual elements by the spread syntax operator .... These values are pushed into our array. It is possible to mutate an array within a function like this in JavaScript.


comments powered by Disqus