Convert integer to array of char bytes
Bitshifting, C, C++
To convert an integer to an array of char
, you can bitmask against the integer in a loop to get the value for each byte.
Endianness & Bitshifting
On a little-endian system, int 256
is stored across 4 bytes in memory like this:
Multiplier | 256⁰ | 256¹ | 256² | 256³ |
---|---|---|---|---|
Value | 00 | 01 | 00 | 00 |
Address | 1000 | 1001 | 1002 | 1003 |
You might imagine that iteratively bitshifting this value and masking against 0xff
would provide the constituent bytes in the little-endian order, but this is not the case.
Bit shifting is always applied against a big-endian representation. This is because when the integer is loaded into the CPU’s register, it is effectively represented in big-endian format.
The CPU register is not byte-addressable, so in this context endianness does not make sense.
This means that bitwise operations are applied to a big-endian representation of the number.
int main()
{
int num = 256;
// When the integer is loaded into the CPU's register, it is the
// equivalent of conversion to big-endian format - the number in
// the CPU register is big-endian.
//
// This means that bitwise operations are applied to a big-endian
// representation of the number:
// 00 00 01 00
std::cout << "num: " << num << '\n'; // 256
std::cout << "(num >> 0) & 0xff = "
<< (num & 0xff) << '\n'; // 0
std::cout << "(num >> 8) & 0xff = "
<< ((num >> 8) & 0xff) << '\n'; // 1
std::cout << "(num >> 16) & 0xff = "
<< ((num >> 16) & 0xff) << '\n'; // 0
std::cout << "(num >> 24) & 0xff = "
<< ((num >> 24) & 0xff) << '\n'; // 0
return 0;
}
In short, you probably don’t need to worry about endianness when bit shifting - but bear in mind even if your system is little-endian, the bitshift operation takes place on a big endian representation.
The Register
A 32-bit CPU register holding a 32-bit value is neither big-endian nor little-endian - it is a register holding a 32-bit value see ref. It is kind of big-endian though, since the rightmost bit is the least significant bit and the leftmost bit is the most significant.
System Endianness: Linux
To find endianness on a modern Linux kernel:
lscpu | grep Endian
# result:
Byte Order: Little Endian
Conversion to an Array of Bytes
#include <iostream>
#include <iomanip>
#include <stdio.h>
#define BITS 8
const size_t BYTES_PER_INT = sizeof(int);
int intToCharArray(char* buffer, int in)
{
for (size_t i = 0; i < BYTES_PER_INT; i++) {
size_t shift = 8 * (BYTES_PER_INT - 1 - i);
buffer[i] = (in >> shift) & 0xff;
}
return 0;
}
int main()
{
std::cout << "Enter an integer: ";
unsigned int in;
if (!(std::cin >> in)) {
std::cout << "Not an integer. Exiting..." << '\n';
return 1;
}
char buffer[BYTES_PER_INT];
intToCharArray(buffer, in);
for (auto& el : buffer) {
std::cout << std::hex << std::setw(2) << std::setfill('0')
<< static_cast<unsigned int>(el & 0xff) << " ";
}
std::cout << '\n';
}
References
- Writing endian-independent code in C, IBM Developer article
comments powered by Disqus