Dec to Hex conversion exercise

Converting a decimal number to a hexadecimal string makes for a nice programming exercise.

I started doing regular programming exercises several years ago when I worked at Escalation Studios. I was new to the world of C++ while we worked on DOOM SnapMaps and that meant I had to learn a lot very quickly. My approach to learning is a combination of theory and practice with focus on the latter. Let me walk you through this simple coding exercise.

Analysis

We must start by doing the conversion manually, step by step. Obviously, everything in our computer hardware is represented in binary. This means we can start by converting from decimal to binary. Binary representation will help us align our thinking with the computer operations we’ll use to solve this problem.

Binary and hexadecimal numbers read right to left. The position index starts at 0. Also, the sum of each bit x at position n represents x * 2^n in decimal. For example, the binary number 0b100 represents 2^2 = 4 in decimal. We can extrapolate this to hexadecimal where the sum of each group of four bits (also known as a nibble) x at position n represents x * 16^n in decimal. The hexadecimal number 0xF00 represents 0b111100000000 in binary and 15 * 16^2 = 3840. You can use a scientific or programmer calculator in the operating system to try this and see the changes and results, which can really help to quickly try these conversions!

Start by picking a random decimal number and converting it to binary. This conversion is not the main focus of this exercise. However, I’ll just state to you that you need to perform modulo by 2 and then repeat a classic “divide by 2” and then stop when the number is zero.

Convert decimal number 42 to binary.

2^5 = 322^4 = 162^3 = 82^2 = 42^1 = 22^0 = 1
Bit101010
Value1 * 2^5 = 3201 * 2^3 = 801 * 2^1 = 20

The sum of the value row is 42, and the decimal 42 is 0b101010 in binary.

Now we can convert the binary number to hexadecimal representation. This is simply because the hexadecimal number system has base 16 and binary has base 2 and 16 / 2 = 4. Each nibble of the hexadecimal number is represented by four bits of the original binary number. What’s different with hexadecimal is the representation. Values in range [10, 15] are represented by letters A-F so the 16 hexadecimal symbols are 0-9 and A-F.

The conversion itself is similar to the previous one. We take each group of four bits from the right side of the binary number and convert them to one hexadecimal symbol. Since our binary number has only 6 positions, we can fill the two missing positions from the left with zeros giving us two sets of four bits, or two nibbles in hexadecimal.

Convert binary number 0b101010 respectively 0b00101010 to hexadecimal.

16^1 = 1616^0 = 1
Bit0 0 1 01 0 1 0
Hex2A
Value2 * 16^1 = 3210 * 16^0 = 10

The sum of the value row is again 42 in decimal, represented by 0x2A in hexadecimal. Now that we know how to convert between decimal, binary, and hexadecimal we can start coding the solution.

Solution

In C++ we can represent the input decimal number as the uint32_t type with a range of [0, 4294967295]. Since the type itself advertises how many bits it uses to store the number, it’s clear we could hard-code the value 32. We can do better than that and use sizeof to get the number of bytes and multiply that by 8 to get the full bit count: sizeof(uint32_t) * 8 = 32. Next, we need to define how many of those bits we need to get one hexadecimal symbol - as I mentioned previously, we need 4 bits to represent 2^4 = 16 values and hexadecimal symbols.

It is very important to recognize that we can optimize the program from the beginning. The first thing we should look at is how we’ll store the output. We know the input has 32 bits. In hexadecimal, that will be 8 symbols and we know this at compile time because it’s driven by the input type size. In C++ this is easy and we can allocate the output buffer by declaring char text[size + 1]. Character arrays or C strings must be null/zero terminated, and this allows us to calculate the string length without storing the length as an extra parameter. This is why we say the buffer is size + 1. Because memory initialization can potentially be expensive and is not always necessary, C++ defaults to having all memory uninitialized. We must initialize the last char to 0.

We can now translate the first bits to an output hexadecimal symbol. We must start by isolating the first group of 4 bits that make a hexadecimal symbol. To do this, we need to use binary multiplication, specifically the AND operation. Let’s try this on the original binary number 0b00101010.

0b00101010 (42)
&
0b00001111 (15)
=
0b00001010 (10)

By doing this, we’ve zeroed out all bits higher than the first four. We also need to consider the second group of 4 bits left in our byte. To get to them and move them to the right, we need to use another binary operation: the right bit shift. The following example shifts bits by 4 to the right.

0b00101010 (42)
>>
4
=
0b00000010 (2)

This way, we’ve gotten the two values we were looking for from our bit to hexadecimal table above (2 and A). Now we can translate each group of isolated 4 bits to a hexadecimal symbol. Let’s consider our options. We can look at using a built-in ASCII table since every character has an integer value.

DecHexBinChar
4830001100000
4931001100011
5032001100102
5133001100113
5234001101004
5335001101015
5436001101106
5537001101117
5638001110008
5739001110019
654101000001A
664201000010B
674301000011C
684401000100D
694501000101E
704601000110F

You can see that there’s a gap in the table. Values between 57-65 contain different symbols than those we need. This gap would force us to use a condition to skip to ASCII letters at positions 65-70 when our hexadecimal symbol value would be in range [10-15].

If we decided to use a loop to sequentially convert all bits to symbols, it could be slower. A condition inside the loop could have an impact on CPU branch prediction. This exercise is so simple that it would not matter, but I think it’s still worth mentioning, and a good problem to be aware of.

Another solution to consider is creating a local lookup table. Each index of the table would represent a single hexadecimal symbol.

char hex[] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };

A lookup table created at compile time is a good resource that can quickly perform translation for all values in range [0-15] to symbols 0-9 and A-F. Lookup tables are another great way to speed up any program. We can now calculate the hexadecimal value that we’ll use for an index to find a symbol in our lookup table.

(number >> bits) & 0xF

Here we rotate the input value by a number of bits and mask it with 0xF or 0b1111 to isolate the value of the first 4 bits. This value is the key to find the right hexadecimal symbol in our lookup table.

0b00101010 (42)
&
0xF (15)
=
0b00001010 (10)

Now we have all the pieces we needed to write a simple program to perform the conversion.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <iostream>
#include <memory>

const char hex[] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };

int main() {
	uint64_t number = 15;

	const size_t bits = 8 * sizeof(number);
	const size_t unit = 4;
	const size_t size = bits / unit;

	char text[size + 1];
	text[size] = 0;

	for(int i = 0; i < size; ++i) {
		text[(size - (i + 1))] = hex[(number >> (i * unit)) & 0xF];
	}

	printf("Decimal value %ld is 0x%s in hexadecimal.\n", (long)number, text);

	return 0;
}

For the decimal 42 input value, the program will print 0x000000000000002A. You can see I used a loop to go over the bits in input. This is because the program is flexible with the input number type uint64_t. If you want to change the input to a 32bit number represented by uint32_t it will still work. You can inspect and compile the program live using Compiler Explorer.

There is one improvement we can do to make the solution even better. We can trim the zeros from the output hexadecimal number by introducing a pointer to char variable that will point to the actual beginning of the output string.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
char *pretty = (char *)"";
for(int i = 0; i < size; ++i) {
    char *symbol = &text[(size - (i + 1))];
    *symbol = hex[(number >> (i * unit)) & 0xF];
    if(*symbol != '0') {
        pretty = symbol;
    }
}

printf("Decimal value %ld is 0x%s in hexadecimal.\n", (long)number, pretty);

Inside the if condition we check the symbol to be non-zero, and we adjust the char * pointer. The pointer will always hold the memory address of the beginning of the output string. The improvement will keep the solution short with good performance, as you can see for yourself in another Dec2Hex example on Compiler Explorer.

I’m sure the code produced in this exercise could be further improved. This exercise definitely helped me refresh. It was also great to about finding and implementing a solution to a simple problem.