The merge sort algorithm (implementation in Java)

1. Description

In computer science many algorithms are recursive by nature and they usually follow an approach called divide-and-conquer . In this particular case dividing-and-conquering a problems means to break it into smaller identical sub-problems, resolve the smaller sub-problems recursively (by dividing the smaller sub-problems into even smaller problems), and eventually combine all their results to determine the solution for the initial problem .

The Merge Sort algorithm is based on this technique :

1. The merge sort algorithm works on arrays (or lists). If the array has its size < = 1, then we consider the array already sorted and no change is done . This is considered to be the limit condition of this recursive approach . [Limit Condition]
2. If size of the array (or list) > 1, then we divide the array into two equally-sized sub-arrays . [Divide] – O(1)
3. At this point we recursively sort both of the two arrays . [Conquer] 2*T*(n/2)
4. We merge the two sorted sub-arrays into one sorted array . [Merge] O(n)

We will start by defining the MERGE function . This is function is responsible with the merging of two sorted arrays. The resulting array (in our case portion of the initial array composed by the two merged sub-arrays) will be also sorted :


L The initial unsorted array .
llim The starting index of the left sub-array;
mlim The stopping index for right sub-array, and the starting index for the right sub-array;
rlim The stopping index for the right sub-array;
LEFTL, RIGHTL Additional helper arrays .
SENTINEL A value to simplify our code, there’s nothing “bigger” than the Sentinel .

After we define the MERGE function, the actual MERGE-SORT function can be written as:

2. Java implementation Read More

Sieve of Eratosthenes (finding all prime numbers up to a specific integer)

1. Algorithm Description

Sieve of Eratosthenes is an ancient algorithm in mathematics that allows us to find all prime numbers up to a specific integer value . It works well on small inputs (the maximum value of the interval is 10.000.000 – 20.000.000 ) .

I am not going to enter into too many details, as the algorithm itself is not very complicated and there are plenty resources on the web describing the concept of a sieve in general and the Eratosthenes Sieve in particular . Still I will implement this algorithm, for the sake of having a “complete” (far from the truth) reference on algorithms and data-structures .

The algorithm is based on the following technique:

1. Create an array S of integers with size N (the maximum value of the interval). Mark all positions in the array as possible primes .
2. Start iterating the array from position 2 (the first prime number), and mark every multiple of 2 as non-prime: S[2*2], S[2*3], …
3. Find the next non-prime number (that wasn’t previously marked), in our case 3, and continue to mark as non-prime all its multiples: S[3*2], S[3*3], …
4. Find the next non-prime number (that wasn’t previously marked), in our case 5, and continue to mark as non-prime all its multiples: S[5*2], S[5*3], …
5. Repeat the logic, and stop when the next non-prime number, k, satisfies the following relationship k*k > p .

Sieve of Eratosthenes pseudo-code:

2. Algorithm Implementation Read More

Binary GCD (Stein’s Algorithm) in C

1. Algorithm Description

Binary GCD also known as Stein’s Algorithm is an algorithm that computes the greatest common divisor of two (positive) numbers . Discovered in 1967 by the Israeli programmer Josef Stein, it’s an alternative to the classical Euclid’s Algorithm, and is considered to be more efficient than this as it’s replacing divisions and multiplications with bitwise operations . The algorithm is recursive by nature, but loops can be used instead of recursion .

The algorithm can be described by the following rules (here and here) . Note that by B_GCD we will refer to a function that returns the greatest common divisor of two positive numbers .

1. B_GCD(0, 0) is not defined, but for convenience we will consider it 0 (zero), so B_GCD(0,0) = 0 ;
2. B_GCD(num1, 0) = num1 and B_GCD(0, num2) = num2 . The reason for this is beacuse zero is divided by everything ;
3. If num1 and num2 are even, B_GCD(num1, num2) = 2 * B_GCD(num1/2, num2/2), as 2 is a common divisor .
4. If num1 is even and num2 is odd, B_GCD(num1, num2) = B_GCD(num1 /2, num2), as 2 is not a common divisor . The steps are the same if num1 is odd and num2 is even : B_GCD(num1, num2) = B_GCD(num1, num2/2) .
5. If both num1 and num2 are odd, then if num1 >= num2 the B_GCD(num1, num2) = B_GCD((num1-num2)/2, num2), else B_GCD(num1, num2) = B_GCD((num2-num1)/2, num1) .
6. Step 4 and 5 are repeated until num1 = num2, or num1 = 0 .

We can also use pseudo code to describe the above algorithm .

Recursive Version of Binary GCD (Stein Algorithm)

The loop-version of the Binary GCD Algorithm

Read More

Euclid’s Algorithm: Reducing fraction to lowest terms

In the last article I’ve described Euclid’s algorithm for finding the greatest common divisor of two given numbers .A simple programming exercise I found in the book called “Algorithms in C” (Sedgewick) asks us to reduce a given fraction to lowest terms, using the Euclid’s Algorithm .

Fraction = 36 / 120
Fraction = Reduce(Fraction) = 3 / 12

1. C Implementation

Step 1:

We will start by defining the data-structure we will use for encapsulating the fraction information (numerator and denominator):

Step 2:

Usually when we play with C structures we should have two functions, for allocating and de-allocating memory associated with a structure (if you have an object-oriented background, this step will be equivalent with writing a constructor / destructor pair ) .

An important aspect is that the fraction_new() function fails if a 0 (zero) denominator is given .

Step 3:

On this step we will write a function that reduces the fraction to lowest terms . For this we will also need to determine the greatest common divisor of the fraction’s denominator and numerator in order to divide the fraction Read More

Euclid’s Algorithm

Recently I’ve started to implement (or reimplement) the most common algorithms a software developer should know . One of the nicest books I found on this topic is Algorithms in C (Robert Sedgewick) . Of course, there is this and this, but lately I am more interested on the “implementation” side of things than on maths and theory .

1. Algorithm Description

Euclid’s Algorithm is an efficient method for calculating the greatest common divisor of two numbers (aka GCD) . The  GCD of two numbers is considered the largest number that divides both of them (without leaving a reminder) .

Euclid’s algorithm is based on the principle that given two numbers a and b, if a is greater than b than the greatest common divisor of a and b is the same with the common divisor of b and (b – a) . If we transform the above property into “code” (actually pseudo-code) the algorithm looks like this:

The above pseudo-code is called the subtraction-variant . We can of course replace the repetitive subtractions with one division . The division-based version looks like this (in pseudo-code):

There is also a recursive version of the algorithm:

Read More