Divide and Conquer: the story of a brave developer
The bells have been rung and the gates closed. The army that came knocking was large and savage. Panic spread among the population. The ignorants screamed that they were doomed, that they could see death embrace their souls in her wooden grip, but the elders were calm and composed for they knew that their commander was dauntless and skilled. Despite being largely outnumbered, the commander was sure of himself. His eyes were dark and distant, and he moved with the subtle certainty that comes from knowing many things. Indeed, he knew many things.
The maximum subarray problem
The enemy was a problem that most of the skilled commanders fail to conquer. It’s called the maximum subarray problem and it goes as follows:
Given an array N of whole numbers, find the subarray of N with the maximum sum of its elements.
For example, the solution for the array [-3, 1, -8, 4, -1, 2, 1, -5, 5] would be [4, -1, 2, 1]. The sum of this subarray is 6, which is greater than the sum of the elements of any other subarray.
Brute Force Approach
The commander had the choice to charge the enemy with all the forces he have. It’s called the brute force attack.
The approach consist of calculating the sum of every possible subarray and returning the one with the maximum sum. We will start by subarrays of size 1, and then move up to the subarrays of size = the size of the input array. This method is guaranteed to solve the problem…
import numpy as npdef maximum_subarray(arr):
max_sum = arr
max_subarray =  for n in range(1, len(arr)+1):
for i in range(0, len(arr)-n+1):
subarray = arr[i:i+n]
sum = np.sum(subarray)
if (sum > max_sum):
max_sum = sum
max_subarray = subarray return max_subarray
n iterates over the sizes of the subarrays, and i iterates over the subarrays of size n. Notice that the time taken by this approach grows quadratically with the number of elements in the array. The time complexity of this algorithm is O(n²). This solution may be appropriate for small input arrays, but will not be efficient for large ones.
The opposing army was large indeed. Our commander can not risk going for a brute force attack because it will exhaust the power of his men and lead to an imminent defeat. He had to find another way. In the midst of a hollow silence, the one that usually precede the storm, our commander remembered the wise words of his mentor:
“Always remember: divide and conquer. A captain should endeavor to divide the forces of the enemy. Once this deed accomplished, he should defeat each part separately and drive his enemy to the abyss”
Divide and Conquer Approach
The divide-and-conquer approach consist of breaking the problem into several subproblems that are similar to the original problem but smaller in size, solving the subproblems recursively, and then combining these solutions to create a solution to the original problem.
- Divide the problem into subproblems that are smaller instances of the same problem
- Conquer the subproblems by solving them recursively. If the size of the subproblem is small enough, solve it manually.
- Combine the solutions of the subproblems into the solution for the original problem.
Let’s see how we can incorporate this technique in our case. Suppose we want to find a maximum subarray of the subarray A[low…high]. Divide-and-conquer suggest to divide the A[low…high] into two subarrays of as equal size as possible A[low…mid] and A[mid+1…high]. Any contiguous subarray A[i…j] of A[low…high] must satisfy one of the properties below:
- entirely contained in the subarray A[low…mid] (low ≤ i ≤ j ≤ mid)
- entirely contained in the subarray A[mid+1…high] (mid < i ≤ j ≤ high)
- crossing the midpoint (low ≤ i ≤ mid < j ≤ high)
A maximum subarray of A[low…high] must have the greatest sum over all subarrays entirely in A[low…mid], entirely in A[mid+1…high] or crossing the midpoint. We can find the maximum subarrays of A[low…mid] and A[mid+1…high] recursively since they are smaller instances of the original problem. The only thing left is to find the maximum subarray that crosses the midpoint, and return the subarray with the largest sum of the three.
Finding the maximum subarray crossing the midpoint is relatively easy. Note that this is not a smaller instance of the original problem because we added the restriction that the subarray must cross the midpoint, otherwise we could have solved it recursively. Any subarray A[i…j] crossing the midpoint is itself made of two subarrays A[i…mid] and A[mid+1…j]. We need to find the maximum subarray of A[i…mid] and A[mid+1…j]. Each of the two maximum subarrays should have respectively A[mid] as right-most element and A[mid+1] as left-most element. Concatenating these two subarrays will yield the maximum subarray crossing the midpoint.
def maximum_crossing_subarray(A, low, mid, high):
left_sum = float('-inf') #negative infinity
sum = 0
for i in range(mid, low-1, -1):
sum += A[i]
if (sum > left_sum):
left_sum = sum
max_left = i right_sum = float('-inf')
sum = 0
for j in range(mid+1, high+1):
sum += A[j]
if (sum > right_sum):
right_sum = sum
max_right = j return [A[max_left:max_right+1], left_sum + right_sum]
The function below takes as input the array A and the indices low, mid and high, and returns the maximum subarray that crosses the midpoint along with the sum of its elements. The complexity of this procedure is O(n).
With this linear-time algorithm, we can write the divide-and-conquer algorithm to solve the maximum-subarray problem:
import mathdef maximum_subarray(arr, low, high):
if low == high:
return [arr[low:high+1], arr[low]]
mid = math.floor((low+high)/2) max_subarray_left, left_sum =
maximum_subarray(arr, low, mid) max_subarray_right, right_sum =
maximum_subarray(arr, mid+1, high) max_crossing_subarray, cross_sum =
maximum_crossing_subarray(arr, low, mid, high) if (left_sum >= right_sum and left_sum >= cross_sum):
return [max_subarray_left, left_sum]
elif (right_sum >= left_sum and right_sum >= cross_sum):
return [max_subarray_right, right_sum]
return [max_crossing_subarray, cross_sum]
The time complexity of this algorithm is O(nlog(n)), which is better than O(n²) of the brute force approach. The divide-and-conquer approach yields an algorithm that is asymptotically faster than the brute force method.
The triumph of the underdog
The opposing army were constituted of men from different regions. Our commander have had the Machiavellian idea to feed stories to the enemy stating that, after their undeniable victory, the men of each region will try to take over the others and keep the spoils to themselves. Soon, distrust spread like a wildfire among the troups of the enemy and the army was divided. The rest is history…