Sorting Algorithms
In the book Algorithms for Dummies by John Paul Mueller and Luca Massaron, the authors listed 4 programs that sort data. I’m going to try to provide my high-level understanding of the sorting concepts for the first 3 python sorting programs (the fourth one, I’m struggling so until I get it, I will leave it off). I thought I would go ahead and proceed with this post because I don’t want to forget what I have struggled to understand thus far. I won’t be showing the programs themselves because you can get the book for that; I want to try to get at the concepts which the book does not really go into, or at least not at the level I need.
There are multiple sorting programs, all developed to try to speed up the sorting process. So, if one program works too slowly, another program may be faster.
Here’s the attached spreadsheet with the three sorts. The file uses a data array [9,5,7,4,2,8,1,10,6,3] to show how the programs are doing the sorting. In this post, I will use another example array [6,8,1,4,25,12]. I figured that between the two examples, you might be able to discern a pattern in the sorting approach and understand in a high-level way how the programs are sorting.
[Okay, I can’t do my usual upload of the document so you can see the file onscreen. But you can download it! With WordPress constantly updating their program, something gets out of whack, so for now you will have to download. I’d rather you look at it onscreen but you know with the constant updates…something always break.}
[New update: I found a new method that appears to work okay with the new theme. You will have to scroll around until you reach the upper left hand corner where you can start to see what is on the screen. ..You can download it! With WordPress constantly updating their program, something gets out of whack.}
Method 2 of embedding: I still want you to be able to see this onscreen.
Sort 1: Selection Sort
The concept is you sort from left to right, picking your smaller number. This is an interactive process, and thus a brute force process, where you start with finding the smallest number for the 1st position by comparing the left two positions and deciding which is the smaller number. Then with the chosen number, you next compare it with the number in the 3rd position and decide which is the smaller. With that smaller number, you move on to the 4th number and continue with the comparison, moving to the right, picking the smaller number. Once you get to the end, that smaller number will be swapped with the number in the 1st position.
Then you move to the 2nd position and compare the 2nd and 3rd numbers to decide which is the smaller. You do the same thing, moving to the right, each time picking the smaller number. At the end of this iteration, you will swap the smallest number with the number in the 2nd position.
You then continue with the 3rd position, repeating the aforementioned process until you have done all of the positions in the array.
Here’s an actual numerical example to help you understand better. The embedded spreadsheet found above works with data [9,5,7,4,2,8,1,10,6,3]; the example below will be a different example. Hopefully, with two different examples, you will see and understand what is going on.
ScanIndex is the position that we are seeking the smallest number. In python, the first position in an array is denoted as zero. Data to sort will be [6,8,1,4,25,12].
Please bear with the image on the screen. The best I could do to develop the example was to do it in table form which I did in Word Document. I don’t know if this makes sense.
The data being sorted is [6,8,1,4,25,12].
Sort 2: Insertion Sort
This one is a bit more convoluted and took me forever to figure out a high level way of describing the logic. Simplistically, this method asks which number is the smaller and then replace the number with the larger. Then you move down the list of numbers iteratively, in a very brute force method of sorting, asking which of the numbers are smaller. But it’s not really that simple.
The insertion sort might be better than the selection sort under certain situations but the book was not clear what those situations could be.
Okay, let’s see if I can describe the general methodology of sorting the numbers. You take the first two numbers and arbitrarily assign “minimum” position to the second number. You will also assign position 2 number to the variable “temp”. You are going to be trying to compare this “temp” number with other numbers as you go through the various positions, moving to the left.
So, in data [9,5,7,4,2,8,1,10,6,3], you would pick 9 and 5 as your starting point, 5 would be your temp number and you would arbitrarily say position 2 (or position 1 in python speak) is the minimum position. Then you ask yourself, is temp less then position 1 (or is 5 less than 9?). If the temp (which is in position 2) is less than the 1st position, you would replace the 2nd position with the larger number and you would say that the minimum position is now position 1, and since we’ve reached position 1, we would place the “temp” number into the minimum position. If position 2 is not less than position 1, you would leave as is and move on to the next iterative step which would be to look at positions 2 and 3. The data now looks like [5,9,7,4,2,8,1,10,6,3].
Got that?
Next, you would look at position 2 and 3 and arbitrarily assign minimum position to position 3 and the number in position 3 will be the “temp” that is being compared to all of the numbers. So, we’re looking at 9 and 7, with temp being 7. You would ask if the temp number is smaller than the number in position 2 (7 less than 9?), and if not, you would move onto the next iteration with positions 3 and 4. But if yes, replace the number in position 3 with the larger of the 2 numbers and assign the minimum position to position 2. Data becomes [5,9,9,4,2,8,1,10,6,3]. Now, you would then look the numbers in position 1 and 2 and ask if temp is smaller than the number is position 1 (7 less than 5?). If not, you would replace the minimum position marker with the temp number and then move on to the next iteration. (At this point data becomes [5,7,9,4,2,8,1,10,6,3]. If yes, then position 2 would be replaced with the larger of the 2 numbers and minimum position assigned to position 1.
As you can see, this will get wordy, so I’m going to try to strip the words down into an example, like we did in the Select Sort. Data will be [6,8,1,4,25,12]. The embedded spreadsheet will continue with the aforementioned example [9,5,7,4,2,8,1,10,6,3]. I think with two examples it will become clearer the process being taken to sort the numbers.
Data = [6,8,1,4,25,12].
I’m not sure that this example will be any better as the whole logic isn’t quite obvious but the program does work. At each iteration, you pick two positions to start from, starting from the far left and moving to the right at each iteration.
At each new iteration, after you pick your positions to look at, your Temp number will always be the right most number of the 2 positions you pick and that will be your initial Min position for that iteration. You will compare the Temp number to each number in the list, going from right to left.
So, the comparison of Temp moves from right to left and each iteration moves from left to right.
The asterix * is what the program would print out as a result.
Sort 3: MergeSort
This third sort may be easier to understand. This sorting involves a “divide and conquer” technique and is supposed to be markedly faster than the other two sorts mentioned.
Basically, what the program does is iteratively break down the array of numbers into individual numbers and then build back up the array by merging the numbers but in numerical order. The break down is in a tree-like fashion and generally, not always, start from the left-hand side of the tree.
The original data: [6,8,1,4,25,12]
The length of this data is 6, so divide it in half: left = [6,8 1] and right = [4,25,12].
Take the left first whose length is 3. The division becomes left = [6] and right = [8,1].
Now break down the right into [8] and [1]. Now merge and build this up into [1,8].
Now build up [6] and [1,8] into [1,6,8].
Before building up [1,6,8] and [4,25,12], the right-hand side has to be broken down first.
The breakdown would be left = [4] and right = [25,12]. Break down the right side further into [25] and [12].
Now begin the buildup, starting with the right side: [12,25].
Then build [4] and [12,25] into [4,12,25].
Finally merge [1,6,8] and [4,12,25] into [1,4,6,8,12,25].
I will admit which branch of the tree breakdown the program works on first is a bit confusing but the general concept is that the array is broken down into branches of the tree by halving the array each time. Then you build back up the tree by merging the individual numbers in order.
Wrap Up
Now to wrap things up, can you tell I wanted to be done? I didn’t even proofread. It’s after 10 and I wanted to be done before 10 so that I can wind down and be ready for bed and not lie in bed awake for a couple of hours.
Anyway, I had hoped to be able to explain the high-level concepts in how the sorting approached worked but I’m not really happy with what I’ve written here. While I do understand the approach in a very rudimentary way, the explanations are still not very clear. Within this post above, you will see an Excel that you can download, and that file has 3 tabs showing each of the sorting type discussed in this post with a data array composed of [9,5,7,4,2,8,1,10,6,3]. I’m hoping with 2 examples, you can get a glimmer of how the programs do the sorting.
You must be logged in to post a comment.