Tuesday, January 5, 2016

Algorithms: Merge Sort


I have decided to take an online Algorithms course for fun / for my own good / for my curiosity in my spare time on Coursera. I am going to post some of my notes here based on what I can find on the internet with my own notes on it. I find a lot of times it is harder to come from a non-traditional background and learn very comp-sciey type of classes as I don't have a background in all the vocab / theories / concepts, so I will basically be breaking down tech sounding concepts into plain English I can understand and sharing my notes :). I don't honestly think this will help me in my day to day work but perhaps down in the future it will be a foundation for something or extremely helpful in some rare case.

Starting off.. the study of algorithms appears to me to be the study of how long things take, making more efficient code, estimating the worst case scenario for how long your code will run, and the actual set of functions / processes / calculations to be measured. Merge sort is one of many ways to take a bunch of scrambled numbers and get them sorted. Merge sort specifically takes say 8 blocks, divides it into two sets of 4, then each of those two sets into four sets of 2, and stops there.. then merges them back one at a time into a new set of numbers in order (by comparing each number to see if it is higher/lower than the numbers in the set). Watch THIS little animated snipped on Wiki and all will be clear.

Very, very specific estimated times for a calculation with constants in them aren't terribly useful.. except on a very small scale calculation.. but seem good for the sake of understanding how to calculate it at all. Below is a merge sort in C language I found on a website (referenced below image) with some annotations on it.

The number of operations it takes to run this algorithm is: 6n*log(base 2)n + 6n The easiest way to prove it is to understand that each level does the same amount of work. And to multiply the amount of work by the number of levels. And that is the total work done.

Here is how to prove each level does the same amount of work: Number sub-problems doubles with each level (larger quanity of arrays). But amount of work to do halves (arrays get smaller). Doubling # of arrays and halve the size of each array.. will have constant amount of work. See notes on code below:
Original image and C code from http://faculty.cse.tamu.edu/djimenez/ut/utsa/cs3343/lecture2.html.

The number of levels will be:
So.. if we know there are (log(base 2)*n + 1) levels and 6n work per level, if we multiple those two things we can say merge sort will take this amount of operations for a n input array of n numbers: 6n*log(base 2)n + 6n