Mon. Jan 23rd, 2023

Bubble Sort

How it works

One of the simplest algorithms to implement, the bubble sort compares each two adjacent values in a list, and if the values are in the wrong order, they are swapped. This process is repeated until the data has been sorted.

Consider the following list of numbers:

22, 14, 9, 16

A bubble sort will begin by comparing the first two values, 22 and 14. Assuming the intention is to sort the values from lowest to highest, 22 and 14 are not in the correct order. So, they are swapped. The list is now:

14, 22, 9, 16

Having compared the first and second items in the list, we move along, and compare the new second and third items. These are 22 and 9. Again, they are not in the correct order, so they are swapped. The list is now:

14, 9, 22, 16

The process repeats; having compared the second and third items, we now compare the new third and fourth items. These are 22 and 16 – clearly, again they will need to be swapped. The list is now:

14, 9, 16, 22

At this point, we will stop for a minute.

What was demonstrated above is called one pass. We have passed over the whole list of values once. Some important points to understand at this point are:

  • The highest value in the list (22 in this example) has moved into the correct position in the list after one pass
  • In our list of four items, we completed three comparisons

Finishing the sort

Having completed one pass, we could then repeat the process, completing a SECOND pass. This would ensure the second-highest value ended up in its rightful place:

9, 14, 16, 22

In fact, in our example the data is now sorted. However, if we had four items to sort, completing three passes would guarantee that all of the data was in the correct order, so we might program our routine to complete n-1 passes, where n is the number of items in the list. This is because our routine would have to work for any combinations of data.

Improving efficiency

There are ways to improve the average efficiency of a bubble sort from the worst-case. Consider:

  • You can use a Boolean variable to track whether or not a value was swapped during any given pass. If no value was swapped, you can end the sort early without completing any remaining passes
  • Given that each pass will move the highest remaining unsorted value into place, each successive pass can miss out one more value than the previous, as the end of the list will be progressively more sorted, and therefore no further changes will be required

Pseudocode

sub BubbleSort(int[] data) 
  //note the nested loops
  for outer = 0 to len(data)-1  
    for inner = 0 to len(data)-1 
      if data[inner] > data[inner+1] then
         temp = data[inner]
         data[inner] = data[inner+1]
         data[inner+1]=temp
       end if
     end for
   end for
end sub  

Time efficiency

A bubble sort, at its worst, would have to complete n-1 passes in order to fully sort the data. Given that each pass requires up to n-1 comparisons and swaps to be completed, the algorithmic complexity is O(n2) – that is, if you increase the amount of data to be sorted by a factor of 2, the time taken to sort it will increase by a factor of 4.

Advantages

In terms of sorting algorithms, a bubble sort is simple to implement, and for small data sets, it’s performance is acceptable.

Disadvantages

Being O(n2) means that as the amount of data increases, the algorithm will become much slower – for example, 100x more data will result in a run time of 10,000x longer.