Posted by Marta on December 18, 2020 Viewed 7216 times
In this article you will learn how to calculate the running time of an algorithm using a very important computer science concept, known as Big O Notation.
Big O will help you to understand how your code scale when processing thousand of elements. Your code will be faster and more efficient memory and time wise. Plus calculate the running time of an algorithm using Big O is a question that comes up in all interview processes for big companies like Google, Amazon, Twitter, etc.
Big O notation looks like a complex mathematical concept, however the idea underneath is simple. The big o notation is a metric that describe the efficiency of an algorithm. That’s the definition, but what does that mean?
You could see big o as a mathematic operation that will express the rate in which the number of steps for your algorithm will increase as you increase the number of items to process. Or you could say that it answer the question: How does number of steps change as the data increase?
Big o notation is a useful metric. Will your program execute fast for 100000 items? Big O will answer that question.
Basically it helps computer scientists and programmers to categorise algorithm based on the their performance and how well they scale. You could refer an algorithm as a 22-steps algorithm, and another one as a 1000 steps algorithm, however that wouldn’t be very useful since the number of steps would change depending on the number of item to process.
However I could say an algorithm is linear( O(n) ). That means the numbers of steps will grow at same rate as the number of data to process, which is quite a long sentence. This can be expressed in one word: Linear. Making the communication more concise and consistent way. Plus big o notation provides a tool to analyse and compare algorithms.
For instance, let’s say we have the following algorithm that will return the number 100.
def function1(items): return 100
This algorithm always carry out the same number of steps to finish, just one step, regardless of the number of items to process. That means this algorithm is constant order, which is represented with this symbol O(n). Constant order is the fastest an algorithm can be.
Another example. Let’s say we have the following algorithm that print all the number in an array.
The number of steps this algorithm will carry out, will increase with the number of items. If the array contains one item, the print statement will execute once. If there are 2 items the print statement will execute twice and so on. This is called linear order.
The most used big orders are the following:
The best is constant order. Then we have logarithmic order, which means that the algorithm takes as many steps as it takes to keep halving the number of items until we remain with one.
Next we have linear order which we have already seen. And then we have quadratic and cubic order, for which the number of steps will start growing really fast as the number of items grow, as you can see in the graph. An algorithm is quadratic when there is a loop inside another loop. And an algorithm is cubic if there are three nested loops.
What if the situation changes? For instance, we have a search algorithm and the item we are looking for is the first one in the list. The algorithm will carry out just one test. Great. And in another case, the item we are searching is in the middle of the list, so the algorithm will carry out n/2 steps. The convention is that Big O notation usually refers to the worst scenario. The reason behind this pessimism is that you should optimise your algorithm for the worst.
Probably you understand all big o notation theory and why it is important by now. How can you apply all these ideas? Let’s see an example. We have the following algorithm to find a number in a list of numbers and return its position.
def search(list, item): for i in range(len(list)): if(list[i] == item): return i return -1
This code will check every number in the list, and see if the number is equal to the number we are looking for and if so, return it is position. Worst case, the number is not in the list, and the code will check every numbers in the list. That means this code is O(n).
Can you improve this? Actually, you can! You can improve this code using the binary search mechanism. The binary search algorithm will require that the list is sorted. Here is how the binary search works:
Searching for: 34 list= [1,4,2,8,43,5,28,49,14,34,67] => [1,2,4,5,8,14,28,34,43,49,67]
Then it will check the number in the middle, 14 in this case. Since the
number we are looking for is bigger, we will only search in the right half.
Searching for: 34 list = [28,34,43,49,67]
Next step, the algorithm will do the same. Check the middle element, if the middle element is smaller than the search number, check only the right half. If the middle element is bigger, continue the search on the first half. It will keep doing this until the list has one element.
searching for: 34 list = [28,34]
Here is how you can implement this in python:
def binary_search(list, item): # Get middle_index = get_middle_index(list) if(len(list) == 1 and list!=item): return -1 if(list[middle_index]==item): return middle_index elif(item < list[middle_index]): # Continue searching on the first half return binary_search(list[0:middle_index],item) else: # Searching on the left/second half return binary_search(list[middle_index:len(list)], item) # Calculate the middle position of a list def get_middle_index(list): if (len(list)<=2): return 1 if (len(list)%2==0): return int(len(list)/2) else: return int(len(list)/2)+1
This code will keep halving the number of items, that means O(log(n)). It is using recursion. The function calls itself, but with a subset of the initial data
And just for fun, you could use the following piece of code, which will run both search and compare the execution time. Feel free to play around with it, increase or decrease the size of the list and see how that impact the timing. Have fun with it!
import random import time # Build a list with 10.000.000 numbers(between 0 and 30)!! def build_dataset_with_10000000_items(): list =  ts = time.time() for i in range(0,10000000): n = random.randint(1,30) list.append(n) print('Time spent creating the list(sec) :'+ str(time.time()-ts)) return list list1 = build_dataset_with_10000000_items() print(list1) ts = time.time() search(list1,100) print('Time spend by normal search: '+str(time.time()-ts)) ts = time.time() binary_search(list1,100) print('Time spend by binary search: '+str(time.time()-ts))
The normal search will take 1.1767148971557617secs to run, and the binary search 0.07022404670715332 secs, it is much much faster.
Now that you know big o notation, you know to calculate the running time of an algorithm. Therefore you can analyse an algorithm and know how it will scale, as you increase the number of items to process.
I hope you enjoy this article. If you have questions or anything is unclear, please send a comment below and I will be happy to help.
If you enjoy this article, drop a like or subscribing to the youtube channel. You will get notified about future programming tips and videos!
Thanks for your time!! 🙂
Steady pace book with lots of worked examples. Starting with the basics, and moving to projects, data visualisation, and web applications
Unique lay-out and teaching programming style helping new concepts stick in your memory
Great guide for those who want to improve their skills when writing python code. Easy to understand. Many practical examples
Perfect Boook for anyone who has an alright knowledge of Java and wants to take it to the next level.
Excellent read for anyone who already know how to program and want to learn Best Practices
Perfect book for anyone transitioning into the mid/mid-senior developer level
Great book and probably the best way to practice for interview. Some really good information on how to perform an interview. Code Example in Java