Python became the de-facto standard for Network Automation nowadays, Many Network Engineers already use it on daily basis to automate networking tasks starting from configuration to operation till troubleshooting the network problems. in This post, we will visit one of advanced -yet- topics in python and scratch the multi-processing nature of python and learn how to use it to accelerate the script execution time.
First, We need to understand, How Computer execute a python script?
1- When you type #python <your_awesome_automation_script>.py in the shell, Python (which run as a process) instruct your computer processor to schedule a thread (which is the smallest unit of processing)
2- The allocated thread will start to execute your script , line by line. Thread can do anything, starting from interacting with I/O devices, connecting to router, printing output, doing mathematical stuff..anything
3- Once The script hit the EOF (End of File), the thread will be terminated and returned to the free pool to be used by other processes
in linux, You can use #strace –p <pid> to trace a specific thread execution
The more threads you assigned to script (and permitted by your processor or OS), the faster your script will run. Actually threads sometimes called “Workers” or “Slaves”
I have a feeling that you’ve that little idea in your head, Why wouldn’t we assign a LOT of threads from all cores to python script in order to get the job done, Quickly!
The problem with assigning a lot of threads to one process without special handling is what’s called “Race Condition”. The operating systems will allocate memory to your process (in this case it’s python process) to be used at the runtime and accessed by all Threads, ALL OF THEM AT THE SAME TIME. Now imagine one of those threads is reading a data before it’s actually written by another thread!.you don’t know the order in which the threads will attempt to access the shared data and this is called Race Condition.
You can read about the race condition and how to avoid it in this link
one of the available solutions is to make thread acquire lock, in fact, Python, by default is optimized to run as a single threaded process and has something called GIL(Global Interpreter Lock). GIL does not allow multiple threads to execute Python code at the same time.
but rather than have Multi-threads, Why Don’t we have Multi-processes?!!
The beauty of the multi-processes over multi-thread is you won’t be afraid from data corruption due to shared data among the threads. Each spawned process will have their own allocated memory that won’t accessed by other python processes. This will allow us to execute parallel tasks at the same time!
Also, from Python PoV, each process has it’s owned GIL. so there’s no resource conflict or race condition here.
Enough Talk, Let’s jump to the code.
First you need to import the module to your python script
import multiprocessing as mp
Second, you need to wrap your code with a python function, this will allow the process to “target” this function and mark it as a parallel execution
let’s say we have a code that connect to router and execute command on it using netmiko library and we want to connect to all devices in parallel, This is a sample “serial” code
We need to assign number of process equal to number of devices (one process will connect to one device and execute the command) and set the target of the process to execute the function we wrapped around our previous code. This is an example
Finally we need to launch the process
behind the scene, the main thread that execute the main script will start to “Fork” a number of processes equal to number of devices and each of them targeting one function that execute “show arp” on all devices at the same time and store the output in a variable without affecting each other, brilliant!
this is a sample view for the processes inside the python now when you execute the full code.
one final thing need to be done, is to “join” the “forked” process again to the main thread/truck in order to finish the program execution smoothly.
I have a script used to push some initial configuration to lab devices (9 routers) and I decided to re-write it again using same above procedures and measure the time and here’s the findings (the value that we’re looking for is “real” row)
You can spot the difference, it’s 6 time faster than the serial execution! and the gap will be increased as you add more devices.
in one simple word,
Don’t just Automate the task, make it fast!
In next post, we will explore some additional flags for multi-processing library inside the python
Finally , I hope this has been informative for your and I’d like to thank you for reading.