Hello coders!! In this article, we will be learning about forking and its implementation in Python. In computer science and technology, the term fork has primarily two meanings:
- Cloning a process
- Developing independently from a legal copy of the source code
Forking in Python:
fork() function creates the copy of the process which calls it.
- The copy runs as a child process
- The data and code of the child process comes from the parent process
- The child process has a different id
- The child process is also independent of the parent process
Depending on the return value of the fork() method one can understand which process they are in.
- Positive – parent process
- Zero – child process
- Negative – error in the creation of process
Code for forking in python:
import os
def parent_child_process():
n = os.fork()
id = os.getpid()
if n > 0:
print("Parent process: ", id)
else:
print("Child process: ",id)
parent_child_process()
Output:
Explanation of the code for Python forking:
- At first, we have imported the os module to run a portable operating system.
- As per the fork() method’s return value, we have classified the process as either child or parent.
Multiprocessing in Python:
multiprocessing.Pool
– module provided by Python to run tasks parallelly in a pool of processes.
from multiprocessing import Pool
import os
def double(i):
print("I'm process", getpid())
return i * 2
with Pool() as pool:
result = pool.map(double, [2, 3, 4, 5])
print(result)
As we can see we have executed multiple processes parallelly. However, a problem arises in multiprocessing when a deadlock is created.
What is deadlock?
A deadlock is created between two or more processes when a process is waiting for another process to free a certain resource, whereas the other process is waiting for the former process to free its resource. Let’s see an illustration.
Process1 is waiting for resource2, which is being used by Process2.
Process2 is waiting for resource1, which is being used by Process1.
Thus, both processes are waiting for each other, giving rise to a deadlock.
Problem with using Python forking:
fork() copies everything from memory, but it doesn’t copy the threads. The child process does not contain the threads running in the parent process. This can result in causing a deadlock.
How to resolve this Problem?
This problem can be easily resolved by stopping the plain use of fork() method. Some of the methods for starting new processes are:
-
fork() followed by an execve()-
In this the child process does not inherit
the module state and starts from scratch - POSIX fork() – duplicates only the thread that calls fork()
Forking vs. Threading
Forking:
A new process is created looks exactly like the old process with the only difference of having a unique process Id and its own memory location. The child process does not inherit any file locks set by the process. Any semaphores that are open in the parent process is also open in the child process.
Threading:
A thread is a lightweight process that basically is just a CPU state with the process containing the remains (data, stack, I/O, signals). Less overhead than “forking” is required because the initialization of a new system’s virtual memory space and the environment is not required. Each thread has its own unique ID, and the threads of the same process share the process instructions and the data.
Debugging a forked process in Python:
One can use pdb on the main process and winpdb on the fork to debug a fork.pdb is the standard library for debugging winpdb is an advanced debugger available in Python that supports multiple threading and breakpoints. The software can be put to break early in the fork process, and the winpdb app can be attached once the break has been hit.
Forking HTTP server in Python:
HTTPServer is a subclass of socketserver.TCPServer.Multiple threads or processes are not used in this to handle requests. To add threading or forking, a new class can be created using the appropriate mix-in form.
- class socketserver.ForkingMixIn
- class socketserver.ThreadingMixIn
Killing a forked process in Python:
os.kill()
is the method available in Python, which is used to send a signal to the specified process with the given process id.
import os
import signal
my_pid = os.getpid()
os.kill(my_pid, signal.SIGINT)
Also, Read
- Python Swap of two variables using Python programming
- Max Heap Python Implementation | Python Max Heap
- Understanding the Python Timer Class with Examples
- Knapsack Problem in Python With Various Ways to Solve
Conclusion:
In this article, we learned all about forking in Python. We also learned about deadlock and saw how to resolve deadlocks while forking. Then, we moved on to learn the difference between forking and threading and saw how to debug and kill a forked process in Python.
However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.
Happy Pythoning!