blog bg

September 16, 2024

Multithreading in Python: How to Work Around the GIL for Better Performance

Share what you learn in this blog to prepare for your interview, create your forever-free profile now, and explore how to monetize your valuable knowledge.

Multithreading in Python: How to Work Around the GIL for Better Performance

 

Python's multithreading is slower than expected. Why? The Global Interpreter Lock (GIL) is the solution. It is important to understand the GIL as it affects how Python handles multitasking at once. Find out why the GIL is important and how it changes Python's parallelism and multithreading in this blog post.

 

Multithreading vs. Multiprocessing

Multithreading and multiprocessing must be differentiated to understand how the GIL affects Python multithreading.

  • Multithreading runs numerous threads in one process. These threads share memory and interact well. Multithreading is important for waiting tasks like processing several web server requests or reading many files.
  • Multiprocessing includes executing many processes, each with its own memory space. Processes do not share memory; hence, GIL does not affect them. Multiprocessing is appropriate for CPU-bound tasks that need numerous cores for hefty calculations.

 

How Does GIL Affect Multithreading in Python?

The GIL limits Python code execution to one thread. Despite having numerous threads, your software cannot perform Python code on several CPU cores. Instead, they must take turns, which slows your software, particularly CPU-bound processes.

For example, you can use multithreading to calculate the sum of a huge list of integers faster. More threads could split the task and speed things up. But because of the GIL, only one thread can do the addition at a time, so you will not see the speedup you expect. Due to thread switching costs, more threads can slow things down.  

 

Real-World Example

To see how the GIL affects multithreading, consider a simple Python example:

 

 

import threading
 
def count_down(n):
while n > 0:
n -= 1
 
thread1 = threading.Thread(target=count_down, args=(1000000,))
thread2 = threading.Thread(target=count_down, args=(1000000,))
 
thread1.start()
thread2.start()
 
thread1.join()
thread2.join()

 

Two threads count from 1,000,000 to zero in this example. Two threads should reduce time in half, but the GIL forces them to take turns. Thus, completing the countdown may not be quicker than with a single thread.

 

Performance Implications

In CPU-bound processes, the GIL may slow performance. When using multithreading to speed up heavy computing tasks in your program, the GIL may prevent you from receiving the performance advantage you expect.

Note that the GIL affects Python scripts differently. The GIL has little effect on I/O-bound programs because the program spends most of its time waiting for input/output activities like file reading or network requests. Threads wait for a long time, which frees up the GIL and lets other threads run.

 

Working Around the GIL

The Global Interpreter Lock (GIL) may restrict Python applications. However, there are methods to get around it depending on your goals. Despite GIL, you can accomplish concurrency and parallelism:

 

With Multiprocessing

Multiprocessing can circumvent the GIL better than multithreading. The GIL restricts threads to the same memory area, although processes work in distinct areas. The GIL does not get in the way of running multiple tasks at the same time on different CPU cores. This is due to each process has its own memory and Python interpreter.

You can use Python's multiprocessing tool to do jobs that use a lot of CPU power, like processing large files or doing complex calculations. This divides your task into numerous processes executing on distinct cores. This lets you maximise CPU cores and boost performance.

 

 

from multiprocessing import Process
 
def cpu_bound_task():
# Simulate a CPU-bound task
result = sum(i * i for i in range(10**6))
 
if __name__ == '__main__':
processes = []
for _ in range(4):  # Create 4 processes
p = Process(target=cpu_bound_task)
processes.append(p)
p.start()
 
for p in processes:
p.join()  # Wait for all processes to finish

 

Here, each process performs the cpu_bound_task separately, enabling full parallelism.

 

Asynchronous Programming

I/O-bound processes like network requests, file reading, and database interaction are less affected by the GIL. Async/await can effectively manage multiple tasks in these instances.

Asynchronous programming lets you create code that does not block the main program during I/O operations. You may start a task, let it run in the background, and then move on to another without waiting. This method benefits web servers and chatbots that must handle many queries.

 

Here's a simple example using asyncio:

 

 

import asyncio
 
async def io_bound_task():
await asyncio.sleep(1# Simulate an I/O-bound task
print("Task completed")
 
async def main():
tasks = [io_bound_task() for _ in range(5)]
await asyncio.gather(*tasks)
 
asyncio.run(main())

 

In this example, asyncio.run(main()) lets many io_bound_task functions work simultaneously while waiting for I/O operations. Reduces GIL effect, making this technique efficient for I/O-bound processes.

 

Third-Party Solutions

Third-party libraries provide enhanced parallelism and concurrency techniques for bypassing the GIL.

  • concurrent.futures: This module offers a high-level interface for thread- or process-based asynchronous function execution. With a few code modifications, you can switch between multithreading and multiprocessing.
  • joblib: A popular machine learning and data processing library. joblib can easily partition embarrassingly parallel jobs into distinct sections that operate in parallel. Hidden multiprocessing allows parallel processing without GIL concerns.

 

For example, using concurrent.futures with processes:

 

 

from concurrent.futures import ProcessPoolExecutor
 
def cpu_bound_task():
result = sum(i * i for i in range(10**6))
return result
 
with ProcessPoolExecutor() as executor:
results = list(executor.map(cpu_bound_task, range(4)))

 

In this case, the ProcessPoolExecutor runs each CPU_bound_task in a separate process, allowing for efficient use of multiple CPU cores.

 

When to Optimise Your Code for the GIL

Check the Global Interpreter Lock (GIL) if your Python program is slow. To figure out whether the GIL is affecting performance, use profiling tools. Profiling reveals where your code spends most of its time. You can check with tools like cProfile, Py-Spy, and timeit to see if thread delays make things take longer.

Let's see how to implement them: After profiling your code, verify the results. GIL can cause severe thread waiting delays. The GIL might not slow down your system if it mostly does I/O tasks.

Optimisation for CPU-intensive tasks should include multiprocessing. Multiple processes may run concurrently, evading GIL. Asynchronous programming can be beneficial for I/O tasks. You can optimise Python programs better by profiling your code and knowing the GIL.

 

Conclusion 

Use multiprocessing to use multiple cores for tasks that use a lot of CPU power or asynchronous programming for tasks that need to access I/O. This will get around the GIL. Third-party tools, such as concurrent.futures, can also provide better options for concurrency and parallelism. Understanding these parameters improves Python performance. 

216 views

Please Login to create a Question