Threading in Python: A Complete Guide

Threading is the go-to way to implement concurrency or parallelism in programming. Python offers numerous constructs and classes to leverage threads for better performance and responsiveness. However, it is also important to follow thread safety best practices to avoid critical issues like race conditions and deadlocks.

In this article, we will take a deep dive into threading in Python. Let’s discuss what threading is and why you might want to use it. Then we will look at ways to create and manage threads in Python. We will also explore some of the challenges of using threading and how to avoid them.

What is threading?

Threading (or multi-threading) is an execution model that enables programmers to implement concurrency or parallelism. A thread is a lightweight unit of a program that can run independently. All threads share the same memory space and resources of the main program.

Concurrency vs. parallelism

In a single-core environment, when you implement multi-threading, the processor switches between different threads, allowing each to run for a brief time slice. This process is known as concurrency.

It's important to note that concurrency doesn't guarantee true parallel execution. Threads take turns using the single core, creating an illusion of parallelism to make the program feel more dynamic.

However, the real power of multi-threading is realized in a multi-core environment, where each thread can run on a separate processor core simultaneously. This simultaneous execution is termed parallelism. It results in significant performance gains, as each thread can run on a different core at the same time without the need for context switching.

Why do we use threading?

There are several benefits of using threading in Python or programming in general:

Improved performance: Threads allow you to do more work in less time. For example, if you have to make API calls to two different servers, you can create and run two threads simultaneously, one for each API call.
Responsiveness: Threading boosts program responsiveness by allowing it to handle multiple requests simultaneously. For example, a web server may create a new thread for each incoming request. This allows the web server to concurrently respond to the requests of multiple users, enhancing overall experience.
Simplified code: When done right, threading can simplify your code by allowing you to break down large tasks into smaller, more manageable chunks. This adds to the overall maintainability of a codebase.
Increased scalability: Threading can also improve the scalability of your program by allowing it to be adapted to run on multiple cores or machines. For example, if you are migrating from a single-core to a multi-core architecture, you can leverage thread parallelism to scale up your application.
Simplified communication: Threads share the same memory space, which makes communication between them more straightforward than with multiple processes. This simplifies the implementation of tasks that require sharing data or coordination between different parts of a program.

Multithreading vs multiprocessing

Multi-processing involves running multiple processes simultaneously. Unlike a thread, each process gets its own dedicated memory space. Multi-processing is well-suited for intensive CPU-bound tasks, and it can take full advantage of multi-core processors. Each process may run on a separate CPU core, increasing overall performance.

Processes typically consume more system resources than threads due to their independent memory space. This can also limit the number of processes you can run concurrently. Inter-process communication (IPC) between processes is generally more complicated and expensive than inter-thread communication.

Threading vs. asynchronous programming

Asynchronous programming is a way of writing event-driven, non-blocking code. In an asynchronous program, time-consuming tasks (like I/O tasks) are performed in the background, without blocking the main thread. This task delegation is done through callbacks, promises, or other asynchronous techniques.

Asynchronous programming is a great fit for I/O-bound scenarios, such as web scraping, network requests, or database queries. It prevents blocking and maximizes the utilization of a single thread.

When to use threads vs. processes vs. asynchronous programming

Use threads when:

You want to boost responsiveness by handling I/O tasks efficiently.
You want efficient communication between different tasks.
You want more granular control over the execution and scheduling of tasks in a single process.
Resource efficiency is a priority. If you want to execute a large number of concurrent tasks without overloading system resources, threads are a good choice.

Use processes when:

You have multi-core CPUs and have to perform CPU-bound tasks.
You require strong isolation between tasks.
You want to scale to multiple machines, with each process running on a different machine.
Resource intensiveness is acceptable.

Use asynchronous programming when:

You want to simplify the code by avoiding the need to worry about thread synchronization.
You want to improve the performance of the program by avoiding the overhead of too many context switches.
You are building an application that handles real-time updates to data, like chat or streaming applications. Asynchronous apps excel at handling streams of data without blocking.
You are building a responsive web application using JavaScript frameworks or libraries, like Node.js or React.

Threading in Python – A definitive guide

Now that we have a good understanding of what threading is and when to use it, let's transition to talking about implementing threading in Python.

Overview of the Threading module in Python

The Python standard library offers a handy “threading” module to work with threads. The module makes it easy to create and manage threads in Python programs. Let’s get started!

A note regarding the Global Interpreter Lock

It’s worth mentioning here that the CPython implementation uses a Global Interpreter Lock (GIL) for thread synchronization. The GIL restricts the execution of Python bytecode to one thread at a time, even on multi-core processors.

For applications that require maximum resource utilization on multi-core machines, the official Python documentation recommends using the “multiprocessing” module. However, it's important to note that I/O-bound tasks can still benefit greatly from threading. During I/O-bound operations, like file I/O or database queries, the GIL is released, allowing multiple threads to progress concurrently.

Creating threads

You can create a new thread by calling the Threading.Thread() constructor. The constructor accepts different arguments, including the thread target function, the thread name, and a list of thread arguments. The target function contains the code that the thread will execute when it starts.

For example, the following piece of code imports the threading module, defines a target function, and then creates a new thread object using the Threading.Thread() constructor.

import threading 

def my_function(): 

                # Your thread's task goes here 

    my_thread = threading.Thread(target=my_function)

Starting threads

The above code created a thread object, but didn’t start it. To start the execution of a thread, we use the start() method exposed by the Thread object. Invoking this function executes the thread’s target function concurrently with the main program.

my_thread.start()

Calling join() on a thread

To wait for a thread to complete its execution, we can call the join() method on the Thread object. This causes the calling thread (often the main program) to block until the thread terminates.

my_thread.join()

Daemon threads

Daemon threads are threads that run in the background and don't prevent the main program from exiting. You can make a thread a daemon either:

When you create it, by setting a flag to True in the constructor.

By setting the Thread object's daemon property to True before invoking the start() method.

For example, the following code sets the daemon property to true, and then calls start().

my_thread.daemon = True  
my_thread.start()

Other useful functions

There are several other functions exposed by the Threading module that a developer should know:

threading.active_count(): Returns the number of thread objects that are currently alive.
threading.excepthook(): A hook for handling unhandled exceptions in threads.
threading.current_thread(): Returns the thread object in the current context.
threading.get_native_id(): Returns the kernel-assigned native thread identifier for the current thread.
threading.main_thread(): Returns the main thread object, representing the initial thread of the program.
threading.stack_size(): Retrieves the thread stack size for new threads. Optionally, you can provide a "size" argument to set a new stack size.

Synchronizing threads using locks, rlocks, semaphores, and condition variables

Thread synchronization ensures that multiple threads can access shared data safely. This is important to prevent race conditions and deadlocks.

Race conditions are errors that can occur when multiple threads access the same data at the same time. Deadlocks are situations where two or more threads are waiting for each other to release a resource. This can cause the threads to block indefinitely, halting the program.

Locks

Locks are synchronization primitives that ensure that only one thread can access a block of code at a time. The Threading module offers a Lock class that can be used for this purpose. The Lock class has two main functions: acquire() and release().

At any time, a lock object can be in one of two possible states: “locked” or “unlocked”.

When you call acquire() on a locked Lock object, it blocks the current thread until another thread calls release() on the same lock.
When you call acquire() on an unlocked Lock object, the state of the Lock is immediately changed to “locked”.
When you call release() on a locked Lock object, the object’s state is immediately changed to “unlocked”. Calling release() on an already unlocked object leads to a runtime error.

The following code gives a simplified example on how to create and use a lock.

import threading 

# Create a lock 
lock = threading.Lock() 

# Acquire the lock 
lock.acquire() 
# Critical section 
# …… 
# Release the lock when done 
lock.release()

RLocks (Reentrant locks):

An RLock, or Reentrant Lock, is an extension of the basic lock that can be acquired multiple times by the same thread. It's especially useful in recursion scenarios, or when a function calls another function that also needs the lock already held by the calling function.

The threading module provides the RLock class for this purpose. Consider the following example where the same thread acquires and releases the rlock multiple times:

import threading 

class SharedData: 
        def __init__(self): 
        self.counter1 = 1 
        self.counter2 = 2 
        self.lock = threading.RLock()  

          def incrementCounter1(self): 
        self.lock.acquire() #acquire again 
    try: 
            self.counter1 = self.counter1 + 1 
  finally: 
            self.lock.release()  
def updateCounter2(self):  
        self.lock.acquire() #acquire again 
    try:  
            self.counter2 = self.counter2 + self.counter1 
    finally:  
            self.lock.release() 

def updateCounters(self):  
        self.lock.acquire() #first acquire 
    try:  
            self.incrementCounter1()  
            self.updateCounter2() 
    finally:  
            self.lock.release() #This will release the lock

Semaphores

Semaphores are objects that maintain counters for controlling access to a resource. They allow a specific number of threads to access a resource concurrently. Each acquire() call decrements the counter, and each release() call increments it. If the counter reaches 0, the next acquire() call blocks until a release is called() by another thread.

The threading module includes the Semaphore class for this purpose:

import threading 
# Create a semaphore with a maximum of 3 allowed threads  
semaphore = threading.Semaphore(3)  
# Acquire the semaphore 
semaphore.acquire()  
# Critical section – that the semaphore protects 
# …… 
# Release the semaphore when done 
semaphore.release()

Condition variables

Condition variables are synchronization primitives that allow threads to wait for specific conditions to become true before proceeding. A condition variable is always linked to a lock. It is typically used to coordinate the execution of different threads in response to some shared state.

The Condition class in the threading module allows us to implement condition variables. Calling the “wait” or “wait_for” functions of a condition variable object releases the linked lock and waits for another thread to call “notify()” or “notify_all()”.

Consider this example where a job processing thread waits for a job producing thread to create a job before starting its processing. The line comments provide explanations for the different lines of code.

import threading 
condition = threading.Condition()  
def consume_job():  
with condition:  
        condition.wait_for(job_available)#wait for the producer thread to   notify   
        fetch_new_job() 
def produce_job(): 
with condition:  
create_new_job()  
 condition.notify()              #notify the waiting thread

Writing synchronization primitives using the “with” statement

All synchronization primitives that the Threading module provides can be expressed using the “with” statement syntax. “with” is a form of “Resource Acquisition Is Initialization” (RAII), a principle used to manage resources in a way that automatically releases them when they go out of scope.

By using the “with” statement, you can prevent potential deadlocks and enhance the readability and maintainability of your code.

with my_lock: 
#important code here

is the same as:

my_lock.acquire()  
try:  
#important code here 
finally: 
    my_lock.release()

Creating thread pools

The “concurrent.futures” module in Python offers a “ThreadPoolExecutor” class, which allows developers to create and manage thread pools for handling asynchronous tasks. Thread pools are a great way to optimize resource utilization by reusing existing threads, instead of constantly creating new ones and destroying them.

The following code creates a thread pool and uses it to perform some asynchronous tasks. The line comments provide explanations for the different lines of code.

import concurrent.futures 

# Function to simulate a time-consuming task 
def perform_task(task_id):  
    print(f"Task {task_id} started.") 
# Simulate some work 
result = task_id * 2 
    print(f"Task {task_id} completed with result: {result}")  
return result 

# Create a ThreadPoolExecutor instance 
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: 
# List of tasks to execute  
tasks = [11,12,13,14,15,16,17] 

# Submit tasks to the executor 
results = [executor.submit(perform_task, task_id) for task_id in   tasks]  
# Wait for the list of tasks to finish  
   concurrent.futures.wait(results) 

# Get the results of the completed tasks  
for future in results:  
result = future.result()  
  print(f"Result: {result}")

Using synchronized queues for safer thread communication

Using synchronized queues is a common way to implement safe communication between threads in Python. The Queue class from the queue module allows you to implement a synchronized queue that can have multiple producers and multiple consumers.

The queue module supports three types of queues:

First in, first out (FIFO): The items are removed in the order they were added.
Last in, first out (LIFO): This queue functions like a stack, where the most recently added items are the first to be removed
Priority: Items are removed based on their assigned priority, with the lowest priority items being removed first.

The following code creates a LIFO queue and defines a producer thread that inserts some items into the queue. It also initializes a consumer thread that processes values from the queue. The line comments provide explanations for the different lines of code.

import threading  
import queue  
# Create a synchronized LIFO queue 
lifo_queue = queue.LifoQueue() 

# Function to simulate a producer adding items to the queue  
def producer():  
for i in range(1, 6):  
lifo_queue.put(i) 
print(f"Produced: {i}")  

# Function to simulate a consumer removing items from the queue  
def consumer():  
while not lifo_queue.empty(): 
item = lifo_queue.get()  
print(f"Consumed: {item}") 
lifo_queue.task_done()  
# Create producer and consumer threads 
producer_thread = threading.Thread(target=producer)  
consumer_thread = threading.Thread(target=consumer)  

# Start the threads 
producer_thread.start()  
consumer_thread.start() 

# Wait for the producer to end  
producer_thread.join()  
# Wait for the consumer to end 
lifo_queue.join() 
# Signal the consumer to exit 
lifo_queue.put(None) 
consumer_thread.join()

Worker threads vs. per-request threads

Worker threads and per-request threads are two common threading models used in Python and other programming languages.

In a worker thread model, a pool of pre-defined threads is created at the start of the application. These threads are designed to be long-lived, with the main thread consistently distributing incoming workloads across them.

Conversely, in a per-request model, the main thread spawns a new thread for each incoming request. These threads are short-lived — i.e., they terminate after processing the request.

Depending on your resource configurations and application requirements, you can use either worker threads or per-request threads.

Use worker threads when:

You want to have a smaller memory footprint by reducing the overhead of thread creation and destruction.
Your users can tolerate slight delays in responses, especially during peak hours, as worker threads may be busy processing other tasks.
You have long-running tasks that shouldn’t block the main thread.

Use per-request threads when:

You have abundant system resources, and can handle a large memory footprint, even during peak usage.
Your application performs short-lived tasks in response to user requests, such as serving web requests.
Your users require near-instantaneous responses, and your infrastructure can support the rapid creation and management of short-lived threads.

Conclusion

Multithreading is an important concept for developers to grasp, regardless of the language they are using. Python offers built-in classes and constructs that can be used to efficiently and safely manage a large number of threads.

This article has introduced you to some of the most important classes and constructs. You can use this knowledge to build scalable, multi-threaded applications that are free of race conditions and deadlocks.

Was this article helpful?

Introduction to Threading in Python