Low-level Thread Support

sys includes low-level functions for controlling and debugging thread behavior.

Check Interval

Python 2 uses a form of cooperative multitasking in its thread implementation. At a fixed interval, bytecode execution is paused and the interpreter checks if any signal handlers need to be executed. During the same interval check, the global interpreter lock is also released by the current thread and then reacquired, giving other threads an opportunity to take over execution by grabbing the lock first.

The default check interval is 100 bytecodes and the current value can always be retrieved with sys.getcheckinterval(). Changing the interval with sys.setcheckinterval() may have an impact on the performance of an application, depending on the nature of the operations being performed.

import sys
import threading
from Queue import Queue
import time

def show_thread(q, extraByteCodes):
    for i in range(5):
        for j in range(extraByteCodes):
            pass
        q.put(threading.current_thread().name)
    return

def run_threads(prefix, interval, extraByteCodes):
    print '%(prefix)s interval = %(interval)s with %(extraByteCodes)s extra operations' % locals()
    sys.setcheckinterval(interval)
    q = Queue()
    threads = [ threading.Thread(target=show_thread, name='%s T%s' % (prefix, i), 
                                 args=(q, extraByteCodes)
                                 )
                for i in range(3)
              ]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    while not q.empty():
        print q.get()
    print
    return

run_threads('Default', interval=10, extraByteCodes=1000)
run_threads('Custom', interval=10, extraByteCodes=0)

When the check interval is smaller than the number of bytecodes in a thread, the interpreter may give another thread control so that it runs for a while. This is illustrated in the first set of output where the check interval is set to 100 (the default) and 1000 extra loop iterations are performed for each step through the i loop.

On the other hand, when the check interval is greater than the number of bytecodes being executed by a thread that doesn’t release control for another reason, the thread will finish its work before the interval comes up. This is illustrated by the order of the name values in the queue in the second example.

$ python sys_checkinterval.py
Default interval = 10 with 1000 extra operations
Default T0
Default T0
Default T0
Default T1
Default T2
Default T2
Default T0
Default T1
Default T2
Default T0
Default T1
Default T2
Default T1
Default T2
Default T1

Custom interval = 10 with 0 extra operations
Custom T0
Custom T0
Custom T0
Custom T0
Custom T0
Custom T1
Custom T1
Custom T1
Custom T1
Custom T1
Custom T2
Custom T2
Custom T2
Custom T2
Custom T2

Modifying the check interval is not as clearly useful as it might seem. Many other factors may control the context switching behavior of Python’s threads. For example, if a thread performs I/O, it releases the GIL and may therefore allow another thread to take over execution.

import sys
import threading
from Queue import Queue
import time

def show_thread(q, extraByteCodes):
    for i in range(5):
        for j in range(extraByteCodes):
            pass
        #q.put(threading.current_thread().name)
        print threading.current_thread().name
    return

def run_threads(prefix, interval, extraByteCodes):
    print '%(prefix)s interval = %(interval)s with %(extraByteCodes)s extra operations' % locals()
    sys.setcheckinterval(interval)
    q = Queue()
    threads = [ threading.Thread(target=show_thread, name='%s T%s' % (prefix, i), 
                                 args=(q, extraByteCodes)
                                 )
                for i in range(3)
              ]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    while not q.empty():
        print q.get()
    print
    return

run_threads('Default', interval=100, extraByteCodes=1000)
run_threads('Custom', interval=10, extraByteCodes=0)

This example is modified from the first so that the thread prints directly to sys.stdout instead of appending to a queue. The output is much less predictable.

$ python sys_checkinterval_io.py
Default interval = 100 with 1000 extra operations
Default T0
Default T1
Default T1Default T2

Default T0Default T2

Default T2
Default T2
Default T1
Default T2
Default T1
Default T1
Default T0
Default T0
Default T0

Custom interval = 10 with 0 extra operations
Custom T0
Custom T0
Custom T0
Custom T0
Custom T0
Custom T1
Custom T1
Custom T1
Custom T1
Custom T2
Custom T2
Custom T2
Custom T1Custom T2

Custom T2

See also

dis
Disassembling your Python code with the dis module is one way to count bytecodes.

Debugging

Identifying deadlocks can be on of the most difficult aspects of working with threads. sys._current_frames() can help by showing exactly where a thread is stopped.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#!/usr/bin/env python
# encoding: utf-8

import sys
import threading
import time

io_lock = threading.Lock()
blocker = threading.Lock()

def block(i):
    t = threading.current_thread()
    with io_lock:
        print '%s with ident %s going to sleep' % (t.name, t.ident)
    if i:
        blocker.acquire() # acquired but never released
        time.sleep(0.2)
    with io_lock:
        print t.name, 'finishing'
    return

# Create and start several threads that "block"
threads = [ threading.Thread(target=block, args=(i,)) for i in range(3) ]
for t in threads:
    t.setDaemon(True)
    t.start()

# Map the threads from their identifier to the thread object
threads_by_ident = dict((t.ident, t) for t in threads)

# Show where each thread is "blocked"
time.sleep(0.01)
with io_lock:
    for ident, frame in sys._current_frames().items():
        t = threads_by_ident.get(ident)
        if not t:
            # Main thread
            continue
        print t.name, 'stopped in', frame.f_code.co_name, 
        print 'at line', frame.f_lineno, 'of', frame.f_code.co_filename

# Let the threads finish
# for t in threads:
#     t.join()

The dictionary returned by sys._current_frames() is keyed on the thread identifier, rather than its name. A little work is needed to map those identifiers back to the thread object.

Since Thread-1 does not sleep, it finishes before its status is checked. Since it is no longer active, it does not appear in the output. Thread-2 acquires the lock blocker, then sleeps for a short period. Meanwhile Thread-3 tries to acquire blocker but cannot because Thread-2 already has it.

$ python sys_current_frames.py
Thread-1 with ident 4315942912 going to sleep
Thread-2 with ident 4320149504 going to sleep
Thread-1 finishing
Thread-3 with ident 4324356096 going to sleep
Thread-3 stopped in block at line 16 of sys_current_frames.py
Thread-2 stopped in block at line 17 of sys_current_frames.py

See also

threading
The threading module includes classes for creating Python threads.
Queue
The Queue module provides a thread-safe implementation of a FIFO data structure.
Python Threads and the Global Interpreter Lock
Jesse Noller’s article from the December 2007 issue of Python Magazine.
Inside the Python GIL
Presentation by David Beazley describing thread implementation and performance issues, including how the check interval and GIL are related.
Bookmark and Share