Home / Python / Day 10: Advanced Python / Python Performance Optimization

Python Performance Optimization

Writing efficient Python code involves choosing the right data structures, avoiding unnecessary work, and measuring performance before optimizing.

Measure Before Optimizing

"Premature optimization is the root of all evil." Always profile your code first to find actual bottlenecks using tools like the time module, timeit, or the cProfile module — don't guess.

Choosing the Right Data Structure

Use the right tool for the job: lists for ordered collections, sets for fast membership testing (in is O(1) for sets vs O(n) for lists), dictionaries for key-based lookups, and tuples for fixed, immutable data.

Avoiding Unnecessary Work

Avoid repeated computation inside loops — calculate values once outside the loop if they don't change. Avoid building large intermediate lists when a generator would do.

List Comprehensions vs Loops

List comprehensions are often faster than equivalent for-loops with .append() because they are optimized internally by the interpreter.

String Concatenation

Repeatedly concatenating strings with + in a loop is slow because strings are immutable (each concatenation creates a new string). Use "".join(list_of_strings) instead.

Using Built-in Functions and Libraries

Built-in functions (sum(), max(), sorted()) and libraries like NumPy are implemented in C and are usually much faster than hand-written Python loops for the same task.

Caching with functools.lru_cache

For expensive functions called repeatedly with the same arguments, @functools.lru_cache caches results automatically, avoiding redundant computation.

Syntax

<pre><code>import time
from functools import lru_cache

# Measuring execution time
start = time.perf_counter()
# ... code to measure ...
end = time.perf_counter()
print(f"Took {end - start:.4f} seconds")

# Set vs List for membership testing
numbers_list = list(range(100000))
numbers_set = set(range(100000))
print(99999 in numbers_set)   # much faster: O(1)
print(99999 in numbers_list)  # slower: O(n)

# Efficient string concatenation
words = ["Python", "is", "fast"]
sentence = " ".join(words)   # good
# Avoid: sentence = ""; for w in words: sentence += w + " "

# List comprehension vs loop
squares_loop = []
for i in range(1000):
    squares_loop.append(i ** 2)

squares_comp = [i ** 2 for i in range(1000)]  # typically faster

# Caching expensive function calls
@lru_cache(maxsize=None)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(30))  # fast, thanks to caching

# Using generators to save memory
def squares_generator(n):
    for i in range(n):
        yield i ** 2

total = sum(squares_generator(1000000))  # doesn't build a huge list in memory
</code></pre>

Revision Notes

• Profile first (time, timeit, cProfile) — don't guess at bottlenecks
• Use sets/dicts for O(1) membership tests instead of lists (O(n))
• "".join(list) instead of += for string concatenation in loops
• List comprehensions are often faster than manual append loops
• @lru_cache caches repeated function calls with the same arguments
• Generators avoid building large lists in memory

Optimize Fibonacci with Caching

Medium

Write a function fast_fibonacci(n) that computes the nth Fibonacci number (0-indexed, fib(0)=0, fib(1)=1) using @functools.lru_cache to avoid redundant recursive calls.

Input:

fast_fibonacci(10)

Output:

Show Hint

Apply @lru_cache(maxsize=None) above a recursive function: if n < 2 return n, else return fast_fibonacci(n-1) + fast_fibonacci(n-2).

Solve this Challenge

Show Solution

from functools import lru_cache

@lru_cache(maxsize=None)
def fast_fibonacci(n):
    if n < 2:
        return n
    return fast_fibonacci(n - 1) + fast_fibonacci(n - 2)

print(fast_fibonacci(10))

Asynchronous Programming with asyncio Back to Course