Python Handbook

Menu

A Primer on Python's Advanced Usage

This interactive application serves as a handbook for experienced software engineers transitioning to or deepening their knowledge of Python. It moves beyond basic syntax to explore Python's internal architecture, memory model, performance characteristics, and advanced patterns. The goal is to provide a fast, consumable guide to help you write efficient, scalable, and robust Python code. Use the navigation on the left or click the links below to explore different topics:

Language Fundamentals

Quick reference for Python's core syntax, control flow, and object-oriented constructs.

Memory & Concurrency

Understand Python's memory management and strategies for handling concurrency, including the GIL.

Data Structures

Dive into Python's built-in data structures and their performance characteristics.

I/O Operations

Explore efficient file, network, and asynchronous I/O handling.

Architecture & Patterns

Learn about Clean Architecture principles and Pythonic design patterns.

Performance & Gotchas

Advanced optimization tips and common pitfalls to avoid.

Language Fundamentals: Quick Reference

This section provides a quick overview of essential Python syntax for defining basic programs, controlling flow, and structuring code with classes and "interfaces." It's designed as a rapid refresher for experienced developers.

Basic Program & Output

The simplest Python program prints "Hello, World!" to the console.

# Hello, World!
print("Hello, World!")

# Basic variable assignment and f-string formatting
name = "Alice"
age = 30
print(f"My name is {name} and I am {age} years old.")

Control Flow: If/Else & Loops

Python uses indentation for code blocks. `if/elif/else` for conditional logic, `for` for iterating over sequences, and `while` for conditional loops.

# Conditional Logic: if/elif/else
score = 85
if score >= 90:
    print("Grade: A")
elif score >= 80:
    print("Grade: B")
else:
    print("Grade: C or lower")

# For Loop: Iterating over a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
    print(f"I like {fruit}")

# For Loop: Using range
for i in range(3): # 0, 1, 2
    print(f"Iteration {i}")

# While Loop
count = 0
while count < 3:
    print(f"Count: {count}")
    count += 1

Defining Classes & Methods

Classes define blueprints for objects, encapsulating data (attributes) and behavior (methods). `__init__` is the constructor.

class Dog:
    # Class attribute
    species = "Canis familiaris"

    # Constructor method
    def __init__(self, name, age):
        self.name = name # Instance attribute
        self.age = age   # Instance attribute

    # Instance method
    def bark(self):
        return f"{self.name} says Woof!"

    # Another instance method
    def get_age_in_dog_years(self):
        return self.age * 7

# Creating objects (instances)
my_dog = Dog("Buddy", 3)
your_dog = Dog("Lucy", 5)

print(my_dog.name) # Output: Buddy
print(my_dog.bark()) # Output: Buddy says Woof!
print(your_dog.get_age_in_dog_years()) # Output: 35
print(Dog.species) # Accessing class attribute

Abstract Base Classes (ABCs) for "Interfaces"

Python doesn't have explicit interfaces like Java, but `abc` module allows defining abstract base classes (ABCs) to enforce method implementations in subclasses, mimicking interface behavior.

from abc import ABC, abstractmethod

class Vehicle(ABC): # Inherit from ABC
    @abstractmethod
    def start(self):
        pass

    @abstractmethod
    def stop(self):
        pass

class Car(Vehicle):
    def start(self):
        print("Car engine started.")
    
    def stop(self):
        print("Car engine stopped.")

class Bicycle(Vehicle):
    def start(self):
        print("Bicycle started pedaling.")
    
    def stop(self):
        print("Bicycle stopped pedaling.")

# Usage
my_car = Car()
my_car.start()

my_bicycle = Bicycle()
my_bicycle.stop()

# This would raise a TypeError: Can't instantiate abstract class
# abstract_vehicle = Vehicle() 

Memory Management & Concurrency

This section dives into how Python handles memory and concurrency, which is critical for building high-performance applications. We'll explore automatic memory management via reference counting and garbage collection, and demystify the Global Interpreter Lock (GIL) and its implications for parallelism.

Automatic Memory Management

Reference Counting

The primary mechanism in CPython. Each object maintains a count of references to it. When a new reference to an object is created, its reference count increases; when a reference is removed or goes out of scope, the count decreases. Once an object's reference count drops to zero, it signifies that no part of the program is using it, and Python automatically deallocates the memory it occupied.

import sys

a = [] # Reference count of [] is 1
b = a  # Reference count of [] is 2
c = b  # Reference count of [] is 3

# print(sys.getrefcount(a)) # Output will be 4 (a, b, c, and the argument to getrefcount)

del a  # Reference count of [] is 2
del b  # Reference count of [] is 1

# When the last reference (c) is deleted, memory is reclaimed
del c

Generational Garbage Collector

This secondary mechanism exists specifically to detect and break reference cycles. It groups objects into three "generations" based on their age. Newer objects are checked more frequently, making the process efficient by focusing on objects most likely to become garbage.

import gc

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

# Create a reference cycle
a = Node(1)
b = Node(2)
a.next = b
b.next = a # a and b now reference each other

# Delete external references
del a
del b

# At this point, a and b are still in memory due to the cycle.
# The generational garbage collector will eventually reclaim them.
# gc.collect() # Manually trigger collection

# The objects are now gone from memory.

The Global Interpreter Lock (GIL) & Concurrency Strategies

The GIL is a mutex in CPython that allows only one thread to execute Python bytecode at a time, even on multi-core processors. This simplifies memory management but limits true CPU-bound parallelism with threading. Choosing the right concurrency model is key.

Strategy Best For GIL Impact Key Feature
Multi-threading I/O-bound tasks (e.g., network requests, disk reads) Limited by GIL for CPU tasks. Threads release GIL on I/O wait. Shared memory, simplifies data sharing between threads.
Multi-processing CPU-bound tasks (e.g., heavy calculations, data processing) Bypassed. Each process has its own interpreter and GIL. Achieves true parallelism across multiple CPU cores.
Asyncio High-volume I/O-bound tasks (e.g., thousands of network connections) Not applicable. Runs on a single thread with an event loop. Highly efficient task switching with low overhead.

Example: Multi-threading (I/O-bound)

import threading
import time
import requests

def download_site(url):
    print(f"Starting download: {url}")
    response = requests.get(url) # This is an I/O-bound operation
    print(f"Finished download: {url}, size: {len(response.content)} bytes")

urls = [
    "https://www.example.com",
    "https://www.google.com",
    "https://www.bing.com",
]

# threads = []
# for url in urls:
#     thread = threading.Thread(target=download_site, args=(url,))
#     threads.append(thread)
#     thread.start()

# for thread in threads:
#     thread.join() # Wait for all threads to complete
# print("All downloads complete with threading.")

Example: Multi-processing (CPU-bound)

import multiprocessing
import time

def cpu_bound_task(n):
    print(f"Starting CPU-bound task for {n}...")
    sum_val = 0
    for i in range(n):
        sum_val += i * i # CPU-intensive calculation
    print(f"Finished CPU-bound task for {n}, sum: {sum_val}")
    return sum_val

numbers = [10**7, 10**7, 10**7]

# if __name__ == '__main__': # Required for multiprocessing on Windows/macOS
#     processes = []
#     for num in numbers:
#         process = multiprocessing.Process(target=cpu_bound_task, args=(num,))
#         processes.append(process)
#         process.start()

#     for process in processes:
#         process.join() # Wait for all processes to complete
#     print("All CPU-bound tasks complete with multiprocessing.")

Example: Asyncio (High-concurrency I/O)

import asyncio
import time

async def async_fetch_data(url):
    print(f"Async starting fetch: {url}")
    await asyncio.sleep(1) # Simulate async I/O operation (e.g., network call)
    print(f"Async finished fetch: {url}")
    return f"Data from {url}"

async def main_async():
    tasks = [
        async_fetch_data("http://api.service.com/1"),
        async_fetch_data("http://api.service.com/2"),
        async_fetch_data("http://api.service.com/3"),
    ]
    results = await asyncio.gather(*tasks) # Run coroutines concurrently
    print("All async fetches complete:", results)

# To run:
# if __name__ == '__main__':
#     asyncio.run(main_async())

Built-in Data Structures

Python's built-in data structures are highly optimized. Choosing the right one is fundamental to performance. This section provides an interactive comparison of their time complexities for common operations. Use the dropdown to see how they stack up.

Performance Comparison

List

Mutable, ordered, dynamic arrays. Excellent for stacks and general collections. Performance suffers for insertions/deletions in the middle (`O(n)`) due to element shifting.

my_list = [1, 2, 3]
my_list.append(4) # Add to end: O(1) amortized
my_list.insert(0, 0) # Add to beginning: O(n)
item = my_list[2] # Access by index: O(1)
# print(my_list) # [0, 1, 2, 3, 4]

Tuple

Immutable, ordered collections. Ideal for fixed data like coordinates or records. Their immutability allows them to be used as dictionary keys.

my_tuple = (10, 20, "hello")
x = my_tuple[0] # Access by index: O(1)
# my_tuple[0] = 5 # TypeError: 'tuple' object does not support item assignment
# print(my_tuple) # (10, 20, 'hello')

Set

Mutable, unordered collections of unique elements, based on hash tables. Blazing fast (`O(1)`) for membership testing and removing duplicates.

my_set = {1, 2, 3, 2} # Duplicates are ignored
my_set.add(4) # Add element: O(1) average
is_present = 3 in my_set # Membership test: O(1) average
# print(my_set) # {1, 2, 3, 4} (order not guaranteed)

Dictionary

Mutable key-value stores, also based on hash tables. Extremely fast (`O(1)`) for lookups, insertions, and deletions by key. The workhorse of Python.

my_dict = {"name": "Alice", "age": 30}
my_dict["city"] = "New York" # Add/Update: O(1) average
name = my_dict["name"] # Lookup by key: O(1) average
# print(my_dict) # {'name': 'Alice', 'age': 30, 'city': 'New York'}

I/O Operations

Efficient I/O is crucial for performance. Python offers robust tools for handling files, networks, and asynchronous operations. The key is to choose the right tool and use it idiomatically to prevent resource leaks and maximize throughput.

File I/O Best Practices

Always use the `with` statement for file operations. It guarantees that the file is closed correctly, even if errors occur, preventing resource leaks. For large files, use generators to iterate line-by-line instead of loading the entire file into memory.

# Good: Memory-efficient processing of a large file
def process_large_file(path):
    with open(path, 'r') as f:
        for line in f:
            yield line.strip().upper()

# Bad: Can cause MemoryError on very large files
def process_large_file_badly(path):
    with open(path, 'r') as f:
        lines = f.readlines() # Loads everything into memory
    return [line.strip().upper() for line in lines]

Network I/O: Sockets

The `socket` module provides low-level network access. For handling multiple clients concurrently without threads, use non-blocking sockets with the `select` module. This allows a single thread to monitor multiple sockets for I/O readiness, forming the basis of many high-performance servers.

import socket
import select
import time

# Client example for testing the servers
def socket_client(host, port, message):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.connect((host, port))
        s.sendall(message.encode())
        data = s.recv(1024)
        print(f"Received from {host}:{port}: {data.decode()}")

# Basic Blocking Socket Server Example
def blocking_server():
    HOST = '127.0.0.1'
    PORT = 65432
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.bind((HOST, PORT))
        s.listen()
        print(f"Blocking server listening on {HOST}:{PORT}")
        conn, addr = s.accept() # Blocks until a client connects
        with conn:
            print(f"Blocking server connected by {addr}")
            while True:
                data = conn.recv(1024) # Blocks until data is received
                if not data:
                    break
                conn.sendall(data.upper()) # Echo back uppercase
            print(f"Blocking server connection closed by {addr}")

# Basic Non-Blocking Socket Server with Select Example
def non_blocking_server():
    HOST = '127.0.0.1'
    PORT = 65433
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.setblocking(False) # Set to non-blocking
    server_socket.bind((HOST, PORT))
    server_socket.listen(5)

    inputs = [server_socket]
    print(f"Non-blocking server listening on {HOST}:{PORT}")

    while True: # Keep running indefinitely
        readable, _, _ = select.select(inputs, [], [], 1.0) # 1.0s timeout
        if not readable:
            continue

        for sock in readable:
            if sock is server_socket:
                try:
                    conn, addr = sock.accept()
                    conn.setblocking(False)
                    inputs.append(conn)
                    print(f"Non-blocking server accepted connection from {addr}")
                except BlockingIOError:
                    pass # No incoming connection ready yet
            else:
                try:
                    data = sock.recv(1024)
                    if data:
                        sock.sendall(data.upper())
                    else:
                        print(f"Non-blocking server closing connection from {sock.getpeername()}")
                        inputs.remove(sock)
                        sock.close()
                except BlockingIOError:
                    pass # No data ready yet
                except ConnectionResetError:
                    print(f"Non-blocking server connection reset by peer {sock.getpeername()}")
                    inputs.remove(sock)
                    sock.close()

# To run these examples (e.g., in separate terminals or using threading):
# import threading
# threading.Thread(target=blocking_server).start()
# threading.Thread(target=non_blocking_server).start()
# time.sleep(1) # Give servers time to start
# threading.Thread(target=socket_client, args=('127.0.0.1', 65432, 'hello from blocking client')).start()
# threading.Thread(target=socket_client, args=('127.0.0.1', 65433, 'hello from non-blocking client')).start()

Asynchronous I/O: `asyncio`

`asyncio` is the modern standard for high-performance I/O-bound applications in Python. Using `async/await` syntax, it allows a single thread to manage thousands of concurrent connections efficiently via an event loop. It's ideal for web servers, database clients, and API gateways.

import asyncio
import time

async def async_fetch_data(url):
    print(f"Async starting fetch: {url}")
    await asyncio.sleep(1) # Simulate async I/O operation (e.g., network call)
    print(f"Async finished fetch: {url}")
    return f"Data from {url}"

async def main_async():
    tasks = [
        async_fetch_data("http://api.service.com/1"),
        async_fetch_data("http://api.service.com/2"),
        async_fetch_data("http://api.service.com/3"),
    ]
    results = await asyncio.gather(*tasks) # Run coroutines concurrently
    print("All async fetches complete:", results)

# To run:
# if __name__ == '__main__':
#     asyncio.run(main_async())

Clean Architecture & Design Patterns

Writing maintainable and scalable code goes beyond syntax. This section visualizes the principles of Clean Architecture and highlights key Pythonic design patterns that leverage the language's dynamic features for elegant solutions.

Clean Architecture Layers

Clean Architecture separates concerns into concentric layers with a strict dependency rule: dependencies only point inwards. This isolates your core business logic (Entities) from frameworks, databases, and UI, making the application adaptable, testable, and easier to maintain. Hover over the layers to learn more.

Hover over a layer to see its description.

Entities
Use Cases
Adapters
Frameworks

Practical Clean Architecture Examples

Here are conceptual Python code snippets illustrating how each layer of Clean Architecture might be implemented. This demonstrates the separation of concerns.

1. Entities (Domain Layer)

Pure business objects, independent of any framework or database.

from dataclasses import dataclass
from datetime import datetime

@dataclass
class User:
    id: str
    name: str
    email: str
    created_at: datetime = datetime.now()

    def is_active(self) -> bool:
        # Simple domain logic
        return True # Placeholder for more complex logic

2. Use Cases (Application Layer)

Application-specific business rules. Orchestrates entities and interacts with abstract interfaces (e.g., repositories).

from abc import ABC, abstractmethod
from typing import List, Optional
import uuid

# Abstract interface for user storage (defined in domain/application layer)
class UserRepository(ABC):
    @abstractmethod
    def get_by_id(self, user_id: str) -> Optional[User]:
        pass

    @abstractmethod
    def save(self, user: User) -> None:
        pass

    @abstractmethod
    def get_all(self) -> List[User]:
        pass

class CreateUserUseCase:
    def __init__(self, user_repo: UserRepository):
        self.user_repo = user_repo

    def execute(self, name: str, email: str) -> User:
        # Application-specific business logic
        if not "@" in email:
            raise ValueError("Invalid email format")
            
        user_id = str(uuid.uuid4())
        new_user = User(id=user_id, name=name, email=email)
        self.user_repo.save(new_user)
        return new_user

class GetUsersUseCase:
    def __init__(self, user_repo: UserRepository):
        self.user_repo = user_repo

    def execute(self) -> List[User]:
        return self.user_repo.get_all()

3. Adapters (Infrastructure Layer)

Concrete implementations of interfaces defined in the Use Cases layer. Connects to external services like databases.

# Example: In-memory implementation of UserRepository
class InMemoryUserRepository(UserRepository):
    def __init__(self):
        self._users = {} # Simulates a database

    def get_by_id(self, user_id: str) -> Optional[User]:
        return self._users.get(user_id)

    def save(self, user: User) -> None:
        self._users[user.id] = user

    def get_all(self) -> List[User]:
        return list(self._users.values())

# Example: Conceptual SQLAlchemy implementation (requires SQLAlchemy setup)
# from sqlalchemy import create_engine, Column, String, DateTime
# from sqlalchemy.orm import sessionmaker, declarative_base
# Base = declarative_base()

# class UserTable(Base):
#     __tablename__ = 'users'
#     id = Column(String, primary_key=True)
#     name = Column(String)
#     email = Column(String)
#     created_at = Column(DateTime)

# class SQLAlchemyUserRepository(UserRepository):
#     def __init__(self, session_factory):
#         self.session_factory = session_factory

#     def get_by_id(self, user_id: str) -> Optional[User]:
#         with self.session_factory() as session:
#             user_data = session.query(UserTable).filter_by(id=user_id).first()
#             return User(**user_data.__dict__) if user_data else None

#     def save(self, user: User) -> None:
#         with self.session_factory() as session:
#             user_table = UserTable(**user.__dict__)
#             session.add(user_table)
#             session.commit()

#     def get_all(self) -> List[User]:
#         with self.session_factory() as session:
#             return [User(**u.__dict__) for u in session.query(UserTable).all()]

4. Frameworks & Drivers (External Layer)

The outermost layer. Uses the Use Cases to drive the application, without the Use Cases knowing about the framework.

# Example: Flask Web Application (requires Flask)
# from flask import Flask, request, jsonify
# from dependency_injector.containers import DeclarativeContainer
# from dependency_injector.providers import Singleton, Factory

# # Simple Dependency Injection Container (conceptual)
# class Container(DeclarativeContainer):
#     user_repository = Singleton(InMemoryUserRepository) # Or SQLAlchemyUserRepository
#     create_user_use_case = Factory(CreateUserUseCase, user_repo=user_repository)
#     get_users_use_case = Factory(GetUsersUseCase, user_repo=user_repository)

# app = Flask(__name__)
# container = Container()

# @app.route('/users', methods=['POST'])
# def create_user_endpoint():
#     data = request.get_json()
#     name = data.get('name')
#     email = data.get('email')
#     try:
#         user = container.create_user_use_case().execute(name, email)
#         return jsonify({"id": user.id, "name": user.name, "email": user.email}), 201
#     except ValueError as e:
#         return jsonify({"error": str(e)}), 400

# @app.route('/users', methods=['GET'])
# def get_users_endpoint():
#     users = container.get_users_use_case().execute()
#     return jsonify([{"id": u.id, "name": u.name, "email": u.email} for u in users]), 200

# # To run:
# # if __name__ == '__main__':
# #     app.run(debug=True)

6.4. Deployment Strategies (Briefcase)

Briefcase is a powerful tool that allows you to package Python applications for native distribution across multiple platforms, including desktop (Windows, macOS, Linux), mobile (iOS, Android), and even the web (via WebAssembly). It handles the complexities of creating platform-specific project structures, bundling dependencies, and generating native installers or app bundles.

Key Briefcase Commands

  • `briefcase new`: Initializes a new Briefcase project structure.
  • `briefcase create `: Creates the necessary platform-specific project scaffolding (e.g., Xcode project for iOS, Android Studio project for Android, native project for desktop).
  • `briefcase build `: Builds the native application artifact for the specified platform (e.g., `.app` for macOS, `.exe` for Windows, `.apk` for Android).
  • `briefcase run `: Runs the built application on the target platform or emulator.
  • `briefcase dev`: Runs the application in a development environment without full packaging.

Cross-Platform Deployment with Briefcase

Briefcase abstracts away much of the platform-specific tooling, allowing developers to focus on their Python code. It integrates with underlying platform SDKs (like Xcode, Android SDK) to produce native applications.

  • Desktop Applications: Packages your Python app into native executables and installers for Windows (MSI), macOS (DMG), and Linux (AppImage, DEB, RPM). It bundles the Python interpreter and all dependencies.
  • Mobile Applications: Generates Xcode projects for iOS and Android Studio projects for Android. Your Python code runs within a native wrapper, leveraging platform capabilities.
  • Web Applications (via WebAssembly): Briefcase can also target web browsers by compiling Python to WebAssembly (WASM), often using tools like PyScript or similar technologies. This allows your Python application to run directly in a web browser, enabling rich client-side logic without server-side Python.

Example: Packaging for Web (Conceptual)

While the exact implementation details depend on the web backend (e.g., PyScript), the general flow involves Briefcase preparing your Python code and its dependencies to be compiled to WebAssembly and served as static web assets.

# Initialize a new Briefcase project (if not already done)
# briefcase new my-web-app

# Navigate into your project directory
# cd my-web-app

# Create the web project structure (this might involve a specific template for web)
# briefcase create web

# Build the web application (compiles Python to WASM, bundles assets)
# briefcase build web

# Run the web application (starts a local web server to serve the assets)
# briefcase run web

# This would typically generate a 'web' directory with HTML, JS, and WASM files
# that can be deployed to any static web host.

Performance & Common "Gotchas"

Even experienced developers can be caught by Python's peculiarities. This section covers key performance optimization strategies and interactive cards explaining common pitfalls. Click on a card to reveal the details and best practices.

Key Optimization Strategies

  • Profile First: Don't guess. Use tools like `cProfile` and `line_profiler` to find actual bottlenecks before optimizing.
  • Use Built-ins: Python's built-in functions (`sum`, `len`) and standard library modules are often C-optimized and faster than manual implementations.
  • Embrace Generators: Use generators (`yield`) and generator expressions for large datasets to drastically reduce memory consumption.
  • Cache with `lru_cache`: For expensive functions with repeated calls, cache previously computed results using `@functools.lru_cache`.

Example: Profiling with `cProfile`

import cProfile
import time

def expensive_function():
    time.sleep(0.1)
    sum(range(10**6))

def another_function():
    time.sleep(0.05)
    [x*x for x in range(10**5)]

def main_program():
    for _ in range(5):
        expensive_function()
    another_function()

# To run the profiler:
# cProfile.run('main_program()')
# This will print a detailed report of function calls and times.

Example: Generators for Memory Efficiency

import sys

# List comprehension (loads all into memory)
my_list = [i * 2 for i in range(1000000)]
# print(f"List size: {sys.getsizeof(my_list) / (1024*1024):.2f} MB")

# Generator expression (produces values on demand)
my_generator = (i * 2 for i in range(1000000))
# print(f"Generator size: {sys.getsizeof(my_generator):.2f} bytes") # Much smaller!

# You can iterate over a generator
# for value in my_generator:
#     pass # Process value here

Example: Caching with `functools.lru_cache`

import functools
import time

@functools.lru_cache(maxsize=None) # maxsize=None means unlimited cache
def fibonacci(n):
    if n <= 1:
        return n
    time.sleep(0.001) # Simulate expensive computation
    return fibonacci(n-1) + fibonacci(n-2)

# First call is slow
# start_time = time.time()
# fibonacci(30)
# print(f"First call: {time.time() - start_time:.4f} seconds")

# Subsequent calls with same input are fast (cached)
# start_time = time.time()
# fibonacci(30)
# print(f"Second call: {time.time() - start_time:.4f} seconds")

Common "Gotchas"

Mutable Default Arguments

The most famous Python pitfall.

The Problem

A function's default arguments are created ONCE, when the function is defined. If a default is mutable (like a list), it's shared across all calls, leading to unexpected behavior.

# Bad:
def add_item_bad(item, items=[]):
    items.append(item)
    return items
# print(add_item_bad(1)) # [1]
# print(add_item_bad(2)) # [1, 2] - Oops!

# Correct:
def add_item_good(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items
# print(add_item_good(1)) # [1]
# print(add_item_good(2)) # [2] - Correct!

Lambda Closures in Loops

A subtle scoping issue.

The Problem

Lambdas defined in a loop don't capture the loop variable's value at each iteration. They capture the variable itself, so they all see its final value after the loop finishes.

# Bad:
funcs_bad = []
for i in range(5):
    funcs_bad.append(lambda: i*2)
# print(funcs_bad[0]()) # All print 8 (4*2)

# Correct: Capture value with default arg
funcs_good = [lambda i=i: i*2 for i in range(5)]
# print(funcs_good[0]()) # 0
# print(funcs_good[3]()) # 6

Identity vs. Equality

`is` vs `==`

The Problem

`==` checks if values are equal. `is` checks if two variables point to the exact same object in memory. CPython caches small integers (-5 to 256) and some strings, which can make `is` behave unexpectedly.

a = 257
b = 257
# print(a == b) # True (values are equal)
# print(a is b) # False (usually, different objects)

x = 10
y = 10
# print(x is y) # True (cached small integer)

list1 = [1, 2]
list2 = [1, 2]
# print(list1 == list2) # True
# print(list1 is list2) # False (different list objects)

Shallow vs. Deep Copies

Beware of nested mutable objects.

The Problem

A shallow copy creates a new collection but populates it with references to the original's elements. If elements are mutable, changes in one copy affect the other. A deep copy recursively duplicates all elements.

import copy

original = [1, [2, 3], 4]

# Shallow copy
shallow_copy = list(original) # or original[:]
shallow_copy[1].append(5)
# print(original) # [1, [2, 3, 5], 4] - Oops!

# Deep copy
deep_copy = copy.deepcopy(original)
deep_copy[1].append(6)
# print(original) # [1, [2, 3, 5], 4] - Original unchanged!

Modifying List While Iterating

Leads to skipped elements or errors.

The Problem

Modifying a list (adding/removing elements) while iterating over it can lead to unexpected behavior, such as skipping elements or `IndexError`.

numbers = [1, 2, 3, 4]
# Bad:
# for num in numbers:
#     if num % 2 == 0:
#         numbers.remove(num)
# print(numbers) # [1, 3] - Skipped 4!

# Correct: Iterate over a copy or build new list
numbers = [1, 2, 3, 4]
new_numbers = [num for num in numbers if num % 2 != 0]
# print(new_numbers) # [1, 3]

Global vs. Local Variables

Shadowing and unexpected assignments.

The Problem

If you assign to a variable inside a function, it becomes local unless explicitly declared `global` or `nonlocal`. This can lead to shadowing global variables or `UnboundLocalError`.

x = 10 # Global variable

def func_bad():
    x = 5 # This creates a NEW local 'x'
    # print(x) # 5

def func_good():
    global x # Refer to the global 'x'
    x = 5
    # print(x) # 5

# func_bad()
# print(x) # 10 (global x is unchanged)
# func_good()
# print(x) # 5 (global x is changed)

Late Binding Closures (Non-Lambda)

Similar to lambda, but with regular functions.

The Problem

When creating functions in a loop that refer to the loop variable, they "close over" the variable itself, not its value at the time of definition. All functions will use the variable's final value.

# Bad:
def create_multipliers_bad():
    multipliers = []
    for i in range(3):
        def multiplier():
            return i * 10
        multipliers.append(multiplier)
    return multipliers
# funcs = create_multipliers_bad()
# print(funcs[0]()) # All print 20 (2*10)

# Correct: Pass as default argument
def create_multipliers_good():
    multipliers = []
    for i in range(3):
        def multiplier(j=i): # Capture 'i' as default 'j'
            return j * 10
        multipliers.append(multiplier)
    return multipliers
# funcs = create_multipliers_good()
# print(funcs[0]()) # 0
# print(funcs[1]()) # 10

`is` with Strings and Interning

String interning can be tricky.

The Problem

CPython "interns" (caches) short strings and strings that look like identifiers to save memory. This means `is` might unexpectedly return `True` for two identical string literals.

s1 = "hello"
s2 = "hello"
# print(s1 is s2) # True (interned)

s3 = "hello world"
s4 = "hello world"
# print(s3 is s4) # Often False (not interned by default)

s5 = "a" * 50
s6 = "a" * 50
# print(s5 is s6) # Almost always False (too long to intern)

# Always use == for string content comparison.

Unpacking Errors

`ValueError` on unpacking.

The Problem

When unpacking sequences (like tuples or lists) into variables, the number of variables must exactly match the number of elements in the sequence, or a `ValueError` occurs.

data = (1, 2, 3)

# Correct:
a, b, c = data
# print(a, b, c) # 1 2 3

# Bad: Too few values to unpack
# x, y = data # ValueError: too many values to unpack

# Bad: Too many values to unpack
# p, q, r, s = data # ValueError: not enough values to unpack

# Use * for flexible unpacking (Python 3+)
first, *rest, last = (1, 2, 3, 4, 5)
# print(first, rest, last) # 1 [2, 3, 4] 5