Python Handbook for Backend Engineers Interview
*How to use this handbook
Read Theory ā Understand internals ā Study code ā Practice ā Master edge cases
Subsection Workflow
For each subsection:
- Read the Theory carefully
- Type all code snippets by hand (avoid copy-paste)
- Execute the code in your terminal or Python shell
- Modify inputs and observe behavior
- Test edge cases intentionally
The handbook is your map. You still have to walk the path. Start today, not tomorrow. š
SECTION 1 ā PYTHON DATA STRUCTURES (Deep Internals)
1.1 List []
š Theory ā What Is a List Internally?
A Python list is not a linked list. It is a dynamic array ā a contiguous block of memory holding pointers (references) to Python objects.
Memory Layout of list [10, 20, 30]:
list object:
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā ob_size = 3 (current length) ā
ā allocated = 4 (capacity) ā
ā ob_item āāāāāāāāāāāāāā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā¼
āāāāāāāā¬āāāāāāā¬āāāāāāā¬āāāāāāā
ā ptr ā ptr ā ptr ā null ā ā array of PyObject* pointers
ā ā10 ā ā20 ā ā30 ā ā
āāāāāāāā“āāāāāāā“āāāāāāā“āāāāāāā
Key insight: The list stores POINTERS, not actual objects.
That's why [1, "hello", 3.14, None] is perfectly valid.
Amortized Resizing: When the array is full, Python allocates a new larger array (~1.125x + constant), copies all pointers, and frees the old array. This makes append() amortized O(1).
CPython growth pattern for allocated slots:
0 ā 4 ā 8 ā 16 ā 24 ā 32 ā 40 ā 52 ā 64 ā 76 ā ...
ā±ļø Time & Space Complexity Table
| Operation | Example | Time | Space | Notes |
|---|---|---|---|---|
| Index access | lst[i] | O(1) | O(1) | Direct pointer arithmetic |
| Index assign | lst[i] = x | O(1) | O(1) | Replace pointer |
| Append | lst.append(x) | O(1)* | O(1) | *Amortized |
| Pop last | lst.pop() | O(1) | O(1) | No shifting needed |
| Pop at index | lst.pop(i) | O(n) | O(1) | Must shift elements left |
| Insert at index | lst.insert(i, x) | O(n) | O(1) | Must shift elements right |
| Delete by value | lst.remove(x) | O(n) | O(1) | Search + shift |
| Search | x in lst | O(n) | O(1) | Linear scan |
| Index of | lst.index(x) | O(n) | O(1) | Linear scan |
| Count | lst.count(x) | O(n) | O(1) | Full scan |
| Length | len(lst) | O(1) | O(1) | Stored as attribute |
| Sort | lst.sort() | O(n log n) | O(n) | Timsort algorithm |
| Reverse | lst.reverse() | O(n) | O(1) | In-place swap |
| Slice | lst[a:b] | O(b-a) | O(b-a) | Creates new list |
| Extend | lst.extend(other) | O(k) | O(k) | k = len(other) |
| Copy | lst.copy() | O(n) | O(n) | Shallow copy |
| Concatenate | lst1 + lst2 | O(n+m) | O(n+m) | New list created |
š» All List Methods ā With Line-by-Line Comments
# ================================================================
# CREATION
# ================================================================
lst = [] # empty list
lst = [1, 2, 3] # literal syntax
lst = list(range(5)) # [0,1,2,3,4] ā from any iterable
lst = list("hello") # ['h','e','l','l','o'] ā from string
lst = [0] * 5 # [0,0,0,0,0] ā ā ļø careful with mutables!
# ================================================================
# ADDING ELEMENTS
# ================================================================
lst = [1, 2, 3]
lst.append(4)
# ā [1, 2, 3, 4]
# append ALWAYS adds the object AS-IS (single element)
lst.append([5, 6])
# ā [1, 2, 3, 4, [5, 6]] ā nested! NOT flat!
# append does NOT unpack iterables
lst.extend([7, 8])
# ā [1, 2, 3, 4, [5, 6], 7, 8]
# extend unpacks the iterable ā adds each element individually
# equivalent to: for x in [7,8]: lst.append(x)
lst.insert(0, 99)
# ā [99, 1, 2, 3, 4, [5, 6], 7, 8]
# insert(index, element) ā shifts everything from index onwards ā O(n)
lst += [10, 11]
# += is same as extend ā modifies in place
# NOT same as lst = lst + [10,11] (which creates new list)
# ================================================================
# REMOVING ELEMENTS
# ================================================================
lst = [10, 20, 30, 20, 40]
val = lst.pop()
# ā returns 40, lst = [10, 20, 30, 20]
# pop() with no argument removes LAST element ā O(1)
val = lst.pop(1)
# ā returns 20, lst = [10, 30, 20]
# pop(i) removes at index i ā O(n) because of shifting
lst.remove(30)
# ā lst = [10, 20]
# remove removes FIRST occurrence of the VALUE
# ā ļø raises ValueError if value not found!
del lst[0]
# ā lst = [20]
# del by index ā statement not method
lst = [1, 2, 3, 4, 5]
del lst[1:3]
# ā [1, 4, 5] ā delete a slice
lst.clear()
# ā [] ā empties the list, same object (id unchanged)
# ================================================================
# SEARCHING / QUERYING
# ================================================================
lst = [10, 20, 30, 20, 40]
20 in lst # True ā O(n) membership test
lst.index(20) # 1 ā FIRST index of value
lst.index(20, 2) # 3 ā search starting from index 2
# ā ļø lst.index(99) ā ValueError if not found!
lst.count(20) # 2 ā count all occurrences
len(lst) # 5 ā O(1), stored internally
# ================================================================
# SORTING AND REVERSING
# ================================================================
lst = [3, 1, 4, 1, 5, 9, 2, 6]
lst.sort()
# sorts IN PLACE ā [1, 1, 2, 3, 4, 5, 6, 9]
# returns None! ā common interview trap
lst.sort(reverse=True) # descending: [9, 6, 5, 4, 3, 2, 1, 1]
lst.sort(key=abs) # sort by absolute value
new_lst = sorted(lst)
# returns NEW sorted list ā original UNCHANGED
# sorted() works on any iterable, sort() only on lists
lst.reverse()
# reverses IN PLACE ā returns None
# alternative: reversed(lst) returns iterator (no new list)
# ================================================================
# COPYING
# ================================================================
lst = [1, [2, 3], 4]
shallow = lst.copy() # same as lst[:] or list(lst)
# ā ļø shallow copy: nested objects are SHARED REFERENCES
shallow[1].append(99) # lst ALSO becomes [1, [2, 3, 99], 4]!
import copy
deep = copy.deepcopy(lst) # recursively copies EVERYTHING
deep[1].append(100) # lst unchanged ā fully independentšŖ Slicing Tricks
lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Syntax: lst[start : stop : step]
# start = inclusive, stop = exclusive, step = direction & skip
lst[2:5] # [2, 3, 4] ā index 2 up to (not including) 5
lst[:3] # [0, 1, 2] ā from beginning
lst[7:] # [7, 8, 9] ā from index 7 to end
lst[::2] # [0, 2, 4, 6, 8] ā every 2nd element (even indices)
lst[1::2] # [1, 3, 5, 7, 9] ā every 2nd starting at index 1
lst[::-1] # [9,8,7,6,5,4,3,2,1,0] ā REVERSE (creates new list)
lst[2:6][::-1] # [5, 4, 3, 2] ā reverse a sub-portion
lst[-1] # 9 ā last element (negative index from end)
lst[-3:] # [7, 8, 9] ā last 3 elements
lst[:-2] # [0,1,2,3,4,5,6,7] ā all except last 2
# Assign to slice (in-place modification):
lst[2:4] = [20, 30] # replace elements at index 2 and 3
lst[1:1] = [99] # INSERT at index 1 (nothing removed)
lst[2:5] = [] # delete elements at index 2, 3, 4
# Copy with slice:
copy = lst[:] # shallow copy of entire listš List as Stack vs Queue
# ā
LIST AS STACK (LIFO) ā Efficient
stack = []
stack.append(1) # push ā O(1)
stack.append(2)
stack.append(3)
top = stack[-1] # peek ā O(1) (no removal)
val = stack.pop() # pop ā O(1), returns 3
# ā LIST AS QUEUE (FIFO) ā Inefficient!
queue = []
queue.append(1) # enqueue at end ā O(1)
queue.append(2)
front = queue.pop(0) # dequeue from front ā O(n)!
# WHY O(n)? Every remaining element must shift left:
# Before: [1, 2, 3, 4, 5]
# After: [2, 3, 4, 5] ā indices 1,2,3,4 all shift to 0,1,2,3
# ā
USE collections.deque FOR O(1) QUEUE
from collections import deque
q = deque()
q.append(1) # enqueue right ā O(1)
q.append(2)
val = q.popleft() # dequeue left ā O(1)
# deque is a doubly-linked list of fixed-size blocks internallyš Common Patterns on Lists
# ================================================================
# PATTERN 1: TWO POINTERS ā Opposite Ends
# Use case: palindrome check, two-sum on sorted array
# ================================================================
def is_palindrome(s: str) -> bool:
left, right = 0, len(s) - 1 # start at both ends
while left < right: # move toward center
if s[left] != s[right]: # mismatch found
return False
left += 1 # move inward
right -= 1
return True # all matched
# ================================================================
# PATTERN 2: TWO POINTERS ā Same Direction (Fast & Slow)
# Use case: remove duplicates from sorted array in-place
# ================================================================
def remove_duplicates(nums: list) -> int:
if not nums:
return 0
slow = 0 # slow = write position
for fast in range(1, len(nums)): # fast = read position
if nums[fast] != nums[slow]: # found new unique element
slow += 1 # advance write position
nums[slow] = nums[fast] # write unique element
return slow + 1 # length of unique portion
# ================================================================
# PATTERN 3: SLIDING WINDOW ā Fixed Size
# Use case: maximum sum of subarray of size k
# ================================================================
def max_sum_subarray(nums: list, k: int) -> int:
window_sum = sum(nums[:k]) # sum of first window
max_sum = window_sum
for i in range(k, len(nums)):
window_sum += nums[i] # add element entering window
window_sum -= nums[i - k] # remove element leaving window
max_sum = max(max_sum, window_sum)
return max_sum
# Time: O(n), Space: O(1)
# ================================================================
# PATTERN 4: SLIDING WINDOW ā Variable Size
# Use case: smallest subarray with sum >= target
# ================================================================
def min_subarray_len(target: int, nums: list) -> int:
left = 0
current_sum = 0
min_len = float('inf')
for right in range(len(nums)): # expand right edge
current_sum += nums[right]
while current_sum >= target: # shrink from left while valid
min_len = min(min_len, right - left + 1)
current_sum -= nums[left]
left += 1
return min_len if min_len != float('inf') else 0
# Time: O(n) ā each element visited at most twiceā ļø Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Modifying list while iterating
# ================================================================
lst = [1, 2, 3, 4, 5]
for x in lst:
if x % 2 == 0:
lst.remove(x) # ā DANGER! Skips elements!
# Result might look correct but is UNRELIABLE
# ā
FIX 1: list comprehension (cleanest)
lst = [x for x in lst if x % 2 != 0]
# ā
FIX 2: iterate over a copy
for x in lst[:]:
if x % 2 == 0:
lst.remove(x)
# ================================================================
# TRAP 2: Shallow copy of nested list
# ================================================================
grid = [[0] * 3] * 3 # ā All 3 rows are SAME object!
grid[0][0] = 1
print(grid) # [[1,0,0],[1,0,0],[1,0,0]] ā ALL rows changed!
grid = [[0] * 3 for _ in range(3)] # ā
Each row is independent
grid[0][0] = 1
print(grid) # [[1,0,0],[0,0,0],[0,0,0]] ā Only first row
# ================================================================
# TRAP 3: Mutable default argument
# ================================================================
def add_item(item, lst=[]): # ā lst is shared across ALL calls!
lst.append(item)
return lst
add_item(1) # [1]
add_item(2) # [1, 2] ā NOT [2]!
def add_item(item, lst=None): # ā
FIX: use None as sentinel
if lst is None:
lst = [] # create fresh list each call
lst.append(item)
return lst
# ================================================================
# TRAP 4: sort() returns None
# ================================================================
lst = [3, 1, 2]
result = lst.sort() # ā result = None !
result = sorted(lst) # ā
result = [1, 2, 3], lst unchanged
# ================================================================
# TRAP 5: is vs == for list comparison
# ================================================================
[1, 2, 3] == [1, 2, 3] # True ā value comparison
[1, 2, 3] is [1, 2, 3] # False ā different objects in memory
# Always use == for value comparison
# Use 'is' ONLY for None checks: if x is NonešÆ Interview Tricks
# Enumerate ā index + value together
for i, val in enumerate(["a", "b", "c"], start=1):
print(i, val) # 1 a, 2 b, 3 c
# Zip ā iterate parallel lists
names = ["Alice", "Bob"]
scores = [95, 87]
for name, score in zip(names, scores):
print(f"{name}: {score}") # Alice: 95, Bob: 87
# Star unpacking
first, *middle, last = [1, 2, 3, 4, 5]
# first=1, middle=[2,3,4], last=5
# Flatten nested list ā one-liner
nested = [[1, 2], [3, 4], [5]]
flat = [x for sub in nested for x in sub] # [1, 2, 3, 4, 5]
# Read: "for each sub in nested, for each x in sub, collect x"
# Transpose a matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = list(zip(*matrix)) # [(1,4,7),(2,5,8),(3,6,9)]
# *matrix unpacks rows as arguments to zip
# Frequency map
freq = {}
for char in "abracadabra":
freq[char] = freq.get(char, 0) + 1
# Anagram key
key = ''.join(sorted("listen")) # "eilnst" ā same as "silent"
# Check if sorted
is_sorted = all(lst[i] <= lst[i+1] for i in range(len(lst)-1))š Interview Problem 1: Two Sum
def two_sum(nums: list, target: int) -> list:
"""
Problem: Find indices of two numbers that sum to target.
Example: nums = [2, 7, 11, 15], target = 9 ā [0, 1]
Approach: Hash map ā store each number's index.
For each new number, check if complement exists.
"""
seen = {} # value ā index
for i, num in enumerate(nums): # single pass through array
complement = target - num # what we need to find
if complement in seen: # O(1) hash map lookup
return [seen[complement], i] # found both indices!
seen[num] = i # store this number's index
return [] # no solution found
# Time: O(n) ā single pass
# Space: O(n) ā hash map stores up to n elementsš Interview Problem 2: Maximum Subarray (Kadane's Algorithm)
def max_subarray(nums: list) -> int:
"""
Problem: Find contiguous subarray with largest sum.
Example: [-2,1,-3,4,-1,2,1,-5,4] ā 6 (subarray [4,-1,2,1])
Approach: At each position, decide:
- Start fresh from this element, OR
- Extend the previous subarray
"""
max_sum = nums[0] # global max (handles all-negative)
current_sum = nums[0] # current subarray sum
for i in range(1, len(nums)): # start from index 1
# Either extend current subarray or start new one here
current_sum = max(nums[i], current_sum + nums[i])
max_sum = max(max_sum, current_sum) # update global max if better
return max_sum
# Time: O(n), Space: O(1)š¦ Quick Cheatsheet ā List
lst[::-1] # reverse (new copy)
lst.sort(key=lambda x: -x) # sort descending
[x for x in lst if x > 0] # filter positives
list(zip(*matrix)) # transpose matrix
[i for i,v in enumerate(lst) if v==target] # all indices of value
sum(lst) / len(lst) # average
lst[::2] # every other element
first, *rest = lst # unpack first and rest
all(x > 0 for x in lst) # check all positive
any(x > 0 for x in lst) # check any positive
lst.copy() # or lst[:] # shallow copy
import copy; copy.deepcopy(lst) # deep copy
[[0]*m for _ in range(n)] # nĆm zero matrix (CORRECT way)1.2 Tuple ()
š Theory ā What Makes Tuples Special?
A tuple is an immutable sequence. Once created, elements cannot be added, removed, or changed. This immutability has real performance benefits:
Why tuples are faster and more memory-efficient than lists:
1. FIXED SIZE: Python allocates exactly the memory needed
(no over-allocation for future growth like lists do)
2. IMMUTABILITY ALLOWS CACHING: Python interns/caches small tuples
3. NO RESIZE OVERHEAD: No dynamic array growth machinery needed
4. MORE COMPACT STORAGE: Fewer internal fields to maintain
Proof:
import sys
sys.getsizeof([1, 2, 3]) # 120 bytes (list ā extra space for growth)
sys.getsizeof((1, 2, 3)) # 64 bytes (tuple ā exact fit, no overhead)
ā±ļø Time & Space Complexity Table
| Operation | Time | Notes |
|---|---|---|
| Index access | O(1) | Same as list |
Search in | O(n) | Linear scan |
| Count | O(n) | Full scan |
| Index | O(n) | Find first occurrence |
| Length | O(1) | Stored as attribute |
| Slice | O(k) | k = slice size, creates new tuple |
| Create from iterable | O(n) | Copy n elements |
š» When to Use Tuples ā With Code
# ================================================================
# USE CASE 1: Dictionary keys (lists can't be dict keys ā mutable!)
# ================================================================
location = (40.7128, -74.0060) # lat, lng as tuple key
distances = {(0, 0): 0, (1, 2): 5} # grid coords as keys
# d = {[0,0]: 0} ā TypeError: unhashable type: 'list'
# ================================================================
# USE CASE 2: Returning multiple values from a function
# ================================================================
def min_max(nums: list) -> tuple:
return min(nums), max(nums) # implicitly creates tuple
lo, hi = min_max([3, 1, 4, 1, 5, 9]) # unpack directly
# lo = 1, hi = 9
def divmod_custom(a, b) -> tuple:
return a // b, a % b # quotient and remainder
quotient, remainder = divmod_custom(17, 5) # q=3, r=2
# ================================================================
# USE CASE 3: Fixed / named constants
# ================================================================
RGB_RED = (255, 0, 0) # color ā shouldn't change
DIRECTIONS = [(0,1), (0,-1), (1,0), (-1,0)] # 4-directional grid traversal
DIAG_DIRS = [(1,1),(1,-1),(-1,1),(-1,-1)] # diagonal directions
# ================================================================
# USE CASE 4: Elements in a set (lists can't be in sets!)
# ================================================================
visited = set()
visited.add((0, 0)) # ā
tuple in set
visited.add((1, 2))
# visited.add([1, 2]) ā TypeError: unhashable type: 'list'
# Common BFS/DFS pattern for grids:
queue = [(0, 0)] # start with tuple coordinates
for r, c in queue:
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
new_pos = (r + dr, c + dc)
if new_pos not in visited:
visited.add(new_pos)šŖ Unpacking Tricks
# Basic unpacking
a, b, c = (1, 2, 3) # a=1, b=2, c=3
x, y = (10, 20) # swap/unpack
# Star unpacking (Python 3)
first, *middle, last = (1, 2, 3, 4, 5)
# first=1, middle=[2,3,4], last=5
head, *tail = (1, 2, 3, 4) # head=1, tail=[2,3,4]
*init, last = (1, 2, 3, 4) # init=[1,2,3], last=4
# Swap without temp variable
a, b = b, a
# Python creates tuple (b, a) on the right, then unpacks to a, b
# Ignore values with _
_, important, _ = (1, 42, 3) # only care about middle
name, *_, age = ("Alice", "B", "C", 30) # skip middle fields
# Nested unpacking
(a, b), (c, d) = (1, 2), (3, 4) # a=1, b=2, c=3, d=4
# In for loops
pairs = [(1, 'a'), (2, 'b'), (3, 'c')]
for num, char in pairs: # auto-unpack each tuple
print(f"{num}: {char}")š·ļø namedtuple from collections
from collections import namedtuple
# Define a named tuple class
Point = namedtuple('Point', ['x', 'y']) # or 'x y' as space-separated string
p = Point(3, 4)
print(p.x, p.y) # 3 4 ā readable named access
print(p[0], p[1]) # 3 4 ā still works by index
print(p) # Point(x=3, y=4) ā readable repr!
# Immutable ā cannot modify:
# p.x = 5 ā AttributeError: can't set attribute
# More useful examples:
Employee = namedtuple('Employee', 'name dept salary')
emp = Employee('Alice', 'Engineering', 120000)
print(emp.name) # 'Alice'
print(emp.salary) # 120000
# Convert to dict
emp._asdict() # {'name': 'Alice', 'dept': 'Engineering', 'salary': 120000}
# Create from existing dict
data = {'name': 'Bob', 'dept': 'Sales', 'salary': 90000}
emp2 = Employee(**data) # unpack dict as keyword args
# Create modified copy (tuples are immutable, this returns NEW namedtuple)
emp3 = emp._replace(salary=130000) # emp unchanged, emp3 is new
print(emp.salary) # 120000 ā original unchanged
print(emp3.salary) # 130000
# Default values (Python 3.6.1+)
Config = namedtuple('Config', ['host', 'port', 'debug'], defaults=['localhost', 8080, False])
cfg = Config() # Config(host='localhost', port=8080, debug=False)
cfg2 = Config(port=9000) # Config(host='localhost', port=9000, debug=False)ā ļø Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Single-element tuple REQUIRES trailing comma
# ================================================================
not_a_tuple = (42) # This is just int 42 in parentheses!
type(not_a_tuple) # <class 'int'>
is_a_tuple = (42,) # The COMMA makes it a tuple
type(is_a_tuple) # <class 'tuple'>
also_tuple = 42, # Parentheses optional! Comma = tuple
type(also_tuple) # <class 'tuple'>
# ================================================================
# TRAP 2: Tuples containing mutables are partially mutable!
# ================================================================
t = ([1, 2], [3, 4])
t[0].append(5) # ā
Works! t = ([1, 2, 5], [3, 4])
# The TUPLE is immutable (can't reassign t[0] to a new object)
# But the LIST inside is still mutable!
# t[0] = [9, 9] # ā TypeError: 'tuple' object doesn't support item assignment
# ================================================================
# TRAP 3: Tuple concatenation creates NEW tuple
# ================================================================
t1 = (1, 2)
t2 = t1 + (3, 4) # (1, 2, 3, 4) ā new tuple object, O(n+m)
# ā For repeated concatenation in a loop:
result = ()
for x in range(1000):
result = result + (x,) # O(n²) total! Creates new tuple each time
# ā
Build list, convert at end:
items = []
for x in range(1000):
items.append(x)
result = tuple(items) # O(n) total
# ================================================================
# TRAP 4: Tuple packing in return statements
# ================================================================
def get_coords():
return 1, 2 # Python automatically packs as tuple (1, 2)
x, y = get_coords() # works perfectly
coords = get_coords() # coords = (1, 2)š¦ Quick Cheatsheet ā Tuple
(42,) # single-element tuple (COMMA required!)
a, b = b, a # swap values (tuple packing/unpacking)
first, *rest = (1, 2, 3, 4) # star unpacking
tuple(lst) # convert list to tuple
hash((1, 2, 3)) # tuples are hashable (can be dict keys/set elements)
Point = namedtuple('Point','x y')# readable tuples with named fields
t.count(x) # count occurrences in tuple
t.index(x) # first index of value
(1, 2) + (3, 4) # concatenation ā (1, 2, 3, 4)
(1, 2) * 3 # repetition ā (1, 2, 1, 2, 1, 2)1.3 Set {}
š Theory ā How Sets Work Internally
A Python set is implemented as a hash table ā like a dictionary but with only keys (no values).
How hashing works step by step:
1. Compute hash(element) ā integer
2. index = hash(element) % table_size ā find bucket
3. If bucket empty ā INSERT here
4. If bucket occupied ā COLLISION!
CPython uses "open addressing" (probing):
- Probe next available slot using perturbation formula
- NOT separate chaining (no linked lists)
5. Load factor > 2/3 ā RESIZE table (~2x size)
Key requirement: Elements must be HASHABLE (immutable).
ā
OK: int, float, str, tuple, frozenset, bool, None
ā ERROR: list, dict, set (mutable ā unhashable)
Why O(1) average? Because hash() maps directly to bucket index.
Why O(n) worst? Because all elements could hash to same bucket.
ā±ļø Time & Space Complexity Table
| Operation | Example | Average | Worst | Notes |
|---|---|---|---|---|
| Add | s.add(x) | O(1) | O(n) | Worst case: resize |
| Remove | s.remove(x) | O(1) | O(n) | KeyError if missing |
| Discard | s.discard(x) | O(1) | O(n) | Silent if missing |
| Contains | x in s | O(1) | O(n) | Hash lookup |
| Pop | s.pop() | O(1) | O(1) | Removes arbitrary element |
| Union | s1 | s2 | O(n+m) | ā | New set |
| Intersection | s1 & s2 | O(min(n,m)) | ā | Iterates smaller |
| Difference | s1 - s2 | O(n) | ā | Elements in s1 not s2 |
| Symmetric diff | s1 ^ s2 | O(n+m) | ā | In one but not both |
| Subset check | s1 <= s2 | O(n) | ā | Is s1 ā s2? |
| Length | len(s) | O(1) | O(1) | Stored internally |
š» All Set Methods
# ================================================================
# CREATION
# ================================================================
s = set() # ā
empty set
s = {} # ā this is an empty DICT, NOT a set!
s = {1, 2, 3} # set literal
s = set([1, 2, 2, 3]) # from list ā {1, 2, 3} (duplicates removed!)
s = set("hello") # {'h', 'e', 'l', 'o'} ā unique chars
# ================================================================
# ADDING ELEMENTS
# ================================================================
s = {1, 2, 3}
s.add(4) # {1, 2, 3, 4}
s.add(2) # {1, 2, 3, 4} ā no effect, already exists
s.update([5, 6], {7, 8}) # add multiple from multiple iterables
# {1, 2, 3, 4, 5, 6, 7, 8}
# ================================================================
# REMOVING ELEMENTS
# ================================================================
s = {1, 2, 3, 4, 5}
s.remove(3) # removes 3; ā ļø KeyError if 3 not present!
s.discard(99) # ā
no error even if 99 not in set
x = s.pop() # removes and returns ARBITRARY element
s.clear() # empties the set
# ================================================================
# SET OPERATIONS
# ================================================================
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
# Union ā all elements from both
a | b # {1, 2, 3, 4, 5, 6}
a.union(b) # same result
# Intersection ā common elements only
a & b # {3, 4}
a.intersection(b) # same result
# Difference ā in a but NOT in b
a - b # {1, 2}
a.difference(b) # same result
# Symmetric difference ā in one but NOT both
a ^ b # {1, 2, 5, 6}
a.symmetric_difference(b) # same result
# ================================================================
# SUBSET / SUPERSET CHECKS
# ================================================================
{1, 2} <= {1, 2, 3} # True ā is subset
{1, 2} < {1, 2, 3} # True ā is PROPER subset (not equal)
{1, 2, 3} >= {1, 2} # True ā is superset
{1, 2, 3}.isdisjoint({4, 5}) # True ā no common elements
# ================================================================
# IN-PLACE OPERATIONS (modifies the set directly)
# ================================================================
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
a |= b # a = {1, 2, 3, 4, 5, 6}
a &= b # a = intersection
a -= b # a = difference
a ^= b # a = symmetric differenceš§ frozenset ā Immutable Set
# frozenset is an IMMUTABLE set
# ā Can be used as dictionary key or element of another set
fs = frozenset([1, 2, 3])
# fs.add(4) ā AttributeError: can't modify frozenset
# Use case 1: set of sets
set_of_sets = {frozenset([1, 2]), frozenset([3, 4])}
# Use case 2: dict key
cache = {frozenset(['a', 'b', 'c']): "result"}
# Use case 3: caching a set argument in memoization
@functools.cache
def compute(items: frozenset): # frozenset is hashable!
return sum(items)
# frozenset supports all read operations:
fs1 = frozenset([1, 2, 3])
fs2 = frozenset([3, 4, 5])
fs1 & fs2 # frozenset({3})
fs1 | fs2 # frozenset({1, 2, 3, 4, 5})
fs1 - fs2 # frozenset({1, 2})š Common Patterns with Sets
# ================================================================
# PATTERN 1: Remove duplicates PRESERVING order
# ================================================================
lst = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
seen = set()
unique = []
for x in lst:
if x not in seen: # O(1) lookup
seen.add(x)
unique.append(x)
# unique = [3, 1, 4, 5, 9, 2, 6]
# One-liner (tricky but useful in interviews):
seen = set()
unique = [x for x in lst if not (x in seen or seen.add(x))]
# Explanation: seen.add(x) returns None (falsy)
# So: "x in seen or None" ā if x not in seen ā add it ā include in result
# If order doesn't matter (simplest):
unique = list(set(lst)) # ā ļø ORDER IS LOST
# ================================================================
# PATTERN 2: Find duplicates in a list
# ================================================================
lst = [1, 2, 3, 2, 4, 3, 5]
seen = set()
duplicates = set()
for x in lst:
if x in seen:
duplicates.add(x)
seen.add(x)
# duplicates = {2, 3}
# ================================================================
# PATTERN 3: BFS/DFS visited tracking
# ================================================================
visited = set()
def dfs(node, graph):
if node in visited: # O(1) check
return
visited.add(node) # O(1) add
for neighbor in graph[node]:
dfs(neighbor, graph)
# ================================================================
# PATTERN 4: Find common elements across multiple lists
# ================================================================
def common_elements(*lists):
result = set(lists[0]) # start with first list's elements
for lst in lists[1:]:
result &= set(lst) # intersect with each subsequent list
return result
common_elements([1,2,3,4], [2,3,4,5], [3,4,5,6]) # {3, 4}ā ļø Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Empty set ā MUST use set(), not {}
# ================================================================
not_a_set = {} # ā empty DICT!
type(not_a_set) # <class 'dict'>
empty_set = set() # ā
empty set
# ================================================================
# TRAP 2: Sets are UNORDERED
# ================================================================
s = {3, 1, 4, 1, 5}
# Don't rely on iteration order ā it's NOT guaranteed
# {1, 3, 4, 5} ā order depends on hash values
# ================================================================
# TRAP 3: Mutable objects cannot be in sets
# ================================================================
# s = {[1, 2]} ā TypeError: unhashable type: 'list'
# s = {{1, 2}} ā TypeError: unhashable type: 'set'
s = {(1, 2)} # ā
tuple is hashable
s = {frozenset([1, 2])} # ā
frozenset is hashable
# ================================================================
# TRAP 4: set.pop() is ARBITRARY (not FIFO, not LIFO)
# ================================================================
s = {10, 20, 30}
s.pop() # could return ANY element ā unpredictable!š Interview Problem: Longest Consecutive Sequence
def longest_consecutive(nums: list) -> int:
"""
Problem: Find length of longest consecutive sequence.
Example: [100, 4, 200, 1, 3, 2] ā 4 (sequence: 1,2,3,4)
Must run in O(n) time.
Key insight: Only start counting from the BEGINNING of a sequence.
A number n is a sequence start if (n-1) is NOT in the set.
"""
num_set = set(nums) # O(n) build, O(1) lookups
max_length = 0
for num in num_set:
if num - 1 not in num_set: # num is start of a sequence
current = num
length = 1
while current + 1 in num_set: # extend the sequence
current += 1
length += 1
max_length = max(max_length, length)
return max_length
# Time: O(n) ā each number visited at most twice
# Space: O(n) ā hash setš¦ Quick Cheatsheet ā Set
set() # empty set (NOT {})
s.add(x) # add element O(1)
s.discard(x) # remove ā no error if missing
s.remove(x) # remove ā KeyError if missing
x in s # O(1) membership test
s1 & s2 # intersection
s1 | s2 # union
s1 - s2 # difference (in s1 not s2)
s1 ^ s2 # symmetric difference
s1 <= s2 # is s1 subset of s2?
s1.isdisjoint(s2) # no common elements?
list(set(lst)) # remove duplicates (loses order)
frozenset(s) # immutable set (hashable)
len(s) # count ā O(1)1.4 Dict {key: value}
š Theory ā How Python Dicts Work Internally
Python dicts use a hash table with open addressing ā specifically CPython's "compact dict" design introduced in Python 3.6.
CPython Dict Structure (Python 3.6+):
Two arrays:
1. INDICES array (sparse): [slot_index, ...]
ā Indexed by: hash(key) % table_size
ā Contains: index into entries array, or EMPTY, or DELETED
2. ENTRIES array (dense): [(hash, key, value), ...]
ā Compact ā no holes between entries
ā New entries always appended here
ā THIS is why dicts preserve insertion order since Python 3.7!
Lookup for d[key]:
Step 1: compute hash(key) ā integer h
Step 2: slot = h % table_size
Step 3: Look in indices[slot]:
EMPTY ā KeyError (not found)
index i ā check if entries[i].key == key
YES ā return entries[i].value
NO ā collision! probe next slot
Step 4: Probe: slot = (5*slot + 1 + perturb) % size
(perturbation prevents clustering)
Load factor: resizes when 2/3 full
Resize: approximately doubles the indices table
Why insertion order preserved? Because entries array is append-only!
Iterating dict = iterating entries in append order.
ā±ļø Time & Space Complexity Table
| Operation | Example | Average | Worst | Notes |
|---|---|---|---|---|
| Get | d[k] | O(1) | O(n) | Hash collision worst case |
| Get safe | d.get(k) | O(1) | O(n) | Returns None if missing |
| Set | d[k] = v | O(1) | O(n) | May trigger resize |
| Delete | del d[k] | O(1) | O(n) | |
| Contains | k in d | O(1) | O(n) | Checks keys only |
| Length | len(d) | O(1) | O(1) | Stored internally |
| Keys view | d.keys() | O(1) | ā | View object (dynamic) |
| Values view | d.values() | O(1) | ā | View object (dynamic) |
| Items view | d.items() | O(1) | ā | View object (dynamic) |
| Iterate | for k in d | O(n) | O(n) | Over all keys |
| Copy | d.copy() | O(n) | O(n) | Shallow copy |
| Pop | d.pop(k) | O(1) | O(n) | Remove and return |
š» All Dict Methods
# ================================================================
# CREATION
# ================================================================
d = {} # empty dict
d = {'a': 1, 'b': 2} # literal syntax
d = dict(a=1, b=2) # keyword arguments
d = dict([('a', 1), ('b', 2)]) # from list of pairs
d = dict.fromkeys(['x', 'y', 'z'], 0) # {'x':0, 'y':0, 'z':0}
d = {k: v for k, v in [('a',1),('b',2)]} # dict comprehension
# ================================================================
# ACCESS
# ================================================================
d = {'a': 1, 'b': 2, 'c': 3}
d['a'] # 1 ā direct access; ā ļø KeyError if missing!
d.get('a') # 1 ā safe access; returns None if missing
d.get('z', -1) # -1 ā returns default if key missing
d.get('z') # None ā default default is None
# ================================================================
# MODIFICATION
# ================================================================
d['d'] = 4 # add new key or update existing
d.update({'e': 5, 'f': 6}) # merge dict (updates existing keys)
d.update(g=7, h=8) # keyword argument form
# setdefault ā get OR set if missing (returns value)
d.setdefault('count', 0) # if 'count' not in d, set it to 0
d.setdefault('count', 99) # already exists ā keeps 0, returns 0
# Common pattern: group items
groups = {}
for item in ['apple', 'banana', 'avocado', 'blueberry']:
key = item[0] # first letter
groups.setdefault(key, []).append(item)
# {'a': ['apple', 'avocado'], 'b': ['banana', 'blueberry']}
# ================================================================
# REMOVAL
# ================================================================
val = d.pop('a') # removes 'a', returns its value; ā ļø KeyError if missing
val = d.pop('z', None) # ā
safe: returns None instead of KeyError
key, val = d.popitem() # removes and returns LAST inserted pair (Python 3.7+)
del d['b'] # delete key ā statement, not method; ā ļø KeyError if missing
d.clear() # remove ALL items, dict becomes {}
# ================================================================
# VIEWS ā Dynamic, Reflect Changes to Dict
# ================================================================
d = {'a': 1, 'b': 2, 'c': 3}
keys = d.keys() # dict_keys(['a', 'b', 'c']) ā view
vals = d.values() # dict_values([1, 2, 3]) ā view
items = d.items() # dict_items([('a',1),('b',2),('c',3)]) ā view
# Views are LIVE ā they reflect changes:
d['d'] = 4
print(keys) # dict_keys(['a', 'b', 'c', 'd']) ā updated!
# Convert to static list if you need to modify dict while using:
list(d.keys()) # ['a', 'b', 'c', 'd']šØ Dict Comprehension
# Basic
squares = {x: x**2 for x in range(6)} # {0:0, 1:1, 2:4, 3:9, 4:16, 5:25}
# With condition
even_sq = {x: x**2 for x in range(10) if x % 2 == 0}
# Invert dict (swap keys and values)
d = {'a': 1, 'b': 2, 'c': 3}
inverted = {v: k for k, v in d.items()} # {1:'a', 2:'b', 3:'c'}
# ā ļø Only works if values are unique and hashable!
# From two parallel lists
keys = ['name', 'age', 'city']
values = ['Alice', 30, 'NYC']
d = dict(zip(keys, values)) # {'name':'Alice','age':30,'city':'NYC'}
# OR:
d = {k: v for k, v in zip(keys, values)}
# Filter dict
scores = {'Alice': 90, 'Bob': 55, 'Charlie': 75}
passing = {name: score for name, score in scores.items() if score >= 60}
# {'Alice': 90, 'Charlie': 75}š Merging Dicts
d1 = {'a': 1, 'b': 2}
d2 = {'b': 3, 'c': 4}
# Method 1: ** unpacking (Python 3.5+)
merged = {**d1, **d2} # {'a':1, 'b':3, 'c':4} ā d2 wins conflicts
# Method 2: | operator (Python 3.9+)
merged = d1 | d2 # {'a':1, 'b':3, 'c':4}
# Method 3: update (modifies d1 in place)
d1.update(d2) # d1 = {'a':1, 'b':3, 'c':4}
# Method 4: |= in place (Python 3.9+)
d1 |= d2 # same as d1.update(d2)š collections ā defaultdict, OrderedDict, Counter
from collections import defaultdict, OrderedDict, Counter
# ================================================================
# defaultdict ā auto-creates default value for missing keys
# ================================================================
# defaultdict(int) ā missing keys default to 0
freq = defaultdict(int)
for char in "abracadabra":
freq[char] += 1 # no need to check if key exists!
# defaultdict(int, {'a':5, 'b':2, 'r':2, 'c':1, 'd':1})
# defaultdict(list) ā missing keys default to []
adj = defaultdict(list) # perfect for graph adjacency lists!
edges = [(1,2), (1,3), (2,4)]
for u, v in edges:
adj[u].append(v) # no need for setdefault or if-check
adj[v].append(u) # undirected graph
# defaultdict(set) ā missing keys default to set()
groups = defaultdict(set)
for word in ["eat", "tea", "tan", "ate"]:
key = ''.join(sorted(word)) # anagram key
groups[key].add(word)
# {'aet': {'eat', 'tea', 'ate'}, 'ant': {'tan'}}
# ================================================================
# Counter ā specialized dict for counting hashable objects
# ================================================================
c = Counter("abracadabra")
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
c.most_common(2) # [('a', 5), ('b', 2)] ā top 2
c['a'] # 5 ā count of 'a'
c['z'] # 0 ā ā
missing keys return 0, NOT KeyError!
# Counter arithmetic
c1 = Counter("aab") # {'a':2, 'b':1}
c2 = Counter("abb") # {'a':1, 'b':2}
c1 + c2 # Counter({'a':3, 'b':3}) ā add
c1 - c2 # Counter({'a':1}) ā subtract (keep positive only)
c1 & c2 # Counter({'a':1, 'b':1}) ā min of each
c1 | c2 # Counter({'a':2, 'b':2}) ā max of each
list(Counter(a=2, b=3).elements()) # ['a','a','b','b','b'] ā with multiplicity
# Anagram check:
def is_anagram(s1: str, s2: str) -> bool:
return Counter(s1) == Counter(s2) # O(n)
# ================================================================
# OrderedDict ā dict with extra order-manipulation methods
# ================================================================
od = OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3
od.move_to_end('a') # move 'a' to end ā [b, c, a]
od.move_to_end('c', last=False) # move 'c' to beginning ā [c, b, a]
od.popitem(last=True) # pop last item: ('a', 1)
od.popitem(last=False) # pop first item: ('c', 3)
# LRU Cache pattern with OrderedDict (classic interview question!):
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity: int):
self.cache = OrderedDict()
self.cap = capacity
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.cache.move_to_end(key) # mark as recently used
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.cap:
self.cache.popitem(last=False) # evict LEAST recently usedā ļø Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Modifying dict while iterating
# ================================================================
d = {'a': 1, 'b': 2, 'c': 3}
# for k in d:
# if d[k] > 1: del d[k] # ā RuntimeError: changed size during iteration
# ā
FIX: iterate over copy of keys
for k in list(d.keys()):
if d[k] > 1:
del d[k]
# ā
BETTER: dict comprehension
d = {k: v for k, v in d.items() if v <= 1}
# ================================================================
# TRAP 2: What CAN and CANNOT be a dict key
# ================================================================
# Keys must be HASHABLE (immutable):
{1: 'int', 'a': 'str', (1,2): 'tuple', True: 'bool'} # ā
all valid
# {[1,2]: 'list'} # ā TypeError: unhashable type: 'list'
# {{1}: 'set'} # ā TypeError: unhashable type: 'set'
# ================================================================
# TRAP 3: True, False, 1, 0 as dict keys
# ================================================================
d = {True: 'true', 1: 'one'}
# True == 1 is True AND hash(True) == hash(1)
# So they are the SAME KEY!
print(d) # {True: 'one'} ā 1 overwrote True!
d = {False: 'false', 0: 'zero'}
print(d) # {False: 'zero'} ā 0 overwrote False!
# ================================================================
# TRAP 4: dict ordering guarantee
# ================================================================
# Python 3.7+: insertion order GUARANTEED by language spec
# Python 3.6: CPython implementation detail (not guaranteed by spec)
# Python < 3.6: NO ordering guarantee at allš Interview Problem: Group Anagrams
from collections import defaultdict
def group_anagrams(strs: list) -> list:
"""
Problem: Group strings that are anagrams of each other.
Example: ["eat","tea","tan","ate","nat","bat"]
ā [["eat","tea","ate"], ["tan","nat"], ["bat"]]
Key insight: anagrams share the same sorted form.
"eat", "tea", "ate" all sort to "aet"
"""
groups = defaultdict(list)
for s in strs:
key = ''.join(sorted(s)) # "eat" ā "aet" ā same for all anagrams
groups[key].append(s) # group by sorted key
return list(groups.values())
# Time: O(n Ć k log k) ā n strings, k = max length
# Space: O(n Ć k)š Interview Problem: Subarray Sum Equals K
def subarray_sum(nums: list, k: int) -> int:
"""
Problem: Count subarrays that sum to exactly k.
Example: nums = [1,1,1], k = 2 ā 2
Key insight: if prefix[j] - prefix[i] = k,
then subarray nums[i..j-1] sums to k.
We need: how many times has (current_prefix - k) appeared before?
"""
count = 0
prefix_sum = 0
freq = {0: 1} # empty prefix has sum 0, seen once
for num in nums:
prefix_sum += num # running prefix sum
# If (prefix_sum - k) appeared before, those subarrays sum to k
count += freq.get(prefix_sum - k, 0)
freq[prefix_sum] = freq.get(prefix_sum, 0) + 1
return count
# Time: O(n), Space: O(n)š¦ Quick Cheatsheet ā Dict
d.get(k, default) # safe access ā no KeyError
d.setdefault(k, []).append(v) # get-or-create then use
d.pop(k, None) # remove safely ā no KeyError
Counter(lst).most_common(k) # top k frequent elements
defaultdict(list) # auto-create [] for missing keys
defaultdict(int) # auto-create 0 for missing keys
{**d1, **d2} # merge (d2 wins on conflicts)
d1 | d2 # merge (Python 3.9+)
{v: k for k, v in d.items()} # invert dict
sorted(d, key=d.get, reverse=True) # sort keys by value descending
{k: v for k,v in d.items() if v>0} # filter dict entries
k in d # key membership ā O(1)SECTION 2 ā ITERABLES, ITERATORS, AND GENERATORS
2.1 Iterable vs Iterator Protocol
š Theory
This is one of the most misunderstood topics in Python. Get this right and everything else clicks.
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā THE DISTINCTION ā
ā ā
ā ITERABLE: Any object you can loop over ā
ā ā Has __iter__() method ā
ā ā __iter__() returns an ITERATOR ā
ā ā Examples: list, tuple, str, dict, set, range, file ā
ā ā Can iterate MULTIPLE times (fresh iterator each time) ā
ā ā
ā ITERATOR: Object that produces values one at a time ā
ā ā Has __next__() method (returns next value or raises ā
ā StopIteration) ā
ā ā Has __iter__() that returns ITSELF ā
ā ā Can only go FORWARD ā cannot reset ā
ā ā An iterator IS an iterable (but not vice versa) ā
ā ā Once exhausted ā done forever ā
ā ā
ā Relationship: ā
ā iterable.__iter__() ā iterator ā
ā iterator.__next__() ā next value ā
ā iterator.__next__() ā next value ā
ā iterator.__next__() ā StopIteration (exhausted) ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š» How For Loops Work Internally
lst = [10, 20, 30]
# What you write:
for x in lst:
print(x)
# What Python actually does:
_iterator = iter(lst) # Step 1: calls lst.__iter__() ā returns iterator
while True:
try:
x = next(_iterator) # Step 2: calls _iterator.__next__()
print(x)
except StopIteration: # Step 3: caught ā loop ends cleanly
break
# iter() built-in = calls __iter__()
# next() built-in = calls __next__()š» Iterable vs Iterator ā Concrete Examples
# ================================================================
# ITERABLES ā can create multiple independent iterators
# ================================================================
lst = [1, 2, 3]
iter1 = iter(lst) # fresh iterator
iter2 = iter(lst) # another fresh iterator ā independent!
next(iter1) # 1
next(iter1) # 2
next(iter2) # 1 ā starts from beginning, independent!
# ================================================================
# ITERATORS ā return themselves from iter()
# ================================================================
it = iter(lst) # iterator object
iter(it) is it # True! iter(iterator) returns same object
next(it) # 1
next(it) # 2
next(it) # 3
# next(it) # ā ļø StopIteration ā exhausted!
# next() with default ā no error on exhaustion:
next(it, 'DONE') # 'DONE' ā returns default instead of error
# ================================================================
# WHICH IS WHICH? Quick test:
# ================================================================
lst = [1, 2, 3]
print(hasattr(lst, '__iter__')) # True ā it's iterable
print(hasattr(lst, '__next__')) # False ā it's NOT an iterator
it = iter(lst)
print(hasattr(it, '__iter__')) # True ā iterators are also iterable
print(hasattr(it, '__next__')) # True ā it has __next__ ā it's an iteratorš» Building Custom Iterable and Iterator
# ================================================================
# CUSTOM ITERABLE ā separate class for iterator
# ================================================================
class CountDown:
"""Iterable: can be looped multiple times."""
def __init__(self, n):
self.n = n
def __iter__(self):
return CountDownIterator(self.n) # returns NEW iterator each call
class CountDownIterator:
"""Iterator: tracks state, single-use."""
def __init__(self, n):
self.current = n
def __iter__(self):
return self # iterator returns itself
def __next__(self):
if self.current <= 0:
raise StopIteration # signal end of iteration
value = self.current
self.current -= 1 # advance state
return value
cd = CountDown(3)
for x in cd: print(x) # 3 2 1
for x in cd: print(x) # 3 2 1 again! (fresh iterator)
# ================================================================
# COMBINED ITERABLE+ITERATOR ā single class (one-time use)
# ================================================================
class InfiniteCounter:
"""Iterator that also IS its own iterable."""
def __init__(self, start=0):
self.current = start
def __iter__(self):
return self # returns itself
def __next__(self):
value = self.current
self.current += 1
return value # infinite ā never raises StopIteration
counter = InfiniteCounter(10)
import itertools
list(itertools.islice(counter, 5)) # [10, 11, 12, 13, 14]2.2 Common Iteration Patterns
# ================================================================
# enumerate ā index + value
# ================================================================
fruits = ['apple', 'banana', 'cherry']
for i, fruit in enumerate(fruits):
print(f"{i}: {fruit}") # 0: apple, 1: banana, 2: cherry
for i, fruit in enumerate(fruits, start=1): # custom start
print(f"{i}: {fruit}") # 1: apple, 2: banana, 3: cherry
# Common use: find index while iterating
idx = next((i for i, v in enumerate(lst) if v == target), -1)
# ================================================================
# zip ā iterate parallel sequences
# ================================================================
names = ['Alice', 'Bob', 'Charlie']
scores = [95, 87, 92]
for name, score in zip(names, scores):
print(f"{name}: {score}") # Alice: 95, etc.
# ā ļø zip stops at SHORTEST sequence:
list(zip([1,2,3], [10,20])) # [(1,10), (2,20)] ā 3 is dropped!
# zip_longest ā pads shorter with fillvalue
from itertools import zip_longest
list(zip_longest([1,2,3], [10,20], fillvalue=0))
# [(1,10), (2,20), (3,0)]
# Unzip (transpose list of tuples):
pairs = [(1,'a'), (2,'b'), (3,'c')]
nums, chars = zip(*pairs) # nums=(1,2,3), chars=('a','b','c')
# ================================================================
# map ā apply function to each element
# ================================================================
nums = [1, 2, 3, 4]
squared = list(map(lambda x: x**2, nums)) # [1, 4, 9, 16]
# ā Usually prefer list comprehension: [x**2 for x in nums]
# map with multiple iterables:
list(map(lambda a, b: a+b, [1,2,3], [10,20,30])) # [11, 22, 33]
# ================================================================
# filter ā keep elements where function returns True
# ================================================================
nums = [1, 2, 3, 4, 5, 6]
evens = list(filter(lambda x: x % 2 == 0, nums)) # [2, 4, 6]
# ā Usually prefer: [x for x in nums if x % 2 == 0]
# ================================================================
# sorted ā flexible, powerful
# ================================================================
words = ['banana', 'apple', 'Cherry', 'date']
sorted(words) # ['Cherry', 'apple', 'banana', 'date']
sorted(words, key=str.lower) # ['apple', 'banana', 'Cherry', 'date']
sorted(words, key=len) # ['date', 'apple', 'banana', 'Cherry']
sorted(words, key=len, reverse=True) # longest first
# Multi-key sort (tuple key):
students = [('Alice', 90), ('Bob', 85), ('Charlie', 90)]
sorted(students, key=lambda s: (-s[1], s[0]))
# sort by score DESC, then name ASC
# [('Alice', 90), ('Charlie', 90), ('Bob', 85)]
# ================================================================
# reversed ā iterate backwards (returns iterator)
# ================================================================
for x in reversed([1, 2, 3]):
print(x) # 3, 2, 1
# reversed() is lazy ā doesn't create new list
# ================================================================
# any / all ā short-circuit boolean checks
# ================================================================
nums = [1, -2, 3, -4, 5]
any(x > 0 for x in nums) # True ā stops at first True
all(x > 0 for x in nums) # False ā stops at first False
any([]) # False ā no elements = no True found
all([]) # True ā vacuously true (no False found)
# Practical use:
has_even = any(x % 2 == 0 for x in nums)
all_unique = len(nums) == len(set(nums))2.3 Comprehensions ā All Four Types
# ================================================================
# LIST COMPREHENSION
# Syntax: [expression for item in iterable if condition]
# ================================================================
squares = [x**2 for x in range(10)] # [0,1,4,9,...,81]
evens = [x for x in range(20) if x % 2 == 0] # [0,2,4,...,18]
upper = [c.upper() for c in "hello"] # ['H','E','L','L','O']
# If-else in expression (not at the end):
labels = ["even" if x%2==0 else "odd" for x in range(5)]
# ['even','odd','even','odd','even']
# ================================================================
# NESTED LIST COMPREHENSION
# ================================================================
# Flatten nested list:
nested = [[1, 2, 3], [4, 5], [6]]
flat = [x for sublist in nested for x in sublist] # [1,2,3,4,5,6]
# Rule: outer loop comes FIRST, inner loop comes SECOND
# Matrix creation:
rows, cols = 3, 4
matrix = [[0] * cols for _ in range(rows)] # 3Ć4 zero matrix
# All pairs where i != j:
pairs = [(i,j) for i in range(3) for j in range(3) if i != j]
# [(0,1),(0,2),(1,0),(1,2),(2,0),(2,1)]
# Transpose matrix:
matrix = [[1,2,3],[4,5,6],[7,8,9]]
transposed = [[row[i] for row in matrix] for i in range(len(matrix[0]))]
# [[1,4,7],[2,5,8],[3,6,9]]
# ================================================================
# SET COMPREHENSION
# ================================================================
unique_lengths = {len(word) for word in ["hi","hello","hey","hi"]} # {2, 5, 3}
# ================================================================
# DICT COMPREHENSION
# ================================================================
word_lengths = {word: len(word) for word in ["hi","hello","hey"]}
# {'hi': 2, 'hello': 5, 'hey': 3}
squares_map = {x: x**2 for x in range(6)}
# {0:0, 1:1, 2:4, 3:9, 4:16, 5:25}
# ================================================================
# GENERATOR EXPRESSION
# ================================================================
# Same syntax as list comprehension but with () ā produces values LAZILY
gen = (x**2 for x in range(1_000_000)) # No memory used yet!
next(gen) # 0 ā compute on demand
next(gen) # 1
# Use directly in functions (no extra parentheses needed):
total = sum(x**2 for x in range(100)) # sum of squares, memory efficient
any(x > 50 for x in data) # short-circuits at first True
max(len(word) for word in words) # longest word length
# Memory comparison:
import sys
list_comp = [x**2 for x in range(10000)] # stores 10000 integers
gen_exp = (x**2 for x in range(10000)) # stores only generator object
sys.getsizeof(list_comp) # ~87,624 bytes
sys.getsizeof(gen_exp) # ~112 bytes (just the generator!)2.4 Generators (Functions with yield)
š Theory
A generator is a function that uses yield instead of return. When called, it returns a generator object that produces values lazily ā computing each value only when requested.
Key difference:
return: function terminates, stack frame DESTROYED
yield: function SUSPENDS, stack frame PRESERVED
ā can resume from exactly where it left off
Generator lifecycle:
gen = my_gen() ā returns generator object (no code runs!)
next(gen) ā runs until first yield, returns yielded value
next(gen) ā resumes after yield, runs until next yield
next(gen) ā StopIteration when function returns/falls off end
# ================================================================
# Basic generator function
# ================================================================
def count_up(n: int):
"""Yields numbers from 1 to n."""
i = 1
while i <= n:
yield i # PAUSE here, return i to caller
i += 1 # RESUME here on next call to next()
# Falling off the end raises StopIteration automatically
gen = count_up(3)
print(type(gen)) # <class 'generator'>
next(gen) # 1 ā runs until first yield
next(gen) # 2 ā resumes, runs until second yield
next(gen) # 3
# next(gen) # StopIteration!
for x in count_up(5): # for loop handles StopIteration automatically
print(x) # 1 2 3 4 5
# ================================================================
# Generator for infinite sequence (Fibonacci)
# ================================================================
def fibonacci():
a, b = 0, 1
while True: # infinite ā never stops on its own
yield a
a, b = b, a + b # next Fibonacci number
fib = fibonacci()
[next(fib) for _ in range(10)] # [0,1,1,2,3,5,8,13,21,34]
# ================================================================
# Generator for reading large files ā memory efficient
# ================================================================
def read_large_file(filepath: str):
"""
Process file line by line.
Uses O(1) memory regardless of file size!
Can process a 10GB file with constant memory.
"""
with open(filepath, 'r') as f:
for line in f: # file object is itself an iterator
yield line.strip() # yield one line at a time
# Process 10GB log file with constant memory:
for line in read_large_file('huge_log.txt'):
if 'ERROR' in line:
print(line) # process one line, discard, next line
# ================================================================
# yield from ā delegate to sub-generator
# ================================================================
def flatten(nested):
"""Recursively flatten any nested iterable."""
for item in nested:
if isinstance(item, (list, tuple)):
yield from flatten(item) # delegate recursively
else:
yield item # yield the leaf value
list(flatten([1, [2, [3, 4], 5], [6, [7]]])) # [1,2,3,4,5,6,7]
# Without yield from, you'd need:
# for val in flatten(item):
# yield val
# ================================================================
# Generator send() ā two-way communication
# ================================================================
def running_average():
"""Generator that computes running average as values are sent."""
total = 0
count = 0
average = None
while True:
value = yield average # yield current avg, receive new value
if value is None:
break
total += value
count += 1
average = total / count
gen = running_average()
next(gen) # None ā initialize (prime the generator)
gen.send(10) # 10.0 ā running avg: 10/1
gen.send(20) # 15.0 ā running avg: 30/2
gen.send(30) # 20.0 ā running avg: 60/3
# ================================================================
# Generator Pipeline ā real-world data processing
# ================================================================
def read_logs(filename):
"""Stage 1: Read lines."""
with open(filename) as f:
for line in f:
yield line.strip()
def parse_json_lines(lines):
"""Stage 2: Parse each line."""
import json
for line in lines:
try:
yield json.loads(line)
except:
continue # skip malformed
def filter_errors(records):
"""Stage 3: Keep only error records."""
for record in records:
if record.get('level') == 'ERROR':
yield record
def add_metadata(records):
"""Stage 4: Enrich records."""
for record in records:
record['processed'] = True
yield record
# Build pipeline ā no data loaded yet!
lines = read_logs('app.log')
records = parse_json_lines(lines)
errors = filter_errors(records)
enriched = add_metadata(errors)
# Data flows through ONE record at a time ā constant memory!
for record in enriched:
save_to_db(record)2.5 itertools Module ā Must-Know for Interviews
import itertools
# ================================================================
# combinations ā choose r items (order doesn't matter, no repetition)
# ================================================================
list(itertools.combinations([1,2,3,4], 2))
# [(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)] ā C(4,2) = 6
# DSA use: generate all pairs, all subsets of size k
# ================================================================
# permutations ā all orderings of r items
# ================================================================
list(itertools.permutations([1,2,3], 2))
# [(1,2),(1,3),(2,1),(2,3),(3,1),(3,2)] ā P(3,2) = 6
list(itertools.permutations([1,2,3]))
# [(1,2,3),(1,3,2),(2,1,3),(2,3,1),(3,1,2),(3,2,1)] ā 3! = 6
# ================================================================
# combinations_with_replacement ā choose r with repetition allowed
# ================================================================
list(itertools.combinations_with_replacement([1,2,3], 2))
# [(1,1),(1,2),(1,3),(2,2),(2,3),(3,3)]
# ================================================================
# product ā cartesian product (like nested for loops)
# ================================================================
list(itertools.product([1,2], ['a','b']))
# [(1,'a'),(1,'b'),(2,'a'),(2,'b')]
# product with repeat:
list(itertools.product([0,1], repeat=3)) # all 3-bit binary numbers
# [(0,0,0),(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0),(1,1,1)]
# DSA use: brute-force all combinations
# ================================================================
# chain ā concatenate multiple iterables lazily
# ================================================================
list(itertools.chain([1,2],[3,4],[5])) # [1,2,3,4,5]
# chain.from_iterable ā flatten one level of nesting
list(itertools.chain.from_iterable([[1,2],[3,4],[5]])) # [1,2,3,4,5]
# ================================================================
# islice ā slice a generator (can't use [a:b] on iterators!)
# ================================================================
gen = (x**2 for x in range(100))
list(itertools.islice(gen, 5)) # [0,1,4,9,16] ā first 5
list(itertools.islice(range(100), 10, 20)) # [10,11,...,19] ā slice by indices
# Backend use: pagination!
def get_page(iterable, page_size, page_num):
"""Get specific page from any iterable."""
start = page_size * (page_num - 1)
return list(itertools.islice(iterable, start, start + page_size))
# ================================================================
# takewhile / dropwhile
# ================================================================
list(itertools.takewhile(lambda x: x < 5, [1,3,5,2,1]))
# [1,3] ā stops at FIRST element >= 5 (even though 2,1 come after!)
list(itertools.dropwhile(lambda x: x < 5, [1,3,5,2,1]))
# [5,2,1] ā drops while condition True, then yields ALL remaining
# ================================================================
# groupby ā group CONSECUTIVE elements (ā ļø MUST SORT FIRST!)
# ================================================================
data = [('A',1),('A',2),('B',3),('B',4),('A',5)]
# Without sorting ā 'A' appears in two separate groups:
for key, group in itertools.groupby(data, key=lambda x: x[0]):
print(key, list(group))
# A [('A',1),('A',2)]
# B [('B',3),('B',4)]
# A [('A',5)] ā separate group!
# With sorting ā all same-key items grouped:
for key, group in itertools.groupby(sorted(data, key=lambda x: x[0]),
key=lambda x: x[0]):
print(key, list(group))
# A [('A',1),('A',2),('A',5)]
# B [('B',3),('B',4)]
# ================================================================
# accumulate ā running accumulation (prefix sums!)
# ================================================================
list(itertools.accumulate([1,2,3,4,5])) # [1,3,6,10,15] ā prefix sums
import operator
list(itertools.accumulate([1,2,3,4,5], operator.mul)) # [1,2,6,24,120] ā prefix products
list(itertools.accumulate([3,1,4,1,5], max)) # [3,3,4,4,5] ā running max
# ================================================================
# count / cycle / repeat ā infinite iterators
# ================================================================
counter = itertools.count(start=10, step=2)
[next(counter) for _ in range(4)] # [10, 12, 14, 16]
cycler = itertools.cycle(['R','G','B'])
[next(cycler) for _ in range(7)] # ['R','G','B','R','G','B','R']
list(itertools.repeat(0, 5)) # [0,0,0,0,0]š Interview Problem: Generate All Subsets
import itertools
def subsets(nums: list) -> list:
"""
Generate all subsets (power set) of nums.
Example: [1,2,3] ā [[],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]]
"""
result = []
for r in range(len(nums) + 1): # for each subset size 0..n
for combo in itertools.combinations(nums, r):
result.append(list(combo))
return result
# Alternative using bitmask (no itertools):
def subsets_bits(nums: list) -> list:
n = len(nums)
result = []
for mask in range(1 << n): # 2^n possible masks (0 to 2^n - 1)
subset = []
for i in range(n):
if mask & (1 << i): # if i-th bit is set
subset.append(nums[i]) # include nums[i]
result.append(subset)
return result
# Time: O(n Ć 2^n), Space: O(n Ć 2^n)š Interview Problem: Paginate a Large Dataset
import itertools
def paginate(data_generator, page_size: int, page_num: int) -> dict:
"""
Paginate any iterable (works with generators ā memory efficient!).
page_num is 1-indexed.
Real-world use: paginate database cursor results without
loading entire result set into memory.
"""
start = page_size * (page_num - 1)
page_items = list(itertools.islice(data_generator, start, start + page_size))
return {
"page": page_num,
"page_size": page_size,
"items": page_items,
"has_more": len(page_items) == page_size, # might have more
}
# Usage with generator (memory efficient):
def stream_users():
for i in range(1, 10001):
yield {"id": i, "name": f"User_{i}"}
# Get page 3, 20 items per page:
result = paginate(stream_users(), page_size=20, page_num=3)
# ā {"page": 3, "items": [user41..user60], "has_more": True}š¦ Quick Cheatsheet ā Iterables, Iterators, Generators
# Iterable vs Iterator
iter(lst) # get iterator from iterable
next(it) # get next value from iterator
next(it, default) # get next or return default (no error)
# Generator function
def gen(): yield value # pauses and resumes
yield from sub_gen() # delegate to another generator
# Generator expression (lazy, memory efficient)
(x**2 for x in lst) # vs [x**2 for x in lst] (eager)
# itertools essentials
itertools.combinations(lst, r) # C(n,r) ā order doesn't matter
itertools.permutations(lst, r) # P(n,r) ā order matters
itertools.product(a, b) # cartesian product
itertools.chain(*lists) # concatenate iterables lazily
itertools.chain.from_iterable(nested) # flatten one level
itertools.accumulate(lst) # prefix sums
itertools.islice(gen, start, stop) # slice any iterator
itertools.groupby(sorted_data, key) # group (SORT FIRST!)
itertools.takewhile(pred, it) # take while condition True
itertools.dropwhile(pred, it) # drop while condition True
# Common patterns
sum(x**2 for x in lst) # generator in function call
list(map(fn, lst)) # apply fn to each element
list(filter(fn, lst)) # keep elements where fn is True
[x for sub in nested for x in sub] # flatten nested listSECTION 3 ā DECORATORS
3.1 Functions as First-Class Objects
š Theory
In Python, functions are objects. They can be assigned to variables, passed as arguments, returned from other functions, and stored in data structures. This is the foundation that makes decorators possible.
First-class object means:
ā
Assign to a variable
ā
Pass as argument to another function
ā
Return from a function
ā
Store in list, dict, set
ā
Have attributes
ā
Be created at runtime (anonymous with lambda)
# ================================================================
# Functions can be assigned to variables
# ================================================================
def greet(name: str) -> str:
return f"Hello, {name}!"
say_hi = greet # NO parentheses = reference to function object
# With parentheses greet() = CALL the function
print(say_hi) # <function greet at 0x...>
say_hi("Alice") # "Hello, Alice!" ā calling via new name
# ================================================================
# Functions can be passed as arguments
# ================================================================
def apply(func, value):
"""Accepts a function and calls it with value."""
return func(value)
apply(len, "hello") # 5 ā passed built-in len
apply(str.upper, "hello") # "HELLO" ā passed method
apply(greet, "Bob") # "Hello, Bob!" ā passed custom function
# ================================================================
# Functions can be returned from functions
# ================================================================
def make_multiplier(n: int):
"""Factory function ā returns a customized function."""
def multiplier(x): # inner function defined inside outer
return x * n # 'n' is captured from enclosing scope
return multiplier # return the function OBJECT (no parentheses)
double = make_multiplier(2) # double is now a function
triple = make_multiplier(3)
double(5) # 10
triple(5) # 15
double(triple(4)) # double(12) = 24
# ================================================================
# Closures ā inner function remembers enclosing scope
# ================================================================
def make_counter(start: int = 0):
"""Returns a counter function that remembers its count."""
count = start # enclosed variable ā lives in closure
def increment(by: int = 1):
nonlocal count # ā tells Python: modify the ENCLOSING count
count += by # without nonlocal, this would be UnboundLocalError!
return count
return increment
counter_a = make_counter(0)
counter_b = make_counter(100) # independent closure!
counter_a() # 1
counter_a() # 2
counter_a(5) # 7
counter_b() # 101 ā completely independent!
# ================================================================
# nonlocal vs global
# ================================================================
x = 10 # module-level variable
def outer():
x = 20 # outer local variable
def inner():
nonlocal x # refers to outer()'s x, NOT module x
x += 5
print(x) # 25
inner()
print(x) # 25 ā modified by inner!
# global keyword (rarely needed, usually bad practice):
def bad_practice():
global x # refers to module-level x
x += 1 # modifies global state ā avoid!
# ================================================================
# Functions stored in data structures
# ================================================================
operations = {
'add': lambda a, b: a + b,
'subtract': lambda a, b: a - b,
'multiply': lambda a, b: a * b,
'divide': lambda a, b: a / b if b != 0 else None
}
operations['add'](3, 4) # 7
operations['multiply'](3, 4) # 12
# Dispatch table pattern (replaces long if-elif chains):
def process(op: str, a: int, b: int):
handler = operations.get(op)
if not handler:
raise ValueError(f"Unknown operation: {op}")
return handler(a, b)3.2 Basic Decorator Syntax
š Theory
A decorator is a function that takes a function and returns a modified version of it. It wraps the original function to add behavior before, after, or around it.
Pattern:
original_func = decorator(original_func)
@ syntax is just syntactic sugar for the above line.
import functools
# ================================================================
# Step 1: The manual wrapper pattern (what @ does under the hood)
# ================================================================
def my_decorator(func):
"""
Takes a function and returns a wrapped version.
This IS the decorator.
"""
def wrapper(*args, **kwargs): # accepts any arguments
print("āā Before function call āā")
result = func(*args, **kwargs) # call the ORIGINAL function
print("āā After function call āā")
return result # MUST return result!
return wrapper # return the wrapper function
def say_hello(name: str):
print(f"Hello, {name}!")
# Manual decoration:
say_hello = my_decorator(say_hello) # wrap and reassign
say_hello("Alice")
# āā Before function call āā
# Hello, Alice!
# āā After function call āā
# ================================================================
# Step 2: @ syntax sugar (identical result, cleaner)
# ================================================================
@my_decorator # ā equivalent to:
def say_hello(name: str): # say_hello = my_decorator(say_hello)
print(f"Hello, {name}!")
# The @ line runs at FUNCTION DEFINITION time, not call time
# ================================================================
# Step 3: ALWAYS use functools.wraps (preserves metadata)
# ================================================================
def my_decorator(func):
@functools.wraps(func) # ā copies name, docstring, etc.
def wrapper(*args, **kwargs):
"""This is wrapper's docstring."""
print("Before")
result = func(*args, **kwargs)
print("After")
return result
return wrapper
@my_decorator
def say_hello(name: str):
"""Greet someone by name."""
print(f"Hello, {name}!")
# Without @wraps:
# say_hello.__name__ ā 'wrapper' ā WRONG (breaks debugging)
# say_hello.__doc__ ā "This is wrapper's docstring." ā WRONG
# With @wraps:
# say_hello.__name__ ā 'say_hello' ā correct!
# say_hello.__doc__ ā "Greet someone by name." ā correct!
# ================================================================
# Decorator that measures execution time
# ================================================================
import time
def timer(func):
"""Decorator: prints how long a function takes."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter() # high-precision timer
result = func(*args, **kwargs) # run the function
elapsed = time.perf_counter() - start
print(f"[TIMER] {func.__name__} took {elapsed:.6f}s")
return result
return wrapper
@timer
def slow_function(n: int) -> int:
"""Sum first n numbers."""
return sum(range(n))
slow_function(1_000_000)
# [TIMER] slow_function took 0.034521s
# 499999500000
# ================================================================
# Decorator that logs function calls
# ================================================================
def log_calls(func):
"""Decorator: logs every call with arguments and return value."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
# Format arguments for display
args_str = ', '.join(repr(a) for a in args)
kwargs_str = ', '.join(f"{k}={v!r}" for k, v in kwargs.items())
signature = ', '.join(filter(None, [args_str, kwargs_str]))
print(f"CALL ā {func.__name__}({signature})")
result = func(*args, **kwargs)
print(f"RETURNā {func.__name__} = {result!r}")
return result
return wrapper
@log_calls
def add(a: int, b: int) -> int:
return a + b
add(3, 5)
# CALL ā add(3, 5)
# RETURNā add = 8
add(a=10, b=20)
# CALL ā add(a=10, b=20)
# RETURNā add = 303.3 Decorators with Arguments (Decorator Factories)
š Theory
When a decorator needs its own arguments, you need one extra level of nesting. This creates a "decorator factory" ā a function that returns a decorator.
Without arguments: @decorator
decorator(func) ā wrapper
With arguments: @decorator(arg1, arg2)
decorator(arg1, arg2) ā actual_decorator
actual_decorator(func) ā wrapper
Triple nesting:
factory(args) ā decorator
decorator(func) ā wrapper
wrapper(...) ā result
import functools
import time
# ================================================================
# @retry(max_attempts=3, delay=1.0) ā with arguments
# ================================================================
def retry(max_attempts: int = 3, delay: float = 1.0, exceptions=(Exception,)):
"""
Decorator FACTORY ā returns a decorator configured with these args.
Level 1: retry(...) ā returns decorator
Level 2: decorator(func) ā returns wrapper
Level 3: wrapper(...) ā runs the actual logic
"""
def decorator(func): # level 2 ā the actual decorator
@functools.wraps(func)
def wrapper(*args, **kwargs): # level 3 ā the wrapper
last_exc = None
for attempt in range(1, max_attempts + 1):
try:
return func(*args, **kwargs) # try the call
except exceptions as e:
last_exc = e
if attempt < max_attempts:
print(f"Attempt {attempt} failed: {e}. "
f"Retrying in {delay}s...")
time.sleep(delay)
raise last_exc # all attempts failed
return wrapper
return decorator # level 1 returns decorator
@retry(max_attempts=3, delay=0.5, exceptions=(ConnectionError,))
def fetch_data(url: str) -> str:
"""Fetch data from URL ā may fail transiently."""
import random
if random.random() < 0.7:
raise ConnectionError("Network timeout")
return f"data from {url}"
# ================================================================
# @validate_types ā checks argument types at runtime
# ================================================================
def validate_types(**type_map):
"""
Validates argument types before calling function.
Usage: @validate_types(name=str, age=int)
"""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
# Map positional args to parameter names
import inspect
sig = inspect.signature(func)
bound = sig.bind(*args, **kwargs)
bound.apply_defaults()
for param_name, expected_type in type_map.items():
if param_name in bound.arguments:
value = bound.arguments[param_name]
if not isinstance(value, expected_type):
raise TypeError(
f"Argument '{param_name}' must be "
f"{expected_type.__name__}, "
f"got {type(value).__name__}"
)
return func(*args, **kwargs)
return wrapper
return decorator
@validate_types(name=str, age=int)
def create_user(name: str, age: int) -> dict:
return {"name": name, "age": age}
create_user("Alice", 30) # ā
works
# create_user("Alice", "30") # ā TypeError: 'age' must be int, got str
# ================================================================
# @repeat(n) ā repeat function call n times
# ================================================================
def repeat(n: int):
"""Call the decorated function n times."""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
results = []
for _ in range(n):
results.append(func(*args, **kwargs))
return results # return list of all results
return wrapper
return decorator
@repeat(3)
def say(message: str) -> str:
print(message)
return message
say("Hello!")
# Hello!
# Hello!
# Hello!
# returns: ['Hello!', 'Hello!', 'Hello!']3.4 Class-Based Decorators
š Theory
A class can act as a decorator by implementing __call__. This is useful when the decorator needs to maintain state across calls.
import functools
# ================================================================
# Class decorator: counts how many times function is called
# ================================================================
class CountCalls:
"""Decorator that counts invocations."""
def __init__(self, func):
functools.update_wrapper(self, func) # preserve metadata
self.func = func
self.count = 0 # mutable state!
def __call__(self, *args, **kwargs): # makes instance callable
self.count += 1
print(f"[{self.func.__name__}] Call #{self.count}")
return self.func(*args, **kwargs)
@CountCalls
def say_hi(name: str) -> str:
return f"Hi, {name}!"
say_hi("Alice") # [say_hi] Call #1
say_hi("Bob") # [say_hi] Call #2
say_hi("Charlie") # [say_hi] Call #3
say_hi.count # 3 ā access decorator state directly!
# ================================================================
# Class decorator: caches results (manual memoization)
# ================================================================
class Memoize:
"""Decorator that caches function results."""
def __init__(self, func):
functools.update_wrapper(self, func)
self.func = func
self.cache = {} # stores (args, kwargs) ā result
def __call__(self, *args, **kwargs):
# Create hashable cache key
key = (args, tuple(sorted(kwargs.items())))
if key not in self.cache:
self.cache[key] = self.func(*args, **kwargs) # compute once
return self.cache[key] # return cached
def clear_cache(self):
"""Allow manual cache invalidation."""
self.cache.clear()
@Memoize
def expensive_computation(n: int) -> int:
print(f"Computing for {n}...") # only prints first time per unique n
return n ** 2
expensive_computation(5) # Computing for 5... ā 25
expensive_computation(5) # 25 (cached ā no print!)
expensive_computation(10) # Computing for 10... ā 100
expensive_computation.clear_cache() # bust cache
expensive_computation(5) # Computing for 5... ā 25 (recomputed)
# ================================================================
# Class decorator with arguments
# ================================================================
class RateLimit:
"""Class decorator factory for rate limiting."""
def __init__(self, max_calls: int, period: float):
self.max_calls = max_calls
self.period = period
def __call__(self, func): # called with the function
from collections import deque
calls = deque()
@functools.wraps(func)
def wrapper(*args, **kwargs):
now = time.time()
# Remove timestamps outside window
while calls and calls[0] < now - self.period:
calls.popleft()
if len(calls) >= self.max_calls:
raise Exception(
f"Rate limit exceeded: "
f"{self.max_calls} calls per {self.period}s"
)
calls.append(now)
return func(*args, **kwargs)
return wrapper
@RateLimit(max_calls=3, period=60) # 3 calls per minute
def api_call(endpoint: str) -> str:
return f"Response from {endpoint}"3.5 Built-in Decorators
# ================================================================
# @staticmethod ā method that needs no self or cls
# ================================================================
class MathUtils:
@staticmethod
def is_prime(n: int) -> bool:
"""No access to instance or class ā just a namespaced function."""
if n < 2: return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
@staticmethod
def gcd(a: int, b: int) -> int:
while b:
a, b = b, a % b
return a
# Call on class OR instance ā no instance needed:
MathUtils.is_prime(17) # True
MathUtils.gcd(12, 8) # 4
# ================================================================
# @classmethod ā method that receives the CLASS as first argument
# ================================================================
class Pizza:
def __init__(self, ingredients: list):
self.ingredients = ingredients
def __repr__(self):
return f"Pizza({self.ingredients})"
@classmethod
def margherita(cls) -> 'Pizza':
"""Alternative constructor ā factory method pattern."""
return cls(['mozzarella', 'tomatoes']) # cls = Pizza class
@classmethod
def pepperoni(cls) -> 'Pizza':
return cls(['mozzarella', 'pepperoni', 'tomatoes'])
@classmethod
def from_string(cls, ingredients_str: str) -> 'Pizza':
"""Create from comma-separated string."""
return cls(ingredients_str.split(','))
# Create without calling __init__ directly:
p1 = Pizza.margherita() # Pizza(['mozzarella', 'tomatoes'])
p2 = Pizza.pepperoni() # Pizza(['mozzarella', 'pepperoni', 'tomatoes'])
p3 = Pizza.from_string("cheese,mushrooms,olives")
# ================================================================
# @property ā method that behaves like an attribute
# ================================================================
import math
class Circle:
def __init__(self, radius: float):
self._radius = radius # convention: _ prefix = "private"
@property
def radius(self) -> float:
"""Getter ā called when you access circle.radius"""
return self._radius
@radius.setter
def radius(self, value: float) -> None:
"""Setter ā called when you do circle.radius = x"""
if value < 0:
raise ValueError(f"Radius must be non-negative, got {value}")
self._radius = value
@radius.deleter
def radius(self) -> None:
"""Deleter ā called when you do del circle.radius"""
print("Deleting radius")
del self._radius
@property
def area(self) -> float:
"""Computed read-only property ā no setter."""
return math.pi * self._radius ** 2
@property
def circumference(self) -> float:
return 2 * math.pi * self._radius
c = Circle(5)
c.radius # 5 ā calls getter
c.radius = 10 # calls setter (validates!)
c.area # 314.159... ā computed on access
# c.area = 100 # ā AttributeError: can't set attribute (no setter)
# c.radius = -1 # ā ValueError: Radius must be non-negative
# ================================================================
# @functools.lru_cache ā memoization (CRUCIAL FOR DSA!)
# ================================================================
import functools
@functools.lru_cache(maxsize=None) # maxsize=None means unlimited cache
def fib(n: int) -> int:
"""
Fibonacci with automatic memoization.
Without cache: O(2^n) ā exponential
With cache: O(n) ā linear!
"""
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
fib(50) # instant! Without cache: would take forever
fib(100) # still instant
fib.cache_info() # CacheInfo(hits=98, misses=101, maxsize=None, currsize=101)
fib.cache_clear() # clear the cache (useful for testing)
# Python 3.9+ shorthand (unlimited cache, slightly faster):
@functools.cache
def fib(n: int) -> int:
if n <= 1: return n
return fib(n-1) + fib(n-2)
# ā ļø INTERVIEW TRAP: lru_cache requires HASHABLE arguments!
# @functools.cache
# def process(lst: list): # ā TypeError: unhashable type: 'list'
# pass
# ā
FIX: use tuple instead of list as argument
@functools.cache
def process(items: tuple) -> int:
return sum(items)
process((1, 2, 3)) # ā
tuple is hashable
# ================================================================
# @dataclass ā auto-generate __init__, __repr__, __eq__
# ================================================================
from dataclasses import dataclass, field
@dataclass
class Employee:
name: str
department: str
salary: float
skills: list = field(default_factory=list) # ā
correct mutable default
is_active: bool = True # simple default
# Python auto-generates:
# __init__(self, name, department, salary, skills=None, is_active=True)
# __repr__ ā "Employee(name='Alice', department='Engineering', ...)"
# __eq__ ā compares all fields by value
e1 = Employee("Alice", "Engineering", 120_000)
e2 = Employee("Alice", "Engineering", 120_000)
e1 == e2 # True ā compares values (not identity)
e1 is e2 # False ā different objects
e1.skills.append("Python") # ['Python']
# Frozen dataclass (immutable ā can be used as dict key):
@dataclass(frozen=True)
class Point:
x: float
y: float
# p.x = 5 ā FrozenInstanceError: cannot assign to field 'x'
# Ordered dataclass (enables <, >, <=, >=):
@dataclass(order=True)
class Version:
major: int
minor: int
patch: int
v1 = Version(1, 2, 3)
v2 = Version(1, 3, 0)
v1 < v2 # True ā compared field by field: majorāminorāpatch3.6 Real Backend Use Cases
import functools
import time
from collections import deque
# ================================================================
# Authentication decorator
# ================================================================
def require_auth(func):
"""Reject requests missing valid auth token."""
@functools.wraps(func)
def wrapper(request: dict, *args, **kwargs):
token = request.get('auth_token')
if not token:
return {"error": "Unauthorized", "status": 401}
if not is_valid_token(token): # your validation logic
return {"error": "Invalid token", "status": 403}
return func(request, *args, **kwargs)
return wrapper
def is_valid_token(token: str) -> bool:
return token == "secret123" # simplified
@require_auth
def get_user_profile(request: dict) -> dict:
user_id = request.get('user_id')
return {"user_id": user_id, "status": 200}
get_user_profile({"auth_token": "secret123", "user_id": 42}) # ā
get_user_profile({"user_id": 42}) # {"error": "Unauthorized", "status": 401}
# ================================================================
# Rate limiting decorator (sliding window)
# ================================================================
def rate_limit(max_calls: int, period: float):
"""
Allow max_calls within period seconds per function.
Uses sliding window with deque.
"""
def decorator(func):
calls = deque() # timestamps of recent calls
@functools.wraps(func)
def wrapper(*args, **kwargs):
now = time.time()
# Remove timestamps outside the window
while calls and calls[0] < now - period:
calls.popleft()
if len(calls) >= max_calls:
wait = period - (now - calls[0])
raise Exception(
f"Rate limit exceeded. "
f"Try again in {wait:.1f}s"
)
calls.append(now)
return func(*args, **kwargs)
return wrapper
return decorator
@rate_limit(max_calls=5, period=60) # 5 calls per minute
def search_api(query: str) -> list:
return [f"result for {query}"]
# ================================================================
# Caching with TTL decorator
# ================================================================
def cache_with_ttl(ttl_seconds: float = 300):
"""
Cache function results with time-to-live expiry.
Cache key = function arguments (must be hashable).
"""
def decorator(func):
# cache: args_key ā (result, expiry_timestamp)
cache = {}
@functools.wraps(func)
def wrapper(*args, **kwargs):
# Create hashable key
key = (args, tuple(sorted(kwargs.items())))
now = time.time()
if key in cache:
result, expiry = cache[key]
if now < expiry: # still fresh
return result # cache HIT ā
# Cache MISS ā compute result
result = func(*args, **kwargs)
cache[key] = (result, now + ttl_seconds)
return result
# Expose cache management on the wrapper
def invalidate(*args, **kwargs):
key = (args, tuple(sorted(kwargs.items())))
cache.pop(key, None)
wrapper.cache = cache
wrapper.invalidate = invalidate
wrapper.clear = cache.clear
return wrapper
return decorator
@cache_with_ttl(ttl_seconds=60)
def get_user(user_id: int) -> dict:
"""Expensive DB query ā cached for 60 seconds."""
print(f"DB query for user {user_id}")
return {"id": user_id, "name": f"User_{user_id}"}
get_user(42) # DB query for user 42 (cache miss)
get_user(42) # (cache hit ā no print)
get_user.invalidate(42) # bust cache for user 42
get_user(42) # DB query for user 42 (cache miss again)
# ================================================================
# Retry with exponential backoff
# ================================================================
def retry_with_backoff(
max_attempts: int = 3,
base_delay: float = 1.0,
backoff: float = 2.0,
exceptions: tuple = (Exception,)
):
"""Retry with exponential backoff on failure."""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
delay = base_delay
for attempt in range(1, max_attempts + 1):
try:
return func(*args, **kwargs)
except exceptions as e:
if attempt == max_attempts:
raise # re-raise on final attempt
print(f"Attempt {attempt} failed: {e}. "
f"Retrying in {delay:.1f}s...")
time.sleep(delay)
delay *= backoff # exponential: 1 ā 2 ā 4 ā 8...
return wrapper
return decorator
@retry_with_backoff(max_attempts=4, base_delay=0.5, backoff=2.0,
exceptions=(ConnectionError, TimeoutError))
def call_payment_api(amount: float) -> dict:
"""Call external payment API ā may fail transiently."""
import random
if random.random() < 0.6:
raise ConnectionError("Payment gateway timeout")
return {"status": "success", "amount": amount}ā ļø Decorator Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Forgetting to return result from wrapper
# ================================================================
def bad_decorator(func):
def wrapper(*args, **kwargs):
func(*args, **kwargs) # ā forgot return!
return wrapper
@bad_decorator
def add(a, b): return a + b
add(3, 4) # None! Result lost.
# ā
FIX:
def good_decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
return func(*args, **kwargs) # ā
always return!
return wrapper
# ================================================================
# TRAP 2: Forgetting @functools.wraps
# ================================================================
def naked_decorator(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
@naked_decorator
def my_function():
"""My docstring."""
pass
my_function.__name__ # 'wrapper' ā WRONG (breaks logging, testing)
my_function.__doc__ # None ā WRONG (docstring lost)
# ================================================================
# TRAP 3: Stacking decorators ā order matters!
# ================================================================
@decorator_a # applied SECOND (outermost)
@decorator_b # applied FIRST (innermost)
def func(): pass
# Equivalent to:
func = decorator_a(decorator_b(func))
# When func() is called: decorator_a's wrapper runs first,
# then calls decorator_b's wrapper, which calls original func
# ================================================================
# TRAP 4: lru_cache with mutable arguments
# ================================================================
@functools.cache
def process(data: list) -> int: # ā TypeError: unhashable type: 'list'
return sum(data)
# ā
FIX: convert to tuple before calling
@functools.cache
def process(data: tuple) -> int: # tuple is hashable
return sum(data)
process(tuple([1, 2, 3])) # ā
# ================================================================
# TRAP 5: Class-based decorator vs instance methods
# ================================================================
class MyDecorator:
def __init__(self, func):
functools.update_wrapper(self, func)
self.func = func
def __call__(self, *args, **kwargs):
return self.func(*args, **kwargs)
# Works fine for standalone functions:
@MyDecorator
def standalone(): pass
# ā ļø May have issues with instance methods (self not passed correctly)
# For methods, prefer function-based decoratorsš¦ Quick Cheatsheet ā Decorators
# Basic decorator template
def decorator(func):
@functools.wraps(func) # ALWAYS include this!
def wrapper(*args, **kwargs):
# ... before ...
result = func(*args, **kwargs)
# ... after ...
return result # ALWAYS return result!
return wrapper
# Decorator with arguments (triple nesting)
def decorator_factory(arg1, arg2):
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
# Class-based decorator
class Decorator:
def __init__(self, func):
functools.update_wrapper(self, func)
self.func = func
def __call__(self, *args, **kwargs):
return self.func(*args, **kwargs)
# Built-in decorators
@staticmethod # no self/cls
@classmethod # receives cls (alternative constructors)
@property # getter/setter as attribute
@functools.lru_cache(maxsize=None) # memoize (unlimited cache)
@functools.cache # same, Python 3.9+
@dataclass # auto __init__, __repr__, __eq__
@dataclass(frozen=True) # immutable dataclass
@dataclass(order=True) # adds <, >, <=, >=
# Stacking: bottom decorator applied first
@outer # applied second
@inner # applied first
def func(): pass
# func = outer(inner(func))SECTION 4 ā IMPORTANT PYTHON BUILT-INS AND TRICKS
4.1 Sorting with Lambda Keys
# ================================================================
# sorted() ā returns NEW list, works on ANY iterable
# sort() ā IN PLACE, only for lists, returns None
# ================================================================
# Single key
words = ["banana", "Apple", "cherry", "date"]
sorted(words) # ['Apple', 'banana', 'cherry', 'date']
sorted(words, key=str.lower) # ['Apple', 'banana', 'cherry', 'date']
sorted(words, key=len) # ['date', 'Apple', 'banana', 'cherry']
sorted(words, key=len, reverse=True) # ['banana', 'cherry', 'Apple', 'date']
sorted(words, key=lambda w: w[-1]) # sort by last character
# Multi-key sort ā use TUPLE as key
students = [("Alice", 90, 20), ("Bob", 85, 22), ("Charlie", 90, 19)]
# Sort by score DESC, then age ASC, then name ASC:
sorted(students, key=lambda s: (-s[1], s[2], s[0]))
# [('Charlie', 90, 19), ('Alice', 90, 20), ('Bob', 85, 22)]
# Trick: negate numeric fields for descending (strings can't be negated)
# For descending string sort, use a wrapper or cmp_to_key
# ================================================================
# min() and max() with key
# ================================================================
nums = [-5, 3, -2, 8, -1]
min(nums) # -5 (smallest value)
min(nums, key=abs) # -1 (smallest absolute value)
max(nums, key=abs) # 8 (largest absolute value)
words = ["banana", "apple", "cherry"]
max(words, key=len) # "banana" (longest)
min(words, key=lambda w: w[-1]) # "banana" (last char 'a' is smallest)
# min/max with default (for empty iterables):
min([], default=0) # 0 ā no ValueError
max([], default=float('-inf')) # -inf
# ================================================================
# Custom comparator with functools.cmp_to_key
# ================================================================
from functools import cmp_to_key
def compare(a, b):
"""
Returns:
negative ā a comes first
zero ā equal
positive ā b comes first
"""
if a + b > b + a: # "93" > "39" ā 9 before 3
return -1
elif a + b < b + a:
return 1
return 0
nums = ["9", "30", "34", "5", "3"]
sorted(nums, key=cmp_to_key(compare)) # ['9', '5', '34', '3', '30']
''.join(sorted(nums, key=cmp_to_key(compare))) # "9534330"4.2 Number and Math Tricks
import math
# ================================================================
# divmod ā quotient and remainder simultaneously
# ================================================================
q, r = divmod(17, 5) # q=3, r=2
# Equivalent to: (17 // 5, 17 % 5) but faster (one operation)
# Use case: convert seconds to hours:minutes:seconds
def seconds_to_hms(total_seconds: int) -> tuple:
minutes, seconds = divmod(total_seconds, 60)
hours, minutes = divmod(minutes, 60)
return hours, minutes, seconds
seconds_to_hms(3661) # (1, 1, 1) ā 1h 1m 1s
# ================================================================
# pow with modulo ā fast modular exponentiation
# ================================================================
pow(2, 10) # 1024
pow(2, 10, 1000) # 24 ā (2^10) % 1000
# Built-in uses fast exponentiation: O(log n) ā much faster than
# (2 ** 10) % 1000 for large exponents
# Competitive programming: (a * b) % mod
MOD = 10**9 + 7
pow(123456789, 987654321, MOD) # instant!
# ================================================================
# Integer division edge cases
# ================================================================
7 // 2 # 3 ā floor division (toward -inf)
-7 // 2 # -4 ā floors toward negative infinity!
# In most languages: -7 / 2 = -3 (truncates toward zero)
# Python is different ā it FLOORS
int(-7 / 2) # -3 ā truncate toward zero
math.trunc(-7 / 2) # -3 ā same
# ================================================================
# abs ā absolute value
# ================================================================
abs(-5) # 5
abs(3 + 4j) # 5.0 ā magnitude of complex number
# ================================================================
# round ā Python uses banker's rounding (round half to even)
# ================================================================
round(2.5) # 2 ā rounds to EVEN (not 3!)
round(3.5) # 4 ā rounds to EVEN
round(3.14159, 2) # 3.14
# ================================================================
# math module ā essential functions
# ================================================================
math.floor(3.7) # 3 ā round down
math.ceil(3.2) # 4 ā round up
math.sqrt(16) # 4.0 ā square root (returns float)
math.isqrt(17) # 4 ā integer square root (Python 3.8+)
math.log(100, 10) # 2.0 ā log base 10
math.log2(1024) # 10.0
math.log(math.e) # 1.0 ā natural log (base e)
math.gcd(12, 8) # 4
math.gcd(12, 8, 6) # 2 ā multiple args (Python 3.9+)
math.lcm(4, 6) # 12 ā LCM (Python 3.9+)
# Before 3.9: lcm(a,b) = abs(a*b) // math.gcd(a,b)
math.factorial(5) # 120
math.comb(5, 2) # 10 ā C(5,2) combinations
math.perm(5, 2) # 20 ā P(5,2) permutations
math.inf # float infinity
-math.inf # negative infinity
math.pi # 3.141592653589793
math.e # 2.718281828459045
math.isclose(0.1 + 0.2, 0.3) # True ā float comparison (not ==!)
# ================================================================
# Binary, Hex, Octal conversions
# ================================================================
bin(10) # '0b1010'
hex(255) # '0xff'
oct(8) # '0o10'
int('1010', 2) # 10 ā binary string to int
int('ff', 16) # 255 ā hex string to int
int('10', 8) # 8 ā octal string to int
# ================================================================
# ord / chr ā character ā ASCII
# ================================================================
ord('a') # 97
ord('A') # 65
ord('0') # 48
chr(97) # 'a'
chr(65) # 'A'
# Char to 0-25 index (lowercase):
ord('c') - ord('a') # 2
# Char to 0-9 index (digit):
ord('7') - ord('0') # 7
# Check if char is lowercase letter (fast):
'a' <= ch <= 'z' # True for lowercase
# Shift char by n positions (like Caesar cipher):
chr((ord('a') - ord('a') + 3) % 26 + ord('a')) # 'd' ā shift 'a' by 34.3 String Methods ā Complete Reference
# ================================================================
# STRIPPING WHITESPACE
# ================================================================
s = " Hello, World! "
s.strip() # 'Hello, World!' ā both sides
s.lstrip() # 'Hello, World! ' ā left only
s.rstrip() # ' Hello, World!' ā right only
s.strip('! ') # 'Hello, World' ā strip specific chars
# ================================================================
# SPLITTING
# ================================================================
"a,b,c".split(',') # ['a', 'b', 'c']
"hello world".split() # ['hello', 'world'] ā any whitespace
"a b c".split() # ['a', 'b', 'c'] ā consecutive = one
"a b c".split(' ') # ['a', '', 'b', '', 'c'] ā each space
"a,,b".split(',') # ['a', '', 'b'] ā empty string between
"a:b:c:d".split(':', 1) # ['a', 'b:c:d'] ā maxsplit=1
"a:b:c:d".split(':', 2) # ['a', 'b', 'c:d']
"hello".splitlines() # ['hello'] (splits on \n, \r\n etc.)
# ================================================================
# JOINING
# ================================================================
','.join(['a', 'b', 'c']) # 'a,b,c'
' '.join(['Hello', 'World']) # 'Hello World'
''.join(['a', 'b', 'c']) # 'abc'
# ā KEY INTERVIEW TRICK: Build strings with join, NOT +=
# ā O(n²):
result = ""
for char in "hello" * 100:
result += char # creates new string each time!
# ā
O(n):
parts = []
for char in "hello" * 100:
parts.append(char) # O(1) append
result = ''.join(parts) # O(n) once at the end
# ================================================================
# SEARCHING
# ================================================================
"hello world".find('world') # 6 ā index of first match (-1 if not found)
"hello world".rfind('l') # 9 ā search from RIGHT
"hello world".index('world') # 6 ā like find but ValueError if not found
"hello".count('l') # 2 ā count all occurrences
"hello".startswith('he') # True
"hello".endswith('lo') # True
"hello".startswith(('he', 'Hi')) # True ā tuple of prefixes!
"hello".endswith(('.txt', '.csv')) # False
# ================================================================
# REPLACING
# ================================================================
"hello world".replace('world', 'Python') # 'hello Python'
"aabaa".replace('a', 'x', 2) # 'xxbaa' ā replace first 2 only
# ================================================================
# CASE METHODS
# ================================================================
"hello".upper() # 'HELLO'
"HELLO".lower() # 'hello'
"hello world".title() # 'Hello World'
"hello world".capitalize() # 'Hello world' ā only first char upper
"Hello World".swapcase() # 'hELLO wORLD'
"hello".casefold() # 'hello' ā aggressive lowercase (for comparison)
# ================================================================
# CHECKING CHARACTER TYPES
# ================================================================
"abc".isalpha() # True ā all alphabetic
"123".isdigit() # True ā all digits
"abc123".isalnum() # True ā all alphanumeric
" ".isspace() # True ā all whitespace
"hello".islower() # True
"HELLO".isupper() # True
"Hello World".istitle() # True
# ================================================================
# PADDING AND ALIGNMENT
# ================================================================
"hello".center(11, '-') # '---hello---'
"42".zfill(5) # '00042' ā pad with zeros
"hello".ljust(10, '.') # 'hello.....'
"hello".rjust(10, '.') # '.....hello'
# ================================================================
# OTHER USEFUL METHODS
# ================================================================
"hello world".partition(' ') # ('hello', ' ', 'world') ā splits on FIRST match
"hello world hello".rpartition(' ') # ('hello world', ' ', 'hello') ā from right
" hello ".strip() == "hello" # True ā common cleaning pattern4.4 f-strings and Format Tricks
name = "Alice"
age = 30
pi = 3.14159265
n = 1_000_000
# ================================================================
# Basic f-strings
# ================================================================
f"Name: {name}, Age: {age}" # 'Name: Alice, Age: 30'
f"Pi ā {pi:.2f}" # 'Pi ā 3.14' ā 2 decimal places
f"Pi ā {pi:.4f}" # 'Pi ā 3.1416'
# ================================================================
# Alignment and width
# ================================================================
f"{42:>10}" # ' 42' ā right align in 10
f"{42:<10}" # '42 ' ā left align
f"{42:^10}" # ' 42 ' ā center
f"{'hello':>10}" # ' hello'
f"{'hello':*^11}" # '***hello***' ā fill with *
# ================================================================
# Number formatting
# ================================================================
f"{n:,}" # '1,000,000' ā thousand separators
f"{n:_}" # '1_000_000' ā underscore separator
f"{0.85:.1%}" # '85.0%' ā percentage
f"{255:b}" # '11111111' ā binary
f"{255:x}" # 'ff' ā hex (lowercase)
f"{255:X}" # 'FF' ā hex (uppercase)
f"{255:o}" # '377' ā octal
f"{1234.5678:.2e}" # '1.23e+03' ā scientific notation
# ================================================================
# Debug shorthand (Python 3.8+)
# ================================================================
x = 42
f"{x=}" # 'x=42' ā prints name and value!
f"{pi=:.2f}" # 'pi=3.14'
f"{name=}" # "name='Alice'"
# ================================================================
# Expressions inside f-strings
# ================================================================
f"{2 ** 10}" # '1024'
f"{', '.join(['a','b','c'])}" # 'a, b, c'
f"{'yes' if age >= 18 else 'no'}" # 'yes'
f"{len(name)}" # '5'4.5 *args and **kwargs
# ================================================================
# *args ā collect positional arguments into TUPLE
# ================================================================
def my_sum(*args) -> float:
"""Accept any number of positional arguments."""
print(type(args)) # <class 'tuple'>
return sum(args)
my_sum(1, 2, 3, 4, 5) # 15
my_sum() # 0 ā empty tuple is fine
# ================================================================
# **kwargs ā collect keyword arguments into DICT
# ================================================================
def print_info(**kwargs) -> None:
"""Accept any number of keyword arguments."""
print(type(kwargs)) # <class 'dict'>
for key, value in kwargs.items():
print(f" {key}: {value}")
print_info(name="Alice", age=30, city="NYC")
# name: Alice
# age: 30
# city: NYC
# ================================================================
# Combining positional, *args, keyword, **kwargs
# ================================================================
def full_func(a, b, *args, keyword_only=None, **kwargs):
"""
a, b ā required positional args
*args ā extra positional args (tuple)
keyword_only ā must be passed as keyword (after *)
**kwargs ā extra keyword args (dict)
"""
print(f"a={a}, b={b}")
print(f"args={args}")
print(f"keyword_only={keyword_only}")
print(f"kwargs={kwargs}")
full_func(1, 2, 3, 4, keyword_only="yes", extra="hello")
# a=1, b=2
# args=(3, 4)
# keyword_only=yes
# kwargs={'extra': 'hello'}
# ================================================================
# Unpacking in function calls
# ================================================================
def add(a, b, c):
return a + b + c
nums = [1, 2, 3]
add(*nums) # 6 ā unpack list as positional args
config = {'a': 1, 'b': 2, 'c': 3}
add(**config) # 6 ā unpack dict as keyword args
# Combine both:
add(*[1, 2], **{'c': 3}) # 6
# ================================================================
# Forwarding args (pass-through decorators, wrapper functions)
# ================================================================
def wrapper(func, *args, **kwargs):
"""Forward all arguments to func unchanged."""
return func(*args, **kwargs)
wrapper(add, 1, 2, 3) # 6
wrapper(add, a=1, b=2, c=3) # 64.6 Walrus Operator := (Python 3.8+)
# ================================================================
# Walrus assigns AND returns value in one expression
# ================================================================
# Without walrus ā read input in loop:
data = input()
while data != "quit":
process(data)
data = input()
# With walrus ā cleaner:
while (data := input()) != "quit": # assign AND check in one line
process(data)
# ================================================================
# In list comprehension ā compute once, use twice
# ================================================================
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Without walrus ā compute x**2 twice:
results = [x**2 for x in data if x**2 > 20]
# With walrus ā compute once, reuse:
results = [y for x in data if (y := x**2) > 20]
# [25, 36, 49, 64, 81, 100]
# ================================================================
# In while loop for chunked processing
# ================================================================
import io
buffer = io.BytesIO(b"Hello, World! Goodbye!")
# Without walrus:
chunk = buffer.read(4)
while chunk:
process(chunk)
chunk = buffer.read(4)
# With walrus:
while chunk := buffer.read(4):
process(chunk)
# ================================================================
# In regex matching
# ================================================================
import re
texts = ["Hello 42", "World", "Python 3.11", "No numbers"]
# With walrus ā check and use match in one step:
matches = [m.group() for text in texts
if (m := re.search(r'\d+\.?\d*', text))]
# ['42', '3.11']4.7 isinstance, type, and Type Checking
# ================================================================
# isinstance ā check if object is instance of class (or subclasses)
# ================================================================
isinstance(5, int) # True
isinstance(5.0, float) # True
isinstance(True, int) # True! (bool IS a subclass of int)
isinstance("hello", str) # True
isinstance([1,2], list) # True
# Check multiple types at once with tuple:
isinstance(5, (int, float)) # True ā is it int OR float?
isinstance("hi", (list, tuple, str)) # True
# ================================================================
# type ā check EXACT type (no subclass matching)
# ================================================================
type(5) # <class 'int'>
type(5) == int # True
type(True) == bool # True
type(True) == int # False! (True is bool, not exactly int)
# isinstance vs type:
class Animal: pass
class Dog(Animal): pass
dog = Dog()
isinstance(dog, Animal) # True ā Dog IS-A Animal
type(dog) == Animal # False ā exact type is Dog
# Rule: prefer isinstance() for type checks (respects inheritance)
# ================================================================
# hasattr ā check if object has an attribute/method
# ================================================================
hasattr([], 'append') # True
hasattr([], 'push') # False
hasattr(5, '__iter__') # False ā int is not iterable
hasattr([], '__iter__') # True ā list is iterable
# Duck typing check:
def is_iterable(obj) -> bool:
return hasattr(obj, '__iter__')
# ================================================================
# callable ā check if object can be called
# ================================================================
callable(len) # True ā built-in function
callable([]) # False ā list is not callable
callable(lambda: None) # True
class WithCall:
def __call__(self): pass
callable(WithCall()) # True ā has __call__4.8 Important Built-in Functions
# ================================================================
# zip, enumerate ā covered in Section 2
# any, all ā covered in Section 2
# map, filter ā covered in Section 2
# sorted, min, max ā covered in 4.1
# ================================================================
# ================================================================
# sum ā with start value
# ================================================================
sum([1, 2, 3, 4, 5]) # 15
sum([1, 2, 3], 10) # 16 ā start=10 (10+1+2+3)
sum([[1,2],[3,4]], []) # [1,2,3,4] ā flatten (not recommended)
# ================================================================
# range ā memory efficient integer sequence
# ================================================================
range(5) # 0,1,2,3,4
range(2, 8) # 2,3,4,5,6,7
range(0, 10, 2) # 0,2,4,6,8
range(10, 0, -1) # 10,9,8,7,6,5,4,3,2,1
range(5, 5) # empty range
list(range(5)) # [0,1,2,3,4] ā convert to list
# ================================================================
# len, id, hash
# ================================================================
len("hello") # 5
len([1, 2, 3]) # 3
len({}) # 0
id(42) # memory address of object
id("hello") # different address
hash("hello") # integer hash value
hash((1, 2, 3)) # tuples are hashable
# hash([1, 2, 3]) # TypeError: unhashable type: 'list'
# ================================================================
# open ā file handling
# ================================================================
# Reading:
with open('file.txt', 'r') as f:
content = f.read() # entire file as string
lines = f.readlines() # list of lines
for line in f: # iterate line by line (memory efficient)
process(line.rstrip('\n'))
# Writing:
with open('file.txt', 'w') as f: # 'w' = overwrite, 'a' = append
f.write("Hello\n")
f.writelines(["line1\n", "line2\n"])
# ================================================================
# vars, dir ā introspection
# ================================================================
vars() # dict of current local variables
vars(obj) # obj's __dict__
dir(obj) # list of obj's attributes and methods
dir([]) # ['append','clear','copy',...,'sort']
# ================================================================
# map to multiple types
# ================================================================
list(map(int, "12345")) # [1, 2, 3, 4, 5]
list(map(int, ["1","2","3"])) # [1, 2, 3]
# Reading multiple integers from a line (competitive programming):
a, b, c = map(int, input().split())
nums = list(map(int, input().split()))ā ļø Built-ins Edge Cases & Interview Traps
# ================================================================
# TRAP 1: Chained comparison is not always what you expect
# ================================================================
1 < 2 < 3 # True ā Python supports chaining
1 < 2 > 1 # True ā also works
True == 1 == 1.0 # True ā all equal by value
# ================================================================
# TRAP 2: Integer caching (-5 to 256)
# ================================================================
a = 256; b = 256; a is b # True ā CPython caches -5 to 256
a = 257; b = 257; a is b # False ā outside cache range!
# Rule: NEVER use 'is' to compare integers. Always use ==.
# ================================================================
# TRAP 3: String interning
# ================================================================
a = "hello"; b = "hello"; a is b # True ā short strings interned
a = "hello world"; b = "hello world"
a is b # may be True or False ā don't rely on it!
# Rule: NEVER use 'is' to compare strings. Always use ==.
# ================================================================
# TRAP 4: bool is subclass of int
# ================================================================
True + True # 2
True * 5 # 5
False + 1 # 1
sum([True, False, True, True]) # 3 ā count of True values!
isinstance(True, int) # True
True == 1 # True
False == 0 # True
# Use case: count True values
booleans = [True, False, True, True, False]
sum(booleans) # 3 ā count of True
# ================================================================
# TRAP 5: Float comparison
# ================================================================
0.1 + 0.2 == 0.3 # False! (floating point precision)
0.1 + 0.2 # 0.30000000000000004
import math
math.isclose(0.1 + 0.2, 0.3) # True ā correct way to compare floats
abs((0.1 + 0.2) - 0.3) < 1e-9 # True ā alternativeš¦ Quick Cheatsheet ā Built-ins & Tricks
# Sorting
sorted(lst, key=lambda x: (-x[1], x[0])) # multi-key: score DESC, name ASC
min(lst, key=len) # shortest string
max(lst, key=abs) # largest by absolute value
# Math
pow(base, exp, mod) # fast modular exponentiation O(log n)
divmod(17, 5) # (3, 2) quotient + remainder
math.isqrt(n) # integer square root
math.gcd(a, b) # greatest common divisor
math.comb(n, r) # C(n,r) combinations
# Characters
ord(c) - ord('a') # char to 0-25 index
chr(idx + ord('a')) # 0-25 index to char
''.join(sorted(word)) # anagram key
# Strings
''.join(parts) # build string from list (O(n))
s.split() # split on any whitespace
s.strip() # remove leading/trailing whitespace
s[::-1] # reverse string
# Type checking
isinstance(x, (int, float)) # check multiple types
hasattr(obj, '__iter__') # check if iterable
callable(obj) # check if callable
# f-strings
f"{val:.2f}" # 2 decimal places
f"{val:,}" # thousand separators
f"{val:>10}" # right-align in width 10
f"{val:.1%}" # percentage
f"{val=}" # debug: prints "val=<value>"
# Walrus operator
while (line := f.readline()): # read and check in one step
[y for x in lst if (y := f(x)) > threshold] # compute once, filter and useSection 5
5.1 heapq (Min-Heap)
š Theory ā How Binary Heaps Work
Min-heap property: every parent ⤠both children
Stored as a FLAT ARRAY (not tree with pointers):
Index: 0 1 2 3 4 5 6
Value: [ 1, 3, 2, 7, 5, 4, 6 ]
Tree view:
1 ā index 0
/ \
3 2 ā index 1, 2
/ \ / \
7 5 4 6 ā index 3,4,5,6
Navigation formulas:
Parent of i: (i - 1) // 2
Left child of i: 2*i + 1
Right child of i: 2*i + 2
heappush: add to end, bubble UP ā O(log n)
heappop: remove root, put last element at root, bubble DOWN ā O(log n)
heapify: build heap from list ā O(n) [NOT O(n log n)!]
ā±ļø Time & Space Complexity
| Operation | Function | Time | Notes |
|---|---|---|---|
| Push | heappush(h, x) | O(log n) | Bubble up |
| Pop min | heappop(h) | O(log n) | Bubble down |
| Peek min | h[0] | O(1) | Direct access |
| Heapify | heapify(lst) | O(n) | NOT O(n log n)! |
| Push+Pop | heappushpop(h, x) | O(log n) | Atomic ā more efficient |
| Pop+Push | heapreplace(h, x) | O(log n) | Atomic ā more efficient |
| K largest | nlargest(k, lst) | O(n log k) | Better than sort for small k |
| K smallest | nsmallest(k, lst) | O(n log k) | Better than sort for small k |
import heapq
# ================================================================
# Basic operations
# ================================================================
heap = []
heapq.heappush(heap, 5) # [5]
heapq.heappush(heap, 1) # [1, 5]
heapq.heappush(heap, 3) # [1, 5, 3]
heapq.heappush(heap, 2) # [1, 2, 3, 5]
heap[0] # 1 ā peek at MIN (no removal) O(1)
heapq.heappop(heap) # 1 ā remove and return MIN O(log n)
# heap is now [2, 5, 3]
# Build heap from existing list ā O(n)
nums = [5, 3, 8, 1, 2, 7]
heapq.heapify(nums) # [1, 2, 7, 5, 3, 8] ā valid heap
# ā ļø heapify modifies list IN PLACE!
# ================================================================
# MAX-HEAP trick ā negate all values
# ================================================================
# Python only has min-heap. Simulate max-heap by pushing negative values.
max_heap = []
for val in [5, 3, 8, 1, 2]:
heapq.heappush(max_heap, -val) # push negative
largest = -heapq.heappop(max_heap) # negate on pop ā 8
# Or negate all at once:
nums = [5, 3, 8, 1, 2]
max_heap = [-x for x in nums]
heapq.heapify(max_heap)
largest = -heapq.heappop(max_heap) # 8
# ================================================================
# Tuple trick ā priority queues
# ================================================================
# Tuples compared element by element: first element = priority
task_queue = []
heapq.heappush(task_queue, (3, "low priority task"))
heapq.heappush(task_queue, (1, "high priority task"))
heapq.heappush(task_queue, (2, "medium priority task"))
priority, task = heapq.heappop(task_queue) # (1, "high priority task")
# ā ļø INTERVIEW TRAP: equal priorities compare second element!
# If second element is non-comparable (custom object), it CRASHES!
# ā
FIX: add unique tiebreaker counter
import itertools
counter = itertools.count()
heapq.heappush(task_queue, (1, next(counter), "task A"))
heapq.heappush(task_queue, (1, next(counter), "task B"))
# Counter ensures FIFO ordering within same priority level
# ================================================================
# heappushpop vs heapreplace
# ================================================================
heap = [1, 3, 5]
# heappushpop: push item, then pop smallest ā more efficient than separate ops
result = heapq.heappushpop(heap, 2) # pushes 2, pops 1 ā returns 1
# ā
If item < heap[0]: returns item immediately (heap unchanged)
# Otherwise: replaces heap[0], sifts down
# heapreplace: pop smallest first, then push ā heap must NOT be empty
result = heapq.heapreplace(heap, 0) # pops 2, pushes 0 ā returns 2
# ā
Faster than heappop + heappush
# ================================================================
# nlargest and nsmallest
# ================================================================
nums = [10, 4, 6, 8, 2, 9, 1, 7, 3, 5]
heapq.nlargest(3, nums) # [10, 9, 8]
heapq.nsmallest(3, nums) # [1, 2, 3]
# With key function:
people = [("Alice", 25), ("Bob", 30), ("Charlie", 20)]
heapq.nlargest(2, people, key=lambda x: x[1])
# [('Bob', 30), ('Alice', 25)]
# When to use nlargest/nsmallest vs sort:
# k << n ā nlargest/nsmallest wins: O(n log k) vs O(n log n)
# k ā n ā just sortš DSA Use Cases
# ================================================================
# K Largest Elements ā maintain min-heap of size k
# ================================================================
def k_largest(nums: list, k: int) -> list:
"""
Find k largest elements using min-heap of size k.
Min-heap of size k: heap[0] = smallest of the k largest
When we see element > heap[0], it beats the current smallest of k-largest
"""
heap = nums[:k]
heapq.heapify(heap) # O(k)
for num in nums[k:]: # O((n-k) log k)
if num > heap[0]:
heapq.heapreplace(heap, num) # atomic pop+push
return sorted(heap, reverse=True) # return sorted descending
# Time: O(n log k), Space: O(k)
# Better than O(n log n) when k << n
# ================================================================
# Merge K Sorted Lists
# ================================================================
def merge_k_sorted(lists: list) -> list:
"""
Merge k sorted lists into one sorted list.
Strategy: use heap to always pick smallest among current heads.
"""
heap = []
result = []
# Initialize: push first element from each list
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0], i, 0)) # (value, list_idx, elem_idx)
while heap:
val, list_i, elem_i = heapq.heappop(heap)
result.append(val)
# Push next element from same list
next_i = elem_i + 1
if next_i < len(lists[list_i]):
next_val = lists[list_i][next_i]
heapq.heappush(heap, (next_val, list_i, next_i))
return result
# Time: O(N log k) where N=total elements, k=number of lists
# ================================================================
# Dijkstra's Shortest Path
# ================================================================
def dijkstra(graph: dict, start: int) -> dict:
"""
graph: {node: [(neighbor, weight), ...]}
Returns: shortest distances from start to all nodes
"""
dist = {start: 0}
heap = [(0, start)] # (distance, node)
while heap:
d, u = heapq.heappop(heap)
if d > dist.get(u, float('inf')):
continue # outdated entry ā skip
for v, weight in graph.get(u, []):
new_dist = d + weight
if new_dist < dist.get(v, float('inf')):
dist[v] = new_dist
heapq.heappush(heap, (new_dist, v))
return dist
# Time: O((V + E) log V)
# ================================================================
# Find Median from Data Stream
# ================================================================
class MedianFinder:
"""
Use two heaps:
- max_heap: lower half (negate for max-heap simulation)
- min_heap: upper half
- len(max_heap) >= len(min_heap)
- max_heap[0] <= min_heap[0] always
"""
def __init__(self):
self.max_heap = [] # lower half (max-heap via negation)
self.min_heap = [] # upper half (min-heap)
def add_num(self, num: int) -> None:
# Push to max_heap first
heapq.heappush(self.max_heap, -num)
# Balance: max_heap top must be <= min_heap top
if self.min_heap and -self.max_heap[0] > self.min_heap[0]:
val = -heapq.heappop(self.max_heap)
heapq.heappush(self.min_heap, val)
# Balance sizes: max_heap can have at most 1 more element
if len(self.max_heap) > len(self.min_heap) + 1:
val = -heapq.heappop(self.max_heap)
heapq.heappush(self.min_heap, val)
elif len(self.min_heap) > len(self.max_heap):
val = heapq.heappop(self.min_heap)
heapq.heappush(self.max_heap, -val)
def find_median(self) -> float:
if len(self.max_heap) > len(self.min_heap):
return -self.max_heap[0] # odd total ā max_heap has extra
return (-self.max_heap[0] + self.min_heap[0]) / 2.05.2 collections Module (Deep Dive)
from collections import Counter, defaultdict, deque, OrderedDict, namedtuple, ChainMap
# ================================================================
# Counter ā additional tricks beyond Section 1
# ================================================================
c = Counter(['a', 'b', 'a', 'c', 'b', 'a'])
# Counter({'a': 3, 'b': 2, 'c': 1})
# elements() ā iterate with multiplicity
list(c.elements()) # ['a','a','a','b','b','c']
# subtract (in-place, allows negative counts)
c.subtract({'a': 2, 'b': 1})
# Counter({'a': 1, 'b': 1, 'c': 1})
# total() ā sum of all counts (Python 3.10+)
c.total() # 3
# Sliding window character count (without Counter overhead):
from collections import Counter
def find_anagram_positions(s: str, p: str) -> list:
"""Find all start indices of p's anagrams in s."""
result = []
need = Counter(p)
window = Counter(s[:len(p)])
if window == need: result.append(0)
for i in range(len(p), len(s)):
window[s[i]] += 1 # add new char
old = s[i - len(p)]
window[old] -= 1 # remove old char
if window[old] == 0:
del window[old] # keep window clean
if window == need:
result.append(i - len(p) + 1)
return result
# ================================================================
# deque ā double-ended queue
# ================================================================
# Internal: doubly-linked list of fixed-size blocks
# ā
O(1) at BOTH ends, ā O(n) random access by index
from collections import deque
d = deque([1, 2, 3])
# O(1) operations at both ends:
d.append(4) # right: [1,2,3,4]
d.appendleft(0) # left: [0,1,2,3,4]
d.pop() # right: returns 4
d.popleft() # left: returns 0
d.extend([5, 6]) # extend right
d.extendleft([9, 8]) # extend left (reversed!): [8,9,1,2,3,5,6]
d.rotate(2) # rotate right by 2
d.rotate(-1) # rotate left by 1
# Bounded deque ā acts as circular buffer / sliding window:
d = deque(maxlen=3)
d.append(1) # deque([1])
d.append(2) # deque([1, 2])
d.append(3) # deque([1, 2, 3])
d.append(4) # deque([2, 3, 4]) ā 1 auto-dropped!
# BFS template using deque:
def bfs(graph, start):
from collections import deque
visited = {start}
queue = deque([start]) # O(1) popleft!
result = []
while queue:
node = queue.popleft() # O(1) ā key advantage over list.pop(0)
result.append(node)
for nb in graph[node]:
if nb not in visited:
visited.add(nb)
queue.append(nb)
return result
# ================================================================
# ChainMap ā layered dict lookups
# ================================================================
from collections import ChainMap
defaults = {'color': 'red', 'size': 'medium', 'debug': False}
user_config = {'color': 'blue'}
cli_args = {'debug': True}
# Search order: cli_args ā user_config ā defaults
config = ChainMap(cli_args, user_config, defaults)
config['color'] # 'blue' ā from user_config
config['size'] # 'medium' ā from defaults
config['debug'] # True ā from cli_args
# Writing only affects FIRST map:
config['new_key'] = 'value' # goes into cli_args
# defaults and user_config are unchanged
# Access individual layers:
config.maps # [cli_args, user_config, defaults]
config.parents # ChainMap without first layer
config.new_child({'temp': 'override'}) # add new layer at front5.3 functools Module
import functools
# ================================================================
# lru_cache / cache ā covered in Section 3.5
# ================================================================
# ================================================================
# partial ā freeze some arguments
# ================================================================
from functools import partial
def power(base: float, exponent: float) -> float:
return base ** exponent
# Create specialized functions by fixing arguments:
square = partial(power, exponent=2) # fix exponent=2
cube = partial(power, exponent=3)
square_root = partial(power, exponent=0.5)
square(5) # 25.0
cube(3) # 27.0
square_root(16) # 4.0
# Practical use: pre-configure functions
import operator
add_10 = partial(operator.add, 10)
add_10(5) # 15
add_10(100) # 110
# With positional args:
def log(level: str, message: str) -> None:
print(f"[{level}] {message}")
log_error = partial(log, 'ERROR')
log_warning = partial(log, 'WARNING')
log_info = partial(log, 'INFO')
log_error("Database connection failed") # [ERROR] Database connection failed
# ================================================================
# reduce ā fold a sequence
# ================================================================
from functools import reduce
# reduce(func, iterable, initial)
# Applies func cumulatively: func(func(func(init, a), b), c) ...
reduce(lambda acc, x: acc + x, [1,2,3,4,5]) # 15 ā sum
reduce(lambda acc, x: acc * x, [1,2,3,4,5]) # 120 ā product
reduce(lambda acc, x: acc + x, [1,2,3,4,5], 10) # 25 ā start=10
# Find maximum without max():
reduce(lambda a, b: a if a > b else b, [3,1,4,1,5,9,2,6]) # 9
# Flatten list:
reduce(lambda acc, x: acc + x, [[1,2],[3,4],[5]], [])
# [1,2,3,4,5]
# ================================================================
# cmp_to_key ā custom comparator for sorting
# ================================================================
from functools import cmp_to_key
# Use when key= function isn't enough (need to compare TWO elements)
def largest_number(nums: list) -> str:
"""Arrange numbers to form largest possible number."""
strs = list(map(str, nums))
def compare(a, b):
if a + b > b + a: return -1 # a before b
if a + b < b + a: return 1 # b before a
return 0
strs.sort(key=cmp_to_key(compare))
result = ''.join(strs)
return '0' if result[0] == '0' else result # edge case: "0000"
largest_number([10, 2]) # "210"
largest_number([3, 30, 34, 5, 9]) # "9534330"
largest_number([0, 0]) # "0"5.4 bisect Module (Binary Search on Sorted List)
import bisect
# Works on SORTED lists only!
arr = [1, 3, 5, 7, 9, 11]
# ================================================================
# bisect_left ā find leftmost insertion point
# ================================================================
bisect.bisect_left(arr, 5) # 2 ā insert BEFORE existing 5
bisect.bisect_left(arr, 6) # 3 ā between 5 and 7
bisect.bisect_left(arr, 0) # 0 ā before everything
bisect.bisect_left(arr, 12) # 6 ā after everything
# ================================================================
# bisect_right (= bisect) ā find rightmost insertion point
# ================================================================
bisect.bisect_right(arr, 5) # 3 ā insert AFTER existing 5
bisect.bisect(arr, 5) # 3 ā alias for bisect_right
# ================================================================
# insort ā insert maintaining sorted order
# ================================================================
bisect.insort(arr, 6) # arr = [1,3,5,6,7,9,11]
bisect.insort_left(arr, 6) # insert before existing 6
# Finding position: O(log n), Inserting: O(n) due to shifting
# ================================================================
# KEY USE CASES
# ================================================================
# 1. Check if element exists in sorted list:
def binary_search(arr: list, target: int) -> int:
"""Returns index of target or -1 if not found."""
i = bisect.bisect_left(arr, target)
if i < len(arr) and arr[i] == target:
return i # found at index i
return -1 # not found
# 2. Count elements in range [lo, hi]:
def count_in_range(arr: list, lo: int, hi: int) -> int:
"""Count elements where lo <= elem <= hi. O(log n)"""
left = bisect.bisect_left(arr, lo) # first position >= lo
right = bisect.bisect_right(arr, hi) # first position > hi
return right - left
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
count_in_range(arr, 3, 7) # 5 ā elements: 3,4,5,6,7
# 3. Find k-th largest using sorted list + bisect:
def find_rank(sorted_arr: list, value: int) -> int:
"""How many elements are <= value?"""
return bisect.bisect_right(sorted_arr, value)
# 4. Grade classification:
def grade(score: int) -> str:
breakpoints = [60, 70, 80, 90]
grades = ['F', 'D', 'C', 'B', 'A']
return grades[bisect.bisect(breakpoints, score)]
grade(85) # 'B'
grade(92) # 'A'
grade(55) # 'F'
# 5. Longest Increasing Subsequence in O(n log n):
def lis_length(nums: list) -> int:
"""
tails[i] = smallest tail of all LIS of length i+1
Binary search tells us where current number fits.
"""
tails = []
for num in nums:
pos = bisect.bisect_left(tails, num) # where does num fit?
if pos == len(tails):
tails.append(num) # extends longest LIS
else:
tails[pos] = num # replaces to keep tails small
return len(tails)
# Time: O(n log n), Space: O(n)
lis_length([10, 9, 2, 5, 3, 7, 101, 18]) # 4 ([2,3,7,101] or [2,5,7,101])5.5 math Module
import math
# ================================================================
# Infinity ā use in DP and graph algorithms
# ================================================================
math.inf # float('inf')
-math.inf # float('-inf')
math.isinf(math.inf) # True
math.isnan(float('nan')) # True
# ================================================================
# Rounding
# ================================================================
math.floor(3.9) # 3 ā always rounds DOWN
math.floor(-3.1) # -4 ā toward negative infinity!
math.ceil(3.1) # 4 ā always rounds UP
math.ceil(-3.9) # -3 ā toward positive infinity!
round(2.5) # 2 ā banker's rounding (half to even)
round(3.5) # 4 ā banker's rounding
# Integer ceiling division (no math.ceil needed):
# Ceiling of a/b = (a + b - 1) // b
import math
math.ceil(7 / 2) # 4
(7 + 2 - 1) // 2 # 4 ā same, integer arithmetic only
# ================================================================
# Combinatorics
# ================================================================
math.factorial(5) # 120
math.comb(5, 2) # 10 ā C(5,2) = 5!/(2!Ć3!)
math.perm(5, 2) # 20 ā P(5,2) = 5!/3!
# ================================================================
# Number theory
# ================================================================
math.gcd(48, 18) # 6
math.gcd(48, 18, 12) # 6 ā multiple args (Python 3.9+)
math.lcm(4, 6) # 12 ā LCM (Python 3.9+)
# Check if perfect square:
def is_perfect_square(n: int) -> bool:
root = math.isqrt(n)
return root * root == n
is_perfect_square(25) # True
is_perfect_square(26) # False
# ================================================================
# Logarithms and powers
# ================================================================
math.log(math.e) # 1.0 ā natural log
math.log(100, 10) # 2.0 ā log base 10
math.log2(1024) # 10.0 ā log base 2
math.log10(1000) # 3.0
math.sqrt(144) # 12.0 ā square root (float)
math.isqrt(144) # 12 ā integer square root (int, Python 3.8+)
math.pow(2, 10) # 1024.0 ā float result (use ** for int)
2 ** 10 # 1024 ā int result
# ================================================================
# Trigonometry (occasionally needed)
# ================================================================
math.pi # 3.141592653589793
math.e # 2.718281828459045
math.sin(math.pi / 2) # 1.0
math.cos(0) # 1.0
math.degrees(math.pi) # 180.0
math.radians(180) # 3.14159...5.6 sys Module
import sys
# ================================================================
# Recursion limit ā critical for DSA!
# ================================================================
sys.getrecursionlimit() # 1000 (default ā often too low!)
sys.setrecursionlimit(10**6) # increase for deep recursion
# ā ļø Common mistake: hitting recursion limit in DFS/backtracking
# ā
Always add this at the top of competitive programming solutions:
import sys
sys.setrecursionlimit(10**6)
# ================================================================
# Infinity alternative
# ================================================================
sys.maxsize # 9223372036854775807 (2^63 - 1)
# Useful as "infinity" when you need an integer (not float)
# Note: Python ints have arbitrary precision ā no true integer overflow
# ================================================================
# Fast input for competitive programming
# ================================================================
# Standard input() is SLOW due to buffering overhead
# For reading millions of values, use sys.stdin:
# Replace input() globally:
input = sys.stdin.readline # much faster!
# ā ļø readline() includes trailing '\n' ā need .strip()
n = int(sys.stdin.readline().strip())
nums = list(map(int, sys.stdin.readline().split()))
# Even faster ā read all at once:
data = sys.stdin.read().split() # entire input as list of strings
idx = 0
n = int(data[idx]); idx += 1
nums = [int(data[idx + i]) for i in range(n)]
# ================================================================
# Object memory size
# ================================================================
sys.getsizeof([1, 2, 3]) # bytes used by list object
sys.getsizeof("hello") # bytes used by string
sys.getsizeof(1000) # bytes used by integer
# ================================================================
# Platform and version info
# ================================================================
sys.version # Python version string
sys.platform # 'linux', 'darwin', 'win32'
sys.argv # command-line arguments list5.7 re Module (Regex Basics)
import re
# ================================================================
# PATTERN REFERENCE
# ================================================================
# \d digit [0-9]
# \D non-digit
# \w word char [a-zA-Z0-9_]
# \W non-word char
# \s whitespace [\t\n\r\f\v ]
# \S non-whitespace
# . any char except newline
# * 0 or more (greedy)
# + 1 or more (greedy)
# ? 0 or 1 (greedy)
# *? 0 or more (non-greedy/lazy)
# {n} exactly n
# {n,m} n to m
# ^ start of string (or line with re.MULTILINE)
# $ end of string
# [abc] character class ā a, b, or c
# [^abc] negated class ā NOT a, b, or c
# (...) capture group
# (?:...) non-capturing group
# a|b alternation ā a OR b
# ================================================================
# findall ā find all matches
# ================================================================
re.findall(r'\d+', 'Order 123, Item 456') # ['123', '456']
re.findall(r'[a-z]+', 'Hello World 123') # ['ello', 'orld']
re.findall(r'\b\w+\b', 'Hello World!') # ['Hello', 'World']
re.findall(r'(\w+)@(\w+)\.(\w+)', 'test@gmail.com')
# [('test', 'gmail', 'com')] ā groups captured
# ================================================================
# search ā find FIRST match anywhere in string
# ================================================================
match = re.search(r'\d+', 'abc 123 def 456')
if match:
match.group() # '123' ā matched text
match.start() # 4 ā start index
match.end() # 7 ā end index
match.span() # (4, 7) ā (start, end)
# ================================================================
# match ā match at START of string ONLY
# ================================================================
re.match(r'\d+', '123 abc') # Match object ā starts with digits
re.match(r'\d+', 'abc 123') # None ā doesn't start with digits
# ================================================================
# fullmatch ā entire string must match
# ================================================================
re.fullmatch(r'\d+', '12345') # Match ā entire string is digits
re.fullmatch(r'\d+', '123abc') # None ā not entirely digits
# ================================================================
# sub ā replace matches
# ================================================================
re.sub(r'\d+', 'NUM', 'Order 123, Item 456')
# 'Order NUM, Item NUM'
re.sub(r'\s+', ' ', 'Hello World !').strip()
# 'Hello World !'
# Back-references in replacement:
re.sub(r'(\w+) (\w+)', r'\2 \1', 'Hello World')
# 'World Hello' ā swap words
# ================================================================
# split ā split by pattern
# ================================================================
re.split(r'[,;\s]+', 'a, b; c d') # ['a', 'b', 'c', 'd']
re.split(r'(\s+)', 'hello world') # ['hello', ' ', 'world'] ā keep separator
# ================================================================
# Compile for reuse ā faster when using same pattern many times
# ================================================================
email_pattern = re.compile(r'^[\w.+-]+@[\w-]+\.[\w.-]+$')
emails = ['user@example.com', 'bad-email', 'also@valid.org']
valid = [e for e in emails if email_pattern.match(e)]
# ['user@example.com', 'also@valid.org']
# ================================================================
# Flags
# ================================================================
re.findall(r'hello', 'Hello HELLO hello', re.IGNORECASE)
# ['Hello', 'HELLO', 'hello']
re.findall(r'^\d+', 'line1\nline2\n3line', re.MULTILINE)
# ['3'] ā ^ matches start of each line
# ================================================================
# Common interview patterns
# ================================================================
# Extract all numbers (including decimals and negatives):
re.findall(r'-?\d+\.?\d*', 'temp is -3.5 and pressure is 101.3')
# ['-3.5', '101.3']
# Validate IP address (simplified):
re.fullmatch(r'(\d{1,3}\.){3}\d{1,3}', '192.168.1.1')
# Extract words (only alphabetic):
re.findall(r'[a-zA-Z]+', 'Hello, World! 123')
# ['Hello', 'World']
# Remove HTML tags:
re.sub(r'<[^>]+>', '', '<b>Hello</b> <i>World</i>')
# 'Hello World'
# Check for palindrome (via regex ā just for fun):
# Better done with s == s[::-1], not regexā ļø Module Edge Cases & Interview Traps
# ================================================================
# heapq TRAPS
# ================================================================
# TRAP 1: heapq has NO decrease-key operation
# For Dijkstra's, push new (dist, node) entry and skip stale ones
if d > dist.get(u, float('inf')):
continue # skip stale entry
# TRAP 2: heapify modifies list in-place (don't do it on important data)
original = [5, 3, 1]
heapq.heapify(original) # original IS NOW modified!
# TRAP 3: heappop on empty heap ā IndexError
if heap:
val = heapq.heappop(heap) # always check!
# ================================================================
# bisect TRAPS
# ================================================================
# TRAP 1: bisect only works on SORTED lists
arr = [3, 1, 4, 1, 5] # unsorted!
bisect.bisect_left(arr, 4) # meaningless result
# TRAP 2: bisect_left vs bisect_right for duplicates
arr = [1, 2, 2, 2, 3]
bisect.bisect_left(arr, 2) # 1 ā leftmost position for 2
bisect.bisect_right(arr, 2) # 4 ā rightmost position for 2
# ================================================================
# sys TRAPS
# ================================================================
# TRAP 1: sys.setrecursionlimit too high can cause segfault
sys.setrecursionlimit(10**6) # safe for most cases
# sys.setrecursionlimit(10**9) # ā ļø may crash Python!
# TRAP 2: sys.stdin.readline includes newline
line = sys.stdin.readline() # "hello\n"
n = int(line) # ā
int() handles trailing \n
name = line.strip() # ā
need strip() for strings
# ================================================================
# re TRAPS
# ================================================================
# TRAP 1: . does NOT match newline by default
re.findall(r'.+', 'hello\nworld') # ['hello', 'world'] ā 2 matches!
re.findall(r'.+', 'hello\nworld', re.DOTALL) # ['hello\nworld'] ā 1 match
# TRAP 2: * is greedy ā matches as MUCH as possible
re.findall(r'<.*>', '<a>hello</a>') # ['<a>hello</a>'] ā greedy!
re.findall(r'<.*?>', '<a>hello</a>') # ['<a>', '</a>'] ā lazy!
# TRAP 3: \d matches only ASCII digits by default
# For Unicode digits, use re.UNICODE flag or [\d]š¦ Quick Cheatsheet ā External Modules
# āā heapq āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import heapq
heapq.heappush(h, val) # O(log n)
heapq.heappop(h) # O(log n) ā smallest
h[0] # O(1) peek
heapq.heapify(lst) # O(n) in-place
heapq.nlargest(k, lst) # O(n log k)
heapq.nsmallest(k, lst) # O(n log k)
heapq.heappush(h, -val) # max-heap trick
heapq.heappush(h, (priority, counter, item)) # priority queue
# āā collections āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
from collections import Counter, defaultdict, deque, OrderedDict
Counter(lst).most_common(k) # top k frequent
Counter(s1) == Counter(s2) # anagram check
defaultdict(list) # auto [] for missing keys
defaultdict(int) # auto 0 for missing keys
deque(maxlen=k) # circular buffer / sliding window
d.popleft() # O(1) ā use instead of list.pop(0)
# āā functools āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
from functools import lru_cache, cache, partial, reduce, cmp_to_key
@cache # unlimited memoization
@lru_cache(maxsize=128) # bounded memoization
partial(func, fixed_arg) # freeze an argument
reduce(lambda acc, x: acc*x, lst) # fold/accumulate
sorted(lst, key=cmp_to_key(cmp_fn)) # custom comparator
# āā bisect āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import bisect
bisect.bisect_left(arr, x) # leftmost insertion point O(log n)
bisect.bisect_right(arr, x) # rightmost insertion point O(log n)
bisect.insort(arr, x) # insert maintaining order O(n)
# count in [lo, hi]:
bisect.bisect_right(arr,hi) - bisect.bisect_left(arr,lo)
# āā math āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import math
math.inf, -math.inf # infinity
math.gcd(a, b) # GCD
math.lcm(a, b) # LCM (Python 3.9+)
math.isqrt(n) # integer square root
math.comb(n, r) # C(n, r)
math.log2(n), math.log(n, base) # logarithms
math.isclose(a, b) # float comparison
# āā sys āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import sys
sys.setrecursionlimit(10**6) # increase recursion depth
input = sys.stdin.readline # faster input
sys.maxsize # largest int (~2^63)
# āā re āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import re
re.findall(r'\d+', s) # find all matches
re.sub(r'\s+', ' ', s) # replace matches
re.split(r'[,;]+', s) # split by pattern
re.search(r'pattern', s) # first match anywhere
re.match(r'pattern', s) # match at start only
re.compile(r'pattern') # pre-compile for reuseSECTION 6 ā DSA PROBLEM-SOLVING PATTERNS AND TECHNIQUES
This section is the heart of interview preparation. Recognizing patterns is more important than memorizing solutions. Each pattern below includes:
- When to recognize it (trigger words/problem characteristics)
- Template code (the skeleton you'll reuse)
- Example problems with full solutions
6.1 Two Pointers
Theory
Two pointers technique uses two indices to traverse data structure(s), typically to reduce time complexity from O(n²) to O(n).
When to Recognize
- Problem involves sorted array or linked list
- Keywords: "pairs", "triplets", "palindrome", "remove duplicates", "merge"
- Need to compare elements at different positions
Two main variants:
Pattern 6.1.A ā Opposite Ends (Converging Pointers)
# === Template: Two Sum II (sorted array) ===
def two_sum_sorted(nums, target):
"""
Find two numbers that add up to target in sorted array.
Return indices (1-indexed).
"""
left, right = 0, len(nums) - 1
while left < right: # pointers converge
current_sum = nums[left] + nums[right]
if current_sum == target:
return [left + 1, right + 1] # found answer
elif current_sum < target:
left += 1 # need larger sum
else:
right -= 1 # need smaller sum
return [] # no solution
# Time: O(n), Space: O(1)Example 1: Valid Palindrome
def is_palindrome(s: str) -> bool:
"""
Check if string is palindrome (ignore case, non-alphanumeric).
Example: "A man, a plan, a canal: Panama" ā True
"""
left, right = 0, len(s) - 1
while left < right:
# Skip non-alphanumeric from left
while left < right and not s[left].isalnum():
left += 1
# Skip non-alphanumeric from right
while left < right and not s[right].isalnum():
right -= 1
# Compare characters (case-insensitive)
if s[left].lower() != s[right].lower():
return False
left += 1
right -= 1
return True
# Time: O(n), Space: O(1)
# Test
is_palindrome("race a car") # False
is_palindrome("A man, a plan") # False (not palindrome)Example 2: Container With Most Water
def max_area(height: list) -> int:
"""
Find two lines that form container with most water.
height[i] = height of line at position i.
"""
left, right = 0, len(height) - 1
max_water = 0
while left < right:
# Width between lines
width = right - left
# Height limited by shorter line
h = min(height[left], height[right])
# Calculate area
max_water = max(max_water, width * h)
# Move pointer at shorter line (greedy choice)
if height[left] < height[right]:
left += 1
else:
right -= 1
return max_water
# Time: O(n), Space: O(1)
# Why greedy works: moving the taller pointer can only decrease area
# (width decreases, height stays same or gets smaller)Example 3: Three Sum
def three_sum(nums: list) -> list:
"""
Find all unique triplets [a, b, c] that sum to 0.
Example: [-1, 0, 1, 2, -1, -4] ā [[-1, -1, 2], [-1, 0, 1]]
"""
nums.sort() # O(n log n)
result = []
for i in range(len(nums) - 2):
# Skip duplicates for first element
if i > 0 and nums[i] == nums[i - 1]:
continue
# Two-pointer for remaining two elements
left, right = i + 1, len(nums) - 1
target = -nums[i] # need b + c = -a
while left < right:
two_sum = nums[left] + nums[right]
if two_sum == target:
result.append([nums[i], nums[left], nums[right]])
# Skip duplicates for second element
while left < right and nums[left] == nums[left + 1]:
left += 1
# Skip duplicates for third element
while left < right and nums[right] == nums[right - 1]:
right -= 1
left += 1
right -= 1
elif two_sum < target:
left += 1
else:
right -= 1
return result
# Time: O(n²), Space: O(1) excluding outputPattern 6.1.B ā Same Direction (Fast & Slow Pointers)
# === Template: Remove Duplicates ===
def remove_duplicates(nums: list) -> int:
"""
Remove duplicates from sorted array in-place.
Return new length.
"""
if not nums:
return 0
slow = 0 # write position
for fast in range(1, len(nums)): # read position
if nums[fast] != nums[slow]: # found new unique
slow += 1 # advance write position
nums[slow] = nums[fast] # write unique element
return slow + 1 # length of unique portion
# Time: O(n), Space: O(1)Example 1: Move Zeroes
def move_zeroes(nums: list) -> None:
"""
Move all zeroes to end while maintaining order of non-zeroes.
Example: [0,1,0,3,12] ā [1,3,12,0,0]
"""
slow = 0 # position to write next non-zero
for fast in range(len(nums)): # scan all elements
if nums[fast] != 0: # found non-zero
nums[slow], nums[fast] = nums[fast], nums[slow] # swap
slow += 1 # advance write position
# Time: O(n), Space: O(1)
# Alternative without swap:
# 1. Copy non-zeroes to front
# 2. Fill rest with zeroesExample 2: Floyd's Cycle Detection (Fast & Slow)
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def has_cycle(head: ListNode) -> bool:
"""
Detect if linked list has a cycle.
"""
if not head or not head.next:
return False
slow = head # moves 1 step
fast = head # moves 2 steps
while fast and fast.next:
slow = slow.next # 1 step
fast = fast.next.next # 2 steps
if slow == fast: # they met ā cycle exists
return True
return False # fast reached end ā no cycle
# Time: O(n), Space: O(1)
def detect_cycle_start(head: ListNode) -> ListNode:
"""
Find the node where cycle begins.
"""
if not head or not head.next:
return None
# Phase 1: Detect cycle
slow = fast = head
has_cycle = False
while fast and fast.next:
slow = slow.next
fast = fast.next.next
if slow == fast:
has_cycle = True
break
if not has_cycle:
return None
# Phase 2: Find start of cycle
# Move one pointer to head, keep other at meeting point
# Move both at same speed ā they meet at cycle start
slow = head
while slow != fast:
slow = slow.next
fast = fast.next
return slow # cycle start node
# Time: O(n), Space: O(1)š¦ Quick Cheatsheet ā Two Pointers
# Opposite ends (converging)
left, right = 0, len(arr) - 1
while left < right:
if condition:
# process
elif need_smaller:
right -= 1
else:
left += 1
# Same direction (fast & slow)
slow = 0
for fast in range(len(arr)):
if arr[fast] meets_condition:
arr[slow] = arr[fast]
slow += 1
# Cycle detection
slow = fast = head
while fast and fast.next:
slow = slow.next
fast = fast.next.next
if slow == fast:
return True6.2 Sliding Window
Theory
Maintain a window (subarray/substring) that expands/contracts to satisfy constraints. Avoids recalculating from scratch each time.
When to Recognize
- "Contiguous subarray/substring"
- "Maximum/minimum sum/length with constraint"
- "All subarrays/substrings with property X"
Pattern 6.2.A ā Fixed-Size Window
# === Template ===
def max_sum_subarray(nums: list, k: int) -> int:
"""
Find maximum sum of subarray of size k.
"""
if len(nums) < k:
return 0
# Initialize first window
window_sum = sum(nums[:k])
max_sum = window_sum
# Slide window
for i in range(k, len(nums)):
window_sum += nums[i] # add new element entering
window_sum -= nums[i - k] # remove element leaving
max_sum = max(max_sum, window_sum)
return max_sum
# Time: O(n), Space: O(1)Example: Maximum Average Subarray
def find_max_average(nums: list, k: int) -> float:
"""
Find maximum average of contiguous subarray of size k.
"""
window_sum = sum(nums[:k])
max_sum = window_sum
for i in range(k, len(nums)):
window_sum += nums[i] - nums[i - k]
max_sum = max(max_sum, window_sum)
return max_sum / k
# Time: O(n), Space: O(1)Pattern 6.2.B ā Variable-Size Window
# === Template: Minimum Window ===
def min_subarray_len(target: int, nums: list) -> int:
"""
Find minimum length of subarray with sum >= target.
Return 0 if no such subarray.
"""
left = 0
current_sum = 0
min_length = float('inf')
for right in range(len(nums)): # expand window
current_sum += nums[right]
# Shrink window while condition satisfied
while current_sum >= target:
min_length = min(min_length, right - left + 1)
current_sum -= nums[left] # remove leftmost
left += 1 # contract window
return min_length if min_length != float('inf') else 0
# Time: O(n) ā each element visited at most twice
# Space: O(1)Example 1: Longest Substring Without Repeating Characters
def length_of_longest_substring(s: str) -> int:
"""
Find length of longest substring without repeating characters.
Example: "abcabcbb" ā 3 ("abc")
"""
char_index = {} # char ā last seen index
left = 0
max_length = 0
for right, char in enumerate(s):
# If char seen before and is in current window
if char in char_index and char_index[char] >= left:
left = char_index[char] + 1 # shrink window from left
char_index[char] = right # update last seen index
max_length = max(max_length, right - left + 1)
return max_length
# Time: O(n), Space: O(min(n, m)) where m = charset size
# Alternative with set:
def length_of_longest_substring_v2(s: str) -> int:
chars = set()
left = 0
max_length = 0
for right in range(len(s)):
# Shrink until no duplicates
while s[right] in chars:
chars.remove(s[left])
left += 1
chars.add(s[right])
max_length = max(max_length, right - left + 1)
return max_lengthExample 2: Minimum Window Substring
from collections import Counter
def min_window(s: str, t: str) -> str:
"""
Find minimum window in s that contains all characters of t.
Example: s = "ADOBECODEBANC", t = "ABC" ā "BANC"
"""
if not s or not t:
return ""
# Count characters in t
required = Counter(t)
needed = len(required) # distinct chars needed
left = 0
formed = 0 # how many distinct chars satisfied
window_counts = {}
# Result: (window_length, left, right)
ans = float('inf'), None, None
for right in range(len(s)):
char = s[right]
window_counts[char] = window_counts.get(char, 0) + 1
# Check if frequency matches requirement
if char in required and window_counts[char] == required[char]:
formed += 1
# Try to contract window while it's valid
while left <= right and formed == needed:
char = s[left]
# Update result if this window is smaller
if right - left + 1 < ans[0]:
ans = (right - left + 1, left, right)
# Remove leftmost character
window_counts[char] -= 1
if char in required and window_counts[char] < required[char]:
formed -= 1
left += 1
return "" if ans[0] == float('inf') else s[ans[1]:ans[2] + 1]
# Time: O(n + m), Space: O(m) where n=len(s), m=len(t)Example 3: Longest Repeating Character Replacement
def character_replacement(s: str, k: int) -> int:
"""
Replace at most k characters to get longest substring of same char.
Example: s = "AABABBA", k = 1 ā 4 ("AABA")
"""
count = {}
left = 0
max_count = 0 # most frequent char in window
max_length = 0
for right in range(len(s)):
count[s[right]] = count.get(s[right], 0) + 1
max_count = max(max_count, count[s[right]])
# Window size - most frequent count = chars to replace
# If > k, window is invalid
while right - left + 1 - max_count > k:
count[s[left]] -= 1
left += 1
max_length = max(max_length, right - left + 1)
return max_length
# Time: O(n), Space: O(26) = O(1) for uppercase lettersš¦ Quick Cheatsheet ā Sliding Window
# Fixed size
window_sum = sum(nums[:k])
for i in range(k, len(nums)):
window_sum += nums[i] - nums[i-k]
# Variable size (expand right, contract left)
left = 0
for right in range(len(arr)):
# add arr[right] to window
while window_invalid:
# remove arr[left] from window
left += 1
# update result
# With hashmap (character frequency)
from collections import Counter
required = Counter(target)
window = {}
formed = 0 # how many requirements satisfied6.3 Binary Search
Theory
Binary search eliminates half of search space each iteration. Works on sorted data or monotonic functions.
When to Recognize
- Array is sorted
- Keywords: "find target", "search in rotated", "first/last occurrence"
- "Minimized maximum" or "Maximized minimum" ā binary search on answer space
Pattern 6.3.A ā Classical Binary Search
# === Template: Find Target ===
def binary_search(nums: list, target: int) -> int:
"""
Find target in sorted array. Return index or -1.
"""
left, right = 0, len(nums) - 1
while left <= right: # <= because right is inclusive
mid = left + (right - left) // 2 # avoid overflow
if nums[mid] == target:
return mid
elif nums[mid] < target:
left = mid + 1 # search right half
else:
right = mid - 1 # search left half
return -1 # not found
# Time: O(log n), Space: O(1)Example 1: Find First and Last Position
def search_range(nums: list, target: int) -> list:
"""
Find starting and ending position of target in sorted array.
Example: [5,7,7,8,8,10], target=8 ā [3, 4]
"""
def find_left(nums, target):
"""Find leftmost (first) occurrence."""
left, right = 0, len(nums) - 1
result = -1
while left <= right:
mid = left + (right - left) // 2
if nums[mid] == target:
result = mid # found, but keep searching left
right = mid - 1
elif nums[mid] < target:
left = mid + 1
else:
right = mid - 1
return result
def find_right(nums, target):
"""Find rightmost (last) occurrence."""
left, right = 0, len(nums) - 1
result = -1
while left <= right:
mid = left + (right - left) // 2
if nums[mid] == target:
result = mid # found, but keep searching right
left = mid + 1
elif nums[mid] < target:
left = mid + 1
else:
right = mid - 1
return result
return [find_left(nums, target), find_right(nums, target)]
# Time: O(log n), Space: O(1)
# Alternative using bisect:
import bisect
def search_range_bisect(nums, target):
left = bisect.bisect_left(nums, target)
right = bisect.bisect_right(nums, target) - 1
if left < len(nums) and nums[left] == target:
return [left, right]
return [-1, -1]Pattern 6.3.B ā Binary Search on Answer Space
Key Insight: When problem asks "minimize the maximum" or "maximize the minimum", often you can binary search on the answer.
# === Template: Capacity To Ship Packages ===
def ship_within_days(weights: list, days: int) -> int:
"""
Find minimum ship capacity to ship all packages within 'days' days.
Packages must be shipped in given order.
Example: weights = [1,2,3,4,5,6,7,8,9,10], days = 5 ā 15
"""
def can_ship(capacity):
"""Check if we can ship all with this capacity in 'days' days."""
current_load = 0
days_needed = 1
for weight in weights:
if current_load + weight > capacity:
days_needed += 1 # need another day
current_load = weight # start new day
else:
current_load += weight
return days_needed <= days
# Binary search on answer (capacity)
left = max(weights) # min capacity (largest package)
right = sum(weights) # max capacity (all in one day)
while left < right:
mid = left + (right - left) // 2
if can_ship(mid):
right = mid # try smaller capacity
else:
left = mid + 1 # need larger capacity
return left
# Time: O(n log S) where S = sum(weights)
# Space: O(1)Example: Koko Eating Bananas
import math
def min_eating_speed(piles: list, h: int) -> int:
"""
Koko can eat k bananas per hour. Find minimum k to eat all piles in h hours.
Example: piles = [3,6,7,11], h = 8 ā 4
(Hour 1: 3, Hour 2-3: 6, Hour 4-5: 7, Hour 6-8: 11)
"""
def can_finish(k):
"""Check if Koko can finish with speed k in h hours."""
hours_needed = 0
for pile in piles:
hours_needed += math.ceil(pile / k) # or: (pile + k - 1) // k
return hours_needed <= h
left, right = 1, max(piles) # min speed, max speed
while left < right:
mid = left + (right - left) // 2
if can_finish(mid):
right = mid # try slower speed
else:
left = mid + 1 # need faster speed
return left
# Time: O(n log m) where m = max(piles)š¦ Quick Cheatsheet ā Binary Search
# Classic search
left, right = 0, len(arr) - 1
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
# Find first occurrence
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target:
result = mid
right = mid - 1 # keep searching left
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
# Binary search on answer
def is_feasible(mid):
# check if mid satisfies condition
pass
left, right = min_answer, max_answer
while left < right:
mid = left + (right - left) // 2
if is_feasible(mid):
right = mid
else:
left = mid + 1
return leftInterview Trap ā Binary Search Edge Cases
# TRAP 1: Infinite loop with 'left < right' and 'left = mid'
# ā WRONG:
while left < right:
mid = left + (right - left) // 2
if condition:
left = mid # BUG! If left = mid and right = left + 1, infinite loop
else:
right = mid - 1
# ā
FIX: Use (left + right + 1) // 2 when updating left = mid
while left < right:
mid = left + (right - left + 1) // 2 # round up
if condition:
left = mid
else:
right = mid - 1
# TRAP 2: Overflow in (left + right) // 2
# In Python: not an issue (arbitrary precision integers)
# In C++/Java: use left + (right - left) // 2
# TRAP 3: Off-by-one in range
# If searching [left, right), use 'while left < right' and 'right = mid' (NOT mid - 1)
# If searching [left, right], use 'while left <= right' and 'right = mid - 1'6.4 BFS / DFS Templates (Graph, Tree, Grid)
Theory
BFS and DFS are the two fundamental graph traversal algorithms. Every graph/tree problem is either one of these or a combination.
BFS (Breadth-First Search):
- Explores level by level
- Uses a QUEUE (deque)
- Finds SHORTEST PATH in unweighted graphs
- Good for: shortest path, level-order traversal, nearest neighbor
DFS (Depth-First Search):
- Explores as deep as possible before backtracking
- Uses a STACK (recursion or explicit stack)
- Good for: cycle detection, topological sort, connected components,
path existence, backtracking
Pattern 6.4.A ā BFS Template
from collections import deque
# === Template: Generic BFS ===
def bfs(graph, start):
"""
BFS traversal of graph represented as adjacency list.
Returns distances from start to all reachable nodes.
"""
visited = {start} # track visited nodes
queue = deque([start]) # BFS queue (FIFO)
distance = {start: 0} # distance from start
while queue:
node = queue.popleft() # O(1) ā why we use deque
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
distance[neighbor] = distance[node] + 1 # one level deeper
queue.append(neighbor)
return distance
# Time: O(V + E), Space: O(V)
# === Template: BFS Level-by-Level (when level info matters) ===
def bfs_level(graph, start):
"""Returns list of levels, each level is a list of nodes."""
visited = {start}
queue = deque([start])
levels = []
while queue:
level_size = len(queue) # snapshot of current level size
current_level = []
for _ in range(level_size): # process ALL nodes at this level
node = queue.popleft()
current_level.append(node)
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
levels.append(current_level)
return levelsExample 1: Binary Tree Level Order Traversal
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def level_order(root: TreeNode) -> list:
"""
Return list of lists ā each inner list = one level of tree.
Example:
Tree: 3
/ \
9 20
/ \
15 7
Output: [[3], [9, 20], [15, 7]]
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue) # how many nodes at this level
current_level = []
for _ in range(level_size):
node = queue.popleft()
current_level.append(node.val)
# Add children for next level
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(current_level)
return result
# Time: O(n), Space: O(n) ā at most n/2 nodes in queue at onceExample 2: Shortest Path in Grid (01 Matrix)
def update_matrix(mat: list) -> list:
"""
Find distance of each cell to nearest 0 in binary matrix.
Example:
mat = [[0,0,0],
[0,1,0],
[1,1,1]]
Output: [[0,0,0],
[0,1,0],
[1,2,1]]
"""
rows, cols = len(mat), len(mat[0])
dist = [[float('inf')] * cols for _ in range(rows)]
queue = deque()
# Initialize: all 0-cells have distance 0, add to queue
for r in range(rows):
for c in range(cols):
if mat[r][c] == 0:
dist[r][c] = 0
queue.append((r, c)) # start BFS from all 0s simultaneously
# Multi-source BFS
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
while queue:
r, c = queue.popleft()
for dr, dc in directions:
nr, nc = r + dr, c + dc
# Check bounds and if we found a shorter path
if 0 <= nr < rows and 0 <= nc < cols:
if dist[r][c] + 1 < dist[nr][nc]:
dist[nr][nc] = dist[r][c] + 1
queue.append((nr, nc))
return dist
# Time: O(rows * cols), Space: O(rows * cols)Example 3: Word Ladder (BFS for shortest transformation)
def ladder_length(begin_word: str, end_word: str, word_list: list) -> int:
"""
Find length of shortest transformation sequence from begin to end.
Each step changes exactly one letter and each word must be in word_list.
Example: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"] ā 5
"""
word_set = set(word_list) # O(1) lookups
if end_word not in word_set:
return 0
queue = deque([(begin_word, 1)]) # (current_word, steps)
visited = {begin_word}
while queue:
word, steps = queue.popleft()
# Try changing each character
for i in range(len(word)):
for char in 'abcdefghijklmnopqrstuvwxyz':
new_word = word[:i] + char + word[i+1:]
if new_word == end_word:
return steps + 1 # found!
if new_word in word_set and new_word not in visited:
visited.add(new_word)
queue.append((new_word, steps + 1))
return 0 # no path found
# Time: O(M² Ć N) where M = word length, N = number of wordsPattern 6.4.B ā DFS Template
# === Template 1: Recursive DFS (most common in interviews) ===
def dfs_recursive(graph, node, visited=None):
"""DFS using recursion (call stack is the implicit stack)."""
if visited is None:
visited = set()
visited.add(node) # mark as visited
print(node) # process node
for neighbor in graph[node]:
if neighbor not in visited:
dfs_recursive(graph, neighbor, visited)
return visited
# Time: O(V + E), Space: O(V) for recursion stack
# === Template 2: Iterative DFS (explicit stack) ===
def dfs_iterative(graph, start):
"""DFS using explicit stack ā avoids recursion limit."""
visited = set()
stack = [start]
result = []
while stack:
node = stack.pop() # LIFO ā use stack
if node in visited:
continue # already processed
visited.add(node)
result.append(node)
# Push neighbors (in reverse order to process in original order)
for neighbor in reversed(graph[node]):
if neighbor not in visited:
stack.append(neighbor)
return resultExample 1: Number of Islands (Grid DFS)
def num_islands(grid: list) -> int:
"""
Count number of islands (connected regions of '1's).
Example:
grid = [["1","1","0","0"],
["1","1","0","0"],
["0","0","1","0"],
["0","0","0","1"]]
Output: 3
"""
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
count = 0
def dfs(r, c):
"""Flood-fill: mark connected land as visited."""
# Base cases: out of bounds or water or visited
if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != '1':
return
grid[r][c] = '#' # mark as visited (modify in place)
# Explore all 4 directions
dfs(r + 1, c)
dfs(r - 1, c)
dfs(r, c + 1)
dfs(r, c - 1)
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1': # found unvisited land
count += 1
dfs(r, c) # sink the whole island
return count
# Time: O(rows * cols), Space: O(rows * cols) for recursion
# ā
BFS version (avoids recursion limit for huge grids):
def num_islands_bfs(grid):
rows, cols = len(grid), len(grid[0])
count = 0
def bfs(r, c):
queue = deque([(r, c)])
grid[r][c] = '#'
while queue:
r, c = queue.popleft()
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
grid[nr][nc] = '#'
queue.append((nr, nc))
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1':
count += 1
bfs(r, c)
return countExample 2: Clone Graph
class Node:
def __init__(self, val=0, neighbors=None):
self.val = val
self.neighbors = neighbors or []
def clone_graph(node: 'Node') -> 'Node':
"""
Deep copy a connected undirected graph.
Each node has a value and a list of neighbors.
"""
if not node:
return None
cloned = {} # original node ā cloned node
def dfs(node):
if node in cloned:
return cloned[node] # already cloned, return it
clone = Node(node.val) # create clone with same value
cloned[node] = clone # register BEFORE processing neighbors
# (prevents infinite loop with cycles!)
for neighbor in node.neighbors:
clone.neighbors.append(dfs(neighbor)) # clone all neighbors
return clone
return dfs(node)
# Time: O(V + E), Space: O(V)Example 3: Course Schedule (Cycle Detection with DFS)
def can_finish(num_courses: int, prerequisites: list) -> bool:
"""
Determine if all courses can be finished given prerequisites.
prerequisites[i] = [a, b] means "must take b before a"
Equivalent to: Does the directed graph have a cycle?
"""
# Build adjacency list
graph = [[] for _ in range(num_courses)]
for course, prereq in prerequisites:
graph[course].append(prereq)
# State: 0 = unvisited, 1 = visiting (in stack), 2 = done
state = [0] * num_courses
def has_cycle(course):
if state[course] == 1: # in current path ā cycle!
return True
if state[course] == 2: # already fully processed
return False
state[course] = 1 # mark as "in current path"
for prereq in graph[course]:
if has_cycle(prereq):
return True
state[course] = 2 # mark as done
return False
for course in range(num_courses):
if has_cycle(course):
return False
return True
# Time: O(V + E), Space: O(V + E)š¦ Quick Cheatsheet ā BFS / DFS
# BFS (shortest path, level order)
from collections import deque
queue = deque([start])
visited = {start}
while queue:
node = queue.popleft()
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
# DFS recursive
def dfs(node, visited):
visited.add(node)
for neighbor in graph[node]:
if neighbor not in visited:
dfs(neighbor, visited)
# Grid DFS (flood fill)
dirs = [(0,1),(0,-1),(1,0),(-1,0)]
def dfs(r, c):
if r<0 or r>=R or c<0 or c>=C or grid[r][c]!='1':
return
grid[r][c] = '#'
for dr,dc in dirs:
dfs(r+dr, c+dc)
# Cycle detection in directed graph
# state: 0=unvisited, 1=in stack, 2=done6.5 Backtracking Template
Theory
Backtracking is a controlled brute force. Explore all possibilities, but prune paths that can't lead to a valid solution.
Pattern:
1. Make a choice (add to current path)
2. Explore (recurse deeper)
3. Undo the choice (remove from path ā BACKTRACK)
Recognize when:
- Problem asks for ALL solutions (not just one optimal)
- Keywords: "all permutations", "all subsets", "all combinations"
- Decision tree structure (choose or not choose)
# === Universal Backtracking Template ===
def backtrack(result, current_path, choices, start=0):
"""
result: accumulate all valid solutions
current_path: choices made so far
choices: available options
start: where to begin in choices (for combinations, avoids duplicates)
"""
# Base case: current path is a valid solution
if is_valid_solution(current_path):
result.append(list(current_path)) # COPY the path!
return
for i in range(start, len(choices)):
# Pruning: skip invalid choices early
if should_prune(i, current_path):
continue
current_path.append(choices[i]) # Make choice
backtrack(result, current_path, choices, i + 1) # Explore
current_path.pop() # Undo choice (BACKTRACK)Example 1: All Subsets (Power Set)
def subsets(nums: list) -> list:
"""
Generate all possible subsets.
Example: [1,2,3] ā [[],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]]
"""
result = []
def backtrack(start, current):
result.append(list(current)) # every state is a valid subset
for i in range(start, len(nums)):
current.append(nums[i]) # choose nums[i]
backtrack(i + 1, current) # explore (next elements only)
current.pop() # unchoose (backtrack)
backtrack(0, [])
return result
# Time: O(n Ć 2āæ), Space: O(n Ć 2āæ)Example 2: All Permutations
def permutations(nums: list) -> list:
"""
Generate all permutations.
Example: [1,2,3] ā [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
"""
result = []
def backtrack(current, remaining):
if not remaining: # used all numbers
result.append(list(current))
return
for i in range(len(remaining)):
current.append(remaining[i]) # choose element i
# remaining without element i
backtrack(current, remaining[:i] + remaining[i+1:])
current.pop() # unchoose
backtrack([], nums)
return result
# Time: O(n Ć n!), Space: O(n Ć n!)
# More efficient version using swap:
def permutations_v2(nums):
result = []
def backtrack(start):
if start == len(nums):
result.append(list(nums))
return
for i in range(start, len(nums)):
nums[start], nums[i] = nums[i], nums[start] # swap
backtrack(start + 1)
nums[start], nums[i] = nums[i], nums[start] # swap back
backtrack(0)
return resultExample 3: Combination Sum
def combination_sum(candidates: list, target: int) -> list:
"""
Find all combinations that sum to target.
Same number can be used multiple times.
Example: candidates = [2,3,6,7], target = 7 ā [[2,2,3],[7]]
"""
result = []
candidates.sort() # sort for pruning
def backtrack(start, current, remaining):
if remaining == 0: # found valid combination
result.append(list(current))
return
for i in range(start, len(candidates)):
# Pruning: if current number > remaining, all future too
if candidates[i] > remaining:
break # stop early!
current.append(candidates[i])
# Use i (not i+1) because we can reuse same element
backtrack(i, current, remaining - candidates[i])
current.pop()
backtrack(0, [], target)
return result
# Time: O(N^(T/M)) where T=target, M=min(candidates)Example 4: N-Queens
def solve_n_queens(n: int) -> list:
"""
Place n queens on nĆn board so no two queens attack each other.
Return all distinct solutions.
"""
result = []
cols = set() # columns occupied
pos_diag = set() # positive diagonals (r + c = const)
neg_diag = set() # negative diagonals (r - c = const)
board = [['.'] * n for _ in range(n)]
def backtrack(row):
if row == n: # placed all queens!
snapshot = [''.join(r) for r in board]
result.append(snapshot)
return
for col in range(n):
# Check if this position is under attack
if col in cols or (row + col) in pos_diag or (row - col) in neg_diag:
continue # skip ā queen would be attacked
# Place queen
board[row][col] = 'Q'
cols.add(col)
pos_diag.add(row + col)
neg_diag.add(row - col)
backtrack(row + 1) # place queen in next row
# Remove queen (backtrack)
board[row][col] = '.'
cols.remove(col)
pos_diag.remove(row + col)
neg_diag.remove(row - col)
backtrack(0)
return result
# Time: O(n!), Space: O(n²)Example 5: Sudoku Solver
def solve_sudoku(board: list) -> None:
"""
Solve sudoku puzzle in-place.
'.' represents empty cells.
"""
def is_valid(board, row, col, num):
"""Check if placing num at (row, col) is valid."""
# Check row
if num in board[row]:
return False
# Check column
if num in [board[r][col] for r in range(9)]:
return False
# Check 3Ć3 box
box_r, box_c = (row // 3) * 3, (col // 3) * 3
for r in range(box_r, box_r + 3):
for c in range(box_c, box_c + 3):
if board[r][c] == num:
return False
return True
def solve(board):
for r in range(9):
for c in range(9):
if board[r][c] == '.': # found empty cell
for num in '123456789':
if is_valid(board, r, c, num):
board[r][c] = num # place number
if solve(board): # recurse
return True
board[r][c] = '.' # backtrack
return False # no valid number found
return True # all cells filled
solve(board)š¦ Quick Cheatsheet ā Backtracking
def backtrack(start, current):
if base_case:
result.append(list(current)) # COPY!
return
for i in range(start, len(choices)):
if pruning_condition:
continue # or break if sorted
current.append(choices[i]) # choose
backtrack(i + 1, current) # explore
current.pop() # undo
# Permutations: start=0, no reuse
# Combinations: start=i+1, no reuse
# Combination sum with reuse: start=i, same element again6.6 Dynamic Programming
Theory
Dynamic programming = memoization + recursion (top-down) or tabulation (bottom-up). Both avoid recomputing subproblems.
Recognize DP when:
1. Optimal substructure: optimal solution uses optimal solutions of subproblems
2. Overlapping subproblems: same subproblem appears multiple times
Common DP patterns:
- 1D DP: Fibonacci, climbing stairs, house robber
- 2D DP: Grid paths, edit distance, LCS, knapsack
- Interval DP: Matrix chain, burst balloons
- String DP: Palindrome, regex matching
Pattern 6.6.A ā Top-Down (Memoization)
# === Template: Top-Down with memo dict ===
def solve_top_down(n, memo=None):
if memo is None:
memo = {} # fresh dict each call (NOT mutable default!)
if n in memo:
return memo[n] # return cached result
if n <= 1: # base case
return n
memo[n] = solve_top_down(n-1, memo) + solve_top_down(n-2, memo) # recurrence
return memo[n] # cache and return
# Better: use @functools.cache (cleaner, same effect)
import functools
@functools.cache
def solve(n):
if n <= 1: # base case
return n
return solve(n-1) + solve(n-2) # recurrence relationExample 1: Climbing Stairs
@functools.cache
def climb_stairs(n: int) -> int:
"""
Climb n stairs taking 1 or 2 steps. Count distinct ways.
Example: n=4 ā 5 ways
Think: to reach stair n, you came from n-1 (1 step) or n-2 (2 steps).
ways(n) = ways(n-1) + ways(n-2) ā Fibonacci!
"""
if n <= 2:
return n
return climb_stairs(n - 1) + climb_stairs(n - 2)
# Time: O(n), Space: O(n) for memo + call stackExample 2: Word Break
def word_break(s: str, word_dict: list) -> bool:
"""
Determine if s can be segmented into words from word_dict.
Example: s = "leetcode", wordDict = ["leet","code"] ā True
"""
word_set = set(word_dict)
n = len(s)
memo = {}
def can_break(start):
if start == n: # reached end ā success
return True
if start in memo:
return memo[start]
for end in range(start + 1, n + 1):
# If current slice is a word AND rest can be broken
if s[start:end] in word_set and can_break(end):
memo[start] = True
return True
memo[start] = False
return False
return can_break(0)
# Time: O(n²), Space: O(n)Pattern 6.6.B ā Bottom-Up (Tabulation)
# === Template: Bottom-Up DP ===
def solve_bottom_up(n):
dp = [0] * (n + 1) # dp[i] = answer for subproblem i
dp[0] = base_case_0 # base cases
dp[1] = base_case_1
for i in range(2, n + 1):
dp[i] = # ... recurrence relation using dp[i-1], dp[i-2], etc.
return dp[n]Example 1: House Robber
def house_robber(nums: list) -> int:
"""
Rob houses without robbing adjacent ones.
Maximize total money robbed.
Example: [2,7,9,3,1] ā 12 (rob 2, 9, 1)
dp[i] = max money from first i houses
dp[i] = max(dp[i-1], dp[i-2] + nums[i])
= max(skip this house, rob this house)
"""
if not nums:
return 0
if len(nums) == 1:
return nums[0]
# Space optimization: only need previous 2 values
prev2 = nums[0] # dp[i-2]
prev1 = max(nums[0], nums[1]) # dp[i-1]
for i in range(2, len(nums)):
current = max(prev1, prev2 + nums[i]) # rob or skip
prev2, prev1 = prev1, current
return prev1
# Time: O(n), Space: O(1)Example 2: Longest Increasing Subsequence (LIS)
def length_of_lis(nums: list) -> int:
"""
Find length of longest strictly increasing subsequence.
Example: [10,9,2,5,3,7,101,18] ā 4 ([2,3,7,101])
"""
if not nums:
return 0
n = len(nums)
# dp[i] = length of LIS ending at index i
dp = [1] * n # every element is LIS of length 1
for i in range(1, n):
for j in range(i): # look at all previous elements
if nums[j] < nums[i]: # can extend subsequence
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
# Time: O(n²), Space: O(n)
# O(n log n) solution using binary search + patience sorting:
import bisect
def length_of_lis_fast(nums: list) -> int:
"""
Maintain 'tails' array where tails[i] = smallest tail of all
increasing subsequences of length i+1.
"""
tails = [] # patience sort "piles"
for num in nums:
# Binary search: find leftmost tail >= num
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
tails.append(num) # extend longest subsequence
else:
tails[pos] = num # replace to get smaller tail
return len(tails)
# Time: O(n log n), Space: O(n)Example 3: Coin Change
def coin_change(coins: list, amount: int) -> int:
"""
Find minimum coins needed to make up amount.
Return -1 if impossible.
Example: coins = [1,5,11], amount = 15 ā 3 (5+5+5)
dp[i] = minimum coins to make amount i
dp[i] = min(dp[i - coin] + 1) for each coin <= i
"""
# Initialize with "impossible" value
dp = [float('inf')] * (amount + 1)
dp[0] = 0 # base case: 0 coins to make 0
for i in range(1, amount + 1):
for coin in coins:
if coin <= i and dp[i - coin] != float('inf'):
dp[i] = min(dp[i], dp[i - coin] + 1)
return dp[amount] if dp[amount] != float('inf') else -1
# Time: O(amount Ć len(coins)), Space: O(amount)Example 4: Longest Common Subsequence (LCS)
def lcs(text1: str, text2: str) -> int:
"""
Find length of longest common subsequence.
Example: text1 = "abcde", text2 = "ace" ā 3 ("ace")
dp[i][j] = LCS of text1[:i] and text2[:j]
If text1[i-1] == text2[j-1]: dp[i][j] = dp[i-1][j-1] + 1
Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
"""
m, n = len(text1), len(text2)
# (m+1) Ć (n+1) table, initialized to 0
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]: # characters match
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
return dp[m][n]
# Time: O(m Ć n), Space: O(m Ć n)Example 5: 0/1 Knapsack
def knapsack(weights: list, values: list, capacity: int) -> int:
"""
Classic 0/1 knapsack ā can take each item at most once.
Maximize total value without exceeding capacity.
dp[i][w] = max value using items 0..i with capacity w
Choice for item i:
- Skip item i: dp[i-1][w]
- Take item i: dp[i-1][w - weights[i]] + values[i] (if fits)
"""
n = len(weights)
dp = [[0] * (capacity + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for w in range(capacity + 1):
# Option 1: skip item i
dp[i][w] = dp[i - 1][w]
# Option 2: take item i (if it fits)
if weights[i - 1] <= w:
dp[i][w] = max(dp[i][w],
dp[i - 1][w - weights[i - 1]] + values[i - 1])
return dp[n][capacity]
# Time: O(n Ć capacity), Space: O(n Ć capacity)
# Space-optimized version (1D DP):
def knapsack_1d(weights, values, capacity):
dp = [0] * (capacity + 1)
for i in range(len(weights)):
# Iterate BACKWARDS to avoid using same item twice
for w in range(capacity, weights[i] - 1, -1):
dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
return dp[capacity]
# Time: O(n Ć capacity), Space: O(capacity)š¦ Quick Cheatsheet ā Dynamic Programming
# Top-down (memoization)
@functools.cache
def dp(state):
if base_case: return base_value
return recurrence(dp(smaller_state))
# Bottom-up (tabulation)
dp = [base] * (n + 1)
for i in range(start, n + 1):
dp[i] = min/max(dp[i-1], dp[i-2] + something)
# Common DP formulas:
# Fibonacci: dp[i] = dp[i-1] + dp[i-2]
# House Robber: dp[i] = max(dp[i-1], dp[i-2] + nums[i])
# Coin Change: dp[i] = min(dp[i-coin] + 1) for each coin
# LCS: dp[i][j] = dp[i-1][j-1]+1 if match, else max(dp[i-1][j], dp[i][j-1])
# Knapsack: dp[w] = max(dp[w], dp[w-wt]+val) ā iterate w BACKWARDS6.7 Monotonic Stack and Monotonic Queue
Theory
Monotonic Stack: stack where elements are always in sorted order
- Monotonic Increasing: top is largest
- Monotonic Decreasing: top is smallest
Use when: "next greater/smaller element", "largest rectangle", "span of stock price"
Monotonic Deque: deque where elements are sorted
- Used for sliding window maximum/minimum in O(1)
Pattern 6.7.A ā Monotonic Stack
# === Template: Next Greater Element ===
def next_greater_element(nums: list) -> list:
"""
For each element, find the next greater element to its right.
Return -1 if none exists.
Example: [2,1,2,4,3] ā [4,2,4,-1,-1]
"""
n = len(nums)
result = [-1] * n
stack = [] # stack of indices (decreasing values)
for i in range(n):
# While stack has elements smaller than nums[i]
while stack and nums[stack[-1]] < nums[i]:
idx = stack.pop()
result[idx] = nums[i] # nums[i] is next greater for idx
stack.append(i)
# Remaining elements in stack have no next greater ā result stays -1
return result
# Time: O(n) ā each element pushed/popped at most once
# Space: O(n)Example 1: Largest Rectangle in Histogram
def largest_rectangle_area(heights: list) -> int:
"""
Find largest rectangle in histogram.
Example: heights = [2,1,5,6,2,3] ā 10 (width=2, height=5)
Key insight: for each bar, find how far left/right it can extend
while remaining the shortest bar.
Monotonic increasing stack helps find these boundaries.
"""
stack = [] # stack of indices (increasing heights)
max_area = 0
heights = heights + [0] # sentinel to flush remaining bars
for i, h in enumerate(heights):
while stack and heights[stack[-1]] > h:
height = heights[stack.pop()] # height of the rectangle
# Width: from stack top to current position
width = i if not stack else i - stack[-1] - 1
max_area = max(max_area, height * width)
stack.append(i)
return max_area
# Time: O(n), Space: O(n)Example 2: Trapping Rain Water
def trap(height: list) -> int:
"""
Compute water trapped between bars.
Example: [0,1,0,2,1,0,1,3,2,1,2,1] ā 6
"""
if not height:
return 0
# ā
Two-pointer approach (O(1) space):
left, right = 0, len(height) - 1
left_max = right_max = 0
water = 0
while left < right:
if height[left] < height[right]:
if height[left] >= left_max:
left_max = height[left] # update max from left
else:
water += left_max - height[left] # trapped here
left += 1
else:
if height[right] >= right_max:
right_max = height[right]
else:
water += right_max - height[right]
right -= 1
return water
# Time: O(n), Space: O(1)
# ā
Stack approach (also O(n)):
# Useful to understand the monotonic stack patternPattern 6.7.B ā Monotonic Deque (Sliding Window Max)
from collections import deque
def sliding_window_maximum(nums: list, k: int) -> list:
"""
Find maximum in every sliding window of size k.
Example: nums=[1,3,-1,-3,5,3,6,7], k=3 ā [3,3,5,5,6,7]
Monotonic decreasing deque: front = index of max in current window
"""
result = []
dq = deque() # stores INDICES (decreasing values)
for i in range(len(nums)):
# Remove indices outside window
while dq and dq[0] < i - k + 1:
dq.popleft()
# Remove indices whose values are smaller than nums[i]
# (they can never be the max while nums[i] is in window)
while dq and nums[dq[-1]] < nums[i]:
dq.pop()
dq.append(i)
# Start recording results once first window is complete
if i >= k - 1:
result.append(nums[dq[0]]) # front is always the max
return result
# Time: O(n) ā each element added/removed at most once
# Space: O(k)š¦ Quick Cheatsheet ā Monotonic Stack/Queue
# Monotonic increasing stack (find next SMALLER)
stack = []
for i, val in enumerate(arr):
while stack and arr[stack[-1]] > val:
idx = stack.pop()
# arr[idx]'s next smaller is val
stack.append(i)
# Monotonic decreasing stack (find next GREATER)
stack = []
for i, val in enumerate(arr):
while stack and arr[stack[-1]] < val:
idx = stack.pop()
# arr[idx]'s next greater is val
stack.append(i)
# Sliding window max (monotonic deque)
dq = deque()
for i in range(len(nums)):
while dq and dq[0] < i - k + 1: dq.popleft() # out of window
while dq and nums[dq[-1]] < nums[i]: dq.pop() # smaller, useless
dq.append(i)
if i >= k - 1: result.append(nums[dq[0]])6.8 Union-Find (Disjoint Set Union)
Theory
Union-Find manages a collection of disjoint sets.
- find(x): which group does x belong to?
- union(x, y): merge groups of x and y
Two optimizations:
1. Path compression: make every node point directly to root (flattens tree)
2. Union by rank/size: always attach smaller tree under larger (keeps tree flat)
With both: nearly O(1) per operation (amortized O(α(n)) where α = inverse Ackermann)
# === Complete Union-Find Implementation ===
class UnionFind:
def __init__(self, n):
self.parent = list(range(n)) # each element is its own root
self.rank = [0] * n # tree height (for union by rank)
self.size = [1] * n # component size
self.num_components = n # number of distinct components
def find(self, x):
"""Find root of x with path compression."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
"""Merge components of x and y. Returns True if they were separate."""
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False # already in same component
# Union by rank: attach smaller rank tree under larger rank
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx # ensure rx has higher rank
self.parent[ry] = rx # attach ry under rx
self.size[rx] += self.size[ry]
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1 # only increase when ranks equal
self.num_components -= 1
return True
def connected(self, x, y):
"""Check if x and y are in same component."""
return self.find(x) == self.find(y)
def component_size(self, x):
"""Return size of component containing x."""
return self.size[self.find(x)]Example 1: Number of Connected Components
def count_components(n: int, edges: list) -> int:
"""
Count connected components in undirected graph.
"""
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.num_components
# Time: O(E à α(N)) ā O(E), Space: O(N)Example 2: Redundant Connection (Cycle Detection)
def find_redundant_connection(edges: list) -> list:
"""
Find the edge that creates a cycle in an undirected graph.
Example: edges = [[1,2],[1,3],[2,3]] ā [2,3]
"""
n = len(edges)
uf = UnionFind(n + 1) # nodes are 1-indexed
for u, v in edges:
if not uf.union(u, v): # union returns False = already connected
return [u, v] # this edge creates a cycle!
return []Example 3: Accounts Merge
def accounts_merge(accounts: list) -> list:
"""
Merge accounts sharing common emails.
accounts[i] = [name, email1, email2, ...]
"""
email_to_id = {} # email ā first seen index
email_to_name = {} # email ā account name
uf = UnionFind(len(accounts) * 10) # plenty of space
node_id = 0
for account in accounts:
name = account[0]
for i in range(1, len(account)):
email = account[i]
if email not in email_to_id:
email_to_id[email] = node_id
node_id += 1
email_to_name[email] = name
# Union current email with first email of this account
uf.union(email_to_id[account[1]], email_to_id[email])
# Group emails by root
from collections import defaultdict
groups = defaultdict(list)
for email, idx in email_to_id.items():
root = uf.find(idx)
groups[root].append(email)
result = []
for root, emails in groups.items():
name = email_to_name[emails[0]]
result.append([name] + sorted(emails))
return result6.9 Trie (Prefix Tree)
Theory
Trie: tree where each node represents a character.
- Root = empty string
- Each path from root = a prefix
- Marked nodes = complete words
Operations: insert, search, startsWith ā all O(L) where L = word length
Use when:
- Autocomplete / prefix search
- Word dictionary with prefix lookup
- Longest common prefix
# === Trie Implementation ===
class TrieNode:
def __init__(self):
self.children = {} # char ā TrieNode
self.is_end = False # marks complete word
self.count = 0 # words ending here (optional)
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str) -> None:
"""Insert word into trie. O(L)"""
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char] # move to next node
node.is_end = True # mark end of word
node.count += 1
def search(self, word: str) -> bool:
"""Check if exact word exists. O(L)"""
node = self._find_prefix(word)
return node is not None and node.is_end
def starts_with(self, prefix: str) -> bool:
"""Check if any word starts with prefix. O(L)"""
return self._find_prefix(prefix) is not None
def _find_prefix(self, prefix: str):
"""Navigate to end of prefix. Returns node or None."""
node = self.root
for char in prefix:
if char not in node.children:
return None
node = node.children[char]
return node
def get_words_with_prefix(self, prefix: str) -> list:
"""Return all words starting with prefix (for autocomplete)."""
node = self._find_prefix(prefix)
if not node:
return []
result = []
def dfs(node, current):
if node.is_end:
result.append(current)
for char, child in node.children.items():
dfs(child, current + char)
dfs(node, prefix)
return resultExample: Word Search II (Backtracking + Trie)
def find_words(board: list, words: list) -> list:
"""
Find all words from list that exist in the board.
Words can be formed from adjacent cells (no cell reused).
"""
# Build trie from word list
trie = Trie()
for word in words:
trie.insert(word)
rows, cols = len(board), len(board[0])
found = set()
def dfs(r, c, node, path):
char = board[r][c]
if char not in node.children:
return # no word starts with this path
next_node = node.children[char]
path += char
if next_node.is_end:
found.add(path) # found a word!
# Don't return ā might find longer words
board[r][c] = '#' # mark visited
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols and board[nr][nc] != '#':
dfs(nr, nc, next_node, path)
board[r][c] = char # restore (backtrack)
for r in range(rows):
for c in range(cols):
dfs(r, c, trie.root, "")
return list(found)
# Time: O(M Ć 4 Ć 3^(L-1)) where M=cells, L=word length6.10 Prefix Sum and Difference Array
Theory
Prefix Sum: precompute running totals for O(1) range sum queries.
prefix[i] = sum of first i elements
range_sum(l, r) = prefix[r+1] - prefix[l]
Difference Array: efficient for range update queries.
diff[i] = arr[i] - arr[i-1]
Add k to range [l, r]: diff[l] += k, diff[r+1] -= k
Reconstruct: prefix sum of diff array
# === Prefix Sum ===
def build_prefix_sum(nums: list) -> list:
"""Build prefix sum array. prefix[i] = sum(nums[0:i])"""
prefix = [0] * (len(nums) + 1)
for i, val in enumerate(nums):
prefix[i + 1] = prefix[i] + val
return prefix
def range_sum(prefix: list, l: int, r: int) -> int:
"""Sum of nums[l..r] inclusive. O(1)"""
return prefix[r + 1] - prefix[l]
# Example
nums = [3, 1, 4, 1, 5, 9, 2, 6]
prefix = build_prefix_sum(nums) # [0, 3, 4, 8, 9, 14, 23, 25, 31]
range_sum(prefix, 2, 5) # 4+1+5+9 = 19
# === 2D Prefix Sum (matrix) ===
def build_2d_prefix(matrix: list) -> list:
"""For O(1) rectangle sum queries."""
rows, cols = len(matrix), len(matrix[0])
prefix = [[0] * (cols + 1) for _ in range(rows + 1)]
for r in range(rows):
for c in range(cols):
prefix[r+1][c+1] = (matrix[r][c]
+ prefix[r][c+1] # top
+ prefix[r+1][c] # left
- prefix[r][c]) # double-counted corner
return prefix
def rect_sum(prefix, r1, c1, r2, c2):
"""Sum of rectangle (r1,c1) to (r2,c2)."""
return (prefix[r2+1][c2+1]
- prefix[r1][c2+1]
- prefix[r2+1][c1]
+ prefix[r1][c1])
# === Difference Array ===
def range_add(arr: list, l: int, r: int, k: int, diff: list):
"""Add k to all elements in arr[l..r] using difference array."""
diff[l] += k
if r + 1 < len(diff):
diff[r + 1] -= k
def apply_diff(diff: list) -> list:
"""Reconstruct array from difference array."""
result = []
running = 0
for d in diff:
running += d
result.append(running)
return result
# Example:
arr = [0] * 6
diff = [0] * 7
range_add(arr, 1, 4, 3, diff) # add 3 to indices 1-4
range_add(arr, 2, 5, -1, diff) # subtract 1 from indices 2-5
apply_diff(diff) # [0, 3, 2, 2, 2, -1]Example: Subarray Sum Equals K (Prefix Sum + Dict)
def subarray_sum(nums: list, k: int) -> int:
"""
Count subarrays summing to k.
Key insight: if prefix[j] - prefix[i] = k,
then subarray nums[i..j-1] sums to k.
"""
count = 0
prefix_sum = 0
freq = {0: 1} # empty prefix seen once
for num in nums:
prefix_sum += num
# How many prefixes had value (prefix_sum - k)?
# Each of those creates a valid subarray ending here
count += freq.get(prefix_sum - k, 0)
freq[prefix_sum] = freq.get(prefix_sum, 0) + 1
return count
# Time: O(n), Space: O(n)6.11 Sorting Tricks for Interviews
from functools import cmp_to_key
# === Multi-key sorting ===
students = [("Alice", 90, 20), ("Bob", 85, 22), ("Charlie", 90, 19)]
# Sort by score DESC, then age ASC
sorted(students, key=lambda s: (-s[1], s[2]))
# [('Charlie', 90, 19), ('Alice', 90, 20), ('Bob', 85, 22)]
# === Largest Number (custom comparator) ===
def largest_number(nums: list) -> str:
"""
Arrange numbers to form largest possible number.
Example: [10,2] ā "210", [3,30,34,5,9] ā "9534330"
"""
strs = list(map(str, nums))
def compare(a, b):
if a + b > b + a: return -1 # a should come first
if a + b < b + a: return 1 # b should come first
return 0
strs.sort(key=cmp_to_key(compare))
result = ''.join(strs)
return '0' if result[0] == '0' else result # edge: all zeros ā "0"
# === Counting Sort (O(n) when range is small) ===
def counting_sort(nums: list, max_val: int) -> list:
count = [0] * (max_val + 1)
for n in nums:
count[n] += 1
result = []
for val, cnt in enumerate(count):
result.extend([val] * cnt)
return result
# === Bucket Sort ===
def bucket_sort(nums: list) -> list:
"""For uniformly distributed floats in [0,1)."""
n = len(nums)
buckets = [[] for _ in range(n)]
for num in nums:
buckets[int(num * n)].append(num)
result = []
for bucket in buckets:
result.extend(sorted(bucket))
return result
# === Anagram Grouping ===
from collections import defaultdict
def group_anagrams(words: list) -> list:
groups = defaultdict(list)
for word in words:
key = ''.join(sorted(word)) # anagram key
groups[key].append(word)
return list(groups.values())6.12 String Manipulation Patterns
# === Palindrome variations ===
def is_palindrome(s):
s = ''.join(c.lower() for c in s if c.isalnum())
return s == s[::-1] # O(n)
def is_palindrome_range(s, l, r):
"""Check if s[l:r+1] is a palindrome ā used in expand-around-center."""
while l < r:
if s[l] != s[r]: return False
l += 1; r -= 1
return True
# Longest Palindromic Substring:
def longest_palindrome(s: str) -> str:
"""Expand around center ā O(n²) time, O(1) space."""
if not s: return ""
start, end = 0, 0
for i in range(len(s)):
# Two cases: odd-length and even-length palindromes
for l, r in [(i, i), (i, i + 1)]: # center at char vs between chars
while l >= 0 and r < len(s) and s[l] == s[r]:
if r - l > end - start:
start, end = l, r
l -= 1; r += 1
return s[start:end + 1]
# === Building strings efficiently ===
# ā BAD: O(n²)
result = ""
for char in iterable:
result += char
# ā
GOOD: O(n)
parts = []
for char in iterable:
parts.append(char)
result = ''.join(parts)
# === Character frequency patterns ===
from collections import Counter
def is_anagram(s: str, t: str) -> bool:
return Counter(s) == Counter(t) # O(n)
def find_all_anagrams(s: str, p: str) -> list:
"""Find all starting indices of p's anagrams in s."""
if len(p) > len(s): return []
p_count = Counter(p)
s_count = Counter(s[:len(p)])
result = []
if s_count == p_count: result.append(0)
for i in range(len(p), len(s)):
# Add new character
s_count[s[i]] += 1
# Remove old character
old = s[i - len(p)]
s_count[old] -= 1
if s_count[old] == 0:
del s_count[old]
if s_count == p_count:
result.append(i - len(p) + 1)
return result6.13 Bit Manipulation Tricks
# === Essential bit operations ===
x & (x - 1) # clear lowest set bit (e.g., 1100 & 1011 = 1000)
x & (-x) # isolate lowest set bit (e.g., 1100 & 0100 = 0100)
x | (1 << k) # set bit k
x & ~(1 << k) # clear bit k
x ^ (1 << k) # toggle bit k
(x >> k) & 1 # check if bit k is set
# === Count set bits ===
bin(n).count('1') # Pythonic way
# or:
def count_bits(n):
count = 0
while n:
n &= n - 1 # clear lowest set bit each iteration
count += 1
return count
# === Power of 2 check ===
def is_power_of_two(n):
return n > 0 and (n & (n - 1)) == 0
# === XOR tricks ===
# Single Number: find element appearing once (all others appear twice)
def single_number(nums: list) -> int:
result = 0
for num in nums:
result ^= num # XOR cancels pairs: a ^ a = 0, a ^ 0 = a
return result
# Swap without temp:
a ^= b
b ^= a
a ^= b
# === Generate all subsets with bitmask ===
def all_subsets_bitmask(nums):
n = len(nums)
result = []
for mask in range(1 << n): # 0 to 2^n - 1
subset = []
for i in range(n):
if mask & (1 << i): # bit i is set ā include nums[i]
subset.append(nums[i])
result.append(subset)
return resultš¦ Quick Cheatsheet ā Section 6 Complete
# Two Pointers
left, right = 0, len(arr)-1 # converging
slow = 0; for fast: if condition: slow++ # fast/slow
# Sliding Window
for right: add; while invalid: left++
# Binary Search
mid = left + (right - left) // 2
# BFS
queue = deque([start]); visited = {start}
while queue: node = queue.popleft()
# DFS recursive
def dfs(node): visited.add(node); [dfs(n) for n in graph[node] if n not in visited]
# Backtracking
def bt(start, path):
result.append(path[:]) # copy current path
for i in range(start, len(choices)):
path.append(choices[i]) # choose
bt(i + 1, path) # explore
path.pop() # undo (backtrack)
# DP
dp[i] = max/min(dp[i-1], dp[i-2] + something)
# Union-Find
uf = UnionFind(n); uf.union(u, v); uf.find(x)
# Prefix Sum
prefix[i+1] = prefix[i] + nums[i]
range_sum = prefix[r+1] - prefix[l]
# Bit tricks
x & (x-1) # clear lowest bit
x & (-x) # isolate lowest bit
a ^ a == 0 # XOR cancels pairsSECTION 7 ā REAL BACKEND PRODUCTION SCENARIOS
These are patterns you will encounter in system design rounds, coding rounds with real-world context, and senior engineer interviews. Each scenario shows the Python approach, the reasoning, and production-grade code.
7.1 LRU Cache Implementation
Problem
Design a cache that evicts the Least Recently Used item when it reaches capacity. Used in: database query caches, CDN edge caches, API response caches.
Theory
Requirements:
- get(key): O(1)
- put(key, value): O(1)
- Evict LRU item when full
Data structures needed:
- HashMap: O(1) key lookup
- Doubly Linked List: O(1) move-to-front and evict-from-back
OR
- OrderedDict: Python's built-in that combines both!
from collections import OrderedDict
# ============================================================
# APPROACH 1: OrderedDict (Pythonic ā use in interviews)
# ============================================================
class LRUCache:
"""
LRU Cache using Python's OrderedDict.
OrderedDict remembers insertion order + supports O(1) move_to_end.
"""
def __init__(self, capacity: int):
self.cache = OrderedDict() # key ā value, ordered by use
self.capacity = capacity
def get(self, key: int) -> int:
if key not in self.cache:
return -1 # cache miss
# Mark as recently used (move to end = most recent)
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key) # update position
self.cache[key] = value # insert/update value
if len(self.cache) > self.capacity:
# Remove least recently used (first item = oldest)
self.cache.popitem(last=False)
# Usage:
cache = LRUCache(2)
cache.put(1, 1) # cache: {1:1}
cache.put(2, 2) # cache: {1:1, 2:2}
cache.get(1) # 1, cache: {2:2, 1:1} (1 moved to end)
cache.put(3, 3) # evicts 2, cache: {1:1, 3:3}
cache.get(2) # -1 (not found)
# ============================================================
# APPROACH 2: Manual Doubly Linked List + HashMap
# (Shows deep understanding ā impressive in interviews)
# ============================================================
class DLLNode:
"""Doubly Linked List node."""
def __init__(self, key=0, val=0):
self.key = key
self.val = val
self.prev = None
self.next = None
class LRUCacheManual:
"""
Manual implementation with DLL + HashMap.
DLL Layout:
HEAD ā [LRU] ā ... ā [MRU] ā TAIL
- HEAD.next = least recently used
- TAIL.prev = most recently used
- Use dummy HEAD and TAIL to simplify edge cases
"""
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = {} # key ā DLLNode
# Dummy sentinel nodes (avoid edge case checks)
self.head = DLLNode() # LRU end
self.tail = DLLNode() # MRU end
self.head.next = self.tail
self.tail.prev = self.head
def _remove(self, node: DLLNode):
"""Remove node from its current position in DLL."""
prev, next_ = node.prev, node.next
prev.next = next_
next_.prev = prev
def _add_to_tail(self, node: DLLNode):
"""Add node right before tail (most recently used position)."""
prev = self.tail.prev
prev.next = node
node.prev = prev
node.next = self.tail
self.tail.prev = node
def get(self, key: int) -> int:
if key not in self.cache:
return -1
node = self.cache[key]
self._remove(node) # remove from current position
self._add_to_tail(node) # move to MRU position
return node.val
def put(self, key: int, value: int) -> None:
if key in self.cache:
node = self.cache[key]
node.val = value
self._remove(node)
self._add_to_tail(node)
else:
node = DLLNode(key, value)
self.cache[key] = node
self._add_to_tail(node)
if len(self.cache) > self.capacity:
# Evict LRU (node after dummy head)
lru = self.head.next
self._remove(lru)
del self.cache[lru.key]7.2 Rate Limiter
Problem
Limit users to N requests per time window. Used in: API gateways, authentication endpoints, payment APIs.
import time
from collections import defaultdict, deque
# ============================================================
# APPROACH 1: Sliding Window Rate Limiter
# ============================================================
class SlidingWindowRateLimiter:
"""
Allow at most max_requests per window_seconds.
Uses a deque to store timestamps of recent requests.
Production use: Redis sorted sets replace the deque.
"""
def __init__(self, max_requests: int, window_seconds: int):
self.max_requests = max_requests
self.window = window_seconds
# user_id ā deque of request timestamps
self.requests = defaultdict(deque)
def is_allowed(self, user_id: str) -> bool:
"""
Returns True if request is allowed, False if rate limited.
Time: O(1) amortized
"""
now = time.time()
window_start = now - self.window
user_requests = self.requests[user_id]
# Remove timestamps outside the current window
while user_requests and user_requests[0] < window_start:
user_requests.popleft() # O(1) deque removal
if len(user_requests) < self.max_requests:
user_requests.append(now) # record this request
return True # allowed
return False # rate limited
# Usage:
limiter = SlidingWindowRateLimiter(max_requests=5, window_seconds=60)
user = "user_123"
for i in range(7):
allowed = limiter.is_allowed(user)
print(f"Request {i+1}: {'ā
Allowed' if allowed else 'ā Blocked'}")
# Requests 1-5: Allowed, Requests 6-7: Blocked
# ============================================================
# APPROACH 2: Token Bucket Rate Limiter
# ============================================================
class TokenBucketRateLimiter:
"""
Tokens refill at a constant rate. Allows bursts up to bucket capacity.
More sophisticated ā handles traffic bursts gracefully.
"""
def __init__(self, rate: float, capacity: int):
self.rate = rate # tokens added per second
self.capacity = capacity # max tokens in bucket
self.tokens = defaultdict(lambda: capacity) # user ā current tokens
self.last_refill = defaultdict(float) # user ā last refill time
def _refill(self, user_id: str) -> None:
"""Refill tokens based on time elapsed since last request."""
now = time.time()
elapsed = now - self.last_refill[user_id]
# Add tokens proportional to elapsed time
new_tokens = elapsed * self.rate
self.tokens[user_id] = min(
self.capacity,
self.tokens[user_id] + new_tokens
)
self.last_refill[user_id] = now
def is_allowed(self, user_id: str, cost: int = 1) -> bool:
"""Returns True if request is allowed. Each request costs tokens."""
self._refill(user_id)
if self.tokens[user_id] >= cost:
self.tokens[user_id] -= cost
return True
return False7.3 Caching with TTL (Time-To-Live)
Problem
Cache results but expire them after a time period. Used in: API response caching, session storage, computed result caching.
import time
import functools
from threading import Lock
# ============================================================
# APPROACH 1: Simple Dict-Based TTL Cache
# ============================================================
class TTLCache:
"""
Cache with time-to-live expiry per key.
Thread-safe with Lock for production use.
"""
def __init__(self, default_ttl: float = 300):
self.store = {} # key ā (value, expiry_timestamp)
self.default_ttl = default_ttl
self.lock = Lock() # thread safety
def get(self, key: str):
"""Get value if not expired. Returns None if missing/expired."""
with self.lock:
if key not in self.store:
return None
value, expiry = self.store[key]
if time.time() > expiry: # expired!
del self.store[key] # lazy deletion
return None
return value
def set(self, key: str, value, ttl: float = None) -> None:
"""Set value with TTL seconds until expiry."""
with self.lock:
ttl = ttl or self.default_ttl
expiry = time.time() + ttl
self.store[key] = (value, expiry)
def delete(self, key: str) -> bool:
"""Manually delete a key. Returns True if existed."""
with self.lock:
if key in self.store:
del self.store[key]
return True
return False
def cleanup_expired(self) -> int:
"""Remove all expired keys. Call periodically."""
with self.lock:
now = time.time()
expired_keys = [
k for k, (_, exp) in self.store.items()
if now > exp
]
for key in expired_keys:
del self.store[key]
return len(expired_keys) # number of keys removed
# ============================================================
# APPROACH 2: TTL Cache as Decorator
# ============================================================
def ttl_cache(ttl_seconds: float = 300):
"""
Decorator that caches function results with TTL.
Cache key = function arguments (must be hashable).
"""
def decorator(func):
cache = {} # args ā (result, expiry)
@functools.wraps(func)
def wrapper(*args, **kwargs):
# Create hashable cache key from args
key = (args, tuple(sorted(kwargs.items())))
now = time.time()
# Check cache hit
if key in cache:
result, expiry = cache[key]
if now < expiry: # not expired
return result # cache hit!
# Cache miss ā compute result
result = func(*args, **kwargs)
cache[key] = (result, now + ttl_seconds)
return result
# Expose cache management methods
def invalidate(*args, **kwargs):
"""Manually invalidate specific cached result."""
key = (args, tuple(sorted(kwargs.items())))
cache.pop(key, None)
wrapper.cache = cache
wrapper.invalidate = invalidate
return wrapper
return decorator
# Usage:
@ttl_cache(ttl_seconds=60)
def fetch_user_from_db(user_id: int) -> dict:
"""Expensive DB call ā cached for 60 seconds."""
print(f"Fetching user {user_id} from DB...")
return {"id": user_id, "name": f"User_{user_id}"}
fetch_user_from_db(42) # DB call made
fetch_user_from_db(42) # Cache hit!
fetch_user_from_db.invalidate(42) # Bust cache for user 42
fetch_user_from_db(42) # DB call made again7.4 Processing Large Files with Generators
Problem
Process a multi-GB log file without loading it into memory. Used in: log analysis, ETL pipelines, data streaming.
import json
from pathlib import Path
from typing import Generator, Iterator
# ============================================================
# Pattern: Generator Pipeline
# ============================================================
def read_lines(filepath: str) -> Generator[str, None, None]:
"""
Stage 1: Read file line by line.
Memory usage: O(1) ā only one line in memory at a time.
"""
with open(filepath, 'r', encoding='utf-8') as f:
for line in f:
yield line.rstrip('\n') # yield one line, never load all
def parse_log_line(lines: Iterator[str]) -> Generator[dict, None, None]:
"""
Stage 2: Parse each log line into structured dict.
Skips malformed lines gracefully.
"""
for line in lines:
if not line.strip():
continue # skip empty lines
try:
# Example log format: JSON per line
yield json.loads(line)
except json.JSONDecodeError:
# Could log this to error handler
continue # skip malformed lines
def filter_by_level(
records: Iterator[dict],
level: str
) -> Generator[dict, None, None]:
"""Stage 3: Filter by log level."""
for record in records:
if record.get('level', '').upper() == level.upper():
yield record
def enrich_record(records: Iterator[dict]) -> Generator[dict, None, None]:
"""Stage 4: Add computed fields."""
for record in records:
# Add any enrichment
record['timestamp_epoch'] = record.get('timestamp', 0)
record['is_critical'] = record.get('level') in ('ERROR', 'CRITICAL')
yield record
def batch_records(
records: Iterator[dict],
batch_size: int = 1000
) -> Generator[list, None, None]:
"""Stage 5: Group records into batches for bulk DB inserts."""
batch = []
for record in records:
batch.append(record)
if len(batch) >= batch_size:
yield batch
batch = [] # reset batch
if batch: # yield remaining
yield batch
# ============================================================
# The Pipeline ā NO intermediate lists created!
# ============================================================
def process_large_log(filepath: str, level: str = 'ERROR') -> int:
"""
Process a multi-GB log file using generator pipeline.
Total memory: O(batch_size) regardless of file size.
"""
total_processed = 0
# Build pipeline ā no execution yet!
lines = read_lines(filepath)
records = parse_log_line(lines)
filtered = filter_by_level(records, level)
enriched = enrich_record(filtered)
batches = batch_records(enriched, batch_size=500)
for batch in batches:
# Simulate bulk insert to database
insert_to_db(batch)
total_processed += len(batch)
print(f"Processed {total_processed} records...")
return total_processed
def insert_to_db(batch: list) -> None:
"""Simulated bulk DB insert."""
pass # In production: db.bulk_insert(batch)
# ============================================================
# Reading CSV with generator (alternative to pandas for huge files)
# ============================================================
import csv
def read_csv_chunked(
filepath: str,
chunk_size: int = 10000
) -> Generator[list, None, None]:
"""
Read CSV in chunks. Each chunk is a list of rows.
Memory efficient alternative to pandas read_csv for huge files.
"""
with open(filepath, 'r', newline='') as f:
reader = csv.DictReader(f)
chunk = []
for row in reader:
chunk.append(dict(row))
if len(chunk) >= chunk_size:
yield chunk
chunk = []
if chunk:
yield chunk # yield last partial chunk
# Usage:
for chunk in read_csv_chunked('sales_data.csv', chunk_size=5000):
# Process 5000 rows at a time
totals = sum(float(row['amount']) for row in chunk)7.5 Priority Task Queue
Problem
Execute tasks in priority order. Used in: job schedulers, event processing, async task managers (Celery-like).
import heapq
import time
import itertools
from dataclasses import dataclass, field
from typing import Any, Callable
from enum import IntEnum
class Priority(IntEnum):
CRITICAL = 0 # processed first
HIGH = 1
MEDIUM = 2
LOW = 3
@dataclass(order=True)
class Task:
"""
Task with priority ordering.
@dataclass(order=True) enables < > comparison based on fields in order.
"""
priority: Priority
sequence: int # tiebreaker (FIFO within same priority)
name: str = field(compare=False) # not used for comparison
func: Callable = field(compare=False)
args: tuple = field(default_factory=tuple, compare=False)
kwargs: dict = field(default_factory=dict, compare=False)
class PriorityTaskQueue:
"""
Thread-safe priority task queue using heapq.
Tasks at same priority are processed FIFO (using sequence counter).
"""
def __init__(self):
self._heap = []
self._counter = itertools.count() # unique sequence number
self._stats = {
'submitted': 0,
'completed': 0,
'failed': 0
}
def submit(
self,
func: Callable,
*args,
priority: Priority = Priority.MEDIUM,
name: str = "",
**kwargs
) -> None:
"""Add task to queue."""
task = Task(
priority=priority,
sequence=next(self._counter), # ensures FIFO within same priority
name=name or func.__name__,
func=func,
args=args,
kwargs=kwargs
)
heapq.heappush(self._heap, task)
self._stats['submitted'] += 1
def execute_next(self) -> Any:
"""Execute highest priority task. Returns result."""
if not self._heap:
raise IndexError("Task queue is empty")
task = heapq.heappop(self._heap)
try:
result = task.func(*task.args, **task.kwargs)
self._stats['completed'] += 1
return result
except Exception as e:
self._stats['failed'] += 1
print(f"Task '{task.name}' failed: {e}")
raise
def execute_all(self) -> list:
"""Execute all tasks in priority order."""
results = []
while self._heap:
try:
results.append(self.execute_next())
except Exception:
continue # log and keep going
return results
def peek(self) -> Task:
"""View highest priority task without removing it."""
if not self._heap:
return None
return self._heap[0]
def size(self) -> int:
return len(self._heap)
def stats(self) -> dict:
return {**self._stats, 'pending': self.size()}
# Usage:
queue = PriorityTaskQueue()
def send_email(to: str, subject: str):
print(f"Sending email to {to}: {subject}")
return True
def generate_report(report_id: int):
print(f"Generating report {report_id}")
return f"report_{report_id}.pdf"
def cleanup_temp_files():
print("Cleaning temp files")
# Submit tasks with different priorities
queue.submit(send_email, "ceo@company.com", "System Down!",
priority=Priority.CRITICAL, name="alert_email")
queue.submit(generate_report, 42,
priority=Priority.LOW, name="monthly_report")
queue.submit(send_email, "user@example.com", "Welcome!",
priority=Priority.HIGH, name="welcome_email")
queue.submit(cleanup_temp_files,
priority=Priority.LOW, name="cleanup")
# Execute all in priority order:
queue.execute_all()
# Output order: alert_email ā welcome_email ā monthly_report ā cleanup7.6 Retry Logic with Exponential Backoff
Problem
Automatically retry failed operations with increasing delays. Used in: HTTP clients, database connections, external API calls.
import time
import random
import functools
from typing import Type, Tuple
class RetryExhausted(Exception):
"""Raised when all retry attempts fail."""
pass
def retry(
max_attempts: int = 3,
base_delay: float = 1.0,
max_delay: float = 60.0,
backoff_factor: float = 2.0,
jitter: bool = True,
exceptions: Tuple[Type[Exception], ...] = (Exception,)
):
"""
Decorator: retry function with exponential backoff.
Delay schedule: base_delay * backoff_factor^(attempt - 1)
With jitter: adds random ±20% to prevent thundering herd.
Args:
max_attempts: total tries (including first)
base_delay: initial wait in seconds
max_delay: maximum wait between retries
backoff_factor: multiply delay by this each retry
jitter: add randomness to delays
exceptions: only retry on these exception types
"""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
last_exception = None
for attempt in range(1, max_attempts + 1):
try:
return func(*args, **kwargs) # try the call
except exceptions as e:
last_exception = e
if attempt == max_attempts:
break # no more retries
# Calculate delay with exponential backoff
delay = min(
base_delay * (backoff_factor ** (attempt - 1)),
max_delay
)
# Add jitter: ±20% of delay (prevents thundering herd)
if jitter:
delay *= (0.8 + random.random() * 0.4)
print(
f"Attempt {attempt}/{max_attempts} failed: {e}. "
f"Retrying in {delay:.2f}s..."
)
time.sleep(delay)
raise RetryExhausted(
f"All {max_attempts} attempts failed. "
f"Last error: {last_exception}"
) from last_exception
return wrapper
return decorator
# Usage:
@retry(max_attempts=5, base_delay=0.5, exceptions=(ConnectionError, TimeoutError))
def call_external_api(endpoint: str) -> dict:
"""Calls external API with automatic retry."""
import random
if random.random() < 0.7: # 70% chance of failure (for demo)
raise ConnectionError("Connection timeout")
return {"status": "success", "data": "..."}
# ============================================================
# Context Manager version for more control
# ============================================================
from contextlib import contextmanager
@contextmanager
def retry_context(max_attempts: int = 3, delay: float = 1.0):
"""Use as: with retry_context(3, 1.0): ..."""
for attempt in range(max_attempts):
try:
yield attempt # caller's code runs here
break # success!
except Exception as e:
if attempt == max_attempts - 1:
raise
time.sleep(delay * (2 ** attempt))7.7 Pagination with Generators and islice
Problem
Return results in pages rather than all at once. Used in: REST APIs, database cursors, search results.
import itertools
from typing import Iterator, TypeVar
T = TypeVar('T')
# ============================================================
# Pattern 1: Offset-based Pagination
# ============================================================
def paginate_list(
data: list,
page: int,
page_size: int
) -> dict:
"""
Offset-based pagination on in-memory list.
page is 1-indexed.
"""
total = len(data)
start = (page - 1) * page_size
end = start + page_size
items = data[start:end]
return {
"items": items,
"page": page,
"page_size": page_size,
"total": total,
"total_pages": (total + page_size - 1) // page_size,
"has_next": end < total,
"has_prev": page > 1
}
# ============================================================
# Pattern 2: Generator-based Pagination (memory efficient)
# ============================================================
def chunked(iterable: Iterator[T], size: int) -> Iterator[list]:
"""
Split any iterable into chunks of given size.
Works with generators ā no need to load all data!
Used for: batch DB inserts, API batch requests, processing large datasets
"""
iterator = iter(iterable)
while True:
chunk = list(itertools.islice(iterator, size))
if not chunk:
break
yield chunk
# Usage with generator (memory efficient):
def fetch_all_users_from_db() -> Iterator[dict]:
"""Simulates DB cursor ā yields one row at a time."""
for i in range(1, 10001): # 10,000 users
yield {"id": i, "name": f"User_{i}", "email": f"user{i}@example.com"}
# Process 10,000 users 100 at a time ā constant memory!
for page_num, batch in enumerate(chunked(fetch_all_users_from_db(), 100), start=1):
print(f"Processing page {page_num}: {len(batch)} users")
# send_to_email_service(batch)
# ============================================================
# Pattern 3: Cursor-based Pagination (for APIs ā scalable)
# ============================================================
def cursor_paginate(
data: list,
cursor: int,
limit: int
) -> dict:
"""
Cursor-based pagination ā more stable than offset-based.
cursor = ID of last seen item.
Advantages over offset:
- Stable when data is inserted/deleted between pages
- More efficient for DB (index seek vs full scan)
"""
# Find items after cursor
if cursor == 0:
items = data[:limit]
else:
# Find cursor position
cursor_pos = next(
(i for i, item in enumerate(data) if item['id'] == cursor),
-1
)
items = data[cursor_pos + 1: cursor_pos + 1 + limit]
next_cursor = items[-1]['id'] if len(items) == limit else None
return {
"items": items,
"next_cursor": next_cursor, # None = no more pages
"has_more": next_cursor is not None
}
# Usage:
data = [{"id": i, "value": f"item_{i}"} for i in range(1, 101)]
page1 = cursor_paginate(data, cursor=0, limit=10) # first page
page2 = cursor_paginate(data, cursor=page1["next_cursor"], limit=10)7.8 Event System with Callbacks
Problem
Implement a pub-sub (publish-subscribe) event system. Used in: microservices, UI frameworks, plugin systems.
from collections import defaultdict
from typing import Callable, Any
import functools
class EventEmitter:
"""
Simple synchronous event emitter (pub-sub pattern).
Similar to Node.js EventEmitter or Python's blinker library.
Used in: plugin systems, state management, decoupled components.
"""
def __init__(self):
# event_name ā list of (priority, callback) pairs
self._listeners = defaultdict(list)
self._once_listeners = defaultdict(set) # callbacks to fire only once
def on(self, event: str, callback: Callable, priority: int = 0) -> 'EventEmitter':
"""Register callback for event. Returns self for chaining."""
self._listeners[event].append((priority, callback))
# Sort by priority (lower number = higher priority)
self._listeners[event].sort(key=lambda x: x[0])
return self
def once(self, event: str, callback: Callable) -> 'EventEmitter':
"""Register callback that fires only once."""
self._once_listeners[event].add(callback)
return self.on(event, callback)
def off(self, event: str, callback: Callable = None) -> 'EventEmitter':
"""Remove callback(s) for event."""
if callback is None:
self._listeners[event] = [] # remove all listeners for event
else:
self._listeners[event] = [
(p, cb) for p, cb in self._listeners[event]
if cb != callback
]
return self
def emit(self, event: str, *args, **kwargs) -> int:
"""
Fire all listeners for event with given args.
Returns number of listeners called.
"""
listeners = self._listeners.get(event, [])
called = 0
for priority, callback in listeners:
try:
callback(*args, **kwargs)
called += 1
except Exception as e:
print(f"Listener error on '{event}': {e}")
# Remove once-listeners after firing
if event in self._once_listeners:
for once_cb in self._once_listeners[event]:
self.off(event, once_cb)
del self._once_listeners[event]
return called
def listeners(self, event: str) -> list:
"""Get all callbacks for an event."""
return [cb for _, cb in self._listeners.get(event, [])]
# ============================================================
# Usage Example: Order Processing System
# ============================================================
emitter = EventEmitter()
# Listeners (decoupled modules don't know about each other)
def send_confirmation_email(order_id: str, user_email: str):
print(f"Email sent to {user_email} for order {order_id}")
def update_inventory(order_id: str, items: list):
print(f"Inventory updated for order {order_id}: {items}")
def notify_warehouse(order_id: str, **kwargs):
print(f"Warehouse notified for order {order_id}")
def log_order(order_id: str, **kwargs):
print(f"Order logged: {order_id}")
# Register listeners
emitter.on('order.created', log_order, priority=0) # fires first
emitter.on('order.created', send_confirmation_email, priority=1)
emitter.on('order.created', update_inventory, priority=2)
emitter.on('order.created', notify_warehouse, priority=3)
# Fire once on first order (analytics)
emitter.once('order.created', lambda **kw: print("First order analytics!"))
# Emit event ā all listeners called in priority order
emitter.emit(
'order.created',
order_id='ORD-001',
user_email='alice@example.com',
items=['item_1', 'item_2']
)7.9 Frequency Analysis with Counter
Problem
Analyze frequency distributions in data streams. Used in: analytics dashboards, recommendation systems, anomaly detection.
from collections import Counter
import heapq
from typing import List, Dict
class FrequencyAnalyzer:
"""
Real-time frequency tracker for streaming data.
Used for: user behavior analytics, search queries, error tracking.
"""
def __init__(self, top_k: int = 10):
self.counter = Counter()
self.top_k = top_k
self.total = 0
def record(self, item: str, count: int = 1) -> None:
"""Record occurrence of item."""
self.counter[item] += count
self.total += count
def record_batch(self, items: list) -> None:
"""Record multiple items at once."""
self.counter.update(items)
self.total += len(items)
def top_items(self, k: int = None) -> List[tuple]:
"""Get top k most frequent items with percentages."""
k = k or self.top_k
top = self.counter.most_common(k)
return [(item, count, count/self.total*100) for item, count in top]
def frequency(self, item: str) -> float:
"""Get frequency (0.0 to 1.0) of an item."""
return self.counter[item] / self.total if self.total > 0 else 0.0
def merge(self, other: 'FrequencyAnalyzer') -> None:
"""Merge another analyzer's data into this one."""
self.counter += other.counter
self.total += other.total
def percentile(self, p: float) -> int:
"""
Find count value at given percentile.
e.g., percentile(0.9) = count exceeded by top 10% of items
"""
counts = sorted(self.counter.values(), reverse=True)
idx = int(len(counts) * (1 - p))
return counts[min(idx, len(counts) - 1)]
def anomalies(self, threshold_multiplier: float = 3.0) -> list:
"""
Find items with unusually high frequency.
Returns items where frequency > threshold_multiplier Ć average.
"""
if not self.counter:
return []
avg = self.total / len(self.counter)
return [
(item, count)
for item, count in self.counter.items()
if count > avg * threshold_multiplier
]
# Usage:
analyzer = FrequencyAnalyzer(top_k=5)
# Simulate streaming search queries
queries = ["python", "java", "python", "javascript",
"python", "java", "rust", "python", "go", "java"]
analyzer.record_batch(queries)
print("Top search queries:")
for item, count, pct in analyzer.top_items():
print(f" {item:15} {count:4} queries ({pct:.1f}%)")
print(f"\nPython frequency: {analyzer.frequency('python'):.1%}")
print(f"Anomalies: {analyzer.anomalies(threshold_multiplier=2.0)}")7.10 Config Management with ChainMap
Problem
Manage layered configuration (defaults ā config file ā env vars ā CLI args). Used in: application configuration, multi-tenant systems, feature flags.
import os
from collections import ChainMap
from typing import Any
class ConfigManager:
"""
Layered configuration system using ChainMap.
Priority (highest to lowest):
1. CLI arguments (runtime overrides)
2. Environment variables
3. Config file values
4. Default values
ChainMap searches layers in order ā first match wins.
"""
def __init__(self, defaults: dict = None):
self.defaults = defaults or {}
self.file_config = {}
self.env_config = {}
self.runtime_config = {}
self._build_config()
def _build_config(self):
"""Rebuild ChainMap from all layers."""
self.config = ChainMap(
self.runtime_config, # highest priority
self.env_config,
self.file_config,
self.defaults # lowest priority
)
def load_from_env(self, prefix: str = "APP_") -> None:
"""Load environment variables with given prefix."""
self.env_config = {
key[len(prefix):].lower(): value
for key, value in os.environ.items()
if key.startswith(prefix)
}
self._build_config()
def load_from_dict(self, config_dict: dict) -> None:
"""Load from config file (parsed as dict)."""
self.file_config = config_dict
self._build_config()
def set(self, key: str, value: Any) -> None:
"""Set runtime override (highest priority)."""
self.runtime_config[key] = value
def get(self, key: str, default: Any = None) -> Any:
"""Get config value searching all layers."""
return self.config.get(key, default)
def __getitem__(self, key: str) -> Any:
return self.config[key]
def show_layers(self) -> dict:
"""Debug: show which layer each key comes from."""
result = {}
for key in self.config:
for i, layer in enumerate(self.config.maps):
if key in layer:
layer_names = ['runtime', 'env', 'file', 'defaults']
result[key] = {
'value': layer[key],
'source': layer_names[i]
}
break
return result
# Usage:
defaults = {
'host': 'localhost',
'port': 8080,
'debug': False,
'db_pool_size': 5,
'cache_ttl': 300
}
config = ConfigManager(defaults=defaults)
config.load_from_dict({'port': 9000, 'cache_ttl': 600}) # from config.yaml
config.load_from_env(prefix="APP_") # from env vars
config.set('debug', True) # CLI override
print(config.get('host')) # 'localhost' (from defaults)
print(config.get('port')) # 9000 (from file, overrides default)
print(config.get('debug')) # True (from runtime, highest priority)
print(config.show_layers()) # shows source of each config keyš¦ Quick Cheatsheet ā Backend Patterns
# LRU Cache
from collections import OrderedDict
cache = OrderedDict()
cache.move_to_end(key) # mark as recently used
cache.popitem(last=False) # evict least recently used
# Rate Limiter (sliding window)
requests = defaultdict(deque)
while requests[user][0] < now - window: requests[user].popleft()
if len(requests[user]) < limit: requests[user].append(now)
# TTL Cache
cache[key] = (value, time.time() + ttl)
if time.time() > expiry: del cache[key] # lazy expiry
# Generator pipeline
lines = read_lines(file) # lazy
records = parse(lines) # lazy
filtered = filter_fn(records) # lazy
for batch in chunked(filtered, 1000): # batched
process(batch)
# Priority Queue
heapq.heappush(heap, (priority, counter, task))
# Retry with backoff
delay = min(base * (factor ** attempt), max_delay)
if jitter: delay *= (0.8 + random.random() * 0.4)
# ChainMap layered config
config = ChainMap(runtime, env, file, defaults)
config['key'] # first match winsSECTION 8 ā INTERVIEW QUICK REFERENCE
8.1 One-Liner Cheatsheet
# ============================================================
# FREQUENCY & COUNTING
# ============================================================
from collections import Counter, defaultdict
Counter("abracadabra") # char frequency
Counter(lst).most_common(k) # top k elements
freq = defaultdict(int) # auto-init to 0
freq[item] += 1 # increment count
{k: v for k, v in Counter(s).items() if v > 1} # duplicates
# ============================================================
# LIST MANIPULATIONS
# ============================================================
lst[::-1] # reverse
sorted(lst, key=lambda x: (-x[1], x[0])) # multi-key sort
list(set(lst)) # remove duplicates (order lost)
[x for x in lst if x > 0] # filter positives
[abs(x) for x in lst] # map abs
# Flatten nested list:
[x for sub in nested for x in sub]
# Partition list:
evens, odds = [x for x in lst if x%2==0], [x for x in lst if x%2!=0]
# Running max:
import itertools
list(itertools.accumulate(lst, max))
# ============================================================
# STRING TRICKS
# ============================================================
''.join(sorted(word)) # anagram key
s[::-1] == s # palindrome check
''.join(reversed(s)) # reverse string
s.count('a') # count char
sum(c.isdigit() for c in s) # count digits
' '.join(s.split()) # normalize whitespace
# Char frequency for anagram:
Counter(s1) == Counter(s2)
# All unique chars?
len(set(s)) == len(s)
# ============================================================
# DICTIONARY
# ============================================================
d.get(k, 0) # safe get with default
d.setdefault(k, []).append(v) # group items
{v: k for k, v in d.items()} # invert dict
sorted(d, key=d.get, reverse=True) # sort keys by value
{k: v for k, v in d.items() if v > 0} # filter dict
# Merge dicts:
{**d1, **d2} # d2 wins conflicts
# ============================================================
# MATH & NUMBERS
# ============================================================
import math, functools
pow(base, exp, mod) # fast modular exponentiation
math.gcd(a, b) # greatest common divisor
math.isqrt(n) # integer square root
x & (x - 1) == 0 # is power of 2
bin(n).count('1') # count set bits
functools.reduce(lambda a,b: a*b, lst) # product of list
# Digit sum:
sum(int(d) for d in str(n))
# ============================================================
# MATRIX OPERATIONS
# ============================================================
list(zip(*matrix)) # transpose
[[row[i] for row in m] for i in range(len(m[0]))] # transpose manual
# Flatten matrix:
[cell for row in matrix for cell in row]
# Create nĆm zero matrix:
[[0]*m for _ in range(n)]
# ============================================================
# HEAP (PRIORITY QUEUE)
# ============================================================
import heapq
heapq.nlargest(k, lst) # k largest
heapq.nsmallest(k, lst) # k smallest
heapq.heappush(h, -val) # max-heap trick
# K most frequent:
Counter(lst).most_common(k)
# ============================================================
# BINARY SEARCH
# ============================================================
import bisect
bisect.bisect_left(arr, target) # leftmost insertion point
bisect.bisect_right(arr, target) - 1 # rightmost occurrence
bisect.bisect_right(arr, hi) - bisect.bisect_left(arr, lo) # count in range
# ============================================================
# GRAPH & TREE
# ============================================================
from collections import defaultdict, deque
graph = defaultdict(list)
graph[u].append(v) # build adjacency list
# BFS shortest path:
dist = {start: 0}
q = deque([start])
while q:
n = q.popleft()
for nb in graph[n]:
if nb not in dist:
dist[nb] = dist[n] + 1
q.append(nb)
# DFS with visited:
visited = set()
def dfs(node): visited.add(node); [dfs(nb) for nb in graph[node] if nb not in visited]
# ============================================================
# ITERTOOLS PATTERNS
# ============================================================
import itertools
list(itertools.combinations(lst, r)) # C(n,r)
list(itertools.permutations(lst, r)) # P(n,r)
list(itertools.product(a, b)) # cartesian product
list(itertools.chain(*nested)) # flatten one level
list(itertools.accumulate(lst)) # prefix sums
list(itertools.islice(gen, k)) # first k from generator8.2 Common Python Interview Gotchas
# ============================================================
# GOTCHA 1: Mutable Default Arguments
# ============================================================
# ā BUG: default list is shared across ALL calls!
def append_to(element, to=[]):
to.append(element)
return to
append_to(1) # [1]
append_to(2) # [1, 2] ā NOT [2]!
# ā
FIX:
def append_to(element, to=None):
if to is None:
to = []
to.append(element)
return to
# ============================================================
# GOTCHA 2: Late Binding Closures
# ============================================================
# ā BUG: all lambdas share same variable 'i'
funcs = [lambda: i for i in range(5)]
[f() for f in funcs] # [4, 4, 4, 4, 4] ā all 4!
# ā
FIX: capture current value of i
funcs = [lambda i=i: i for i in range(5)]
[f() for f in funcs] # [0, 1, 2, 3, 4]
# ============================================================
# GOTCHA 3: is vs == for None, True, False
# ============================================================
x = None
x == None # True (value comparison)
x is None # True (identity ā PREFERRED for None)
# Why 'is None' is better:
class Tricky:
def __eq__(self, other): return True # overrides ==!
obj = Tricky()
obj == None # True ā WRONG
obj is None # False ā CORRECT
# Rule: Always use 'is' for None, True, False comparisons
# ============================================================
# GOTCHA 4: Integer Caching (-5 to 256)
# ============================================================
a = 256
b = 256
a is b # True ā CPython caches -5 to 256
a = 257
b = 257
a is b # False ā different objects!
# Rule: NEVER use 'is' to compare integers (or strings)
# Always use == for value comparisons
# ============================================================
# GOTCHA 5: List Multiplication Shallow Copy
# ============================================================
# ā All rows are the SAME list object!
grid = [[0] * 3] * 3
grid[0][0] = 1
print(grid) # [[1,0,0], [1,0,0], [1,0,0]] ā All rows changed!
# ā
Each row is independent:
grid = [[0] * 3 for _ in range(3)]
grid[0][0] = 1
print(grid) # [[1,0,0], [0,0,0], [0,0,0]] ā Only first row changed
# ============================================================
# GOTCHA 6: Modifying Container While Iterating
# ============================================================
# ā DANGEROUS:
lst = [1, 2, 3, 4, 5]
for x in lst:
if x % 2 == 0:
lst.remove(x) # Skips elements!
# Result: [1, 3, 5] ā looks right but is unreliable
# ā ALSO DANGEROUS with dict:
d = {'a': 1, 'b': 2, 'c': 3}
for k in d:
if d[k] > 1:
del d[k] # RuntimeError!
# ā
FIX: iterate over copy
for x in lst[:]: # copy of list
if x % 2 == 0: lst.remove(x)
for k in list(d.keys()): # copy of keys
if d[k] > 1: del d[k]
# ā
BETTER: comprehension
lst = [x for x in lst if x % 2 != 0]
d = {k: v for k, v in d.items() if v <= 1}
# ============================================================
# GOTCHA 7: Shallow vs Deep Copy
# ============================================================
import copy
original = [[1, 2], [3, 4]]
shallow = original.copy() # or original[:]
shallow[0].append(99)
print(original) # [[1, 2, 99], [3, 4]] ā modified!
deep = copy.deepcopy(original)
deep[0].append(100)
print(original) # unchanged ā correct
# ============================================================
# GOTCHA 8: Global Interpreter Lock (GIL)
# ============================================================
# The GIL means only ONE Python thread executes Python code at a time.
#
# Implications:
# - threading doesn't speed up CPU-bound work
# - threading IS useful for I/O-bound work (GIL released during I/O)
# - multiprocessing bypasses GIL (separate processes)
# - asyncio for high-concurrency I/O (single thread, cooperative)
#
# Backend rule of thumb:
# - CPU-bound ā multiprocessing or C extensions
# - I/O-bound ā asyncio or threading
import threading
import multiprocessing
# ā Threading won't speed up CPU computation:
def cpu_task(): return sum(range(10**7))
# ā
Use multiprocessing for CPU work:
with multiprocessing.Pool() as pool:
results = pool.map(cpu_task, [1, 2, 3, 4])
# ā
Threading works for I/O:
def io_task(): import time; time.sleep(1) # simulates network call
# ============================================================
# GOTCHA 9: Float Comparison
# ============================================================
0.1 + 0.2 == 0.3 # False! (floating point)
0.1 + 0.2 # 0.30000000000000004
import math
math.isclose(0.1 + 0.2, 0.3) # True ā correct way
# In competitive programming:
abs((0.1 + 0.2) - 0.3) < 1e-9 # True
# ============================================================
# GOTCHA 10: String Immutability
# ============================================================
# ā O(n²) ā creates new string object each iteration!
result = ""
for char in "hello" * 1000:
result += char
# ā
O(n) ā collect then join
parts = []
for char in "hello" * 1000:
parts.append(char)
result = ''.join(parts)
# ============================================================
# GOTCHA 11: sort() vs sorted()
# ============================================================
lst = [3, 1, 2]
lst.sort() # IN PLACE, returns None
new = sorted(lst) # returns NEW list, original unchanged
# Common bug:
lst = sorted([3, 1, 2]) # ā
correct
lst = [3, 1, 2].sort() # ā lst = None!
# ============================================================
# GOTCHA 12: enumerate and zip edge cases
# ============================================================
# zip stops at shortest:
list(zip([1,2,3], [10,20])) # [(1,10), (2,20)] ā 3 dropped!
# Use zip_longest to keep all:
from itertools import zip_longest
list(zip_longest([1,2,3], [10,20], fillvalue=0)) # [(1,10),(2,20),(3,0)]8.3 Time & Space Complexity Master Table
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā PYTHON DATA STRUCTURE COMPLEXITY ā
ā āāāāāāāāāāāāāāā¦āāāāāāāāāāāāāāāāāāāā¦āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā STRUCTURE ā OPERATION ā TIME SPACE NOTES ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā Index access ā O(1) ā
ā ā Index assign ā O(1) ā
ā ā Append ā O(1)* O(1) *amortized ā
ā LIST [] ā Pop (last) ā O(1) ā
ā ā Pop (index i) ā O(n) Shifts elements ā
ā ā Insert at i ā O(n) Shifts elements ā
ā ā Remove by value ā O(n) Search + shift ā
ā ā Search (in) ā O(n) Linear scan ā
ā ā Sort ā O(n log n) O(n) Timsort ā
ā ā Reverse ā O(n) O(1) In-place ā
ā ā Slice ā O(k) O(k) k = slice size ā
ā ā len() ā O(1) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā Get d[k] ā O(1) avg O(n) worst ā
ā ā Set d[k]=v ā O(1) avg O(n) worst may resize ā
ā DICT {} ā Delete del d[k] ā O(1) avg ā
ā ā k in d ā O(1) avg ā
ā ā Iteration ā O(n) ā
ā ā Copy ā O(n) O(n) ā
ā ā .keys()/.values() ā O(1) Returns view ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā Add ā O(1) avg ā
ā ā Remove ā O(1) avg KeyError if missing ā
ā SET {} ā Discard ā O(1) avg Silent if missing ā
ā ā x in s ā O(1) avg ā
ā ā Union s1|s2 ā O(n+m) ā
ā ā Intersection s1&s2ā O(min(n,m)) ā
ā ā Difference s1-s2 ā O(n) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā append ā O(1) Right end ā
ā ā appendleft ā O(1) Left end ā
ā DEQUE ā pop ā O(1) Right end ā
ā ā popleft ā O(1) Left end ā
ā ā Access d[i] ā O(n) NOT O(1)! ā
ā ā rotate(k) ā O(k) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā heappush ā O(log n) ā
ā HEAPQ ā heappop ā O(log n) ā
ā (MIN-HEAP) ā heapify ā O(n) NOT O(n log n)! ā
ā ā heap[0] peek ā O(1) ā
ā ā nlargest/nsmallestā O(n log k) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā bisect_left/right ā O(log n) Binary search ā
ā BISECT ā insort ā O(n) Search O(log n) ā
ā ā ā Insert O(n) shifts ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā Index access ā O(1) Same as list ā
ā TUPLE () ā Search (in) ā O(n) ā
ā ā Slice ā O(k) ā
ā ā len() ā O(1) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā Trie insert ā O(L) L = word length ā
ā TRIE ā Trie search ā O(L) ā
ā ā Prefix search ā O(L) ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā UNION-FIND ā find ā O(α(n)) ā O(1) With path compression ā
ā (DSU) ā union ā O(α(n)) ā O(1) With rank ā
ā āāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā£
ā ā BFS ā O(V+E) O(V) ā
ā GRAPHS ā DFS ā O(V+E) O(V) ā
ā ā Dijkstra ā O((V+E)log V) O(V) With heap ā
ā ā Topological Sort ā O(V+E) O(V) ā
āāāāāāāāāāāāāāāā©āāāāāāāāāāāāāāāāāāāā©āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
8.4 Python-Specific Interview Questions
# ============================================================
# Q1: How does Python's garbage collection work?
# ============================================================
"""
Python uses TWO mechanisms:
1. REFERENCE COUNTING (primary):
- Every object has a reference count (ob_refcnt)
- Incremented when: assigned to variable, passed to function, put in container
- Decremented when: variable goes out of scope, del called, container changes
- When count hits 0 ā object immediately deallocated
- Problem: can't handle REFERENCE CYCLES (a ā b ā a)
2. CYCLIC GARBAGE COLLECTOR (secondary):
- Runs periodically to find unreachable reference cycles
- Uses generational collection (3 generations)
- Generation 0: young objects ā collected most frequently
- Generation 1: survived one collection
- Generation 2: long-lived objects ā collected rarely
- Tracks only container objects (lists, dicts, etc.)
Generations: based on "weak generational hypothesis" ā
most objects die young.
"""
import gc
gc.collect() # manually trigger GC
gc.disable() # disable GC (when using ctypes, etc.)
gc.get_count() # (gen0_count, gen1_count, gen2_count)
# Reference cycle example:
import gc
class Node:
pass
a = Node()
b = Node()
a.ref = b # a references b
b.ref = a # b references a (cycle!)
del a, b # refcounts > 0, not collected by refcount!
gc.collect() # cyclic GC finds and collects these
# ============================================================
# Q2: What is the GIL and when does it matter?
# ============================================================
"""
GIL = Global Interpreter Lock
A mutex that protects Python's object memory model.
Only ONE thread can execute Python bytecode at a time.
WHY it exists:
- CPython uses reference counting for memory management
- Without GIL, concurrent threads could corrupt refcounts
- GIL simplifies CPython internals significantly
WHEN it matters:
- CPU-bound multithreading ā GIL prevents parallelism
- I/O-bound multithreading ā GIL is released during I/O waits ā fine!
- C extensions can release GIL (numpy, pandas do this)
SOLUTIONS:
- multiprocessing: separate processes, each has own GIL
- asyncio: cooperative concurrency (single thread, no GIL issue)
- Cython / C extensions: can release GIL manually
- PyPy: has different GIL behavior
- Python 3.13+: experimental "no-GIL" mode (PEP 703)
"""
# ============================================================
# Q3: How are dicts implemented in CPython?
# ============================================================
"""
CPython dict (Python 3.6+): "compact dict"
Two arrays:
1. indices array (sparse): [index_into_entries, ...]
- Indexed by: hash(key) % table_size
- Contains: index into entries array (or EMPTY/DELETED)
2. entries array (dense): [(hash, key, value), ...]
- Compact ā no holes
- This is why dict maintains insertion order!
Lookup algorithm:
1. hash = hash(key)
2. slot = hash % table_size
3. Check indices[slot]:
- EMPTY ā key not found
- index ā check entries[index].key == key
- DELETED (dummy) ā probe next slot (perturbation-based probing)
4. If key matches ā return entries[index].value
5. If collision ā probe: slot = (5*slot + 1 + perturb) % size; perturb >>= 5
Load factor: resizes when 2/3 full
Resize factor: ~2x current size
Why insertion order is preserved (3.7+):
- entries array is appended-to (dense)
- Iterating dict iterates entries in append order
"""
# ============================================================
# Q4: __repr__ vs __str__
# ============================================================
class Temperature:
def __init__(self, celsius: float):
self.celsius = celsius
def __str__(self) -> str:
"""Human-readable: used by print(), str()"""
return f"{self.celsius}°C"
def __repr__(self) -> str:
"""Developer-readable: used by repr(), interactive shell, debugging"""
return f"Temperature(celsius={self.celsius})"
t = Temperature(25.5)
print(str(t)) # "25.5°C"
print(repr(t)) # "Temperature(celsius=25.5)"
print(t) # "25.5°C" (calls __str__)
# In containers (list, dict), __repr__ is always used:
print([t]) # [Temperature(celsius=25.5)]
# Rule: __repr__ should ideally be: eval(repr(obj)) == obj
# __str__ is for end users, __repr__ is for developers
# ============================================================
# Q5: How does @property work internally?
# ============================================================
"""
@property uses the descriptor protocol.
A descriptor is any object that defines __get__, __set__, or __delete__.
property is a built-in descriptor class.
@property creates a property object with:
- fget: the getter function
- fset: the setter function (added by @x.setter)
- fdel: the deleter function (added by @x.deleter)
Attribute lookup order (simplified):
1. Data descriptors from type (have __set__) ā property is here
2. Instance __dict__
3. Non-data descriptors from type (only __get__)
property IS a data descriptor (defines both __get__ and __set__),
so it takes priority over instance dict ā that's why you can't
accidentally shadow a property with an instance attribute.
"""
# Manual equivalent of @property:
class Circle:
def __init__(self, radius):
self._radius = radius
# Manual: property(fget, fset, fdel, doc)
def _get_radius(self): return self._radius
def _set_radius(self, v):
if v < 0: raise ValueError("Negative radius")
self._radius = v
radius = property(_get_radius, _set_radius, doc="Circle radius")
# Decorator equivalent (cleaner):
class Circle:
def __init__(self, radius):
self._radius = radius
@property
def radius(self): # creates property object, assigns to 'radius'
return self._radius
@radius.setter # calls radius.setter(func), returns new property
def radius(self, value):
if value < 0: raise ValueError()
self._radius = value
# ============================================================
# Q6: What is __slots__?
# ============================================================
"""
By default, Python objects store attributes in a __dict__ (a hash table).
This is flexible but uses significant memory.
__slots__ replaces per-instance __dict__ with a fixed-size array of slots.
Benefits:
- Less memory (especially with millions of objects)
- Slightly faster attribute access
- Prevents adding new attributes not in __slots__
Drawbacks:
- Can't add arbitrary attributes
- Multiple inheritance becomes complex
- Weakrefs need explicit __weakref__ slot
"""
class Point:
__slots__ = ('x', 'y') # only x and y attributes allowed
def __init__(self, x, y):
self.x = x
self.y = y
import sys
class RegularPoint:
def __init__(self, x, y):
self.x, self.y = x, y
p1 = Point(1, 2)
p2 = RegularPoint(1, 2)
sys.getsizeof(p1) # ~56 bytes (slots)
sys.getsizeof(p2) # ~48 bytes + __dict__ (~232 bytes)
# ============================================================
# Q7: What is the difference between deepcopy and pickle?
# ============================================================
"""
deepcopy:
- Creates a deep copy within the same Python process
- Handles circular references
- Can copy any object (including those without __reduce__)
- Fast (pure Python memory copy)
pickle:
- Serializes object to bytes (for storage or network transfer)
- Can be sent across processes/machines
- Produces bytes/file representation
- Security risk: never unpickle untrusted data!
- Slower than deepcopy
Use deepcopy when: you need independent copy in same process
Use pickle when: you need persistence or inter-process communication
"""SECTION 9 ā DAY BEFORE INTERVIEW REVISION
The 20% That Covers 80% of Interviews
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā šÆ THE GOLDEN CHEATSHEET ā READ THIS THE ā
ā NIGHT BEFORE YOUR INTERVIEW ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š“ TIER 1 ā Know These Cold (100% exam probability)
Data Structures You Must Implement From Memory
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 1. TWO SUM ā Hash Map Pattern
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def two_sum(nums, target):
seen = {} # val ā index
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = i
# Time: O(n), Space: O(n)
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 2. SLIDING WINDOW ā Variable Size
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def longest_no_repeat(s):
char_idx = {}
left = max_len = 0
for right, ch in enumerate(s):
if ch in char_idx and char_idx[ch] >= left:
left = char_idx[ch] + 1
char_idx[ch] = right
max_len = max(max_len, right - left + 1)
return max_len
# Time: O(n), Space: O(1)
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 3. BINARY SEARCH ā Classic
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target: return mid
elif arr[mid] < target: left = mid + 1
else: right = mid - 1
return -1
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 4. BFS ā Shortest Path
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
from collections import deque
def bfs(graph, start, end):
queue = deque([(start, [start])])
visited = {start}
while queue:
node, path = queue.popleft()
if node == end: return path
for nb in graph[node]:
if nb not in visited:
visited.add(nb)
queue.append((nb, path + [nb]))
return []
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 5. DFS ā Graph with Cycle Detection
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def dfs(graph, node, visited=None):
if visited is None: visited = set()
visited.add(node)
for nb in graph[node]:
if nb not in visited:
dfs(graph, nb, visited)
return visited
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 6. BACKTRACKING ā Subsets
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def subsets(nums):
result = []
def bt(start, curr):
result.append(curr[:]) # copy!
for i in range(start, len(nums)):
curr.append(nums[i])
bt(i + 1, curr)
curr.pop()
bt(0, [])
return result
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 7. DYNAMIC PROGRAMMING ā Coin Change
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def coin_change(coins, amount):
dp = [float('inf')] * (amount + 1)
dp[0] = 0
for i in range(1, amount + 1):
for coin in coins:
if coin <= i:
dp[i] = min(dp[i], dp[i - coin] + 1)
return dp[amount] if dp[amount] != float('inf') else -1
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 8. LRU CACHE
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
from collections import OrderedDict
class LRUCache:
def __init__(self, cap):
self.cache = OrderedDict()
self.cap = cap
def get(self, key):
if key not in self.cache: return -1
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key, val):
if key in self.cache: self.cache.move_to_end(key)
self.cache[key] = val
if len(self.cache) > self.cap:
self.cache.popitem(last=False)
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 9. MERGE INTERVALS
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def merge(intervals):
intervals.sort(key=lambda x: x[0])
merged = [intervals[0]]
for start, end in intervals[1:]:
if start <= merged[-1][1]: # overlapping
merged[-1][1] = max(merged[-1][1], end)
else:
merged.append([start, end])
return merged
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 10. KADANE'S ALGORITHM ā Max Subarray
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def max_subarray(nums):
max_sum = curr = nums[0]
for n in nums[1:]:
curr = max(n, curr + n)
max_sum = max(max_sum, curr)
return max_sum
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 11. LINKED LIST ā Reverse
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def reverse_list(head):
prev, curr = None, head
while curr:
next_node = curr.next
curr.next = prev
prev = curr
curr = next_node
return prev
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# 12. TREE ā DFS (Inorder, Preorder, Postorder)
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def inorder(root): # left ā node ā right
if not root: return []
return inorder(root.left) + [root.val] + inorder(root.right)
def preorder(root): # node ā left ā right
if not root: return []
return [root.val] + preorder(root.left) + preorder(root.right)
def max_depth(root):
if not root: return 0
return 1 + max(max_depth(root.left), max_depth(root.right))š” TIER 2 ā High Probability Patterns
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# HEAP ā K Largest Elements
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
import heapq
def k_largest(nums, k):
return heapq.nlargest(k, nums) # O(n log k)
# Manual: maintain min-heap of size k
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# UNION FIND ā Connected Components
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
class UF:
def __init__(self, n):
self.p = list(range(n))
def find(self, x):
if self.p[x] != x: self.p[x] = self.find(self.p[x])
return self.p[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py: return False
self.p[px] = py
return True
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# TRIE ā Insert and Search
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
class Trie:
def __init__(self): self.root = TrieNode()
def insert(self, word):
node = self.root
for c in word:
node = node.children.setdefault(c, TrieNode())
node.is_end = True
def search(self, word):
node = self.root
for c in word:
if c not in node.children: return False
node = node.children[c]
return node.is_end
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# MONOTONIC STACK ā Next Greater Element
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def next_greater(nums):
result = [-1] * len(nums)
stack = []
for i, val in enumerate(nums):
while stack and nums[stack[-1]] < val:
result[stack.pop()] = val
stack.append(i)
return result
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# PREFIX SUM ā Subarray Sum = K
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def subarray_sum_k(nums, k):
count = prefix = 0
freq = {0: 1}
for num in nums:
prefix += num
count += freq.get(prefix - k, 0)
freq[prefix] = freq.get(prefix, 0) + 1
return count
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# FAST & SLOW POINTERS ā Cycle Detection
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
def has_cycle(head):
slow = fast = head
while fast and fast.next:
slow, fast = slow.next, fast.next.next
if slow == fast: return True
return False
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
# TOPOLOGICAL SORT ā Kahn's Algorithm
# āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
from collections import deque
def topo_sort(n, edges):
graph = [[] for _ in range(n)]
indegree = [0] * n
for u, v in edges:
graph[u].append(v)
indegree[v] += 1
queue = deque(i for i in range(n) if indegree[i] == 0)
order = []
while queue:
node = queue.popleft()
order.append(node)
for nb in graph[node]:
indegree[nb] -= 1
if indegree[nb] == 0:
queue.append(nb)
return order if len(order) == n else [] # empty = cycle existsš¢ TIER 3 ā Know the Concepts, Not Always Code-From-Scratch
ā
Dijkstra's Algorithm ā heapq + dist dict
ā
Dynamic Programming patterns (knapsack, LCS, edit distance)
ā
Segment Tree ā range queries
ā
Binary Indexed Tree (Fenwick Tree) ā prefix sums with updates
ā
Matrix chain multiplication ā interval DP
ā
String matching ā KMP, Rabin-Karp
š§ Decision Flowchart ā Which Algorithm to Use?
Problem arrives
ā
āā Array/String + "find pair/subarray" ā TWO POINTERS or SLIDING WINDOW
ā
āā "Sorted array" + "find target" ā BINARY SEARCH
ā
āā "All combinations/permutations/subsets" ā BACKTRACKING
ā
āā "Shortest path" ā BFS (unweighted) or DIJKSTRA (weighted)
ā
āā "Connected components" / "cycles" ā DFS or UNION-FIND
ā
āā "Optimal substructure" / "overlapping subproblems" ā DP
ā
āā "K largest/smallest" ā HEAP (min-heap of size k)
ā
āā "Prefix/word search" ā TRIE
ā
āā "Next greater/smaller element" ā MONOTONIC STACK
ā
āā "Range sum query" ā PREFIX SUM (static) or SEGMENT TREE (updates)
ā
āā "Scheduling/priority" ā PRIORITY QUEUE (heapq)
ā
āā "Frequency/counting" ā COUNTER or HASH MAP
ā
āā "Detect cycle in linked list" ā FAST & SLOW POINTERS
ā” Critical Python Syntax ā Read 10 Minutes Before Interview
# MUST KNOW IMPORTS
from collections import Counter, defaultdict, deque, OrderedDict
from itertools import combinations, permutations, product, accumulate
import heapq, bisect, math, functools, sys
# CRITICAL ONE-LINERS
sorted(w, key=lambda x: (-x[1], x[0])) # sort by value desc, key asc
Counter(lst).most_common(k) # top k frequent
''.join(sorted(word)) # anagram key
[x for sub in nested for x in sub] # flatten
list(zip(*matrix)) # transpose
heapq.nlargest(k, nums) # k largest
bisect.bisect_left(arr, target) # binary search position
sys.setrecursionlimit(10**6) # avoid recursion limit
float('inf'), float('-inf') # infinity
@functools.cache # memoize (Python 3.9+)
mid = left + (right - left) // 2 # safe midpoint
# CRITICAL GOTCHAS TO AVOID TODAY
# 1. result.append(current) ā BUG: reference!
# result.append(current[:]) or result.append(list(current))
#
# 2. [[0]*n]*m ā BUG: shared rows!
# [[0]*n for _ in range(m)]
#
# 3. def f(lst=[]) ā BUG: shared default!
# def f(lst=None): if lst is None: lst = []
#
# 4. lst.sort() returns None ā not the sorted list!
# use sorted(lst) to get new list
#
# 5. for k in d: del d[k] ā RuntimeError!
# for k in list(d.keys()): ...
#
# 6. deque[i] is O(n), not O(1)!
# use list for random accessš¤ Behavioral Framing for Technical Answers
When asked "How would you solve X?":
1. CLARIFY: "Can I assume the input is sorted? What are the constraints?"
2. BRUTE FORCE: "Naive approach is... O(n²). Let me optimize."
3. PATTERN: "I see this is a sliding window / DP / graph problem because..."
4. CODE: Write cleanly with variable names that communicate intent
5. TRACE: "Let me trace through with the example: [2,7,11,15], target=9..."
6. COMPLEXITY: "Time: O(n), Space: O(n) because the hash map stores n elements"
7. EDGE CASES: "Edge cases: empty input, single element, all same, overflow..."
Red flags interviewers watch for:
ā Jumping to code without understanding problem
ā Silent coding (narrate your thought process!)
ā Forgetting edge cases
ā Not knowing complexity of your own solution
ā Giving up when stuck (ask for hint, don't freeze)
Green flags that impress:
ā
"There's a brute force O(n²) but we can optimize to O(n) using..."
ā
"Let me check edge cases: empty array, negative numbers, overflow..."
ā
"The space complexity is O(n) ā if memory is a concern, we could..."
ā
Mentioning tradeoffs between solutions
š Final Checklist ā Night Before
ā” Can I write Two Sum from memory in < 2 minutes?
ā” Can I trace through BFS/DFS on a small graph?
ā” Do I know the difference between heappush and heapreplace?
ā” Can I explain why [[]] * n is a bug?
ā” Do I know when to use deque vs list?
ā” Can I implement LRU Cache from scratch?
ā” Do I know the time complexity of:
- list.append() ā O(1) amortized
- list.pop(0) ā O(n) ā use deque!
- dict lookup ā O(1) average
- set lookup ā O(1) average
- heappush ā O(log n)
- sorted() ā O(n log n)
ā” Have I reviewed mutable default argument gotcha?
ā” Do I know when to use @functools.cache?
ā” Can I write the Union-Find class from memory?
ā” Do I know sys.setrecursionlimit(10**6)?
ā” Am I ready to narrate my thought process out loud?
š Final Words
The best interview performers share these traits:
1. They think out loud ā always
2. They start simple, then optimize
3. They know their data structures cold
4. They don't panic when stuck ā they systematically try patterns
5. They write clean, readable code ā not clever, unreadable tricks
6. They test with the example, then with edge cases
Pattern recognition > memorization.
Clarity > cleverness.
Communication > perfect code.
You have prepared thoroughly.
Trust the process. You've got this. š
š HANDBOOK COMPLETE
This handbook now covers:
- ā Section 1: Python Data Structures (List, Tuple, Set, Dict)
- ā Section 2: Iterables, Iterators, Generators, Itertools
- ā Section 3: Decorators (all forms + real use cases)
- ā Section 4: Built-ins and Python Tricks
- ā Section 5: External Modules (heapq, collections, bisect, math, etc.)
- ā Section 6: All 13 DSA Patterns with full code
- ā Section 7: 10 Real Backend Production Scenarios
- ā Section 8: Quick Reference, Gotchas, Complexity Table, Internals Q&A
- ā Section 9: Day-Before Revision (Tier 1/2/3, Decision Tree, Checklist)
Good luck with your interviews! ššŖ