Data Structures
& Algorithms
Every important DSA topic in exact order — from Big-O basics to dynamic programming, graphs, and advanced data structures. Each topic maps to real interview problems. Built for Java backend engineers targeting product company interviews.
- Big-O — describes upper bound growth rate as input size n→∞. Drop constants and lower-order terms: O(2n + 5) = O(n) INTERVIEW
- Common complexities (fast → slow): O(1) → O(log n) → O(n) → O(n log n) → O(n²) → O(2ⁿ) → O(n!)
- O(1) — array index access, hash map lookup, push/pop stack
- O(log n) — binary search, balanced BST operations. Input halved each step
- O(n) — linear scan, traversing array or linked list
- O(n log n) — merge sort, heap sort, most efficient comparison sorts
- O(n²) — nested loops, bubble sort, selection sort, comparing all pairs
- O(2ⁿ) — subsets, Fibonacci naive recursion, exponential backtracking
- Space complexity: auxiliary space used by algorithm (not counting input). Recursion stack depth counts as space INTERVIEW
- Best vs Worst vs Average: Quick Sort best O(n log n) avg, worst O(n²). Interviewers want worst-case unless stated otherwise
- Amortized complexity: dynamic array append is O(1) amortized even though occasional resize is O(n). Average cost per operation over a sequence
- How to derive: count loops (each nested loop multiplies), count recursive calls (draw recursion tree), count space on call stack
"What is the time and space complexity of your solution?" — Always answer both. For recursive solutions: time = nodes in recursion tree × work per node. Space = max depth of call stack (if no memoization).
- Array internals: contiguous memory. Access O(1), insert middle O(n), append amortized O(1)
-
Prefix sum array: precompute prefix[i] =
sum of arr[0..i]. Range sum query [l,r] in O(1):
prefix[r] - prefix[l-1]INTERVIEW - 2D prefix sum: extend to grids. Submatrix sum in O(1) after O(n²) preprocessing
- Difference array: range update in O(1), read in O(n). Add val to arr[l..r] without looping
- Kadane's Algorithm — maximum subarray sum in O(n): track current sum and global max. Reset current to 0 when negative INTERVIEW
- Dutch National Flag (3-way partition): sort array of 0s, 1s, 2s in O(n) one pass. Three pointers: lo, mid, hi
- Rotate array: reverse entire array, reverse first k, reverse rest. O(n) time O(1) space
- In-place vs out-of-place: always clarify — can you modify input? Do you need O(1) extra space?
- String immutability in Java: String is immutable. Concatenation in loop = O(n²). Always use StringBuilder for building strings iteratively INTERVIEW
-
char[] tricks: convert to array for
in-place modification.
s.toCharArray() - Frequency count with int[26]: int[26] array for lowercase letters — faster than HashMap for character frequency
- Anagram check: sort both O(n log n), or frequency count comparison O(n). Two strings are anagrams iff same character frequencies
- Palindrome check: two pointers from both ends. O(n) time O(1) space
- String comparison: always use .equals() not == in Java. == compares references
- StringBuilder methods: append(), insert(), delete(), reverse(), toString(). All O(1) amortized append
- String.valueOf(), Integer.parseInt(), Character.isDigit/isLetter/toLowerCase — utility methods
- Longest common prefix: sort array, compare first and last strings only
- Recursion structure: base case (stops recursion) + recursive case (smaller subproblem). Missing base case = StackOverflowError
- Call stack: each recursive call adds a stack frame. Stack depth = space complexity. Java default stack ~500-1000 frames deep
- Recursion tree: visualize by drawing each call as a node. Useful to derive time complexity (count nodes × work per node)
- Fibonacci naive: O(2ⁿ) — same subproblems computed repeatedly. This is WHY memoization exists INTERVIEW
- Memoization: store results in HashMap or array. Fibonacci becomes O(n) with memo. Top-down DP = recursion + memoization
- Tail recursion: recursive call is last operation. Can be optimized to loop by compiler (Java JVM typically does NOT optimize this)
- Think recursively: "If I had the answer for n-1, how do I get the answer for n?" — Inductive thinking
- Template: solve base case → trust recursive call → combine results
"What is the time complexity of this recursive solution?" — Draw recursion tree. Nodes = number of calls, each edge = one recursive call. Total work = sum of work at each node. For balanced trees with halving: O(log n) depth × O(n) work per level = O(n log n).
-
Node structure:
class ListNode { int val; ListNode next; } - Singly linked list: each node points to next. O(1) insert at head, O(n) search/delete arbitrary node
- Doubly linked list: each node has next + prev. O(1) delete with reference to node. Used in LRU Cache
- Reverse singly linked list: three pointers (prev, curr, next). Iterative O(n) O(1). Recursive O(n) O(n) INTERVIEW
- Dummy/sentinel node: create a dummy head to simplify edge cases (empty list, inserting at head). Return dummy.next
- Common errors: dereferencing null (check curr != null before curr.next), losing reference to next before updating pointers
- Linked list vs array: LL = O(1) insert/delete at known position, O(n) access by index. Array = O(1) access, O(n) insert/delete middle INTERVIEW
- Java's LinkedList — implements both List and Deque. Use as queue or stack
- Floyd's Cycle Detection (Tortoise & Hare): slow moves 1 step, fast moves 2. If cycle: they meet. If no cycle: fast reaches null INTERVIEW
- Cycle entry point: after meeting, reset slow to head, move both 1 step. They meet at cycle start
- Middle of linked list: slow/fast pointers. When fast reaches end, slow is at middle
- Check palindrome linked list: find middle, reverse second half, compare both halves
- Intersection of two lists: two pointers, each traverses both lists. They meet at intersection or null after 2n steps
- Reorder list (L0→Ln→L1→Ln-1): find middle, reverse second half, merge two halves alternating
"Detect a cycle in a linked list." — Floyd's algorithm. Fast/slow pointers. Space O(1) vs HashSet O(n). Follow-up: find cycle entry point. Then: what's the cycle length? (count steps after meeting point)
- Stack (LIFO): push, pop, peek — all O(1). Use: undo operations, call stack simulation, DFS iterative, expression evaluation
- Queue (FIFO): enqueue (offer), dequeue (poll), peek — all O(1). Use: BFS, task scheduling, sliding window
-
Java stack: use
Deque<Integer> stack = new ArrayDeque<>()— NOT the legacy Stack class (synchronized, slow) -
Java queue:
Queue<Integer> q = new LinkedList<>()ornew ArrayDeque<>() - Deque (double-ended queue): addFirst/addLast, removeFirst/removeLast. Supports stack and queue operations
- Valid parentheses: push opening brackets, pop on closing, check match. Empty at end = valid INTERVIEW
- Min stack: maintain auxiliary stack tracking minimum so far. O(1) getMin
- Implement queue using 2 stacks: classic interview problem. Amortized O(1) per operation
- Stack for DFS, Queue for BFS — fundamental pattern for tree/graph traversal INTERVIEW
- Hash function: maps key → bucket index. Good hash function distributes keys uniformly
- Collision resolution: Separate chaining (linked list per bucket) or Open addressing (probe next slot) INTERVIEW
- Load factor: n/capacity. When > 0.75 (Java default), resize and rehash. Resize = O(n) but amortized O(1) per insertion
- HashMap complexity: O(1) average get/put/remove. O(n) worst case (all keys collide to same bucket). Java 8+: bucket becomes TreeMap at 8 entries → O(log n) worst case
- equals() + hashCode() contract: MUST override both. If a.equals(b) then a.hashCode() == b.hashCode(). Using custom objects as keys: override both INTERVIEW
- Common patterns: frequency count (count chars/words), two-sum (store complements), grouping (group by property)
- getOrDefault(key, 0): avoid NPE when incrementing counters
- LinkedHashMap: HashMap + insertion order. Use for LRU cache
- TreeMap: sorted by key, O(log n). floorKey(), ceilingKey(), firstKey(), lastKey()
"How does a HashMap work internally?" — Array of buckets, hash(key) % capacity gives index, linked list/tree for collisions. Also: "When would you use HashMap vs TreeMap?" — HashMap = O(1) unordered. TreeMap = O(log n) sorted order (range queries, ordered iteration).
| Algorithm | Best | Average | Worst | Space | Stable |
|---|---|---|---|---|---|
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | ✓ Yes |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | ✗ No |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | ✗ No |
| Bubble Sort | O(n) | O(n²) | O(n²) | O(1) | ✓ Yes |
| Counting Sort | O(n+k) | O(n+k) | O(n+k) | O(k) | ✓ Yes |
- Merge Sort — divide into halves, sort each, merge. Guaranteed O(n log n). Best for linked lists. Used in Java's Arrays.sort for objects (TimSort = merge + insertion) INTERVIEW
- Quick Sort — pick pivot, partition around it (smaller left, larger right), recurse. O(n log n) average, O(n²) worst (sorted array, bad pivot). Java Arrays.sort for primitives uses dual-pivot quicksort
- Quick Select — find kth smallest in O(n) average using partition logic. No full sort needed
- Counting Sort — count frequency of each value. O(n+k) where k = range. Works only on integers with known bounded range
- Radix Sort — sort digit by digit from least to most significant. O(d×n) where d = digits. For integers and strings
-
Custom comparator in Java:
Arrays.sort(arr, (a,b) -> b - a)(descending). For objects: compare by specific field - Java's Arrays.sort: O(n log n) guaranteed. Use it in interviews unless specifically implementing a sorting algorithm
- When to choose: stability needed → Merge. In-place & general → Quick. Guaranteed worst case → Merge/Heap. Bounded integers → Counting/Radix
-
Classic template:
lo=0, hi=n-1. while(lo<=hi): mid=(lo+hi)/2. if arr[mid]==target return mid. else update lo or hi -
Overflow-safe mid:
mid = lo + (hi - lo) / 2— avoids integer overflow when lo+hi > Integer.MAX_VALUE INTERVIEW - Find first/leftmost occurrence: when found, set hi = mid - 1 (keep searching left). Return lo
- Find last/rightmost occurrence: when found, set lo = mid + 1 (keep searching right). Return hi
- Search in rotated sorted array: one half is always sorted. Check which half is sorted, determine if target is in that half, update bounds accordingly INTERVIEW
-
Search in 2D matrix: treat as 1D array.
mid_row = mid/cols, mid_col = mid%cols - Binary search on answer: "What's the minimum/maximum X such that condition holds?" — Search on X, check condition. Works when condition is monotone (if X works, X+1 also works) INTERVIEW
- Examples of search on answer: Koko Eating Bananas, Ship Packages in D Days, Minimum Capacity, Painter's Partition
- Find peak element: if arr[mid] < arr[mid+1], peak is on right. O(log n)
"Binary search on answer" is a very common medium/hard pattern. Key insight: if you can write a function isValid(x) that is monotone (all valid x form a contiguous range), you can binary search on x. Example: "Can we ship all packages in D days with capacity x?" — increase x if yes, decrease if no.
- Two pointers (opposite ends): lo at start, hi at end. Move based on condition. Requires sorted array or specific structure
- Two Sum on sorted array: if arr[lo]+arr[hi] < target: lo++. If > target: hi--. If == return. O(n)
- Container with Most Water: move pointer with smaller height inward. Greedy argument: moving larger height can only decrease width and can't help
- 3Sum: sort array, fix first element, two pointers for remaining two. Skip duplicates. O(n²) INTERVIEW
- Trapping Rain Water: two pointers. Track max left and max right heights. Water at i = min(maxL, maxR) - height[i]
- Two pointers (same direction): both start at beginning, one leads. Remove duplicates in sorted array: slow tracks insertion point, fast scans
- Remove element in-place: slow/fast pointers. Fast scans, slow writes non-target values
- Fixed-size window: maintain window of exactly k elements. Add new element, remove oldest element each step. O(n)
-
Variable-size window (shrink when invalid):
expand right pointer, shrink left when window condition
violated. Template:
while window invalid: remove arr[left], left++INTERVIEW - Longest substring without repeating chars: HashSet tracks current window. When duplicate: remove from left until no duplicate
- Minimum window substring: expand right to include all required chars, then shrink left while valid. Track count of satisfied requirements
- Sliding window maximum: use monotonic deque. O(n). Remove elements outside window from front, remove smaller elements from back
-
Subarray sum equals k: prefix sum +
HashMap.
prefixSum[i] - k == prefixSum[j]means subarray [j+1,i] sums to k - When to use sliding window: problem asks for optimal contiguous subarray/substring. Condition on window can be maintained incrementally
- Inorder (Left → Root → Right): BST inorder gives sorted order. Most common traversal INTERVIEW
- Preorder (Root → Left → Right): used for tree serialization, copying tree structure
- Postorder (Left → Right → Root): used when children must be processed before parent (delete tree, subtree problems)
- Iterative inorder: stack + curr pointer. Push current, go left until null, process top, go right. Essential to know INTERVIEW
- Iterative preorder: push root, pop and process, push right then left (right first so left processed first)
- Iterative postorder: reverse of modified preorder (Root→Right→Left reversed = Left→Right→Root)
- Morris traversal: O(1) space inorder using threaded binary tree. Advanced — know concept
- Level order (BFS): use Queue. Process level by level. Track level size to group nodes by level
-
Height of tree:
1 + max(height(left), height(right)). Base: null → -1 or 0 - Diameter of binary tree: at each node, diameter through that node = height(left) + height(right) + 2. Global max across all nodes INTERVIEW
- Path sum problems: "does path from root to leaf sum to target?" Pass remaining sum down. At leaf: check if remaining == 0
- Max path sum (any path): at each node, max contribution = node.val + max(0, maxPath(left)) + max(0, maxPath(right)). Return node.val + max(0, max child contribution)
- LCA (Lowest Common Ancestor): if both p and q in right subtree → recurse right. If both left → recurse left. Else: current node is LCA INTERVIEW
- Balanced tree check: height function returns -1 if unbalanced. O(n) with this combined approach
- Serialize/Deserialize: preorder traversal with null markers. Deserialize by consuming from queue. INTERVIEW
- Symmetric tree: check left subtree is mirror of right. Helper: isMirror(left, right)
- BST property: all left subtree values < root, all right subtree values > root. Inorder gives sorted sequence INTERVIEW
- Search: if target < node.val: go left. If >: go right. O(h) where h = height. O(log n) balanced, O(n) worst
- Insert: find correct leaf position using BST property. Create new node there
- Delete: three cases: leaf (just remove), one child (replace with child), two children (replace with inorder successor — smallest in right subtree) INTERVIEW
- Validate BST: pass min/max bounds down. At each node: val must be in (min, max). Common mistake: only checking parent not full subtree constraint
- Kth smallest in BST: inorder traversal, count nodes. Can also track subtree sizes for O(h) query
- Inorder successor: if right subtree exists: leftmost node in right subtree. Else: lowest ancestor for which this node is in left subtree
- Balanced BST (AVL, Red-Black): know conceptually — height O(log n) guaranteed. Java's TreeMap uses Red-Black tree
- Binary heap property: parent ≤ children (min-heap) or parent ≥ children (max-heap). Complete binary tree stored as array
- Array representation: parent of i = (i-1)/2. Left child = 2i+1. Right child = 2i+2
- Insert (bubble up): add at end, swap with parent until heap property restored. O(log n)
- Extract min/max (sift down): replace root with last element, sift down. O(log n)
- Heapify (build heap from array): O(n) — not O(n log n). Start from last non-leaf, sift down each INTERVIEW
-
Java PriorityQueue: min-heap by default.
new PriorityQueue<>(Collections.reverseOrder())for max-heap. poll() = extract min. peek() = view min - Top K largest: min-heap of size k. For each element, if > heap.peek(): poll and add. Result: k elements in heap
- Top K smallest: max-heap of size k. If < heap.peek(): poll and add
- Merge K sorted lists: push first node of each list to min-heap. Poll min, push its next. O(N log k) INTERVIEW
- Find median from data stream: two heaps — max-heap for lower half, min-heap for upper half. Balance sizes. Median = peek of larger or average of both tops
"Why is building a heap O(n) and not O(n log n)?" — Sift-down is O(h) per node. Nodes near leaves have small height. Most nodes are at bottom levels. Sum of h × (n/2^h) for all levels converges to O(n).
-
Trie node:
class TrieNode { TrieNode[] children = new TrieNode[26]; boolean isEnd; } - Insert: for each char, create child node if not exists, move to child. Mark isEnd at last char. O(L) where L = word length
- Search: traverse char by char. If any node missing: return false. Check isEnd at last node. O(L)
- StartsWith (prefix search): same as search but don't check isEnd. O(L)
- Why trie over hashmap: prefix operations, lexicographic ordering, space-efficient for common prefixes INTERVIEW
- Word Search II (find all words in board): trie + DFS backtracking. Prune by checking if prefix exists in trie
- Add and Search words with wildcards: DFS through trie. When '.': explore all children
- Space optimization: use HashMap instead of array for sparse tries (non-lowercase-only)
-
Adjacency list:
Map<Integer, List<Integer>>orList<List<Integer>>. Space O(V+E). Efficient for sparse graphs (most interview problems) INTERVIEW - Adjacency matrix: int[n][n]. Space O(V²). Edge check O(1) but slow traversal. Use for dense graphs
- Edge list: list of (u, v, weight) pairs. Used in Kruskal's MST algorithm
- Directed vs undirected: directed = edges have direction. Undirected: add edge in both adjacency lists
-
Weighted graphs:
Map<Integer, List<int[]>>where int[] = {neighbor, weight} - Cycle detection in undirected: DFS with visited set. If visiting a neighbor that's already visited (and not parent): cycle exists
- Cycle detection in directed: DFS with 3 states: WHITE (unvisited), GRAY (in current path), BLACK (fully processed). Gray → Gray = cycle INTERVIEW
- Bipartite graph check: 2-color with BFS/DFS. If adjacent nodes have same color: not bipartite
- Connected components: DFS/BFS from each unvisited node. Count how many times you start fresh
- BFS template: add source to queue + visited. While queue not empty: poll, process, add unvisited neighbors to queue + visited
- Shortest path in unweighted graph: BFS gives shortest path. Level = distance. INTERVIEW
- Grid BFS: 4-directional or 8-directional. Use int[][] dirs = {{0,1},{1,0},{0,-1},{-1,0}}. Check bounds + not visited
- Number of Islands: BFS/DFS from each unvisited '1', mark all connected '1's visited. Count starts
- Multi-source BFS: add all sources to queue simultaneously. All at distance 0. BFS expands outward. Rotting Oranges, Distance to Nearest 0 INTERVIEW
- Bidirectional BFS: BFS from both source and target simultaneously. Meets in middle. Reduces O(b^d) to O(b^(d/2)). Word Ladder
- BFS level tracking: use queue size at start of each level iteration, or add sentinel null node
- DFS template: if visited: return. Mark visited. Process. Recurse on neighbors
- DFS vs BFS: DFS = deep first, uses stack (recursion or explicit). BFS = wide first, uses queue. BFS for shortest path. DFS for exhaustive search, backtracking INTERVIEW
- Topological sort (DFS): post-order DFS — add node to result AFTER all neighbors processed. Reverse = topo order. Works on DAG only
- Topological sort (Kahn's BFS): compute in-degrees. Add all 0 in-degree nodes to queue. Process: poll, decrement neighbors' in-degree, add newly zero in-degree. If result size != V: cycle exists INTERVIEW
- Course Schedule: courses = nodes, prerequisites = edges. Detect cycle in directed graph. Topological sort if no cycle
- All paths in DAG: DFS with path tracking. Backtrack when done
- Clone graph: DFS with HashMap (original node → cloned node) to handle cycles
- Dijkstra's algorithm: single source, non-negative weights. Priority queue (min-heap) based. O((V+E) log V). Process node when first popped (lowest distance) INTERVIEW
- Dijkstra template: dist[src]=0, all others ∞. PQ with (dist, node). Pop min, skip if already processed, relax edges, push updates
- Why Dijkstra fails with negative edges: greedy assumption breaks — a shorter path through negative edge might exist after a node is "finalized"
- Bellman-Ford: single source, handles negative weights. O(VE). Relax all edges V-1 times. If can relax on Vth iteration: negative cycle exists INTERVIEW
-
Floyd-Warshall: all-pairs shortest paths.
O(V³) DP.
dist[i][j] = min(dist[i][j], dist[i][k]+dist[k][j])for all k - Network delay time: Dijkstra from source, return max distance to all nodes. If any unreachable: -1
- Cheapest flights within K stops: modified Dijkstra/Bellman-Ford with stop count constraint
- Minimum Spanning Tree (MST): connect all V vertices with V-1 edges, minimum total weight. Undirected connected graph only INTERVIEW
- Kruskal's algorithm: sort edges by weight, add if doesn't create cycle (use Union-Find). O(E log E)
- Prim's algorithm: start from any node, greedily add minimum weight edge connecting tree to non-tree vertex. Priority queue. O(E log V)
- Min Cost to Connect All Points: Prim's on grid where edge weight = Manhattan distance
- Bridge in graph: edge whose removal disconnects the graph. Tarjan's algorithm with low-link values. O(V+E)
- Articulation point: vertex whose removal disconnects graph. Also Tarjan's algorithm
- Strongly Connected Components (SCC): in directed graph, all nodes in SCC can reach each other. Kosaraju's (2 DFS) or Tarjan's (1 DFS)
- D&C pattern: divide into smaller subproblems, solve recursively, combine results. Recurrence: T(n) = 2T(n/2) + O(n) → O(n log n)
- Master theorem: T(n) = aT(n/b) + f(n). Quick way to derive complexity of divide-and-conquer recurrences INTERVIEW
- Count inversions: modified merge sort. During merge, if arr[i] > arr[j]: all remaining left elements form inversions with arr[j]. Add (mid - i + 1) to count
- Merge K sorted arrays: D&C — merge pairs, then merge results. O(N log k)
- Closest pair of points: O(n log n) D&C. Split by median x, recurse each half, then check strip near split line
- Greedy strategy: at each step, make locally optimal choice. Requires proof that greedy doesn't miss a better solution (exchange argument)
- Activity selection / Interval scheduling: sort by end time, greedily pick non-overlapping intervals. Classic greedy INTERVIEW
- Meeting rooms II (min rooms): sort by start, use min-heap of end times. If new meeting starts after heap.peek(): reuse room (poll, add new end). Else: new room
- Jump Game: track max reachable index. If current index > max reachable: can't proceed. O(n)
- Jump Game II (min jumps): track current range end and next range end. When reach current end: increment jumps, update current to next range
- Gas Station: if total gas ≥ total cost: solution exists. Find start by tracking running sum; when negative: reset and try next station
- Task Scheduler: most frequent task determines minimum time. Formula: (maxFreq-1) * (n+1) + countMaxFreq. Compare with task count
- Greedy fails when: future choices affect current optimality (use DP instead). Coin change with arbitrary denominations
- DP conditions: overlapping subproblems (same subproblem computed multiple times) + optimal substructure (optimal solution uses optimal subsolutions) INTERVIEW
- Top-down (memoization): recursion + cache. More intuitive to write. Space = O(n) call stack + O(n) memo
- Bottom-up (tabulation): fill DP table iteratively. Often more space-efficient. No recursion overhead
- How to approach DP: (1) Define state: dp[i] means... (2) Recurrence: dp[i] = f(dp[i-1], ...) (3) Base cases (4) Order of computation (5) Answer location
- Coin Change (min coins): dp[amount] = min(dp[amount - coin] + 1) for each coin. Bottom-up. O(amount × coins)
- Coin Change (ways): dp[amount] += dp[amount - coin]. Order matters: outer loop = coins (each coin used multiple times, unbounded knapsack)
- House Robber: dp[i] = max(dp[i-2] + nums[i], dp[i-1]). Optimize to O(1) space with two variables INTERVIEW
- Climbing Stairs: dp[i] = dp[i-1] + dp[i-2]. Fibonacci pattern
- LIS (Longest Increasing Subsequence): O(n²) DP: dp[i] = max(dp[j]+1) for j<i where nums[j]<nums[i]. O(n log n) patience sorting with binary search
- Longest Palindromic Substring: expand around center O(n²). Or DP table dp[i][j] = palindrome from i to j
- Word Break: dp[i] = any dp[j] where s[j..i] is a word. O(n² × max word length)
- 0/1 Knapsack: dp[i][w] = max value using first i items, capacity w. Either include item i: dp[i-1][w-wt[i]] + val[i], or exclude: dp[i-1][w]. O(n×W) INTERVIEW
- Unbounded Knapsack: can use same item multiple times. dp[w] = max(dp[w], dp[w-wt[i]] + val[i]). Inner loop: iterate capacity forward (not backward)
- Subset Sum: special case of 0/1 knapsack. dp[i][s] = can we make sum s using first i elements?
- Partition Equal Subset Sum: subset sum where target = totalSum/2
- LCS (Longest Common Subsequence): dp[i][j] = LCS of s1[0..i] and s2[0..j]. If chars match: dp[i-1][j-1]+1. Else: max(dp[i-1][j], dp[i][j-1]) INTERVIEW
- Edit Distance (Levenshtein): dp[i][j] = min ops to convert s1[0..i] to s2[0..j]. Match: dp[i-1][j-1]. Else: 1 + min(insert, delete, replace)
- Distinct Subsequences: dp[i][j] = # ways s[0..i] contains t[0..j] as subsequence
- Wildcard Matching / Regex: 2D DP handling * and ? patterns
- Unique Paths: dp[i][j] = dp[i-1][j] + dp[i][j-1]. Space optimize to 1D array
- Minimum Path Sum: same structure, take min instead of sum
- Maximal Square: dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1 if cell is '1'
- Dungeon Game: solve backwards from bottom-right to top-left. dp[i][j] = min health needed to survive from this cell
-
Backtracking template:
void bt(state, choices): if done: add to results. for each choice: make choice → recurse → undo choiceINTERVIEW - Subsets: at each index, either include or exclude. 2ⁿ subsets. O(n × 2ⁿ)
- Permutations: at each position, try all unused elements. n! permutations. O(n × n!)
- Combinations: choose k from n, only forward index (no repeats in set). C(n,k) combinations
-
Handle duplicates: sort array first. Skip
duplicate choices at same level:
if i > start && nums[i] == nums[i-1]: continue - Pruning: stop early when current state can't lead to valid solution. E.g. remaining sum < 0 in combination sum
- N-Queens: place queen row by row. Check column, diagonal, anti-diagonal conflicts. O(n!)
- Sudoku Solver: try 1-9 for empty cells, check row/col/box constraints, backtrack if no valid digit
- Word Search (grid backtracking): DFS from each cell, mark visited with temp value, restore on backtrack
- Operators: & (AND), | (OR), ^ (XOR), ~ (NOT), << (left shift = ×2), >> (right shift = ÷2)
- XOR properties: a^a=0, a^0=a, XOR is commutative+associative. XOR all array elements = single non-duplicate element INTERVIEW
-
Check bit i:
(n >> i) & 1 -
Set bit i:
n | (1 << i) -
Clear bit i:
n & ~(1 << i) -
Count set bits (popcount):
Integer.bitCount(n). Or:n &= (n-1)repeatedly (clears lowest set bit) until n=0, count iterations -
Check power of 2:
n > 0 && (n & (n-1)) == 0. Power of 2 has exactly one set bit INTERVIEW - Two numbers missing/appearing: use XOR to cancel duplicates, then split into two groups
- Bitmask DP: represent subset as integer. dp[mask] = result for subset represented by mask. For n ≤ 20
- Purpose: range queries (sum, min, max) + point updates in O(log n). Better than prefix sum when array is mutable
- Structure: binary tree. Leaf = array element. Internal node = aggregate of range. Size = 4×n array
- Build: O(n). Query: O(log n). Update: O(log n)
- Lazy propagation: range updates (add val to all elements in range) in O(log n). Push down pending updates when needed
- When to use: range sum/min/max with point updates, range updates
- Purpose: prefix sum queries + point updates in O(log n). Simpler than segment tree, less memory
-
Key operation:
i & (-i)= lowest set bit of i. Used to navigate the tree - Update: add to BIT[i], then BIT[i + (i & -i)], repeat
- Query prefix sum [1..i]: sum BIT[i] + BIT[i - (i & -i)] + ..., repeat
- Count inversions: process elements right to left, query prefix sum of smaller elements, update
- Purpose: efficiently track connected components. Support union (merge two sets) and find (which set does element belong to?) INTERVIEW
- Find with path compression: on find, make all nodes on path point directly to root. Nearly O(1) amortized
- Union by rank: attach smaller tree under larger tree. Keeps tree flat. O(α(n)) amortized per operation (inverse Ackermann — practically constant)
- Cycle detection: before union(u,v), if find(u) == find(v): adding this edge creates cycle
- Connected components count: start with n components, decrement on each union. Track with components variable
- Number of Islands (Union-Find approach): union adjacent land cells. Count distinct sets
- Accounts Merge: union emails belonging to same account (same person). Group by representative
- Redundant Connection: find edge that creates first cycle using union-find. If both endpoints already in same component: redundant
- AVL Tree: self-balancing BST. Balance factor = |height(left) - height(right)| ≤ 1. Rotations (LL, LR, RL, RR) maintain balance. O(log n) all ops. Stricter balance than Red-Black INTERVIEW
- Red-Black Tree: 5 properties ensure O(log n) height. Less rotations than AVL (faster insert/delete). Java's TreeMap/TreeSet use Red-Black tree
- B-Tree: generalized BST where each node can have many keys and children. Minimizes disk I/O. Used in databases and filesystems. PostgreSQL B-tree index
- B+ Tree: all data in leaves, internal nodes only have keys. Linked list of leaves for range scans. Most DB indexes use B+ trees INTERVIEW
- Skip List: layered linked list with O(log n) expected search. Used in Redis sorted sets. Simpler to implement than balanced BST
- When asked about TreeMap: O(log n) get/put/remove. floorKey(k), ceilingKey(k), headMap(k), tailMap(k) for range operations
- In practice: you won't implement these in interviews. Know: what they guarantee, what Java uses internally, when to use TreeMap vs HashMap
- Sorted array / Search for value: Binary Search
- Contiguous subarray / substring: Sliding Window or Prefix Sum
- Pair in sorted array (sum, difference): Two Pointers
- Top K / Kth largest/smallest: Heap (PriorityQueue) or Quick Select
- Tree — order/path/LCA: DFS recursion or BFS level order
- Shortest path in unweighted graph: BFS
- Shortest path in weighted graph: Dijkstra (positive weights) or Bellman-Ford (negative)
- Ordering with dependencies: Topological Sort
- Connected components / cycle detection: Union-Find or DFS
- Count/enumerate all combinations/permutations: Backtracking
- Optimize count/sum with overlapping subproblems: Dynamic Programming
- Nearest larger/smaller element: Monotonic Stack
- Interval problems (merge/overlap): Sort by start time + greedy/heap
- Prefix / autocomplete / word dict: Trie
- Frequent/duplicate items across stream: HashMap frequency count
- Parentheses / expression evaluation: Stack
- O(n log n) expected: sorting involved, or heap, or binary search combined with linear scan
- O(1) space constraint: two pointers, in-place algorithms, no recursion (or tail-recursive)
- n ≤ 20: bitmask DP or backtracking O(2ⁿ)
- n ≤ 1000: O(n²) DP acceptable
- n ≤ 10⁵: O(n log n) required
- n ≤ 10⁶: O(n) or O(n log n) max
-
Euclidean GCD:
gcd(a, b) = gcd(b, a%b)until b=0. O(log min(a,b)). LCM = a×b/gcd(a,b) - Sieve of Eratosthenes: find all primes up to n in O(n log log n). Mark multiples of each prime as composite
-
Modular arithmetic: (a+b)%m =
((a%m)+(b%m))%m. Important for large number problems.
(a*b)%m = ((a%m)*(b%m))%m -
Fast Power (Binary Exponentiation):
pow(base, exp, mod)in O(log exp). Square base, halve exp. Handle odd exp separately - Overflow prevention: use long in Java. When multiplying two ints: cast to long first
- Factorial and combinations modulo prime: precompute factorials and modular inverses using Fermat's little theorem
- Base conversion: n to base b: repeatedly divide, collect remainders in reverse
- Integer square root: binary search or Newton's method. Check mid*mid <= n
- Naive pattern matching: O(n×m). Compare pattern at each position in text
- KMP (Knuth-Morris-Pratt): O(n+m). Precompute failure function (LPS array) for pattern. On mismatch, don't restart from beginning of pattern INTERVIEW
- LPS (Longest Proper Prefix which is also Suffix): lps[i] = length of longest prefix of pattern[0..i] that is also a suffix
- Rabin-Karp (Rolling Hash): hash the pattern, slide window over text comparing hashes. O(n+m) average. Handle hash collisions with exact comparison
- Rolling hash formula: hash = (hash - text[i] × base^(m-1)) × base + text[i+m]. O(1) per slide
- Manacher's Algorithm: find all palindromic substrings in O(n). Use previously computed palindromes to expand
-
In practice: Java has
str.indexOf(pattern)— use it in interviews. Implement KMP only when explicitly asked
- Monotonic stack: stack where elements are in monotone order (increasing or decreasing). Pop when invariant violated INTERVIEW
- Next Greater Element: maintain decreasing stack. When element is larger than top: top's NGE is current element. Pop and record. O(n)
- Daily Temperatures: same pattern. Stack stores indices. When warmer day found: pop indices with cooler temperature, compute days waited
- Largest Rectangle in Histogram: stack stores indices of bars in increasing order. When bar shorter than top: pop, compute area with popped bar as height. Width = right boundary - stack.peek() - 1. O(n)
- Maximal Rectangle (matrix): convert to histogram row by row, apply histogram algorithm on each row
- Monotonic deque (sliding window max): maintain decreasing deque. Front = maximum of current window. Remove expired elements from front, smaller elements from back
- Sum of Subarray Minimums: for each element, find number of subarrays where it's minimum using monotonic stack. Multiply contribution
- Merge Intervals: sort by start time. Merge if next.start <= prev.end: update end = max(prev.end, next.end). O(n log n) INTERVIEW
- Insert Interval: find where new interval fits. Merge all overlapping. Three phases: add non-overlapping left, merge overlapping, add non-overlapping right
- Two intervals overlap when: a.start <= b.end AND b.start <= a.end. Non-overlapping: a.end < b.start OR b.end < a.start
- Sweep line: add events (start: +1, end: -1). Sort events by time. Track running sum. Useful for "max overlapping at any point"
- Meeting rooms I: sort by start. If any consecutive pair overlaps: can't attend all. Return false
- Meeting rooms II: min rooms = max simultaneous overlapping meetings. Sweep line or min-heap of end times
- Employee free time: merge all intervals across employees, find gaps between merged intervals
- Two Sum (LC 1), Best Time to Buy Stock (LC 121), Contains Duplicate (LC 217), Product of Array Except Self (LC 238), Maximum Subarray (LC 53), Maximum Product Subarray (LC 152), Find Min in Rotated Sorted Array (LC 153), Search Rotated Sorted Array (LC 33), 3Sum (LC 15), Container with Most Water (LC 11)
- Valid Anagram (LC 242), Group Anagrams (LC 49), Longest Substring Without Repeating (LC 3), Minimum Window Substring (LC 76), Valid Parentheses (LC 20), Longest Palindromic Substring (LC 5), Palindromic Substrings (LC 647)
- Max Depth (LC 104), Same Tree (LC 100), Invert Binary Tree (LC 226), Binary Tree Max Path Sum (LC 124), Serialize/Deserialize (LC 297), Subtree of Another Tree (LC 572), Construct from Preorder/Inorder (LC 105), Validate BST (LC 98), Kth Smallest in BST (LC 230), LCA of BST (LC 235)
- Climbing Stairs (LC 70), Coin Change (LC 322), Longest Increasing Subsequence (LC 300), Word Break (LC 139), Combination Sum IV (LC 377), House Robber (LC 198, 213), Decode Ways (LC 91), Unique Paths (LC 62), Longest Common Subsequence (LC 1143), Edit Distance (LC 72), Burst Balloons (LC 312)
- Clone Graph (LC 133), Course Schedule (LC 207), Number of Islands (LC 200), Rotting Oranges (LC 994), Pacific Atlantic Waterflow (LC 417), Word Ladder (LC 127), Number of Connected Components (LC 323), Graph Valid Tree (LC 261), Alien Dictionary (LC 269)
- HashMap / HashSet: database indexes (hash index), HTTP routing tables, DNS cache, compiler symbol tables, deduplification
- B+ Tree: PostgreSQL/MySQL indexes, filesystem metadata (NTFS, ext4), key-value stores (RocksDB, InnoDB)
- Heap / Priority Queue: OS scheduler (CPU task scheduling), Dijkstra in routing protocols (OSPF), event simulation, Huffman encoding, A* pathfinding
- Skip List: Redis Sorted Sets, LevelDB memtable, concurrent data structures
- Trie: autocomplete (Google search, IDE), IP routing (longest prefix match), spell checkers, t9 prediction
- LRU Cache (HashMap + Doubly Linked List): CPU cache replacement, browser cache, Redis maxmemory eviction policy, OS page replacement
- Bloom Filter: check if item definitely NOT in set. CDN negative cache, DB query optimization, weak password check
- Consistent Hashing: distributing data across servers in DynamoDB, Cassandra, Redis Cluster, load balancers
- Queue (BFS pattern): web crawler URL frontier, task queue (RabbitMQ, SQS), BFS in recommendation systems
- Graph algorithms: Google PageRank (power iteration), social network friend suggestions (BFS), navigation (Dijkstra), dependency resolution (topological sort in package managers)
- Segment Tree / BIT: range query in databases, analytics systems, leaderboard scoring
- Suffix Array / Suffix Tree: full-text search engine indexing, bioinformatics sequence matching
"Design an LRU Cache" — HashMap (O(1) lookup) + Doubly Linked List (O(1) move to front/remove). HashMap maps key → node in DLL. DLL head = most recent, tail = least recent. On get: move to head. On put: add to head, if over capacity: remove tail. Also know: Java's LinkedHashMap with accessOrder=true is a built-in LRU.
- U — Understand: repeat problem in own words. Ask clarifying questions: input constraints? Sorted? Duplicates? What to return if no solution? Edge cases?
- M — Match: identify pattern. Does it look like a known problem? Which data structure fits?
- P — Plan: talk through approach BEFORE coding. State time/space complexity upfront. Get buy-in from interviewer
- I — Implement: write clean code while narrating. Use meaningful variable names. Helper functions are fine
- R — Review: trace through with example. Catch bugs before interviewer does
- E — Evaluate: state complexity again. Discuss tradeoffs. Can it be optimized?
- Think out loud: interviewers want to see thought process. Silence is bad. Say "I'm thinking about using a HashMap because..."
- Start with brute force: always mention brute force first, then optimize. Shows you can solve it, then shows you can think deeper
- Ask before coding: "Does this approach make sense before I start implementing?"
- Handle unknowns: "I'm not immediately sure of the optimal solution, but here's what I can see..."
- Edge cases to always check: empty input, single element, all same values, negative numbers, overflow, already sorted/reversed
- Use ArrayDeque, not Stack class — Deque<Integer> stack = new ArrayDeque<>()
- Use Long for large multiplications — (long) a * b before modulo
- Integer.MAX_VALUE/MIN_VALUE for initialization. Be careful of overflow when adding 1 to MAX_VALUE
- Arrays.sort() for primitives, Collections.sort() for objects
- List.of(), Map.of() for immutable collections in tests/initialization
Week 1–4: 2 easy + 1 medium per day. Week 5–8: 1 medium + attempt 1 hard per day. Week 9–12: timed mock interviews (25 min per problem). Week 13–16: company-specific questions (Glassdoor/Leetcode company tags). Review every problem you couldn't solve within 30 minutes.