< Randomized Optimization for | Contents | Experiment Support Vector Machine Classifier on iris dataset >
Randomized Optimization¶
Which RO algorithms work best depends highly on the structure of the problem. In this notebook, we will compare how two commonly used RO algs perofrm on two problems with different structures.
Note: GA's and SA's are both commonly used but there is not state of the art algorithm for all optimization problems. Good RO outcomes require understanding the structure of the problem and the algorithms strengths and weaknesses.....alternatively the structure of the problem can be inferred from the algorithm's performance :D
Randomized Optimization Algorithms on Toy Problems¶
This notebook explores the performance of different randomized optimization (RO) algorithms on two classic toy problems: the Traveling Salesman Problem (TSP) and the Knapsack Problem.
Problem Overview¶
TSP is a classic combinatorial optimization problem where the goal is to find the shortest possible route that visits each city exactly once and returns to the origin city.
The Knapsack Problem involves selecting a subset of items with given weights and values to maximize total value without exceeding a weight limit.
Problem | Type | Input | Output | Challenge |
---|---|---|---|---|
TSP | Permutation problem | Distance matrix, cities | Best route (permutation) | Finding optimal cyclic permutation |
Knapsack | Subset selection | Weights, values, max capacity | Selected items (0/1 vector) | Best value under weight constraint |
RO Algorithm Strengths and Weaknesses¶
Algorithm | Strengths | Weaknesses |
---|---|---|
Genetic Algorithm | Good for large search spaces & multimodal | Slower, parameter tuning needed |
Simulated Annealing | Simple, good global exploration | May be slow to converge |
Genetic Algorithm (GA)¶
Algorithm:
- Evolve a population of solutions using selection, crossover (recombination), and mutation.
- Keep fitter individuals more likely to survive to the next generation.
What it does:
- Searches large, complex, or multimodal spaces by mimicking natural evolution.
- Maintains diversity and explores a wide variety of solutions.
Strengths:
- Handles large, noisy, nonlinear, or discrete search spaces well.
- Good at escaping local optima due to population-based search.
- Flexible: works with both permutation (e.g. TSP) and binary/vector spaces (e.g. Knapsack).
Weaknesses:
- Slower due to population overhead.
- Requires careful tuning (population size, mutation rate, crossover type).
- Can converge prematurely if diversity is lost.
Good for:
- TSP, Knapsack, scheduling, neural architecture search, multimodal fitness landscapes.
Less good for:
- Small search spaces, problems with fast greedy solutions, when runtime is limited.
Simulated Annealing (SA)¶
Algorithm:
- Start with an initial solution.
- At each step, randomly perturb the solution.
- Accept worse solutions with a decreasing probability (based on temperature).
- Gradually "cool down" the system to settle into a minimum.
What it does:
- Emulates annealing in metallurgy — controlled cooling to minimize energy.
- Allows occasional uphill moves to escape local optima early in the process.
Strengths:
- Simple and memory-efficient.
- Effective at escaping local minima in early stages.
- No need to maintain a population.
Weaknesses:
- Sensitive to cooling schedule.
- May converge slowly or poorly if not tuned.
- No memory of past states, so can revisit bad solutions.
Good for:
- Permutation-based problems like TSP.
- Continuous or combinatorial optimization with many local minima.
Less good for:
- Highly rugged or deceptive search spaces.
- Problems with tight time/accuracy constraints.
Expectations¶
How do we expect these RO algs to perform on TSP and Knapsack?
Aspect | TSP (e.g. 10-50 cities) | Knapsack (e.g. 20-50 items) |
---|---|---|
Genetic Algorithm | Learns good tours with crossover/mutation | Good balance between value/weight |
Simulated Annealing | Can explore far and settle gradually | Can get stuck if cooling is too fast |
Runtime (small scale) | GA > SA | GA > SA |
Setup Code¶
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import random
import time
import tracemalloc
# --------------------------
# TSP Problem Setup
# --------------------------
def euclidean_distance(p1, p2):
return np.linalg.norm(np.array(p1) - np.array(p2))
def total_tsp_distance(route, coords):
return sum(
euclidean_distance(coords[route[i]], coords[route[(i + 1) % len(route)]])
for i in range(len(route))
)
def generate_tsp_coordinates(n_cities, seed=42):
np.random.seed(seed)
return np.random.rand(n_cities, 2) * 100
# --------------------------
# Knapsack Problem Setup
# --------------------------
def generate_knapsack_items(
n_items=50, weight_range=(1, 30), value_range=(10, 100), capacity_ratio=0.4, seed=42
):
np.random.seed(seed)
weights = np.random.randint(*weight_range, size=n_items)
values = np.random.randint(*value_range, size=n_items)
capacity = int(capacity_ratio * np.sum(weights))
return weights, values, capacity
def knapsack_value(selection, weights, values, capacity):
total_weight = np.dot(selection, weights)
if total_weight > capacity:
return 0 # Penalty for infeasibility
return np.dot(selection, values)
Genetic Algorithm for TSP¶
Note: GAs require you to define an initial population (set of candidate solutions...random permutations) to evolve, fitness function (eg. total distance), selection strategy, cross over function (eg. ordered for TSP), mutation function (eg. random swap of cities), etc.
def ga_tsp(coords, n_generations=300, pop_size=100, mutation_rate=0.01):
def init_population():
# initialize population with random routes (permutations of city indices)
return [np.random.permutation(len(coords)) for _ in range(pop_size)]
def crossover(p1, p2):
start, end = sorted(random.sample(range(len(p1)), 2))
child = [-1] * len(p1)
child[start:end] = p1[start:end]
pointer = 0
for gene in p2:
if gene not in child:
while child[pointer] != -1:
pointer += 1
child[pointer] = gene
return np.array(child)
def mutate(route):
if random.random() < mutation_rate:
a, b = random.sample(range(len(route)), 2)
route[a], route[b] = route[b], route[a]
population = init_population() # initial random population...routes
best_route = None
best_distance = float("inf") # init as large as possible
history = [] # to track best distance over generations
for _ in range(n_generations):
# sort population by fitness (distance)
population = sorted(population, key=lambda r: total_tsp_distance(r, coords))
# update best route if current best in population is better
if total_tsp_distance(population[0], coords) < best_distance:
best_route = population[0]
best_distance = total_tsp_distance(best_route, coords)
# track best distance over generations
history.append(best_distance)
# start next generation with top 10 routes (elitism)
next_generation = population[:10] # Elitism
# fill in the rest of the population with crossover and mutation
while len(next_generation) < pop_size:
# randomly select 2 parents from top 50 (selection pressure)
parents = random.choices(population[:50], k=2)
# create a child via crossover
child = crossover(parents[0], parents[1])
# possibly mutate the child
mutate(child)
# add the child to the next generation
next_generation.append(child)
# replace old population with the new one
population = next_generation
return best_route, best_distance, history
Simulated Annealing for TSP¶
# --------------------------
# Simulated Annealing for TSP
# --------------------------
def sa_tsp(coords, T_init=1000, alpha=0.995, n_iter=10000):
# swap 2 cities in a route
def swap(route):
a, b = random.sample(range(len(route)), 2)
route[a], route[b] = route[b], route[a]
return route
# init: start with a random route and call that the "best"
current = np.random.permutation(len(coords))
best = current.copy()
current_distance = total_tsp_distance(current, coords)
best_distance = current_distance
# set initial temperature and history (best distances over time)
T = T_init
history = [current_distance]
for _ in range(n_iter):
candidate = swap(current.copy()) # generate a neighbor by swapping two cities
candidate_distance = total_tsp_distance(
candidate, coords
) # eval neighbors distance
delta = (
candidate_distance - current_distance
) # how much better/worse is the candidate?
if delta < 0 or np.random.rand() < np.exp(
-delta / T
): # accept if better or with some probability
current = candidate
current_distance = candidate_distance
if candidate_distance < best_distance:
best = candidate
best_distance = candidate_distance
history.append(
best_distance
) # record best and gradually cool the system (low acceptance probability of worse solutions)
T *= alpha
return best, best_distance, history
Genetic Algorithm for Knapsack¶
# --------------------------
# Genetic Algorithm
# --------------------------
def ga_knapsack(
weights, values, capacity, n_generations=200, pop_size=100, mutation_rate=0.05
):
n_items = len(weights)
def init_population():
return [np.random.randint(0, 2, size=n_items) for _ in range(pop_size)]
def mutate(solution):
if random.random() < mutation_rate:
idx = random.randint(0, n_items - 1)
solution[idx] = 1 - solution[idx]
def crossover(p1, p2):
point = random.randint(1, n_items - 1)
return np.concatenate((p1[:point], p2[point:]))
population = init_population()
best_solution = None
best_value = 0
history = []
for _ in range(n_generations):
population = sorted(
population,
key=lambda x: knapsack_value(x, weights, values, capacity),
reverse=True,
)
if knapsack_value(population[0], weights, values, capacity) > best_value:
best_solution = population[0]
best_value = knapsack_value(best_solution, weights, values, capacity)
history.append(best_value)
next_generation = population[:10] # Elitism
while len(next_generation) < pop_size:
parents = random.choices(population[:50], k=2)
child = crossover(parents[0], parents[1])
mutate(child)
next_generation.append(child)
population = next_generation
return best_solution, best_value, history
Simulated Annealing for Knapsack¶
# --------------------------
# Simulated Annealing
# --------------------------
def sa_knapsack(weights, values, capacity, T_init=100, alpha=0.98, n_iter=3000):
n_items = len(weights)
def flip_bit(solution):
idx = random.randint(0, n_items - 1)
solution[idx] = 1 - solution[idx]
return solution
current = np.random.randint(0, 2, size=n_items)
best = current.copy()
current_value = knapsack_value(current, weights, values, capacity)
best_value = current_value
T = T_init
history = [current_value]
for _ in range(n_iter):
candidate = flip_bit(current.copy())
candidate_value = knapsack_value(candidate, weights, values, capacity)
delta = candidate_value - current_value
if delta > 0 or np.random.rand() < np.exp(delta / T):
current = candidate
current_value = candidate_value
if candidate_value > best_value:
best = candidate
best_value = candidate_value
history.append(best_value)
T *= alpha
return best, best_value, history
Run Experiments¶
Measure runtime, memory and best score for each algorithm on both problems.
TSP¶
# generate coordinates for TSP
coords = generate_tsp_coordinates(n_cities=25)
print("Coordinates for TSP:", coords)
# --- Measure GA performance ---
tracemalloc.start()
start_time = time.perf_counter()
ga_route, ga_dist, ga_history = ga_tsp(coords)
end_time = time.perf_counter()
ga_time = end_time - start_time
ga_mem_current, ga_mem_peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(
f"GA TSP: Best Distance = {ga_dist}, Time = {ga_time:.2f}s, Memory = {ga_mem_current / 1024:.2f}KB (Peak: {ga_mem_peak / 1024:.2f}KB)"
)
# --- Measure SA performance ---
tracemalloc.start()
start_time = time.perf_counter()
sa_route, sa_dist, sa_history = sa_tsp(coords)
end_time = time.perf_counter()
sa_time = end_time - start_time
sa_mem_current, sa_mem_peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(
f"SA TSP: Best Distance = {sa_dist}, Time = {sa_time:.2f}s, Memory = {sa_mem_current / 1024:.2f}KB (Peak: {sa_mem_peak / 1024:.2f}KB)"
)
# --------------------------
# Plot Convergence
# --------------------------
plt.figure(figsize=(10, 5))
plt.plot(ga_history, label="Genetic Algorithm")
plt.plot(sa_history, label="Simulated Annealing")
plt.xlabel("Iteration")
plt.ylabel("Distance")
plt.title("TSP: Convergence Comparison")
plt.legend()
plt.grid(True)
plt.show()
# Plot runtime and memory comparison
algos = ["Genetic Algorithm", "Simulated Annealing"]
runtimes = [ga_time, sa_time]
mem_current = [ga_mem_current / 1024, sa_mem_current / 1024] # in KB
mem_peak = [ga_mem_peak / 1024, sa_mem_peak / 1024] # in KB
fig, ax1 = plt.subplots(figsize=(8, 4))
# Runtime bars (left y-axis)
ax1.bar(algos, runtimes, color="skyblue")
ax1.set_ylabel("Runtime (s)", color="skyblue")
ax1.tick_params(axis="y", labelcolor="skyblue")
ax1.set_title("Algorithm Resource Comparison")
# Memory lines (right y-axis)
ax2 = ax1.twinx()
ax2.plot(algos, mem_peak, color="orange", marker="o", label="Peak Memory (KB)")
ax2.plot(
algos,
mem_current,
color="green",
marker="s",
linestyle="--",
label="Current Memory (KB)",
)
ax2.set_ylabel("Memory (KB)", color="gray")
ax2.tick_params(axis="y", labelcolor="gray")
# Combine legends from both axes
lines_1, labels_1 = ax1.get_legend_handles_labels()
lines_2, labels_2 = ax2.get_legend_handles_labels()
ax2.legend(lines_1 + lines_2, labels_1 + labels_2, loc="upper left")
plt.grid(True, axis="y", linestyle="--", alpha=0.7)
plt.tight_layout()
plt.show()
So, GA found a better solution but took longer to run.
It also used less memory in our implementation...but wait???....isn't SA is supposed to be more memory efficient? Yes, but in our implementation we store the entire history. A better implementation may cap history length or optimize swap, etc
Knapsack¶
Goal: maximize value without exceeding weight limit.
weights, values, capacity = generate_knapsack_items()
# --- Measure GA performance ---
tracemalloc.start()
start_ga = time.perf_counter()
ga_sol, ga_val, ga_hist = ga_knapsack(weights, values, capacity)
end_ga = time.perf_counter()
ga_time = end_ga - start_ga
ga_mem_current, ga_mem_peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(
f"GA Knapsack: Best Value = {ga_val}, Time = {ga_time:.2f}s, Memory = {ga_mem_current / 1024:.2f}KB (Peak: {ga_mem_peak / 1024:.2f}KB)"
)
# --- Measure SA performance ---
tracemalloc.start()
start_sa = time.perf_counter()
sa_sol, sa_val, sa_hist = sa_knapsack(weights, values, capacity)
end_sa = time.perf_counter()
sa_time = end_sa - start_sa
sa_mem_current, sa_mem_peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(
f"SA Knapsack: Best Value = {sa_val}, Time = {sa_time:.2f}s, Memory = {sa_mem_current / 1024:.2f}KB (Peak: {sa_mem_peak / 1024:.2f}KB)"
)
So GA found a better solution but took longer to run.
# --------------------------
# Plot Convergence
# --------------------------
plt.figure(figsize=(10, 5))
plt.plot(ga_hist, label="Genetic Algorithm")
plt.plot(sa_hist, label="Simulated Annealing")
plt.xlabel("Iteration")
plt.ylabel("Value")
plt.title("Knapsack: Convergence Comparison")
plt.legend()
plt.grid(True)
plt.show()
# --------------------------
# Combined Runtime & Memory Plot
# --------------------------
algos = ["Genetic Algorithm", "Simulated Annealing"]
runtimes = [ga_time, sa_time]
mem_current = [ga_mem_current / 1024, sa_mem_current / 1024]
mem_peak = [ga_mem_peak / 1024, sa_mem_peak / 1024]
fig, ax1 = plt.subplots(figsize=(8, 4))
ax1.bar(algos, runtimes, color="skyblue")
ax1.set_ylabel("Runtime (s)", color="skyblue")
ax1.tick_params(axis="y", labelcolor="skyblue")
ax1.set_title("Knapsack: Resource Comparison")
ax2 = ax1.twinx()
ax2.plot(algos, mem_peak, color="orange", marker="o", label="Peak Memory (KB)")
ax2.plot(
algos,
mem_current,
color="green",
marker="s",
linestyle="--",
label="Current Memory (KB)",
)
ax2.set_ylabel("Memory (KB)", color="gray")
ax2.tick_params(axis="y", labelcolor="gray")
lines_1, labels_1 = ax1.get_legend_handles_labels()
lines_2, labels_2 = ax2.get_legend_handles_labels()
ax2.legend(lines_1 + lines_2, labels_1 + labels_2, loc="upper left")
plt.grid(True, axis="y", linestyle="--", alpha=0.7)
plt.tight_layout()
plt.show()
Again, our implementation of SA stored the entire history, which is not memory efficient.
Summary: Experimental Results vs Expectations¶
Overall, our expectations aligned fairly well with results....Genetic Algorithm is slower but more robust for complex or multimodal spaces. Simulated Annealing is faster but seems more sensitive to parameter tuning and problem structure....gets stuck. We didn't expect SA to have such poor memory but that was due to our implementation choice more than a feature of the algorithm itself.
Aspect | TSP Results | Knapsack Results |
---|---|---|
Genetic Algorithm | Found a better route than Simulated Annealing, with longer runtime | Achieved a significantly higher total value than Simulated Annealing, with longer runtime |
Simulated Annealing | Produced a reasonable solution quickly but did not reach the best quality | Produced a lower-value solution quickly; less effective on the problem landscape |
Runtime | Matched expectation: Genetic Algorithm was slower than Simulated Annealing | Matched expectation: Genetic Algorithm was slower than Simulated Annealing |
Memory Usage | Simulated Annealing had higher peak memory due to route copying and history | Simulated Annealing had higher peak memory due to repeated bit flips and state storage |
Algorithm Behavior | Simulated Annealing explored well early, but Genetic Algorithm exploited better over time | Genetic Algorithm balanced exploration and exploitation more effectively; Simulated Annealing likely cooled too quickly |
Possible Improvements¶
- Tune cooling schedule in Simulated Annealing to maintain exploration longer?
- Try adaptive mutation rates in Genetic Algorithm to preserve diversity?
- Cap memory usage in Simulated Annealing by limiting history length?
< Randomized Optimization for | Contents | Experiment Support Vector Machine Classifier on iris dataset >