And indeed this is a form of the knapsack problem. From the Wikipedia article on the same:
The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items.This is the sort of problem for which exact answers cannot generally be found in a reasonable amount of time (although this may be possible for smaller instances). There are numerous inexact ways to solve such a problem. One of them is so-called genetic algorithms. Again, from the Wikipedia article:
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection.[1]In other words, one creates a cruel, merciless struggle for dominance in the digital world and culls solutions that can't compete effectively, while letting those that can thrive and multiply, a very pleasant way of doing things, no doubt. (A less brutal approach to evolving solutions to problems is particle swarm optimization, in which no one ever dies. However, I have not used it because, in its basic form, it is not as well-suited to combinatorial problems as are GAs.)
I used the Python library Distributed Evolutionary Algorithms in Python, or DEAP for short, to attempt to take on this problem. I don't want to be too verbose about what I did, and DEAP has pretty good documentation (even if it is written by québécoises less than fully comfortable with English), but I will explain a few of the particulars of this problem instance. An important assumption here is that more than one of the same item can be ordered, not shared by all instances of the knapsack problem. As a consequence, I evolved candidate solutions consisting of lists of integers as long as the appetizer menu, where each integer ranged from zero to six, six being the maximum number of the cheapest item on the list (mixed fruit) that can be bought within the budget alloted. (Seven just goes slightly over the limit, something no doubt intended by Randall Munroe.) Additionally, when evaluating the fitness of candidate solutions, I am assuming the limit of $15.05 absolutely cannot be exceeded, and so "punish" solutions that go even a little over this limit by putting excess spending first in the two criteria by which candidates are evaluated. (When doing multi-objective optimization with DEAP, it is very important (!!!) to put the objectives in the order of their importance, because they are evaluated in a lexicographic fashion. The weights assigned to them barely count for anything.)
import json import random import numpy as np from deap import algorithms from deap import base from deap import creator from deap import tools class Evaluator(object): def __init__(self, items, goal): self.items = items self.goal = goal def __call__(self, candidate): total_price = 0 for i in xrange(len(candidate)): total_price += candidate[i] * self.items[i]['price'] if total_price > self.goal : overspending = total_price - self.goal else: overspending = 0 return (overspending, total_price) with open('items.json') as f: items = json.load(f) with open('goal.json') as f: goal = json.load(f) evaluator = Evaluator(items, goal) upper_limit = 6 # Objectives are decreasing overspending and increasing total profit, in that # order creator.create('FitnessMulti', base.Fitness, weights=(-1, 1)) creator.create('Individual', list, fitness=creator.FitnessMulti) toolbox = base.Toolbox() toolbox.register('random_integer', lambda: random.randint(0, upper_limit)) toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.random_integer, n = len(items)) toolbox.register('population', tools.initRepeat, list, toolbox.individual) toolbox.register('evaluate', evaluator) toolbox.register('mate', tools.cxTwoPoint) toolbox.register('mutate', tools.mutUniformInt, low=0, up=upper_limit, indpb=0.05) toolbox.register('select', tools.selTournament, tournsize=3) # random.seed(42) NGEN = 1000 CXPB = 0.5 MUTPB = 0.2 pop = toolbox.population(n=100) hof = tools.ParetoFront() pop, log = algorithms.eaSimple(pop, toolbox, cxpb=CXPB, mutpb=MUTPB, ngen=NGEN, halloffame=hof, verbose=True) total_price = 0 best_solution = hof[0] print 'Items ordered:' for i in xrange(len(best_solution)): if best_solution[i] != 0: print '%dx %s' % (best_solution[i], items[i]['name']) total_price += best_solution[i] * items[i]['price'] print print 'Total price: $%.2f' % total_price
These are the source files on Pastebin:
How did it do? Here are some solutions offered by the process:
- 3x mixed fruit, 1x French fries, 1x sampler plate; total price: $15.00
- 1x side salad, 2x sampler plate; total price: $14.95
- 1x mixed fruit, 2x hot wings, 1x sampler plate; total price: $15.05
- 3x French fries, 2x side salad; total price: $14.95
- 2x mixed fruit, 3x hot wings; total price: $14.95
But how about something a little more difficult? I don't know exactly how many appetizer orders under the limit of $15.05 are possible. There might be a way of doing this with dynamic programming or something that isn't immediately obvious to me. But given that, at most, six of the cheapest item can be bought, an easy upper bound for the number of possible orders is \(7^6\), that is to say, the possibility of ordering anywhere from zero to six inclusive of each appetizer, raised to six, the number of appetizers on the list. This number is about 118,000, and is also the same number that the genetic algorithm described above can, in principle, entertain—though it will quickly reject many of these possibilities. The number is "only" on the order of about 100,000. What about a bigger search space? One John Burkhardt, formerly of Florida State University, has uploaded a number of datasets fit for use in various computational problems, among them, a few for the binary knapsack problem, where it is not possible to include a number of each item limited only by the maximum spending limit. In this case, an item is either included or it is not. But due to the power of exponentiation, such problems can have very large search spaces. Since we're doing well today, let's consider the last item on the list, which has 24 items. The search space in this case contains \(2^{24}\), or well over 16 million possible candidates. The candidate solutions will have to be lists of Boolean values in this case, with operators appropriate to their kind. With that said, here's how it goes:
import json import random import numpy as np from deap import algorithms from deap import base from deap import creator from deap import tools class Evaluator(object): def __init__(self, items, capacity): self.items = items self.capacity = capacity def __call__(self, candidate): total_weight = 0 total_profit = 0 for i in xrange(len(candidate)): if candidate[i]: total_weight += self.items[i]['weight'] total_profit += self.items[i]['profit'] if total_weight > self.capacity: overload = total_weight - self.capacity else: overload = 0 return (overload, total_profit) with open('items.json') as f: items = json.load(f) with open('capacity.json') as f: capacity = json.load(f) evaluator = Evaluator(items, capacity) # Objectives are decreasing overload and increasing total profit, in that order creator.create('FitnessMulti', base.Fitness, weights=(-1, 1)) creator.create('Individual', list, fitness=creator.FitnessMulti) toolbox = base.Toolbox() toolbox.register('random_bit', lambda: random.choice([False, True])) toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.random_bit, n = len(items)) toolbox.register('population', tools.initRepeat, list, toolbox.individual) toolbox.register('evaluate', evaluator) toolbox.register('mate', tools.cxTwoPoint) toolbox.register('mutate', tools.mutFlipBit, indpb=0.05) toolbox.register('select', tools.selNSGA2) # random.seed(42) NGEN = 500 CXPB = 0.5 MUTPB = 0.2 pop = toolbox.population(n=100) hof = tools.ParetoFront() pop, log = algorithms.eaSimple(pop, toolbox, CXPB, MUTPB, NGEN, halloffame=hof, verbose=True) total_profit = 0 total_weight = 0 best_solution = hof[0] print 'Items selected:' for i in xrange(len(best_solution)): if best_solution[i]: print i + 1 total_profit += items[i]['profit'] total_weight += items[i]['weight'] print print 'Total weight: %d' % total_weight print 'Capacity: %d' % capacity print 'Total profit: %d' % total_profit
Again, the source files, slightly different to reflect the terminology of the problem description:
How did it do? Here are some runs of the solver:
- Items selected: 2, 4, 5, 6, 7, 9, 10, 11, 16, 17, 20, 22, 24; total weight: 6,396,544; total profit: 13,494,864
- Items selected: 2, 4, 5, 6, 10, 11, 13, 16, 17, 20, 22, 23, 24; total weight: 6,374,246; total profit: 13,464,598
- Items selected: 1, 2, 4, 5, 7, 9, 11, 13, 16, 17, 20, 23, 24; total weight: 6,402,050; total profit: 13,461,886
- Items selected: 1, 2, 4, 5, 6, 7, 10, 11, 13, 16, 17, 20, 22; total weight: 6,397,001; total profit: 13,524,340
- Items selected: 1, 2, 4, 5, 7, 10, 11, 13, 16, 22, 23, 24; total weight: 6,392,842; total profit: 13,521,334
For some difficult computational problems, evolutionary approaches are clearly very capable. For your patience in getting through all of this, here is an even more vivid demonstration of what they can do:
No comments:
Post a Comment