n items of lengths l1, l2...
ln and a total length of t, find all subsets of items
which add up to t.
Even though the description sounds fairly simple, the problem is
surprisingly tough. In fact, it is NP-complete,
which is computer science gobbledigook for "d*mn tough, man!".
In practice, it means that for small values of n, it's easy to find
an algorithm which tests all permutations and lists all subsets in
a reasonable amount of time. However, as the size of the array of item lengths
grows, the computation time may grow exponentially, and chances are you'll
never see the day when it ends
(cf. Deep Thought).
The following simple backtracking
algorithm is of that nature: It performs reasonably well for small arrays, but will
quickly wear your patience for larger arrays. To see how the algorithm behaves,
I created a version of the code which only counts the number of found solutions;
then I ran it in CLISP, compiled and interpreted:
| Array size | Run time (compiled) | Run time (interpreted) ![]() |
|---|---|---|
| 27 | 80s | 780s |
| 20 | 0.7s | 6s |
| 25 | 18s | 210s |
| 15 | 0.05s | 0.35s |
| 10 | 0.002s | 0.02s |
| 29 | 270s | |
| 30 | 570s |
(let ((solutions 0)
flags
numbers)
(defun found-solution()
"Called whenever the algorithm has found a solution"
(let ((total 0))
(format t " ")
(dotimes (i (length numbers))
(when (aref flags i)
(incf total (aref numbers i))
(format t "~A " (aref numbers i)))
)
(format t " => ~A~%" total)
(incf solutions)))
(defun find-solutions(k target-sum callback)
"Core backtracking algorithm"
(when (zerop target-sum)
(funcall callback)
(return-from find-solutions))
(unless (= k (length numbers))
(let ((nk (aref numbers k)))
(when (>= target-sum nk)
;; try subtracting numbers[k] from target-sum
(setf (aref flags k) t)
(find-solutions (+ 1 k) (- target-sum nk) callback)
(setf (aref flags k) nil)))
;; recurse without subtracting first
(find-solutions (+ 1 k) target-sum callback)))
(defun find-subset-sum(target-sum)
"Set up and run backtracking algorithm based on 'numbers' array"
(setf flags (make-array (list (length numbers))))
(setf solutions 0)
(find-solutions 0 target-sum #'found-solution)
(format t "Found ~A different solutions.~%" solutions))
(defun subset-sum-test(size)
"Test subset sum algorithm using random numbers"
(let* ((total 0) target-sum)
;; init numbers array with random values up to 1000
(setf numbers (make-array (list size)))
(dotimes (i size)
(setf (aref numbers i) (random 1000))
(incf total (aref numbers i)))
(setf target-sum (floor (/ total 2))) ;; random target sum
(format t "Now listing all subsets which sum up to ~A:~%" target-sum)
(find-subset-sum target-sum)))
)
The core backtracking algorithm is in find-solutions. It will recursively
exhaust all subsets. When it finds a subset which adds up to target-sum,
it will call the callback function - this function can either simply
increase a solution counter, report the current solution to the user, or
store it somewhere for later retrieval.
In the test example above, the callback function is print-solution which
increments a solution counter and prints the current solution.
To test the code, run subset-sum-test, providing an array size. This function
will create an array of numbers of that size and initialize it with random
values; it will also pick a random target-sum. In a real application,
you would replace subset-sum-test with a function which gets the array
data from somewhere (for example, from a tools database as in the customer's
case), and lets the user pick a target-sum.
-- ClausBrod - 01 Mar 2006
The aforementioned customer would actually have preferred a solution in CoCreate
Drafting's macro language. However, this macro language isn't really
a full-blown programming language (even though it is perfectly adequate
for almost all customization purposes). For instance, its macros cannot return
values, the language doesn't have an array type, and the macro expansion stack
(i.e. the depth of the macro call tree) has a fixed limit - which pretty much
rules out non-trivial amounts of recursion.
While I was considering my options, I also fooled around with VBA,
which resulted in the code presented below. I'm not at all
proficient in VBA, so I'm sure the implementation is lacking, but
anyway - maybe someone out there finds it useful nonetheless
Dim solutions As Long
Dim flags() As Boolean
Dim numbers() As Long
Sub findSolutions(k As Long, targetSum As Long)
If targetSum = 0 Then
' we found a solution
solutions = solutions + 1
Exit Sub
End If
If k <= UBound(numbers) Then
If (targetSum >= numbers(k)) Then
flags(k) = True
' try first by subtracting numbers[k] from targetSum
Call findSolutions(k + 1, targetSum - numbers(k))
flags(k) = False
End If
' now try without subtracting
Call findSolutions(k + 1, targetSum)
End If
End Sub
Sub subsetsum()
Dim targetSum As Long
Dim i As Long
Dim arraySize As Long
arraySize = 25
ReDim numbers(0 To arraySize - 1)
ReDim flags(0 To arraySize - 1)
' initialize numbers array with random entries
Randomize
For i = 0 To arraySize - 1
numbers(i) = Int(1000 * Rnd + 1)
flags(i) = False
targetSum = targetSum + numbers(i)
Next
targetSum = Int(targetSum / 2)
solutions = 0
Call findSolutions(0, targetSum)
MsgBox "Found " + Str(solutions) + " solutions."
End Sub
Let's see - we recurse one level for each entry in the array, so with a maximum
array size of 30 (the customer said he was considering a table of 20-30 values),
the recursion depth should never exceed 30. That's not a lot, in fact,
so would this still exceed the macro stack thresholds in CoCreate Drafting?
Oh, and by the way, the recursive function doesn't even try to return values,
so the lack of return values in the macro language isn't a real obstacle
in this case! I couldn't resist and just had to try to translate the algorithm
into CoCreate Drafting macro language:
{ Description: Subset sum algorithm }
{ Author: Claus Brod }
{ Language: CoCreate Drafting macros }
{ (C) Copyright 2006 Claus Brod, all rights reserved }
DEFINE Found_solution
LOCAL I
LOCAL Solution
LOCAL Total
LOCAL Nk
{ display current solution }
LET Subset_sum_solutions (Subset_sum_solutions+1)
LET I 1
LET Solution ''
LET Total 0
WHILE (I <= Subset_sum_arraysize)
IF (READ_LTAB 'Flags' I 1)
LET Nk (READ_LTAB 'Numbers' I 1)
LET Total (Total + Nk)
LET Solution (Solution + ' ' + STR(Nk))
END_IF
LET I (I+1)
END_WHILE
DISPLAY_NO_WAIT (Solution + ' sum up to ' + STR(Total))
END_DEFINE
DEFINE Find_solutions
PARAMETER K
PARAMETER Target_sum
LOCAL Nk
IF (Target_sum = 0)
{ we found a solution, display it }
Found_solution
ELSE_IF (K <= Subset_sum_arraysize)
LET Nk (READ_LTAB 'Numbers' K 1)
{ The following optimization only works if we can assume a sorted array }
IF ((Nk * (Subset_sum_arraysize-K+1)) >= Target_sum)
IF (Target_sum >= Nk)
{ try first by subtracting Numbers[k] from Target }
WRITE_LTAB 'Flags' K 1 1
Find_solutions (K+1) (Target_sum-Nk)
WRITE_LTAB 'Flags' K 1 0
END_IF
{ now try without subtracting }
Find_solutions (K+1) Target_sum
END_IF
END_IF
END_DEFINE
DEFINE Subset_sum
PARAMETER Subset_sum_arraysize
LOCAL Target
LOCAL Random
LOCAL I
LOCAL Subset_sum_solutions
LOCAL Start_time
{ Allocate Numbers and Flags arrays }
CREATE_LTAB Subset_sum_arraysize 1 'Numbers'
CREATE_LTAB Subset_sum_arraysize 1 'Flags'
LET Target 0
LET I 1
WHILE (I <= Subset_sum_arraysize)
LET Random (INT(1000 * RND + 1))
LET Target (Target + Random)
WRITE_LTAB 'Numbers' I 1 Random
WRITE_LTAB 'Flags' I 1 0
LET I (I+1)
END_WHILE
LET Target (INT (Target/2))
DISPLAY ('Array size is ' + STR(Subset_sum_arraysize) + ', target sum is ' + STR(Target))
{ Sorting in reverse order speeds up the recursion }
SORT_LTAB 'Numbers' REVERSE_SORT 1 CONFIRM
LET Start_time (TIME)
LET Subset_sum_solutions 0
Find_solutions 1 Target
DISPLAY ('Found ' + STR(Subset_sum_solutions) + ' solutions in ' + STR(TIME-Start_time) + ' seconds.')
END_DEFINE
Because CoCreate Drafting's macro language doesn't have arrays, they have to be emulated
using logical tables, or ltabs. The solutions which are found are displayed
in CoCreate Drafting's prompt line, which is certainly not the most ideal place, but it's
sufficient to verify the algorithm is actually doing anything .-)
What you see above, is a tuned version of the macro code I came up with initially.
The "literal" translation from the original Lisp version took ages to
complete even for small arrays; for instance, searching for solutions in an array
of 20 numbers took 95 seconds (using OSDD 2005). However, there are two
simple optimizations which can be applied to the algorithm.
First, the input array can be sorted in reverse order, i.e.
largest numbers first. This makes it more likely that we
can prune the recursion tree early. This optimization
itself improved runtimes by only 5% or so, but more importantly,
it paved the way for another optimization.
Since we know that the numbers in the array are
monotonically decreasing, we can now predict in many cases that
there is no chance of possibly reaching the target sum anyway, and
therefore abort the recursion early. Example for an array of size 20:
Target_sum has already been
reduced to 500 earlier in the recursion, i.e. the remaining
entries in the array somehow have to sum up to 500 to meet
the required subset sum.
Numbers array will in fact contain floating-point
values rather than integers. The algorithm doesn't change, but
whenever you work with floating-point values, it's good to follow
a few basic guidelines like the ones outlined here.
In the case of the above macro code, instead of a comparison like
IF (Target = 0), you'd probably want to write something
like IF (ABS(Target) < Epsilon) where Epsilon is a small value
chosen to meet a user-defined tolerance (for example 0.001).
-- ClausBrod - 16 Mar 2006
This is taking me to places I didn't anticipate.
Over at codecomments.com,
they are discussing solutions to the subset sum problem in Haskell,
Prolog and Caml, if anyone is interested. (I sure was, and even read
a tutorial on Haskell to help me understand what these guys are
talking about
)
-- ClausBrod - 01 Apr 2006
I started to learn some Ruby, so here's a naïve implementation of
the algorithm in yet another language
(See also this blog entry.)
$solutions = 0
$numbers = []
$flags = []
def find_solutions(k, target_sum)
if target_sum == 0
# found a solution!
(0..$numbers.length).each { |i| if ($flags[i]) then print $numbers[i], " "; end }
print "\n"
$solutions = $solutions + 1
else
if k < $numbers.length
if target_sum >= $numbers[k]
$flags[k] = true
find_solutions k+1, target_sum-$numbers[k]
$flags[k] = false
end
find_solutions k+1, target_sum
end
end
end
def find_subset_sum(target_sum)
print "\nNow listing all subsets which sum up to ", target_sum, ":\n"
$solutions = 0
(0..$numbers.length()).each { |i| $flags[i] = false }
find_solutions 0, target_sum
print "Found ", $solutions, " different solutions.\n"
end
def subset_sum_test(size)
total = 0
target_sum = 0
(0..size).each { |i| $numbers[i] = rand(1000); total += $numbers[i]; print $numbers[i], " " }
target_sum = total/2
find_subset_sum target_sum
end
subset_sum_test 25
-- ClausBrod - 17 Apr 2006
The other day, I experimented with Python and thought I'd start with a quasi-verbatim
translation of the subset sum code. Here is the result. Apologies for the non-idiomatic
and naïve implementation.
import random import array import sys numbers = array.array('i') flags = array.array('c') solutions = 0 def find_solutions(k, target_sum): global solutions if target_sum == 0: print " Solution:", for i in range(0, len(numbers)): if flags[i] != 0: print numbers[i], print solutions = solutions + 1 else: if k < len(numbers): if (numbers[k] * (len(numbers)-k+1)) >= target_sum: if target_sum >= numbers[k]: flags[k] = 1 find_solutions(k+1, target_sum - numbers[k]) flags[k] = 0 find_solutions(k+1, target_sum) def find_subset_sum(target_sum): global solutions global flags print "Subsets which sum up to %s:" % target_sum flags = [0] * len(numbers) find_solutions(0, target_sum) print "Found", solutions, "different solutions" def subset_sum_test(size): global numbers total = 0 print "Random values:\n ", for i in range(0, size): numbers.append(random.randint(0, 1000)) total = total + numbers[i] print numbers[i], print numbers = sorted(numbers, reverse = True) target_sum = total/2 find_subset_sum(target_sum) subset_sum_test(15 if len(sys.argv) < 2 else int(sys.argv[1]))See also A Subset Of Python. -- ClausBrod - 30 Jun 2013 And here is an implementation in C#:
using System; namespace SubsetSum { class SubsetSum { private int[] numbers; private bool[] flags; private int findSolutions(int k, int targetSum, int solutions=0) { if (targetSum == 0) { Console.Write(" Solution: "); for (int i=0; i<numbers.Length; i++) { if (flags[i]) { Console.Write("{0} ", numbers[i]); } } Console.WriteLine(); solutions++; } else { if (k < numbers.Length) { if ((numbers[k] * (numbers.Length - k + 1)) >= targetSum) { if (targetSum >= numbers[k]) { flags[k] = true; solutions = findSolutions(k + 1, targetSum - numbers[k], solutions); flags[k] = false; } solutions = findSolutions(k + 1, targetSum, solutions); } } } return solutions; } public void solve() { Array.Sort(numbers, (x, y) => y - x); // sort in reverse order Array.Clear(flags, 0, flags.Length); int total = 0; Array.ForEach(numbers, (int n) => total += n); int solutions = findSolutions(0, total / 2); Console.WriteLine("Found {0} different solutions.", solutions); } SubsetSum(int size) { numbers = new int[size]; Random r = new Random(); for (int i = 0; i < size; i++) { numbers[i] = r.Next(1000); Console.Write("{0} ", numbers[i]); } Console.WriteLine(); flags = new bool[size]; } public static void Main(string[] args) { int size = args.Length > 1 ? int.Parse(args[1]) : 15; new SubsetSum(size).solve(); } } }A naïve implementation in Delphi:
program subsetsum; {$APPTYPE CONSOLE} {$R *.res} uses System.Generics.Collections, System.Generics.Defaults, System.SysUtils; type TSubsetSum = class private FNumbers: TArray<Integer>; FFlags: TArray<Boolean>; function FindSolutions( aK: Integer; aTargetSum: Integer; aSolutions: Integer = 0 ): Integer; public procedure Solve( ); constructor Create( aSize: Integer ); end; var vSize: Integer; vSubsetSum: TSubsetSum; constructor TSubsetSum.Create( aSize: Integer ); var i: Integer; begin SetLength( FNumbers, aSize ); SetLength( FFlags, aSize ); Randomize; for i := 0 to aSize - 1 do begin FNumbers[i] := Random( 1000 ); Write( FNumbers[i].ToString + ' ' ); end; writeln; end; function TSubsetSum.!FindSolutions( aK, aTargetSum, aSolutions: Integer ): Integer; begin if ( aTargetSum = 0 ) then begin write( ' Solution: ' ); for var i := 0 to Length( FNumbers ) - 1 do if FFlags[i] then write( FNumbers[i].ToString + ' ' ); writeln; inc( aSolutions ); end else begin if ( aK < Length( FNumbers ) ) then if ( ( FNumbers[aK] * ( Length( FNumbers ) - aK + 1 ) ) >= aTargetSum ) then begin if ( aTargetSum >= FNumbers[aK] ) then begin FFlags[aK] := True; aSolutions := FindSolutions( aK + 1, aTargetSum - FNumbers[aK], aSolutions ); FFlags[aK] := False; end; aSolutions := FindSolutions( aK + 1, aTargetSum, aSolutions ); end; end; Result := aSolutions; end; procedure TSubsetSum.Solve; var vTotal: Integer; vSolutions: Integer; begin TArray.Sort<Integer>( FNumbers, TComparer<Integer>.Construct( function( const Left, Right: Integer ): Integer begin Result := Right - Left; end ) ); vTotal := 0; for var vNumber: Integer in FNumbers do vTotal := vTotal + vNumber; vSolutions := FindSolutions( 0, vTotal div 2 ); writeln( Format( 'Found %d different solutions.', [vSolutions] ) ); end; begin vSize := 15; if ( ParamCount > 1 ) then vSize := ParamStr( 0 ).ToInteger; vSubsetSum := TSubsetSum.Create( vSize ); vSubsetSum.Solve( ); vSubsetSum.Free; end.-- ClausBrod - 19 Jan 2021
CoCreateModeling