Towards an Initial Understanding of Linear Programming
|
Links and notes regarding explanations of linear programming in general, the simplex algorithm and the dual simplex algorithm.
Robin Whittle rw@firstpr.com.au 2016-09-25
Brief description of Linear Programming, why it is useful for many
problems, and how it compares to other optimisation techniques which
can handle more complex non-linear relationships
Linear Programming is a mathematical
optimisation technique with many
important practical applications. An LP solver finds a set of
values for a
large number of variables in a way which maximises (or minimises) the
value of a single outcome which depends upon the values of all these
variables. The dependencies must be linear, such as the
profitability of a manufacturing operation going up by $4 a kilogram of
one ingredient and down by $2 a kilogram of some other
ingredient (two variables). The most complex part of LP is the
constraints which
limit the values of these variables in potentially very complex and
inter-dependent ways. For instance, the value of variable A
multiplied by 2.7 plus the value of variable C multiplied by -3.457
must equal 64. Constraints can involve equality, greater than or
less than relations. Each variable can have bounds, such as
minimum and maximum values, which are really just simple constraints.
Linear programming only works with some kinds of problem. Other
kinds, which involve non-linear relationships between the values of the
variables and the outcome are of great interest since they are directly
applicable to a greater range of real-world problems. However,
the computational techniques for solving these other kinds of problems
(such as mixed integer programming, in which variables can only have
integer values) are much less able to solve really large problems in
any reasonable time. A primary attraction of linear programming
is that extraordinarily large and complex LP problems can be solved by
computers in reasonable times - seconds to hours or perhaps days,
rather than weeks or millenia.
Introduction to this page
Many people without a strong
mathematical background - myself included
- need to know what Linear Program can and can't do, and have some
insights into
how an LP solver works. However, the detailed theoretical basis
of LP in general, and the inner workings of LP solvers are fully
understood only by those who devote a large part of their intensely
mathematical minds
to this arcane field. I think these people are generally not
inclined or able to write and illustrate an introduction to LP for
people (such as me) who initially know nothing of the field, and who
are not trained in the relevant branches of mathematics.
For example, the Wikipedia
page uses two diagrams and launches quickly into algebra and terms such as
convex polytope,
affine function,
vector,
matrix
etc. Interested folks in business, engineering or science who are wishing to maximise profits or
minimise average transit time in a freight delivery network don't necessarily know where to start with these
mathematical descriptions.
As part of my C++ programming work for the mining consulting company
Whittle Consulting, building on the work of my father Jeff
http://www.jeff.whittlefamily.id.au
I need to understand linear programming from the point of view of a
programmer who uses a linear programming solver - specifically
Cplex and
CLP.
This does not mean I need to know how to write an LP solver - but I
need to know quite a lot about how they work, since our software makes
very intensive use of linear programming.
I need to know the fundamentals of linear programming, and
have some idea of how the simplex algorithm works - for instance with
CLP using the dual simplex algorithm.
There are plenty of books and websites about linear programming, and the
most detailed material is necessarily written by and for people with extensive
mathematical training.
With rare exception, I think the most advanced specialists in any field
lack the motivation and/or the understanding of what other people
don't
know to give an introduction to their field which is elegant and
suitable for newcomers. Generally, I think the best
explanations come from a suitably motivated outsider to the field, with good writing
- and probably illustration - skills, who
fully appreciates what the
reader probably does not know, and crafts an elegant explanatory path
with examples.
The purpose of this page is to highlight web sites and books which I
think are
unusually helpful regarding understanding at least the basics of linear
programming, for people who have only basic mathematical training - and
not necessarily any computer programming experience. I was
inspired to do this by James Jones' explanation of LP and the simplex
method - after I discovered that no websites apparently link to this
helpful material.
Below I also have some notes on CLP.
Contents
#01jj |
01 James Jones' Linear Programming explanation
|
#02psds |
02 Primal simplex and dual simplex
|
#03clp |
03 CLP
|
#01jj
01 - James Jones' Linear Programming explanation
I found this page and the pages it
links to - and some of the videos linked to from those pages - to be a
really clear and well-illustrated explanation of the
fundamental concepts of linear programming. Give me
well-written sentences and diagrams any day in preference to equations
with Greek and other characters, if these can be avoided!
Professor Jones leads us through a graphic representation of a
maximisation LP problem, and then to a tabular format for understanding
and solving a two-variable LP problem with pen and paper, without the need to create or think in terms of diagrams.
Diagrams are fine for LP problems involving two variables. They
are challenging for problems involving three variables, and generally
impossible for depicting or helping us think about four or more
variables. LP solvers regularly deal with problems involving half
a million variables and a similar number of constraints, but two
variable examples are good enough for understanding LP principles.
He then extends the tabular concepts into an explanation of the simplex method - an
algorithm which works with any number of variables and which is one of
the algorithms used by LP solver software.
Starting with the above page, I printed, read and annotated these pages as well, which are linked to from the above page:
I had some difficulty understanding the pivot mechanism and Prof. Jones
pointed me to another page which helps, but which is not linked to
directly from the above:
In the simplex.html page, I could not understand the transformation of all the numbers from the tableau before the heading "
Pivot!"
to the numbers in the tableau following this heading. Reading pivot.pdf, I hit the
same problem with the numbers in the top right tableau.
The key to understanding this is at the bottom left of pivot.pdf, "How
to perform the actual pivot". The tableau in this section
illustrates the procedure for setting the new value of the element
which is in column z and row 2. This is initially -1 (this
tableau has the same numbers as the starting tableau at the top
left). The procedure is to take the product of the top left
element in the rectangle formed between the pivot element and the
element to be changed (row 1, column x = 1) and the value of the
element to be changed (-1) and to subtract from this the product of the
values at the other two corners of the rectangle (-2 and 3). The
red text under "New values for Row 2" explains this.
After this element has been changed, and the one for row 2 column x,
then the same game is played for the RHS column, which was initially
5. The same game is also played on the sum element for row 2 (the far
right element, equal initially to 5), though I think it could also
have its new value determined simply by summing the updated values to
the left of it.
I understand that this is the use of "row operations" in Gauss-Jordan
reduction, and I am well out of my comfort zone and into the matrix
algebra wilds beyond. My broad and tentative understanding of
these row operations and the pivot operation as a whole, when used as
part of the simplex method of solving an LP problem, is that it is a
way of
maintaining the integrity of all the information in the LP when we
require that all the elements but one in the pivot column be converted
into zeros, with the remaining element converted to 1. I
understand the pivot operation as a way of transforming the
representation of the LP problem from one tableau (one set of numbers
in this matrix) into another, which corresponds exactly to moving our
potential solution point from one intersection of two lines (for a
two-variable LP problem, as illustrated here) to another intersection,
and that the choice of pivot column and row, when made correctly,
corresponds to moving the potential solution point to the most
immediately profitable such point (in a maximisation LP).
If I really wanted to understand exactly how these "row operations"
worked, I could probably search some more pages and figure it out, but
it seems that depicting an understanding of why and how it works is
non-trivial. This is one such potential explanation, which avoids
algebra and concentrates on verbal/textual explanations and graphics:
In this case, it involves 3D animations. With more effort and an understanding of reduced row-echelon form (
https://en.wikipedia.org/wiki/Row_echelon_form) I imagine I could understand it.
So for now, the exact principles the pivot operation are
beyond my understanding, but I have a general idea of what they do and
I understand they are not contentious. (This means I have no
reason to suspect they are invalid, which is not something I always
think about scientific theories I don't clearly understand.)
In simplex.html, in the section before the heading "Repeating the
Process", we have navigated our potential solution point from (x
1 = 0, x
2
= 0) to a point on the right edge of the feasible region, and the
algorithm for choosing which point to move to was explained.
While the diagrams are vital to understanding the process, they are not
needed for implementing the choice of pivot and the pivot itself.
In the fully transformed tableau before this heading, we changed slack
variable s
3 from being basic (important) to being non-basic:
in this case to being set to 0. The statement to this effect ". .
. not cleared out, so they are non-basic" only made sense to me in the
context of the diagram in that the new potential solution point (this
is my term - we are moving it around until we can't find a point in the
x
1 and x
2 dimensions with the highest possible objective function value) is at the intersection of the line at which variable x
2 = 0 and the line at which s
3 = 0.
By setting these two variables to 0, they cannot contribute to the
three constraints (rows 1, 2 and 3). Therefore, by setting them
to zero and observing that:
- The other three variables x1, s1 and s2 all have a single '1' in their column, each for a different row, and that:
- Each of these constraint rows is an equation.
we can see that at the new location of the potential solution point,
each of these three variables must be equal to a particular item on the
RHS of those rows:
x1 (row 3) = 8
s1 (row 1) = 1
s2 (row 2) = 8
What was
not clear to me was that although we had transformed the original objective function (-40 for x
1 and -30 for x
2,
plus zeros for the three slack variables, as per the first tableau in
simplex.html) into something else (0, -10/3, 0, 0 and 43/3), in order
to find the value of the objective function at our new potential
solution point, we need to plug the above three values (x
1 = 8, s
1 = 1 and s
2 = 8) into the original objective function, for which only x
1
= 8 is relevant, since it is the only one of these three variables with
a non-zero coefficient. Its coefficient is 40, so the value at
our new potential solution point is 8 * 40 = 320.
I glanced at the example with three decision variables:
https://people.richland.edu/james/ictcm/2006/3dsimplex.html
but was happy with the understanding in principle which I gained from
the previous material, which works with two decision variables, and so
is very amenable to graphics in a two-dimensional plane.
This explanation does not go into the distinction between primal
simplex and dual simplex algorithms, or mention any of the other
algorithms for solving LP problems (such as "barrier" AKA "interior
point").
However, it gave me a good enough understanding of the simplex
algorithm for my current needs, and I think it would be difficult for
any explanation to achieve so much with fewer words and diagrams.
#02psds
02 - Primal simplex and dual simplex
The material from James Jones satisfied my needs regarding the simplex algorithm AKA "primal simplex".
"Dual" in this context means an alternative LP problem which is
mathematically derived from the original (primal) LP problem
(maximisation in this explanation) in such a way that, solving the dual
problem produces the same results as solving the primal problem, by way
of this dual problem solution providing an upper bound to the value of the objective
function in the feasible range of the primal problem. If this is done properly, then
the bound is a precise match, so it is not necessary to solve the
primal problem. I don't yet understand the motivations for
preferring to generate and solve the dual version, but I understand they
are powerful enough to make this dual simplex algorithm a widely
favored approach to solving LP problems.
However, it seems that "dual simplex" does not mean generating a dual
version of the problem and then solving it with the simplex
method. More below . . .
The transformation from a simple maximisation LP problem into its dual
form is explained in the first two pages of a chapter from book, which
is available here (Preview Chapter 5:
Duality Theory:
The choice of multiplying the two constraints by 2 and 3, in the
initial part of this explanation, is partially arbitrary. The 3 is
a minimum value and looks to me like a good choice, due to the need to
multiply up the coefficient of 1 for x
3 to at least equal
the coefficient of 3 for this variable in the objective function.
The coefficient for this variable in the first constraint is 0.
The choice of 2 seems to be higher than is really required, so the
upper bound produced by these choices is sure to exceed the true
optimised value of the objective function for the primal (original) LP
problem.
To make a dual version of the problem, which is a minimisation problem, rather than using arbitrary
choices such as 2 and 3, we create a set of variables, one for each constraint in the primal problem, and
then a set constraints, one for each decision variable in the
primal problem.
I found a 2001 edition of this book at Archive.org:
and had a quick look at the rest of Chapter 5. The values (of the
objective function) for various solutions to the primal problem (one or
more of which are optimal) forms a set of possible values which are
less than or equal to the set of upper bounds for these, which result
from the set of solutions to the dual problem (one or more of which may
be optimal). The optimal solutions for each problem produce the
same value (where the two sets touch). So if we can find a
feasible solution for one problem (we don't yet know it to be optimal)
which is the same (has the same values for all decision variables, and
so the same value for the objective function) a feasible solution to
the other problem,
then we know that both of them are optimal solutions. In other
words, the set of primal feasible solutions are in the primal feasible
area, and this area touches at one point (a single vertex) or perhaps
along a line between two vertexes in these diagrams, the feasible area
of the dual problem, which is otherwise outside the feasible region of
the primal problem. That point (or line) of touching is where the
one or more optimal solutions are found for both problems.
According to section 6 of this Chapter 5, the "Dual Simplex Method"
means "simply as a new way of picking the entering and leaving
variables in a sequence of primal dictionaries". Using the
terminology of James Jones' explanation, I understand this means
"simply a new way of choosing how to pivot using a series of tableaux
derived from the original (primal) LP problem". So I understand
the dual simplex algorithm does a series of operations on the tableaux
of the primal problem, to achieve much the same as would be achieved by
a conventional simplex solution of the actual, and therefore
specifically formulated, dual version of the problem.
This chapter describes a two phase approach to implementing a dual simplex LP solver.
It
is interesting to note how we detect infeasibility with this new Phase
I algorithm. The modified [dual]
problem is guaranteed always to be dual
feasible. It is easy to see that the primal problem is infeasible
if and only if the modified problem is dual unbounded (which the dual
simplex method will detect just as the primal simplex method detects
primal unboundedness).
CLP's dual simplex algorithm is apparently single phase. From the
help-text of CLP, the source code of which is, at least in part, here:
The
dual algorithm in Clp is a single phase algorithm as opposed to a two
phase algorithm where you first get feasible then optimal. If a
problem has both upper and lower bounds then it is trivial to get dual
feasible by setting non basic variables to correct bound. If the
gap between the upper and lower bounds of a variable is more than the
value of dualBound Clp introduces fake bounds so that it can make the
problem dual feasible. This has the same effect as a composite
objective function in the primal algorithm.
The first 200 lines of this CLP source file gives more details of CLP's dual simplex algorithm:
I only partially understand this at present. I wouldn't have been
able to be understand any of it without the clear understanding of
simplex which I gained from James Jones' explanation. My
impression from reading LP material in general, in text books and
websites, is that it would have taken me very much longer to understand
the simplex algorithm without his explanation.
#03clp
03 - CLP
CLP is an open-source LP solver,
written in C++, which began as an IBM internal project. Of the
various open-source LP solvers, it has a good reputation for
performance. A valuable review of open source LP solvers is:
There are two home pages, the second which does not link to the first:
The main documentation is at:
http://www.coin-or.org/Clp/userguide/index.html
If odd characters appear, this is due to the HTML file using
Western character encoding without specifying this in the HTML file,
and the web server now - differently from when the document was written
- being configured to send the document in an HTTP envelope (the
protocol which delivers the HTML file to the browser) specifying the
default character coding of the server's operating system, which is now
UTF-8. To fix this in the browser, use something like View >
Text encoding > Western.
The
CLP mailing list is at:
In December 2015 I wrote to the list pointing to a single file Mbox
format copy of the mailing list messages from 2002 to 2015-12-01.
Since many questions have been answered on the list, it is handy to be
able to search the whole set of messages as a text file, or via an
email client such as Thunderbird, which interprets the file as
individual emails.
Here is another file with messages from 2002 to 2016-09-25: My
2015-12-01 message explains how I filled my Thunderbird mailbox with
messages from the gz archives. With Thunderbird on Windows,
if you copy the uncompressed file, renamed to "CLP" to somewhere such
as:
C:\Users\XYZ\AppData\Roaming\Thunderbird\Profiles\wwxxyyzz.default\Mail\Local
Folders\ then you should be able to read the messages
individually. Alternatively you can rename it as a CLP.txt file
and open it with an editor, or web browser. (
Google
Chrome's text search and scroll-bar highlighting of matches is superior
to that of Firefox, but Chrome has no File > Open command, so use file:///D:/CLP.txt in the address bar.)
CLP's source code and classes are searchable at:
One body of text which I found helpful is the help text which can be
displayed on the console of the standalone CLP executable
program. This text is not available anywhere I know of as a
document. I collected all the help text I think the program can
produce and then edited it into a formatted Word file. This is
for CLP version 1.16.9, which is not the current version, but you may
still find it useful. It helped me understand some settings of
the solver (in the library). Note that the standalone program has
perturb(ation) on by default while for the library solver, it is off by default.
I think that even the most experienced programmers who are already
fully knowledgeable about linear programming would find it a challenge
to find out what they need to know from the CLP documentation and
source code.
One thing which confused me was the term "ABC", as if it were a
project, or a variant of CLP which could be enabled with appropriate
pre-processor macros. This term is all through the CLP project
and I could not understand what it meant. Eventually, I decided I
could safely ignore anything concerning "ABC" after reading this
message from John Forrest (the originator of CLP). I understand
that ABC is a variant of CLP with some operations done in parallel, on
multiple cores of the CPU(s).
If you have any questions about CLP, don't ask me! Please join
and write to the mailing list instead. You will most likely have
your question answered within a day or two by experts - and this way
many other people who read the list now and in the future will benefit
from your query.
As John Forrest wrote in the above-mentioned message:
If you need more information, do not hesitate to ask awkward questions.
Update history
- 2016-09-25 Page established.
../ to the main First Principles site.
.