Lecture 27: Non-Linear Programming

Lecture 27: Non-Linear Programming#

Note

Until now, we have studied linear optimisation problems in Transportation Engineering, wherein both, the objective function and constraints are a linear function of the decision variables. However, most practical applications encompass non-linear modeling of objective function or constraints, thus necessitating non-linear optimisation. In this final module of this course, we will focus on optimisation techniques to address some of the non-linear optimisation problems in Transportation Engineering.

Introduction#

A general non-linear optimisation problem with an objective function \(f\), a vector of decision variables \(\mathbf{x} = (x_1, x_2, ..., x_n) \in \mathbb{R}^n\), and a vector of inequality constraints \(\mathbf{g}\), can be expressed as follows,

Objective:

\[ \min_{\mathbf{x}} \ f(\mathbf{x}) \]

Subject to:

\[\begin{split} \begin{aligned} \mathbf{g}(\mathbf{x}) \geq 0 \\ \mathbf{x} \in \mathbb{R_+} \end{aligned} \end{split}\]

Example#

Consider a highway management firm that operates and maintains the expressway connecting Chennai with Bangalore. The highway management firm wants to set toll price \(p_1\) for private vehicles and \(p_2\) for commercial vehicles, to collect toll revenue on this highway. However, the National Highways Authority of India (NHAI) wants to facilitate sufficient flow between Chennai and Bangalore, ensuring that at least 1000 private and 1500 commercial vehicles each use the expressway during the peak hour. Note, the peak hour expressway traffic for private and commercial vehicles is subject to respective toll prices, and is given by \(Q_1(p_1)\) and \(Q_2(p_2)\), respectively. Nonetheless, the NHAI also wants to ensure sufficient tax collection, requiring toll price to at least be ₹150. Considering these regulations, what toll prices should the firm set so as to maximize the toll revenue?

Objective:

\[ \max_{p_1, p_2} \ Z = p_1 (5000 - 20p_1) + p_2 (6000 - 0.05p_2^2) \]

Subject to:

\[\begin{split} \begin{aligned} & 5000 - 20p_1 \geq 1000 \\ & 6000 - 0.05p_2^2 \geq 1500 \\ & p_1 \geq 150 \\ & p_2 \geq 150 \\ & p_1, p_2 \in \mathbb{Z}_+ \end{aligned} \end{split}\]

Note, an inequality constraint \(g(\mathbf{x}) \leq 0\) can be re-written in the standard form as \(-g(\mathbf{x}) \geq 0\), while an equality constraint \(g(\mathbf{x}) = 0\) can be re-written in the standarad form as \(g(\mathbf{x}) \geq 0\); \(-g(\mathbf{x}) \geq 0\).

Optimisation#

Unconstrained Optimisation#

Conceptualisation#

\[\begin{split} \begin{aligned} & \max_{\mathbf{x}} / \min_{\mathbf{x}} f(\mathbf{x}) \\ & \mathbf{x} = (x_1, x_2, ..., x_n) \in R^n \end{aligned} \end{split}\]

For a twice continuously differentiable function \(f\) (first and second partial derivatives of \(f\) exist at all points and are continuous at all points) with variables \((x_1, x_2, ..., x_n) \in \mathbb{R}^n\), the optimal solution is given by solving the system of equations resulting from,

\[\begin{split} \begin{aligned} & \nabla f = 0 \\ & \\ & \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}} = 0 \\ & \\ & \begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_1} & \frac{\partial f(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial f(\mathbf{x})}{\partial x_n} \end{bmatrix} = 0 \end{aligned} \end{split}\]

Here, \(\nabla\) (Gradient Vector) renders the first derivative of the function wrt the variables.

The resulting optimal solution can be appropriately characterised as either a minmum or maximum based on the solution resulting from,

\[ det(H - \lambda I) = 0 \]

Where, \(H\) (Hessian Matrix) is the second derivative of the function wrt the variables, given by,

\[\begin{split} \begin{aligned} & H = \frac{\partial^2 f(\mathbf{x})}{\partial \mathbf{x}^2} \\ & \\ & H = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2} \end{bmatrix} \end{aligned} \end{split}\]

And, \(\lambda\) is the vector of eigen values associated with this Hessian matrix, given as,

\[ \lambda = \begin{bmatrix} \lambda_1 & \lambda_2 & ... & \lambda_n \end{bmatrix} \]

Specifically, if \(\lambda_i > 0 \ \forall \ i\) then the optimal solution is a local minimum; else if \(\lambda_i < 0 \ \forall \ i\) then the optimal solution is a local maximum; else if some \(\lambda_i > 0\) while other \(\lambda_i < 0\) then the optimal solution is a saddle point; else the test is incolclusive and we need even higher-order derivatives to understand the behaviour of the function at this point.

Example#

Optimise: \(f(x_1, x_2) = x_1^2x_2 + x_1x_2^3 - x_1x_2\)

We begin by computing the Gradient Vector and the Hessian Matrix for \(f\),

\[\begin{split} \begin{aligned} & \nabla f = \begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_1} & \frac{\partial f(\mathbf{x})}{\partial x_2} \end{bmatrix} \\ & \\ & \nabla f = \begin{bmatrix} 2x_1x_2 + x_2^3 - x_2 & x_1^2 + 3x_2^2x_1 - x_1 \end{bmatrix} \end{aligned} \end{split}\]

\[\begin{split} \begin{aligned} & H = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} \end{bmatrix} \\ & \\ & H = \begin{bmatrix} 2x_2 & 2x_1 + 3x_2^2 - 1 \\ 2x_1 + 3x_2^2 - 1 & 6x_1x_2 \end{bmatrix} \end{aligned} \end{split}\]

The optimal solution is given by solving the system of equations resulting from \(\nabla f = 0\), rendering,

\(x_1\)	\(x_2\)
0	0
0	1
1	0
2/5	1/\(\sqrt{5}\)
2/5	-1/\(\sqrt{5}\)

To determine the characteristics of these values, we will compute the Hessian Matrix at these points, rendering,

\(x_1\)	\(x_2\)	H
0	0	\(\begin{bmatrix} 0 & -1 \\ -1 & 0 \end{bmatrix}\)
0	1	\(\begin{bmatrix} 2 & 2 \\ 2 & 0 \end{bmatrix}\)
1	0	\(\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\)
2/5	1/\(\sqrt{5}\)	\(\begin{bmatrix} 2/\sqrt{5} & 2/5 \\ 2/5 & 12/5\sqrt{5} \end{bmatrix}\)
2/5	-1/\(\sqrt{5}\)	\(\begin{bmatrix} -2/\sqrt{5} & 2/5 \\ 2/5 & -12/5\sqrt{5} \end{bmatrix}\)

Now, we shall compute the vector \(\lambda = \begin{bmatrix} \lambda_1 & \lambda_2 \end{bmatrix}\) (eigen vector) for each Hessian matrix, such that \(det(H - \lambda I) = 0\), rendering,

\(x_1\)	\(x_2\)	\(\lambda\)
0	0	\(\begin{bmatrix} 1 & -1 \end{bmatrix}\)
0	1	\(\begin{bmatrix} 3 + \sqrt{5} & -1 - \sqrt{5} \end{bmatrix}\)
1	0	\(\begin{bmatrix} 1 & -1 \end{bmatrix}\)
2/5	1/\(\sqrt{5}\)	\(\begin{bmatrix} 2/5 + \sqrt{5} & 12/5 + \sqrt{5} \end{bmatrix}\)
2/5	-1/\(\sqrt{5}\)	\(\begin{bmatrix} -2/5 + \sqrt{5} & -12/5 + \sqrt{5} \end{bmatrix}\)

Consequently, we can infer that,

\(x_1\)	\(x_2\)	Inference
0	0	Saddle Point
0	1	Saddle Point
1	0	Saddle Point
2/5	1/\(\sqrt{5}\)	Local Minima
2/5	-1/\(\sqrt{5}\)	Local Maxima

Constrained Optimisation#

Conceptualisation#

Here, we have a general non-linear optimisation problem with an objective function \(f\), a vector of decision variables \(\mathbf{x} = (x_1, x_2, ..., x_i, ..., x_n) \in \mathbb{R}^n\), and a vector of inequality constraints \(\mathbf{g}\), expressed as,

Objective:

\[ \min_{\mathbf{x}} \ f(\mathbf{x}) \]

Subject to:

\[ \mathbf{g}(\mathbf{x}) \geq 0 \]

We shall transform this constrained optimisation probelem into an unconstrained optimisation problem by introducing constraints into the objective fucntion using Lagrange Multipliers (\(\gamma\)) as follows,

Objective:

\[ \min_{\mathbf{x}} L(\mathbf{x},\mathbf{\gamma}) = f(\mathbf{x}) + \sum_j \gamma_j g_j(\mathbf{x}) \]

This transformation of constrained optimisation into an unconstrained optimisation is referred to as Lagrange Transformation. Note, the Lagrange Multipliers indicate how much the objective function would improve if a constraint is relaxed by one unit - Shadow Price (Recall Duality priniciples from Lecture 13).

Assuming \(L\) to be a twice continuously differentiable function with variables \((x_1, x_2, ..., x_i, ..., x_n) \in \mathbb{R}^n\), the optimal solution is given by solving the system of equations resulting from \(\nabla L = 0\).

Specifically,

\[\begin{split} \begin{aligned} \frac{\partial L(\mathbf{x}, \mathbf{\gamma})}{\partial x_i} = 0 \ \forall \ i \\ \frac{\partial L(\mathbf{x}, \mathbf{\gamma})}{\partial \gamma_j} = 0 \ \text{if} \ \gamma_j \ne 0 \ \forall \ j \end{aligned} \end{split}\]

Rendering,

\[\begin{split} \begin{aligned} \frac{\partial f(\mathbf{x})}{\partial x_i} + \sum_j \gamma_j \frac{\partial g_j(\mathbf{x})}{\partial x_i} = 0 \ \forall \ i \\ \gamma_j g_j(\mathbf{x}) = 0 \ \forall \ j \\ \end{aligned} \end{split}\]

These equations together are referred to as Karush-Kuhn-Tucker (KKT) conditions.

The resulting optimal solution can be appropriately characterised as either a minmum or maximum based on the solution resulting from, \(det(H - \lambda I) = 0\), where, \(\lambda = \begin{bmatrix} \lambda_1 & \lambda_2 & ... & \lambda_n \end{bmatrix}\) is the vector of eigen values associated with this Hessian matrix (H). Specifically, if \(\lambda_i > 0 \ \forall \ i\) then the optimal solution is a local minimum; else if \(\lambda_i > 0 \ \forall \ i\) then the optimal solution is a local maximum; else if some \(\lambda_i > 0\) while other \(\lambda_i < 0\) then the optimal solution is a saddle point; else the test is incolclusive and we need even higher-order derivatives to understand the behaviour of the function at this point.

Example#

Objective:

\[ \max_{p_1, p_2} \ Z = p_1 (5000 - 20p_1) + p_2 (6000 - 0.05p_2^2) \]

Subject to:

\[\begin{split} \begin{aligned} & 5000 - 20p_1 \geq 1000 \\ & 6000 - 0.05p_2^2 \geq 1500 \\ & p_1 \geq 150 \\ & p_2 \geq 150 \\ & p_1, p_2 \in \mathbb{Z}_+ \end{aligned} \end{split}\]

We begin by transforming the above maximisation problem into a minimisation problem,

\[ \min_{p_1, p_2} \ Z = p_1 (20p_1 - 5000) + p_2 (6000 - 0.05p_2^2) \]

Subject to:

\[\begin{split} \begin{aligned} & 5000 - 20p_1 - 1000 \geq 0 \\ & 6000 - 0.05p_2^2 - 1500\geq 0 \\ & p_1 - 150 \geq 0 \\ & p_2 - 150 \geq 0 \\ & p_1, p_2 \in \mathbb{Z}_+ \end{aligned} \end{split}\]

We shall now transform this constrained optimisation into an unconstrained optimisation by introducing constraints into the objective function using Lagrange Multipliers,

\[ \min_{p_1, p_2} \ Z = p_1 (20p_1 - 5000) + p_2 (6000 - 0.05p_2^2) + \gamma_1 (5000 - 20p_1 - 1000) + \gamma_2 (6000 - 0.05p_2^2 - 1500) + \gamma_3 (p_1 - 150) + \gamma_4 (p_2 - 150) \]

Consequently, the Karush-Kuhn-Tucker (KKT) conditions render,

\[\begin{split} \begin{aligned} 40p_1 - 5000 - 20 \gamma_1 + \gamma_3 = 0 \\ -0.15p_2^2 + 6000 - 0.1 \gamma_2 p_2 + \gamma_4 = 0 \\ \gamma_1 (5000 - 20p_1 - 1000) = 0 \\ \gamma_2 (6000 - 0.05p_2^2 - 1500) = 0 \\ \gamma_3 (p_1 - 150) = 0 \\ \gamma_4 (p_2 - 150) = 0 \end{aligned} \end{split}\]

Solving these set of equations will render the optimal solutions to the optimisation problem.

Note

For simpler set of KKT equations, analytical approaches that can derive closed-form solutions are sufficient. However, more complex scenarios necessitate use of sophisticated solution techniques such as interior-point method, penalty and barrier method, branch and bound method, cutting plan method, etc. Nonetheless these solution techniques scale poorly, rendering optimal solution with exponential increase in computational overhead for increasing problem complexity and size. To cope with this challenge, we shall explore metaheurisitics in the last module of this course. These metaheurisitcs are high-level algorithmic frameworks designed to provide near-optimal solutions for complex optimisation problems within reasonable computational limits.

Lecture 27: Non-Linear Programming

Contents

Lecture 27: Non-Linear Programming#

Introduction#

Example#

Optimisation#

Unconstrained Optimisation#

Conceptualisation#

Example#

Constrained Optimisation#

Conceptualisation#

Example#