Quickstart

This page presents a quickstart guide to solve a nonlinear problem with MadNLP.

As a demonstration, we show how to implement the HS15 nonlinear problem from the Hock & Schittkowski collection, first by using a nonlinear modeler and then by specifying the derivatives manually.

The HS15 problem is defined as:

\[\begin{aligned} \min_{x_1, x_2} \; & 100 \times (x_2 - x_1^2)^2 +(1 - x_1)^2 \\ \text{subject to} \quad & x_1 \times x_2 \geq 1 \\ & x_1 + x_2^2 \geq 0 \\ & x_1 \leq 0.5 \end{aligned} \]

Despite its small dimension, its resolution remains challenging as the problem is nonlinear nonconvex. Note that HS15 encompasses one bound constraint ($x_1 \leq 0.5$) and two generic constraints.

Using a nonlinear modeler: JuMP.jl

The easiest way to implement a nonlinear problem is to use a nonlinear modeler as JuMP. In JuMP, the user just has to pass the structure of the problem, the computation of the first- and second-order derivatives being handled automatically.

Using JuMP's syntax, the HS15 problem translates to

using JuMP
model = Model()
@variable(model, x1 <= 0.5)
@variable(model, x2)

@objective(model, Min, 100.0 * (x2 - x1^2)^2 + (1.0 - x1)^2)
@constraint(model, x1 * x2 >= 1.0)
@constraint(model, x1 + x2^2 >= 0.0)

println(model)
Min (100.0 * ((-x1² + x2) ^ 2.0)) + (x1² - 2 x1 + 1)
Subject to
 x1*x2 ≥ 1
 x2² + x1 ≥ 0
 x1 ≤ 0.5

Then, solving HS15 with MadNLP directly translates to

using MadNLP
JuMP.set_optimizer(model, MadNLP.Optimizer)
JuMP.optimize!(model)
This is MadNLP version v0.8.0, running with umfpack

Number of nonzeros in constraint Jacobian............:        4
Number of nonzeros in Lagrangian Hessian.............:        5

Total number of variables............................:        2
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        1
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        2
        inequality constraints with only lower bounds:        2
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0000000e+00 1.01e+00 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  9.9758855e-01 1.00e+00 4.61e+01  -1.0 1.01e+00    -  4.29e-01 9.80e-03h  1
   2  9.9664309e-01 1.00e+00 5.00e+02  -1.0 4.81e+00    -  1.00e+00 9.93e-05h  1
   3  1.3615174e+00 9.99e-01 4.41e+02  -1.0 5.73e+02    -  9.98e-05 4.71e-04H  1
   4  1.3742697e+00 9.99e-01 3.59e+02  -1.0 3.90e+01    -  2.30e-02 2.68e-05h  1
   5  1.4692139e+00 9.99e-01 4.94e+02  -1.0 5.07e+01    -  2.76e-04 1.46e-04h  1
   6  3.1727722e+00 9.97e-01 3.76e+02  -1.0 8.08e+01    -  1.88e-06 9.77e-04h  11
   7  3.1726497e+00 9.97e-01 2.12e+02  -1.0 9.98e-01    -  1.00e+00 7.94e-04h  1
   8  8.2350196e+00 9.85e-01 4.29e+02  -1.0 1.51e+01    -  1.44e-03 7.81e-03h  8
   9  8.2918294e+00 9.84e-01 4.71e+02  -1.0 3.94e+00    -  2.51e-01 2.49e-04h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  4.0282504e+01 8.72e-01 4.57e+02  -1.0 4.94e+00    -  1.00e+00 6.25e-02h  5
  11  2.8603735e+02 2.66e-01 4.85e+02  -1.0 1.16e+00    -  5.54e-01 5.00e-01h  2
  12  3.9918132e+02 6.89e-03 2.90e+02  -1.0 1.90e-01    -  8.75e-01 1.00e+00h  1
  13  3.9783737e+02 3.06e-04 2.86e+02  -1.0 4.68e-02    -  2.50e-02 1.00e+00h  1
  14  3.5241265e+02 2.33e-02 2.70e+01  -1.0 4.59e-01    -  1.00e+00 1.00e+00h  1
  15  3.5876922e+02 7.37e-03 4.63e+00  -1.0 2.66e-01    -  6.99e-01 1.00e+00h  1
  16  3.6046938e+02 7.34e-05 8.17e-03  -1.0 3.14e-02    -  1.00e+00 1.00e+00h  1
  17  3.6038250e+02 2.71e-07 8.49e-05  -2.5 1.42e-03    -  1.00e+00 1.00e+00h  1
  18  3.6037976e+02 2.63e-10 7.95e-08  -5.7 4.63e-05    -  1.00e+00 1.00e+00h  1
  19  3.6037976e+02 2.22e-16 5.96e-14  -8.6 3.06e-08    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 19

                                   (scaled)                 (unscaled)
Objective...............:   3.6037976240508465e+02    3.6037976240508465e+02
Dual infeasibility......:   5.9563010060664213e-14    5.9563010060664213e-14
Constraint violation....:   2.2204460492503131e-16    2.2204460492503131e-16
Complementarity.........:   1.5755252497616872e-09    1.5755252497616872e-09
Overall NLP error.......:   1.5755252497616872e-09    1.5755252497616872e-09

Number of objective function evaluations             = 47
Number of objective gradient evaluations             = 20
Number of constraint evaluations                     = 47
Number of constraint Jacobian evaluations            = 20
Number of Lagrangian Hessian evaluations             = 19
Total wall-clock secs in solver (w/o fun. eval./lin. alg.)  =  8.822
Total wall-clock secs in linear solver                      =  0.017
Total wall-clock secs in NLP function evaluations           =  0.000
Total wall-clock secs                                       =  8.839

EXIT: Optimal Solution Found (tol = 1.0e-08).

Under the hood, JuMP builds a nonlinear model with a sparse AD backend to evaluate the first and second-order derivatives of the objective and the constraints. Internally, MadNLP takes as input the callbacks generated by JuMP and wraps them inside a MadNLP.MOIModel.

Specifying the derivatives by hand: NLPModels.jl

Alternatively, we can compute the derivatives manually and define directly a NLPModel associated to our problem. This second option, although more complicated, gives us more flexibility and comes without boilerplate.

We define a new NLPModel structure simply as:

struct HS15Model <: NLPModels.AbstractNLPModel{Float64,Vector{Float64}}
    meta::NLPModels.NLPModelMeta{Float64, Vector{Float64}}
    counters::NLPModels.Counters
end

function HS15Model(x0)
    return HS15Model(
        NLPModels.NLPModelMeta(
            2,     #nvar
            ncon = 2,
            nnzj = 4,
            nnzh = 3,
            x0 = x0,
            y0 = zeros(2),
            lvar = [-Inf, -Inf],
            uvar = [0.5, Inf],
            lcon = [1.0, 0.0],
            ucon = [Inf, Inf],
            minimize = true
        ),
        NLPModels.Counters()
    )
end
Main.HS15Model

This structure takes as input the initial position x0 and generates a AbstractNLPModel. NLPModelMeta stores the information about the structure of the problem (variables and constraints' lower and upper bounds, number of variables, number of constraints, ...). Counters is just a utility storing the number of time each callbacks is being called.

The objective function takes as input a HS15Model instance and a vector with dimension 2 storing the current values for $x_1$ and $x_2$:

function NLPModels.obj(nlp::HS15Model, x::AbstractVector)
    return 100.0 * (x[2] - x[1]^2)^2 + (1.0 - x[1])^2
end

The corresponding gradient writes (note that we update the values of the gradient g inplace):

function NLPModels.grad!(nlp::HS15Model, x::AbstractVector, g::AbstractVector)
    z = x[2] - x[1]^2
    g[1] = -400.0 * z * x[1] - 2.0 * (1.0 - x[1])
    g[2] = 200.0 * z
    return g
end

Similarly, we define the constraints

function NLPModels.cons!(nlp::HS15Model, x::AbstractVector, c::AbstractVector)
    c[1] = x[1] * x[2]
    c[2] = x[1] + x[2]^2
    return c
end

and the associated Jacobian

function NLPModels.jac_structure!(nlp::HS15Model, I::AbstractVector{T}, J::AbstractVector{T}) where T
    copyto!(I, [1, 1, 2, 2])
    copyto!(J, [1, 2, 1, 2])
end

function NLPModels.jac_coord!(nlp::HS15Model, x::AbstractVector, J::AbstractVector)
    J[1] = x[2]    # (1, 1)
    J[2] = x[1]    # (1, 2)
    J[3] = 1.0     # (2, 1)
    J[4] = 2*x[2]  # (2, 2)
    return J
end
Note

As the Jacobian is sparse, we have to provide its sparsity structure.

It remains to implement the Hessian of the Lagrangian for a HS15Model. The Lagrangian of the problem writes

\[L(x_1, x_2, y_1, y_2) = 100 \times (x_2 - x_1^2)^2 +(1 - x_1)^2 + y_1 \times (x_1 \times x_2) + y_2 \times (x_1 + x_2^2)\]

and we aim at evaluating its second-order derivative $\nabla^2_{xx}L(x_1, x_2, y_1, y_2)$.

We first have to define the sparsity structure of the Hessian, which is assumed to be sparse. The Hessian is a symmetric matrix, and by convention we pass only the lower-triangular part of the matrix to the solver. Hence, we define the sparsity structure as

function NLPModels.hess_structure!(nlp::HS15Model, I::AbstractVector{T}, J::AbstractVector{T}) where T
    copyto!(I, [1, 2, 2])
    copyto!(J, [1, 1, 2])
end

Now that the sparsity structure is defined, the associated Hessian writes down:

function NLPModels.hess_coord!(nlp::HS15Model, x, y, H::AbstractVector; obj_weight=1.0)
    # Objective
    H[1] = obj_weight * (-400.0 * x[2] + 1200.0 * x[1]^2 + 2.0)
    H[2] = obj_weight * (-400.0 * x[1])
    H[3] = obj_weight * 200.0
    # First constraint
    H[2] += y[1] * 1.0
    # Second constraint
    H[3] += y[2] * 2.0
    return H
end

Once the problem specified in NLPModels's syntax, we can create a new MadNLP instance and solve it:

x0 = zeros(2) # initial position
nlp = HS15Model(x0)
solver = MadNLP.MadNLPSolver(nlp; print_level=MadNLP.INFO)
results = MadNLP.solve!(solver)
"Execution stats: Optimal Solution Found (tol = 1.0e-08)."

MadNLP converges in 19 iterations to a (local) optimal solution. MadNLP returns a MadNLPExecutionStats storing all the results. We can query the primal and the dual solutions respectively by

results.solution
2-element Vector{Float64}:
 -0.7921232178470455
 -1.2624298435831807

and

results.multipliers
2-element Vector{Float64}:
 -477.17046873911977
   -3.126200044003132e-9