Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Convex Analysis and Optimization - Assignment 2 | 711 611, Assignments of Operational Research

Material Type: Assignment; Class: 711 - SEL TOPICS OPER RES; Subject: OPERATIONS RESEARCH; University: Rutgers University; Term: Spring 2009;

Typology: Assignments

Pre 2010

Uploaded on 09/17/2009

koofers-user-mey-1
koofers-user-mey-1 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Special Topics in Operations Research 16:711:611
Convex Analysis and Optimization
Spring 2009 Rutgers University Prof. Eckstein
Solutions to Homework 1
1. We note that the set λC is convex (this is part of the proposition, but not explicitly
required by the problem). To see this, pick z1, z2λC and α(0,1). By construction
z1=λx1,z2=λx2for some x1, x2C. Then
αz1+ (1 α)z2=αλx1+ (1 α)λx2=λαx1+ (1 α)x2.
Since αx1+ (1 α)x2by convexity of C,αz1+ (1 α)z2is of the form λx for xC,
and is thus contained in λC. Since z1,z2, and αwere arbitrary, λC is convex.
Now consider any λ1, λ2>0. We start by showing that if z1λ1Cand z2λ2C,
then we must have z1+z2(λ1+λ2)C. By construction, we have z1=λ1x1and
z2=λ2x2, where x1, x2C. We note that by convexity of C,
λ1
λ1+λ2x1+λ2
λ1+λ2x2C
If we multiply this vector by (λ1+λ2), we obtain λ1x1+λ2x2=z1+z2, so therefore
z1+z2(λ1+λ2)C. Since z1λ1Cand z2λ2Cwere arbitrary, we have λ1C+λ2C
(λ1+λ2)C.
On the other hand, if we pick any point z(λ1+λ2)C, it must be of the form
z= (λ1+λ2)x, for xC, and can be written z=λ1x+λ2x, meaning it is in
λ1C+λ2C. Thus (λ1+λ2)Cλ1C+λ2C, and in view of the opposite inclusion
proved above λ1C+λ2C= (λ1+λ2)C
Counterexamples for nonconvex Care very simple. In R1, consider C={0,1},λ1= 1,
and λ2= 2. Then λ1C+λ2C={0,1,2,3}, but (λ1+λ2)C={0,3}. (Notice that the
proof of (λ1+λ2)Cλ1C+λ2Cabove did not use convexity, so that remains true.)
2. First, it is clear that any function of the form f(x) = ha, xi+bobeys (1) for any αR,
since we have
αf(x) + (1 α)f(y) = α(ha, xi+b) + (1 α)(ha, yi+b)
=αha, xi+ (1 α)ha, yi+αb + (1 α)b
=ha, αx + (1 α)yi+b
=f(αx + (1 α)y).
Conversely, consider any function fobeying (1). Note that I did not specify the range
of αfor which (1) holds. It turns out that even if we suppose that (1) holds only for
α[0,1], we can immediately deduce that it holds for all αR. Suppose we have
1
pf3
pf4
pf5

Partial preview of the text

Download Convex Analysis and Optimization - Assignment 2 | 711 611 and more Assignments Operational Research in PDF only on Docsity!

Special Topics in Operations Research 16:711:

Convex Analysis and Optimization

Spring 2009 Rutgers University Prof. Eckstein

Solutions to Homework 1

  1. We note that the set λC is convex (this is part of the proposition, but not explicitly required by the problem). To see this, pick z^1 , z^2 ∈ λC and α ∈ (0, 1). By construction z^1 = λx^1 , z^2 = λx^2 for some x^1 , x^2 ∈ C. Then

αz^1 + (1 − α)z^2 = αλx^1 + (1 − α)λx^2 = λ

αx^1 + (1 − α)x^2

Since αx^1 + (1 − α)x^2 by convexity of C, αz^1 + (1 − α)z^2 is of the form λx for x ∈ C, and is thus contained in λC. Since z^1 ,z^2 , and α were arbitrary, λC is convex. Now consider any λ 1 , λ 2 > 0. We start by showing that if z^1 ∈ λ 1 C and z^2 ∈ λ 2 C, then we must have z^1 + z^2 ∈ (λ 1 + λ 2 )C. By construction, we have z^1 = λ 1 x^1 and z^2 = λ 2 x^2 , where x^1 , x^2 ∈ C. We note that by convexity of C, ( λ 1 λ 1 + λ 2

x^1 +

λ 2 λ 1 + λ 2

x^2 ∈ C

If we multiply this vector by (λ 1 + λ 2 ), we obtain λ 1 x^1 + λ 2 x^2 = z^1 + z^2 , so therefore z^1 +z^2 ∈ (λ 1 +λ 2 )C. Since z^1 ∈ λ 1 C and z^2 ∈ λ 2 C were arbitrary, we have λ 1 C +λ 2 C ⊆ (λ 1 + λ 2 )C. On the other hand, if we pick any point z ∈ (λ 1 + λ 2 )C, it must be of the form z = (λ 1 + λ 2 )x, for x ∈ C, and can be written z = λ 1 x + λ 2 x, meaning it is in λ 1 C + λ 2 C. Thus (λ 1 + λ 2 )C ⊆ λ 1 C + λ 2 C, and in view of the opposite inclusion proved above λ 1 C + λ 2 C = (λ 1 + λ 2 )C Counterexamples for nonconvex C are very simple. In R^1 , consider C = { 0 , 1 }, λ 1 = 1, and λ 2 = 2. Then λ 1 C + λ 2 C = { 0 , 1 , 2 , 3 }, but (λ 1 + λ 2 )C = { 0 , 3 }. (Notice that the proof of (λ 1 + λ 2 )C ⊆ λ 1 C + λ 2 C above did not use convexity, so that remains true.)

  1. First, it is clear that any function of the form f (x) = 〈a, x〉+b obeys (1) for any α ∈ R, since we have

αf (x) + (1 − α)f (y) = α(〈a, x〉 + b) + (1 − α)(〈a, y〉 + b) = α〈a, x〉 + (1 − α)〈a, y〉 + αb + (1 − α)b = 〈a, αx + (1 − α)y〉 + b = f (αx + (1 − α)y).

Conversely, consider any function f obeying (1). Note that I did not specify the range of α for which (1) holds. It turns out that even if we suppose that (1) holds only for α ∈ [0, 1], we can immediately deduce that it holds for all α ∈ R. Suppose we have

z = αx + (1 − α)y for α > 1. We can rearrange this equation into αx = z + (α − 1)y and divide by α to obtain x =

α

z +

(α− 1 α

y. From (1) with the substitution α ← 1 /α ∈ [0, 1], we then obtain

f (x) =

α

f (z) +

(α− 1 α

f (y),

which we can algebraically manipulate into f (z) = αf (x) + (1 − α)f (y), even though α > 1. A similar technique applies if α < 0: we write y =

1 −α

z +

( (^) −α 1 −α

x, apply (1), and then apply a reverse series of algebraic manipulations. Thus, we may consider (1) to hold for α ∈ R. With this in mind, set g(x) = f (x) − f (0). We show that g : Rn^ → R must be a linear form. For any λ ∈ R, we have

g(λx) = f (λx) − f (0) = f (λx + (1 − λ)0) − f (0) = λf (x) + (1 − λ)f (0) − f (0) [by (1)] = λf (x) − λf (0) = λg(x).

Now take any x, y ∈ Rn. We then observe that

g(x + y) = g

2 x^ +^

1 2 y

= 2g

2 x^ +^

1 2 y

[since g(λx) = λg(x)] = 2

2 g(x) +^

1 2 g(y)

[by (1)] = g(x) + g(y).

So g is a linear functional. In Rn, this means that we must have g(x) = 〈a, x〉 for some a ∈ Rn.^1 Setting b = f (0), we obtain from g(x) = f (x) − f (0) that f (x) = g(x) + f (0) = 〈a, x〉 + b.

  1. Let Y be the set of all convex combinations of points from X. As in class, the convex hull conv(X) is the intersection of all convex sets containing X. First, we show that Y must be convex. Take any y, y′^ ∈ Y and α ∈ (0, 1). By construction, we have

y =

∑^ m

i=

βixi y′^ =

∑^ m′

i=

β i′x′ i

where β 1 ,... , βm, β 1 ′,... , β′ m′ ≥ 0, x 1 ,... , xm, x′ 1 ,... , x′ m′ ∈ X,

∑m ∑ i=1^ βi^ = 1, and m′ i=1 β ′ i = 1. We then write αy + (1 − α)y′^ = αβ 1 x^1 + · · · + αβmxm^ + (1 − α)β′ 1 x′ 1 + · · · (1 − α)β m′′ x′ m′. (^1) For those of you familiar with infinite-dimensonal spaces, this result is also true in any Hilbert space by

the famous Riesz representation theorem. It may fail in more exotic infinite-dimensonal spaces.

(b) From this point, we can proceed much as in question 3. Let Z denote the set of all affine combinations of elements of Y. Consider any affine set X containing Y. From part (a), X contains all affine combinations of its elements, and in particular all affine combinations from Y. Therefore, X ⊇ Z. Furthermore, since the affine set X ⊇ Y was arbitrary, Z is contained in all affine sets containing Y , and we have Z ⊆ aff(Y ). To complete the proof, we will show that Z is an affine set. This, along with the obvious fact that Z ⊇ Y , establishes that Z ⊇ aff(Y ), since aff(Y ) is the intersection of all affine sets containing Y. In view of the opposite inclusion above, we then conclude Z = aff(Y ). To show that Z is affine, we consider any affine combination w = α 1 z^1 +· · ·+αmzm of points z^1 ,... , zm^ ∈ Z, where α 1 + · · · + αm = 1. If we can show that any such w is in Z, then part (a) will assert that Z is affine. Now, for all i = 1,... , m, we have from the construction of Z that

zi^ =

∑^ ni

j=

βij yij^ , yi 1 ,... , yini ∈ Y

∑^ ni

j=

βij = 1.

Thus, we can write

w =

∑^ m

i=

αi

∑ni

j=

βij yij^ =

∑^ m

i=

∑^ ni

j=

(αiβij )yij^.

Noting that

∑^ m

i=

∑^ ni

j=

αiβij =

∑^ m

i=

αi

( (^) ni ∑

j=

βij

∑^ m

i=

αi · 1 =

∑^ m

i=

αi = 1,

it is clear that w is an affine combination of the points yij^ ∈ Y , and is hence a member of Z.

  1. As suggested in the hint, consider the function f : (0, ∞) → R given by f (x) = − log x (I will use “log” to stand for the natural logarithm). From elementary calculus, we find that f ′′(x) = 1/x^2 , which is positive for all x > 0. Using Proposition 1.2.6 with n = 1 and C = (0, ∞), we conclude that f is strictly convex over (0, ∞). Jensen’s inequality, formula (1.7) from the text, with n = 1 and X = (0, ∞), tells us that

f

( (^) m ∑

i=

αixi

∑^ m

i=

αif (xi).

Substituting the definition of f and multiplying by −1, we obtain

log

( (^) m ∑

i=

αixi

∑^ m

i=

αi log xi.

Applying the monotonic function ex^ to both sides of this inequality produces

∑^ m

i=

αixi ≥

∏^ m

i=

xα i i,

which is equivalent to the desired result. In the construction of Jensen’s inequality, it can also be seen that if f is strictly convex, then the inequality will be strict unless x 1 = · · · = xm.