CS 6550 Homework 5: Design and Analysis of Algorithms - Problems and Solutions | Exercises Advanced Algorithms

Design and Analysis of Algorithms, CS 6550 (Professor Jamie Morgenstern ): Homework #5

Homework Out: April 18

Due Date: April 25, midnight

Reminder: this homework is for extra credit! No late days. The HW contains some exercises (fairly simple

problems to check you are on board with the concepts; dont submit your solutions), and problems (for which you

should submit your solutions, and which will be graded). Some problems have sub-parts that are exercises. For this

problem set, its OK to work with others. (Groups of 2, maybe 3 max.) That being said, please think about the

problems yourself before talking to others. Please cite all sources you use, and people you work with. The exp ectation

is that you try and solve these problems yourself, rather than looking online explicitly for answers. Submissions due

at beginning of class on the due date. Please check the Piazza for details on submitting your LaTeXed solutions.

Problems

1. (A Counter, and the Median-of-Means Estimator.). Here is a way of maintaining an approximate

counter. (Call this the basic counter.) Start with X←0. When an element arrives, increment Xby 1 with

probability 2−X. When queried, return N:= 2X−1.

(a) Suppose the actual count is n, show that E[N] = n, and Var(N) = n(n−1)

Since its variance is large, average kindependent basic counters N1, N2, . . . , Nk, and output the sample

average ˆ

N:= 1

kPiNi. Call this the k-mean counter.

(b) (Do not submit) Show that

P[ˆ

N /∈(1 ±)n]≤1

22k.

Hence using k=1

P22δcounters can make the failure probability at most δ. (Said in other words, your

error is less than n with confidence 1 −δ.) Heres a way to use only K=O(1

2log 1

δ) counters to get the

same answer (and the approach is useful in many different contexts beyond this one):

Take a collection of `= 10log 1

δindependent k0-mean counters, where k0=4

2. Output the median Mof

these `counters.

chance of that?

2. (I Stream, You Stream.). In data streaming model, suppose we denote frequency vector by x=

(x1, x2, ..., xD)∈ZD

≥0where xicounts the number of occurences of element i∈[D] seen so far. We want

a streaming algorithm that stores information about the stream so that when it is eventually queried with

some index q∈[D], returns a value ˆxqthat is ≈xqwith probability at least 1 −δ. We dont want to store x

explicitly, we want to use less space.

Consider the following algorithm:

Keep a global hash function g: [D]→[d], and also dcounters C1, C2, . . . , Cd(initially zero), each with its own

hash function hi: [D]→ {−1,+1}. If you see element e∈[D], first hash it using the global hash function gto

get the bucket number g(e), and then update

Cg(e)←Cg(e)+hg(e)(e).

When faced with the query q, output A(q) := hg(q)(q)Cg(q).

Assume that gand the his are all independently picked, and each hash function is itself 2-universal.

(a) Show that E[A(q)] = xq.

(b) Show that the variance of the estimate A(q) = 1

d(F2−x2

q)≤F2/d.

2δ, our estimate A(q) satisfies

P[A(q)∈xq±pF2]≥1−δ.

Recall that F2:= Pix2

i=||x||2

2and hence the error term is ||x||2.

CS 6550 Homework 5: Design and Analysis of Algorithms - Problems and Solutions, Exercises of Advanced Algorithms

Related documents

Partial preview of the text

Download CS 6550 Homework 5: Design and Analysis of Algorithms - Problems and Solutions and more Exercises Advanced Algorithms in PDF only on Docsity!