

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The instructions and problems for homework 5 of the cs 6550: design and analysis of algorithms course taught by professor jamie morgenstern. The homework covers topics such as counters, data streaming algorithms, and matrix-valued chernoff bounds. Students are expected to solve problems related to maintaining approximate counters, designing streaming algorithms, and proving chernoff bounds for symmetric matrices.
Typology: Exercises
1 / 2
This page cannot be seen from the preview
Don't miss anything!
Design and Analysis of Algorithms, CS 6550 (Professor Jamie Morgenstern ): Homework #
Homework Out: April 18 Due Date: April 25, midnight Reminder: this homework is for extra credit! No late days. The HW contains some exercises (fairly simple problems to check you are on board with the concepts; dont submit your solutions), and problems (for which you should submit your solutions, and which will be graded). Some problems have sub-parts that are exercises. For this problem set, its OK to work with others. (Groups of 2, maybe 3 max.) That being said, please think about the problems yourself before talking to others. Please cite all sources you use, and people you work with. The expectation is that you try and solve these problems yourself, rather than looking online explicitly for answers. Submissions due at beginning of class on the due date. Please check the Piazza for details on submitting your LaTeXed solutions.
Problems
(a) Suppose the actual count is n, show that E[N ] = n, and Var(N ) = n(n 2 − 1). Since its variance is large, average k independent basic counters N 1 , N 2 ,... , Nk, and output the sample average Nˆ := (^1) k
i Ni^. Call this the^ k-mean counter. (b) (Do not submit) Show that P[ N /ˆ∈ (1 ± )n] ≤ 1 2 ^2 k
Hence using k = (^) P 21 (^2) δ counters can make the failure probability at most δ. (Said in other words, your error is less than n with confidence 1 − δ.) Heres a way to use only K = O( (^) ^12 log (^1) δ ) counters to get the same answer (and the approach is useful in many different contexts beyond this one): Take a collection of = 10log (^1) δ independent k 0 -mean counters, where k 0 = (^) ^42. Output the median M of these
counters. (c) Prove that P[M /∈ (1 ± )n] ≤ δ. Hint: what must happen for the the median to be too high? What is the chance of that?
Cg(e) ← Cg(e) + hg(e)(e). When faced with the query q, output A(q) := hg(q)(q)Cg(q). Assume that g and the his are all independently picked, and each hash function is itself 2-universal.
(a) Show that E[A(q)] = xq. (b) Show that the variance of the estimate A(q) = (^1) d (F 2 − x^2 q ) ≤ F 2 /d. (c) Show that if we set d = (^) ^12 δ , our estimate A(q) satisfies
P[A(q) ∈ xq ±
F 2 ] ≥ 1 − δ.
Recall that F 2 :=
i x^2 i =^ ||x||^22 and hence the error term is^ ||x||^2.
Design and Analysis of Algorithms, CS 6550 (Professor Jamie Morgenstern ): Homework #
(d) Finally, consider an extension of this idea: maintain t independent copies of the above data structure. On a query for q, if the answers from the individual copies of the data structure are A 1 (q), A 2 (q), ..., At(q), return the median M (q) of these t answers. Show that with t = c 1 log 1+ δ , and d = c ^22 , where c 1 , c 2 are constants you can choose, you get M (q) ∈ xq ± ||x|| 2 with probability 1 − δ.
Theorem 1. Let X 1 , X 2 ,... , Xn be independent symmetric d × t matrices, and Sn =
i X)i. Then, for any t ≥ 0 and ∈ R, P[λ 1 (Sn) ≥
] ≤ d · e−t`^ ·
∏^ n
i=
λ 1 (E[etXi^ ]).
P[λd(Sn) ≤ −] ≤ d · e−t
^ ·
∏^ n
i=
λ 1 (E[e−tXi^ ]).
Recall that tr(A) =
∑n i=1 aii. You may use the following without proof: i. tr(A) =
i λi(A) ii. λi(eA) = eλi(A) iii. The Golden-Thompson inequality: tr(eA+B^ ) ≤ tr(eA^ · eB^ ). iv. For PSD matrices A and B, tr(AB) ≤ tr(A) · tr(B). v. Expectations and trace commute: E[tr(A)] = tr(E[A]). (a) Show that for any t ≥ 0,
P[λ 1 (Sn) ≥ ] ≤ P[tr(etSn^ ) ≥ et
] ≤ e−t`^ · E[tr(stSn^ ].
(b) Show that EX 1 ,...,Xn [tr(etSn^ )] ≤ EX 1 ,...,Xn− 1 [tr(etSn−^1 )] · λ 1 (E[etXn^ ]). Hint: why can you use (iv) above even if Xn isn’t PSD? (c) Use the previous two parts to prove the first statement of the theorem. (d) Use the same arguments on (−Sn) =
i(−Xi) to prove the other part.