Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Terminology of Order Statistics in Fundamental of Algorithm | CS 231, Study notes of Algorithms and Programming

Material Type: Notes; Professor: Shull; Class: Fundamental Algorithms; Subject: Computer Science; University: Wellesley College; Term: Fall 1996;

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-cxr
koofers-user-cxr 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Wellesley College CS231 Algorithms October 11, 1996
Handout #18
ORDER STATISTICS
Reading: CLR Chapter 10
---------------------------------------------------------------------------------------------
------------------------
Terminology
Let S be a set of n elements (necessarily distinct).
The ith order statistic of S is the ith smallest element (i.e., the element that is larger than
exactly i - 1 other elements in the set). Such an element is said to have rank i.
The minimum of S is the first order statistic (element with rank 1).
The maximum of S is the nth order statistic (element with rank n).
The median(s) of S is (are) the element(s) with rank (n+1)/2 and(n+1)/2.
Example: B = {43 5 17 91 2 42 19 72 37 3}
Minimum of B:
Maximum of B:
Medians of B:
Rank of 17:
pf3
pf4
pf5

Partial preview of the text

Download Terminology of Order Statistics in Fundamental of Algorithm | CS 231 and more Study notes Algorithms and Programming in PDF only on Docsity!

Wellesley CollegeCS231 AlgorithmsOctober 11, 1996 Handout #

ORDER STATISTICS

Reading: CLR Chapter 10

**---------------------------------------------------------------------------------------------

Terminology**

Let S be a set of n elements (necessarily distinct).

The i th order statistic of S is the i th smallest element (i.e., the element that is larger than exactly i - 1 other elements in the set). Such an element is said to have rank i.

The minimum of S is the first order statistic (element with rank 1).

The maximum of S is the nth order statistic (element with rank n).

The median (s) of S is (are) the element(s) with rank (^) (n+1)/2 and (n+1)/2.

Example: B = {43 5 17 91 2 42 19 72 37 3}

  • Minimum of B:
  • Maximum of B:
  • Medians of B:
  • Rank of 17:

The Selection Problem

Specification:

Select(A, i) Return the ith order statistic from an array of n (distinct) elements.

Trivial algorithm:

Select(A, i) Sort(A) return A[i]

Can obviously be done in Θ(n lg(n)) time.

The main question of today's lecture: Can we do better?

Important special cases:

  • Can find mininum or maximum with n - 1 comparisons.
  • Can find minimum and maximum with 3 n/2 comparisons.
  • For any fixed k ≥ 1, can select kth or (n - k)th element in Θ(n) time by k applications of minimum/maximum + deletion. --------------------------------------------------------------------------------------------- ------------------------

Selection in Worst Case Linear Time

Presentation in CLR is confusing. I prefer the following explanation.

We will consider the median-of-median-of-c algorithm, where c is an integer constant ≥ 1.

Median-of-Median-of-c(c, A, i)

  1. View input array A[1..n] as a two-dimensional array B[1..c, 1..n/c]. (If c does not evenly divide n, can always pad the array with extra large elements that won't affect results). Sort each column using any method (even quicksort). Since each column contains c elements, this step is linear, not quadratic. After this step, the row B[ (c+1)/2, 1..n/c] are medians of their respective columns.
  2. Use Median-of-Median-of-c(c, B[ (c+1)/2, 1..n/c], (n/c+1)/2) to find median of medians mm. At least 1/4 elements are ≥ mm; at least 3/4 elements are < mm. (Can see this by imagining that the columns are sorted from left to right by their medians. Important: we only imagine this sorting, we don't actually do it!)
  3. Partition A around mm, guaranteeing that mm ends up in the high partition. Let k be the number of elements in the low partition and (n - k) the number of elements in the high partition.
  4. If i ≤ k, use Median-of-Median-of-c with index i on the low partition; If i > k, use Median-of-Median-of-c with index i - k on the high partition;

Analysis: