The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
For example, for arr = [2,3,4], the median is 3. For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5. Implement the MedianFinder class:
MedianFinder() initializes the MedianFinder object. void addNum(int num) adds the integer num from the data stream to the data structure. double findMedian() returns the median of all elements so far. Answers within 10-5 of the actual answer will be accepted.
Example 1:
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
This solution utilizes two heaps (small
and large
) to keep track of the data stream and efficiently find the median.
-
Heap Structure:
- We use two heaps:
small
: A max-heap (stores the smaller half of the elements in descending order). Python'sheapq
module doesn't have a built-in max-heap, so we achieve this by negating the elements before inserting them.large
: A min-heap (stores the larger half of the elements in ascending order).
- We use two heaps:
-
Adding Numbers:
- When a new number (
num
) is added (addNum
function):- We negate
num
and push it onto thesmall
heap (heapq.heappush(self.small, -1 * num)
).
- We negate
- When a new number (
-
Balancing the Heaps:
-
After adding, we need to ensure both heaps have roughly the same number of elements to accurately calculate the median. Here's how we achieve balance:
-
Case 1: Small Heap Overflows:
- If the
small
heap has more elements than thelarge
heap by more than 1 (len(self.small) > len(self.large) + 1
), we:- Pop the largest element (
val
) fromsmall
(max-heap, so it's the most negative value). Don't forget to negate it back to its original value. - Push this element (
val
) onto thelarge
heap (min-heap). This maintains balance and ensures the larger half has at least the same number of elements as the smaller half.
- Pop the largest element (
- If the
-
Case 2: Large Heap Overflows:
- Similarly, if the
large
heap has more elements than thesmall
heap by more than 1 (len(self.large) > len(self.small) + 1
), we:- Pop the smallest element (
val
) fromlarge
(min-heap). - Negate
val
to convert it back to its original value before pushing it onto thesmall
heap (max-heap). This maintains balance and ensures the smaller half doesn't fall too far behind.
- Pop the smallest element (
- Similarly, if the
-
-
-
Finding the Median:
- The
findMedian
function calculates the median based on the current heap sizes:- If
small
has more elements: The median is the (negated) maximum element fromsmall
(-1 * self.small[0]
). - If
large
has more elements: The median is the minimum element fromlarge
(self.large[0]
). - If both heaps have the same size: The median is the average of the maximum element from
small
(negated) and the minimum element fromlarge
((-1 * self.small[0]) + self.large[0]) / 2
).
- If
- The
Efficiency:
- This solution has a time complexity of O(log n) for
addNum
and O(1) forfindMedian
operations due to the efficient use of heaps.