Question: https://codility.com/demo/take-sample-test/double_median

Question Name: nu2011 or DoubleMedian

For the given queries, we need to find the median in two sorted arrays of different lengths. Then we need to find out the median of the query results. That is, we need to find the median in an unsorted array. Both of these sub-questions are classic. And you could find tons of articles via Google. For the second sub question, typically we have two methods. One solution is to sort the array and return the element in the middle position. This method could guarantee the time complexity be O(NlogN) in any case. And this is what the official golden solution used.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | def findMedian(A, B, AFrom, ATo, BFrom, BTo): # A classic question: find median in two sorted array # For more details, please Google it. result = 0 while True: # AMid and BMid are current medians for A and B # sub-arrays respectively AMid = A[(AFrom+ATo)/2] BMid = B[(BFrom+BTo)/2] if AFrom == ATo: # Only one element in A's sub-array. No need # for further iteration if AMid < B[(BFrom+BTo)/2]: result = B[(BFrom+BTo)/2] elif AMid > B[(BFrom+BTo)/2+1]: result = B[(BFrom+BTo)/2+1] else: result = AMid break elif BFrom == BTo: # Only one element in B's sub-array. No need # for further iteration if BMid < A[(AFrom+ATo)/2]: result = A[(AFrom+ATo)/2] elif BMid > A[(AFrom+ATo)/2+1]: result = A[(AFrom+ATo)/2+1] else: result = BMid break elif AMid == BMid: # Median is found result = AMid break elif AMid > BMid: # Median must be in the left part of A's # sub-array OR the right part of B's sub- # array reduced = (min(ATo-AFrom, BTo-BFrom)+1)/2 ATo -= reduced BFrom += reduced else: # Median must be in the right part of A's # sub-array OR the left part of B's sub- # array reduced = (min(ATo-AFrom, BTo-BFrom)+1)/2 AFrom += reduced BTo -= reduced return result def solution(A, B, P, Q, R, S): questionCount = len(P) answer = [0] * questionCount for i in xrange(questionCount): # Find the median for each query answer[i] = findMedian(A, B, P[i], Q[i], R[i], S[i]) answer.sort() return answer[questionCount/2] |

The other method is quick-select, which is similar with quicksort sorting algorithm. There are many variants of quick-select. And we are using a simple one: quick-select with shuffle. The expected time complexity is O(N). And in the worst case, the time complexity is O(N^2). In practice, it should work quite quickly.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | import random def findTwoArrayMedian(A, B, AFrom, ATo, BFrom, BTo): # A classic question: find median in two sorted array # For more details, please Google it. result = 0 while True: # AMid and BMid are current medians for A and B # sub-arrays respectively AMid = A[(AFrom+ATo)/2] BMid = B[(BFrom+BTo)/2] if AFrom == ATo: # Only one element in A's sub-array. No need # for further iteration if AMid < B[(BFrom+BTo)/2]: result = B[(BFrom+BTo)/2] elif AMid > B[(BFrom+BTo)/2+1]: result = B[(BFrom+BTo)/2+1] else: result = AMid break elif BFrom == BTo: # Only one element in B's sub-array. No need # for further iteration if BMid < A[(AFrom+ATo)/2]: result = A[(AFrom+ATo)/2] elif BMid > A[(AFrom+ATo)/2+1]: result = A[(AFrom+ATo)/2+1] else: result = BMid break elif AMid == BMid: # Median is found result = AMid break elif AMid > BMid: # Median must be in the left part of A's # sub-array OR the right part of B's sub- # array reduced = (min(ATo-AFrom, BTo-BFrom)+1)/2 ATo -= reduced BFrom += reduced else: # Median must be in the right part of A's # sub-array OR the left part of B's sub- # array reduced = (min(ATo-AFrom, BTo-BFrom)+1)/2 AFrom += reduced BTo -= reduced return result def findSingleArrayMedian(content): # Find the median in the unsorted array "content" # Methond: quick-select with shuffle begin = 0 end = len(content) target = end/2 random.shuffle(content) # Similar with 3-way quick-sort while True: current = content[begin] i, lt, gt = begin, begin, end - 1 # After one while loop, all the elements, from (include) # begin to (exclude) lt, are less than chosen pivot # "current". All the elements, from (include) lt to # (include) gt, are same as the pivot. All the elements, # from (exclude) gt to end, are greater than pivot. while i <= gt: if content[i] < current: content[i], content[lt] = content[lt], content[i] i += 1 lt += 1 elif content[i] > current: content[i], content[gt] = content[gt], content[i] gt -= 1 else: i += 1 if target >= lt and target <= gt: # The median equals to the chosen pivot return content[target] elif target < lt: end = lt else: begin = gt + 1 def solution(A, B, P, Q, R, S): questionCount = len(P) answer = [0] * questionCount for i in xrange(questionCount): # Find the median for each query answer[i] = findTwoArrayMedian(A, B, P[i], Q[i], R[i], S[i]) # Find the median of the query results return findSingleArrayMedian(answer) |