Question: https://codility.com/demo/take-sample-test/count_distinct_slices

Question Name: Count-Distinct-Slices or CountDistinctSlices

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | def solution(M, A): accessed = [-1] * (M + 1) # -1: not accessed before # Non-negative: the previous occurrence position front, back = 0, 0 result = 0 for front in xrange(len(A)): if accessed[A[front]] == -1: # Met with a new unique item accessed[A[front]] = front else: # Met with a duplicate item # Compute the number of distinct slices between newBack-1 and back # position. newBack = accessed[A[front]] + 1 result += (newBack - back) * (front - back + front - newBack + 1) / 2 if result >= 1000000000: return 1000000000 # Restore and set the accessed array for index in xrange(back, newBack): accessed[A[index]] = -1 accessed[A[front]] = front back = newBack # Process the last slices result += (front - back + 1) * (front - back + 2) / 2 return min(result, 1000000000) |

Hi,

I was trying to solve this problem in Java, and couldn’t figure out where I went wrong. Perhaps, could you take a look at it and point out the mistake?

Your method to update the begin index is incorrect.

I will explain a little bit of code which was cryptic to me:

result += (newBack – back) * (front – back + front – newBack + 1) / 2

The above line of code is equal to the below ones:

result += (front – back) * (front – back + 1) / 2

result -= (front – newBack) * (front – newBack + 1) / 2

Thanks for improving the post with more details!

Hey Sheng ,

Can you please explain following line

‘ result += (newBack – back) * (front – back + front – newBack + 1) / 2 ‘

Almost every solution uses this but I havent found why this formula is used .

Have you used any theorem to derive this ?

Just like Mr. Botond Orban said, you can deduce the formula from his equation set:

(front – back) * (front – back + 1) / 2 – (front – newBack) * (front – newBack + 1) / 2

= (newBack – back) * (front – back + front – newBack + 1) / 2

PS: the formula: 1+2+3+4+…+n-1+n = n*(n+1)/2

Thanks for your explanation! Your first comment needs my approval. Otherwise, it will show automatically.

Please refer to comment from @Botond Orban:

result += (newBack – back) * (front – back + front – newBack + 1) / 2 is the same as:

result += (front – back) * (front – back + 1) / 2 – (front – newBack) * (front – newBack + 1) / 2.

And

(front – back) * (front – back + 1) / 2 is the number of slices in A[back : front + 1],

(front – newBack) * (front – newBack + 1) / 2. is for A[newBack : front + 1]

Ok, I got 90 and always failed the range check. I didn’t figure out the reason until 2016! 🙂

For whatever ever reason, I thought the int_max is something starting with 2 and with 11 digits, but it only has 10 digits. So, after this great discovery, I finally got my first successful submit in 2016!

BTW, the cleanup of the lookup table may not be necessary.

Well, the reason I came back here is because I fall into the same trap again 🙁

First of all, I overlooked the integer overflow here again. Then, when I tried to fix it, I put 1e10 instead of 1e9, which took me 4 hours to figure that out!

“got 705082704 expected 1000000000”

Sad, old enough to count 0s wrong…

I present here a slight different solution than above using array, which is almost as same as Sheng’s python solution.

Sometimes, I used len(“1000000000”) in Python to count the 0s 🙂

Well, I have tried this one with C#. Apparently, everything is correct, my code is correct and efficient, but for particularly one of the tests of performance, it says the following:

WRONG ANSWER

got 705082704 expected 1000000000

So, what have I done? I submitted a search on Google with the two following terms:

codility 705082704

Then I found your post. I don’t understand this, because I am 99.99% that my code is correct. I am testing millions (literally) of cases and my code seems to be correct. I would understand a mistake if the result 705,082,704 was bigger than 1,000,000,000 (because when the result is bigger than 1,000,000,000 it should return 1,000,000,000), but IT ISN’T, 705,082,704 is lower than 1,000,000,000. I don’t understand it…

another one that’s 100/100 but doesn’t use the n(n+1)/2 formula, as it instead adds up the new slices on each loop iteration, making the code longer but the math simpler

It becomes a bit hard to read code when you name your variables this way. Here is another implementation that doesn’t use n(n + 1)/2 formula, and uses a single loop:

Hi

Here is my solution… it was easier for me to understand… basically I am counting the eligible slice ending at a given index.

Codility 100%, no math, one plain cycle solution.

Hope code is tagged correctly now 🙂

It is correctly tagged now. Thanks very much for sharing!

this is my sample code

Thanks for sharing. However, it does not satisfy the requirement: expected worst-case time complexity is O(N).

I think the codility guideline is missing the slice (1,4) and not sure why they didn’t include

slice (2, 4) when in fact they did include slice (3, 4)

With input [3, 4, 5, 5, 2], slice (1, 4) is [4, 5, 5, 2] and slice (2, 4) is [5, 5, 2]. Both slices contain the duplicated number

`5.`

Glad to see that you are actively participating. Really thanks for taking out time to reply. I do hope to get done with these demo tests so that I can come back to discuss the new unsolved questions.

You are very welcome! I have much less time on this site. But I am trying to keep it alive 🙂

Simple but efficient

This is my C++ solution:

https://app.codility.com/demo/results/training4UV3KH-Q5M/