Solution to Count-Distinct-Slices by codility

17 Apr

April 17, 2014 Sheng 25

Question: https://codility.com/demo/take-sample-test/count_distinct_slices

Question Name: Count-Distinct-Slices or CountDistinctSlices

def solution(M, A):

accessed = [-1] * (M + 1) # -1: not accessed before

# Non-negative: the previous occurrence position

front, back = 0, 0

result = 0

for front in xrange(len(A)):

if accessed[A[front]] == -1:

# Met with a new unique item

accessed[A[front]] = front

else:

# Met with a duplicate item

# Compute the number of distinct slices between newBack-1 and back

# position.

newBack = accessed[A[front]] + 1

result += (newBack - back) * (front - back + front - newBack + 1) / 2

if result >= 1000000000: return 1000000000

# Restore and set the accessed array

for index in xrange(back, newBack):

accessed[A[index]] = -1

accessed[A[front]] = front

back = newBack

# Process the last slices

result += (front - back + 1) * (front - back + 2) / 2

return min(result, 1000000000)

25 Replies to “Solution to Count-Distinct-Slices by codility”

Hi,
I was trying to solve this problem in Java, and couldn’t figure out where I went wrong. Perhaps, could you take a look at it and point out the mistake?

public static int distinctSlices(int M, int[] A) {

// write your code in Java SE 8

long slices=0;

int begin=0,end=0;

HashSet<Integer> set = new HashSet<Integer>();

while(end<=A.length-1)

{

while(set.contains(A[end])==false)

{

set.add(A[end]);

end++;

if(end==A.length)

{

break;

}

slices += ((end-begin)*(end-begin+1))/2;

if(slices>1000000000)

{

return 1000000000;

}

begin=end;

set.clear();

}

return (int)slices;

}

Sheng says:
March 18, 2015 at 11:45 pm
Your method to update the begin index is incorrect.
Reply

Botond Orban says:
March 31, 2015 at 10:54 am
I will explain a little bit of code which was cryptic to me:
result += (newBack – back) * (front – back + front – newBack + 1) / 2
The above line of code is equal to the below ones:
result += (front – back) * (front – back + 1) / 2
result -= (front – newBack) * (front – newBack + 1) / 2
Reply
- Sheng says:
  April 5, 2015 at 8:04 pm
  Thanks for improving the post with more details!
  Reply
Nishant Sthalekar says:
June 23, 2015 at 6:27 pm
Hey Sheng ,
Can you please explain following line
‘ result += (newBack – back) * (front – back + front – newBack + 1) / 2 ‘
Almost every solution uses this but I havent found why this formula is used .
Have you used any theorem to derive this ?
Reply
- Hong Bo Niu says:
  July 6, 2015 at 8:15 am
  Just like Mr. Botond Orban said, you can deduce the formula from his equation set:
  (front – back) * (front – back + 1) / 2 – (front – newBack) * (front – newBack + 1) / 2
  = (newBack – back) * (front – back + front – newBack + 1) / 2
  PS: the formula: 1+2+3+4+…+n-1+n = n*(n+1)/2
  Reply
  - Sheng says:
    July 23, 2015 at 11:26 pm
    Thanks for your explanation! Your first comment needs my approval. Otherwise, it will show automatically.
    Reply
- Sheng says:
  July 16, 2015 at 11:42 pm
  Please refer to comment from @Botond Orban:
  result += (newBack – back) * (front – back + front – newBack + 1) / 2 is the same as:
  result += (front – back) * (front – back + 1) / 2 – (front – newBack) * (front – newBack + 1) / 2.
  And
  (front – back) * (front – back + 1) / 2 is the number of slices in A[back : front + 1],
  (front – newBack) * (front – newBack + 1) / 2. is for A[newBack : front + 1]
  Reply

Ok, I got 90 and always failed the range check. I didn’t figure out the reason until 2016! 🙂
For whatever ever reason, I thought the int_max is something starting with 2 and with 11 digits, but it only has 10 digits. So, after this great discovery, I finally got my first successful submit in 2016!
BTW, the cleanup of the lookup table may not be necessary.

#include <cassert>

#include <vector>

#include <unordered_map>

#include <algorithm>

using namespace std;

int solution(int M, vector<int> &A) {

int len = A.size();

assert(len > 0 && M >= 0 && M < 100001);

unordered_map<int, int> map;

unordered_map<int, int>::const_iterator end = map.end();

unordered_map<int, int>::const_iterator valueItor;

long long count = 0, i = 0, j = 0, k;

while (j < len)

{

valueItor = map.find(A[j]);

if (valueItor == end || valueItor->second < i)

{

map[A[j]] = j;

++j;

}

else

{

count += (j - i)*(j - i + 1) / 2;

k = j - valueItor->second - 1;

count -= k*(k + 1) / 2;

i = valueItor->second + 1;

if (count >= 1000000000ll)return 1000000000;

}

count += (j - i)*(j - i + 1) / 2;

return (int)std::min(count, 1000000000ll);

}

Well, the reason I came back here is because I fall into the same trap again 🙁
First of all, I overlooked the integer overflow here again. Then, when I tried to fix it, I put 1e10 instead of 1e9, which took me 4 hours to figure that out!
“got 705082704 expected 1000000000”
Sad, old enough to count 0s wrong…
I present here a slight different solution than above using array, which is almost as same as Sheng’s python solution.

int solutionCountDistinctSlices(int M, vector<int> &A)

{

int len = A.size(), lasti = 0;

long long count = 0, i = 0, offset = 0;

vector<int> memo = vector<int>(M + 1, -1);

for (i = 0; i < len; ++i)

{

if (memo[A[i]] < lasti)

memo[A[i]] = i;

else

{

offset = i - memo[A[i]] - 1;

count += (i - lasti + 1) * (i - lasti) / 2 - (offset + 1) * offset / 2;

if (count > 1000000000ll)return 1e9;

lasti = memo[A[i]] + 1;

memo[A[i]] = i;

}

return std::min(1000000000LL, count + (i - lasti + 1) * (i - lasti) / 2);

}

Sheng says:
March 14, 2016 at 9:54 pm
Sometimes, I used len(“1000000000”) in Python to count the 0s 🙂
Reply
Carlos Martínez Trueba says:
June 21, 2023 at 4:45 pm
Well, I have tried this one with C#. Apparently, everything is correct, my code is correct and efficient, but for particularly one of the tests of performance, it says the following:
WRONG ANSWER
got 705082704 expected 1000000000
So, what have I done? I submitted a search on Google with the two following terms:
codility 705082704
Then I found your post. I don’t understand this, because I am 99.99% that my code is correct. I am testing millions (literally) of cases and my code seems to be correct. I would understand a mistake if the result 705,082,704 was bigger than 1,000,000,000 (because when the result is bigger than 1,000,000,000 it should return 1,000,000,000), but IT ISN’T, 705,082,704 is lower than 1,000,000,000. I don’t understand it…
Reply

another one that’s 100/100 but doesn’t use the n(n+1)/2 formula, as it instead adds up the new slices on each loop iteration, making the code longer but the math simpler

def solution(M, A):

mm = [0]*(M+1)

nb = 0

nf = 0

ns = 0

mr = False

ln = 0

for ii in range(len(A)):

if mm[A[ii]]==0:

mm[A[ii]] = 1

ln+=1

ns+=ii-nb+1

if ns>1000000000:

return 1000000000

else:

mh = A[ii]

nf = ii

while A[nb]!=mh:

mm[A[nb]] = 0

nb+=1

ns+=ii-nb+1

#print nb

return ns

It becomes a bit hard to read code when you name your variables this way. Here is another implementation that doesn’t use n(n + 1)/2 formula, and uses a single loop:

int solution(int M, vector<int> &A) {

vector<bool> slice(M + 1, false);

long long int res = 0;

int front = 0, back = 0;

while (front != A.size()) {

if (back != A.size() && slice[A[back]] == false) {

slice[A[back]] = true;

++back;

}

else {

res += (back - front);

if(res > 1000000000) {

return 1000000000;

}

slice[A[front]] = false;

++front;

}

return res;

}

Hi
Here is my solution… it was easier for me to understand… basically I am counting the eligible slice ending at a given index.

def solution(M, A):

# write your code in Python 2.7

pass;

# M = max(A);

counter = [0]*(1+M);

N = len(A);

back = 0;

counter[A[0]] = 1;

result = 1;

for front in range(1, N):

# print counter;

if counter[A[front]] != 0:

while back < front and counter[A[front]] != 0:

counter[A[back]] -= 1;

back += 1;

# print back, front;

result += (front - back + 1);

# print result;

if result > 1000000000:

return 1000000000;

counter[A[front]] += 1;

return result;

Codility 100%, no math, one plain cycle solution.
Hope code is tagged correctly now 🙂

public static int solution(int[] A)

{

// array to remember last positions of values

int vMax = A[0];

for (int i = 1; i < A.Length; ++i) vMax = Math.Max(vMax, A[i]);

int[] vLastPos = new int[vMax + 1];

for (int i = 0; i < vLastPos.Length; ++i) vLastPos[i] = -1;

// each element adds the same number of slices as is the length of current distinct slice

int vSlices = 0, vNewStart = -1;

for (int i = 0; i < A.Length; ++i)

{

int vVal = A[i];

int vPrevPos = vLastPos[vVal];

vSlices += i - Math.Max(vPrevPos, vNewStart);

if (vSlices > 1000000000) return 1000000000;

if (vPrevPos != -1) vNewStart = Math.Max(vNewStart, vPrevPos); // actual start of distinct slice

vLastPos[vVal] = i;

}

return vSlices;

}

Sheng says:
May 23, 2016 at 9:47 pm
It is correctly tagged now. Thanks very much for sharing!
Reply

this is my sample code

# you can write to stdout for debugging purposes, e.g.

# print "this is a debug message"

def solution(M, A):

# write your code in Python 2.7

len_A=len(A)

counter=0

for P in range (0,len_A):

for Q in range (0 , len_A):

PQ=sum(A[P:Q])

if PQ<=M and 0<=P<=Q:

counter+=1

return counter

Sheng says:
July 10, 2016 at 9:53 pm
Thanks for sharing. However, it does not satisfy the requirement: expected worst-case time complexity is O(N).
Reply

Doctor Tex says:
April 30, 2019 at 7:58 am
I think the codility guideline is missing the slice (1,4) and not sure why they didn’t include
slice (2, 4) when in fact they did include slice (3, 4)
Reply
- Sheng says:
  May 4, 2019 at 11:42 pm
  With input [3, 4, 5, 5, 2], slice (1, 4) is [4, 5, 5, 2] and slice (2, 4) is [5, 5, 2]. Both slices contain the duplicated number 5.
  Reply
  - Doctor Tex says:
    May 6, 2019 at 4:19 pm
    Glad to see that you are actively participating. Really thanks for taking out time to reply. I do hope to get done with these demo tests so that I can come back to discuss the new unsolved questions.
    Reply
    - Sheng says:
      May 17, 2019 at 10:38 am
      You are very welcome! I have much less time on this site. But I am trying to keep it alive 🙂
      Reply

Simple but efficient

def solution(M, A):

# write your code in Python 3.6

visited = [0]*(M+1)

front = 0

N = len(A)

total = 0

for back in range(N):

while(front < N and visited[A[front]] != 1):

visited[A[front]] = 1

front += 1

total += front - back

visited[A[back]] = 0

if(total > 1000000000):

return 1000000000

return total

pass

This is my C++ solution:
https://app.codility.com/demo/results/training4UV3KH-Q5M/

#include <unordered_map>

int solution(int M, vector<int> &A) {

const int MAX_RESULT = 1'000'000'000;

std::unordered_map<int, int> index_lookup;

int result = 0;

int begin = -1;

int end = 0;

while (end < A.size()) {

int current = A[end];

int previous_position = -1;

if(index_lookup.find(current) != index_lookup.end()) {

previous_position = index_lookup[current];

}

begin = std::max(begin, previous_position);

result += end - begin;

if (result > MAX_RESULT) {

return MAX_RESULT;

}

index_lookup[current] = end;

end++;

}

return result;

}

25 Replies to “Solution to Count-Distinct-Slices by codility”

Leave a Reply Cancel reply