Time and Space Complexity of Count-Min Sketch Data Structure
Now let’s look at the time and space consumed with this type of approach (wrt to above Java-Guava Implementation)
Number of UUIDs |
Multiset Insertion Time(ms) |
CMS Insertion Time(ms) |
---|---|---|
10 |
<25 |
35 |
100 |
<25 |
30 |
1,000 |
30 |
69 |
10,000 |
257 |
246 |
100,000 |
1200 |
970 |
1,000,000 |
4244 |
4419 |
Let’s have a look at the memory (space) consumed:
Number of UUIDs |
Multiset JVM heap used(MB) |
CMS JVM heap used(MB) |
---|---|---|
10 |
<2 |
N/A |
100 |
<2 |
N/A |
1,000 |
3 |
N/A |
10,000 |
9 |
N/A |
100,000 |
39 |
N/A |
1,000,000 |
234 |
N/A |
Count-Min Sketch Data Structure with Implementation
The Count-Min Sketch is a probabilistic data structure and is defined as a simple technique to summarize large amounts of frequency data. Count-min sketch algorithm talks about keeping track of the count of things. i.e, How many times an element is present in the set.
Contact Us