> ## Documentation Index
> Fetch the complete documentation index at: https://hanzla.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# 01 contains duplicate

# Contains Duplicate

🔗 **NeetCode Problem**:\
[https://neetcode.io/problems/duplicate-integer/question?list=neetcode150](https://neetcode.io/problems/duplicate-integer/question?list=neetcode150)

🔗 **LeetCode Problem**:\
[https://leetcode.com/problems/contains-duplicate](https://leetcode.com/problems/contains-duplicate)

***

## Problem Statement

Given an integer array `nums`, return `true` if any value appears **at least twice** in the array, and return `false` if every element is **distinct**.

***

## Approach 1: Brute Force (Nested Loop)

### Idea

Compare every element with every other element to check if a duplicate exists.

### Pseudocode

```
for i from 0 to n-1:
    for j from i+1 to n-1:
        if nums[i] == nums[j]:
            return true
return false
```

### C++ Implementation

```cpp theme={null}
class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        for (int i = 0; i < nums.size(); i++) {
            for (int j = i + 1; j < nums.size(); j++) {
                if (nums[i] == nums[j])
                    return true;
            }
        }
        return false;
    }
};
```

### Complexity Analysis

* **Time Complexity**: **O(n²)** → Nested loops over the array
* **Space Complexity**: **O(1)** → No extra space needed

### Why it's not optimal?

For large arrays, comparing every pair becomes extremely slow. If the array has 10,000 elements, we'd do \~50 million comparisons in the worst case.

***

## Optimized Approach: Using Hash Map (Unordered Map)

### Idea

Use a hash map to store elements as we iterate.\
If an element already exists in the map, it means a duplicate is found.

### Key Insight

Hash map lookup is **O(1)** on average, so we only need a single pass through the array.

### C++ Implementation

```cpp theme={null}
class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        unordered_map<int, int> mp;

        for (int i = 0; i < nums.size(); i++) {
            if (mp.find(nums[i]) != mp.end())
                return true;

            mp[nums[i]] = i;
        }
        return false;
    }
};
```

### Complexity Analysis

* **Time Complexity**: **O(n)** → Single pass through the array with O(1) lookups
* **Space Complexity**: **O(n)** → Hash map stores up to n elements

### Why it's better?

Single pass with constant-time lookups beats nested loops. Early termination when duplicate is found.

***

## Alternative Optimized Approach: Using Hash Set

### Idea

Similar to hash map, but we only need to store the elements themselves, not their indices.

### C++ Implementation

```cpp theme={null}
class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        unordered_set<int> seen;

        for (int num : nums) {
            if (seen.find(num) != seen.end())
                return true;

            seen.insert(num);
        }
        return false;
    }
};
```

### Complexity Analysis

* **Time Complexity**: **O(n)** → Single pass with O(1) lookups
* **Space Complexity**: **O(n)** → Hash set stores up to n elements

### Advantage over Hash Map

Hash set uses slightly less memory since we don't store indices, only the elements themselves.

***

## Optimized Approach: Using Sorting

### Idea

Sort the array, then check if any adjacent elements are equal.

### C++ Implementation

```cpp theme={null}
class Solution {
public:
    bool containsDuplicate(vector<int>& nums) {
        sort(nums.begin(), nums.end());

        for (int i = 0; i < nums.size() - 1; i++) {
            if (nums[i] == nums[i + 1])
                return true;
        }
        return false;
    }
};
```

### Complexity Analysis

* **Time Complexity**: **O(n log n)** → Sorting dominates
* **Space Complexity**: **O(1)** → In-place sorting (depending on sort implementation)

### When to use?

When space is critical and you can afford sorting overhead. Not as good as hash-based approaches for time.

***

## Comparison of All Approaches

| Approach        | Time Complexity | Space Complexity | Pros                           | Cons                       |
| --------------- | --------------- | ---------------- | ------------------------------ | -------------------------- |
| **Brute Force** | O(n²)           | O(1)             | No extra space                 | Very slow for large arrays |
| **Hash Map**    | O(n)            | O(n)             | Fast, stores indices if needed | Uses extra memory          |
| **Hash Set**    | O(n)            | O(n)             | Fast, cleaner code             | Uses extra memory          |
| **Sorting**     | O(n log n)      | O(1)             | Space-efficient                | Slower than hash-based     |

***

## Why Hash Set is Most Optimal

1. **Single pass**: O(n) time complexity
2. **Fast lookups**: O(1) average case for hash operations
3. **Early termination**: Returns true immediately when duplicate is found
4. **Memory efficient**: Stores only values, not indices like hash map
5. **Simple code**: Clean and easy to understand

***

## Example Walkthroughs

### Example 1: `nums = [1, 2, 3, 1]`

**Using Hash Set:**

```
seen = {}

i=0: num=1
  - 1 not in seen
  - seen = {1}

i=1: num=2
  - 2 not in seen
  - seen = {1, 2}

i=2: num=3
  - 3 not in seen
  - seen = {1, 2, 3}

i=3: num=1
  - 1 IS in seen ✓
  - Return true
```

Duplicate found! Time: O(4) = O(n)

### Example 2: `nums = [1, 2, 3, 4]`

**Using Hash Set:**

```
seen = {}

i=0: num=1 → seen = {1}
i=1: num=2 → seen = {1, 2}
i=2: num=3 → seen = {1, 2, 3}
i=3: num=4 → seen = {1, 2, 3, 4}

Loop ends, no duplicate found
Return false
```

No duplicates. Time: O(4) = O(n)

### Example 3: `nums = [99, 99]`

**Using Hash Set:**

```
seen = {}

i=0: num=99
  - 99 not in seen
  - seen = {99}

i=1: num=99
  - 99 IS in seen ✓
  - Return true (immediately!)
```

Duplicate found on second element! Very efficient.

***

## Key Takeaways

* **Avoid brute force** for this problem (O(n²) is too slow)
* **Hash Set is optimal** for general use (O(n) time, O(n) space)
* **Sorting is an alternative** if space is strictly limited
* **Early termination** makes hash-based solutions even faster in practice
* **Hash Map** only needed if you must track indices, otherwise use Hash Set
