How to Fix Unable to Allocate Array with Shape and Data Type

When working with NumPy, we may encounter the “Unable to allocate array with shape and data type” error, this error is called memory error. This error typically occurs when attempting to create a NumPy array that requires more memory than is available. This could happen due to various reasons such as insufficient system memory.

In this tutorial let’s walk through how to fix “Unable to Allocate Array with Shape and Data Type”.

What is Memoryerror: Unable to Allocate Array with Shape and Data Type?

When we create a NumPy array, the library needs to allocate memory to store the array data. The amount of memory required depends on the shape and data type of the array.

The memory error in Python typically occurs when there is not enough available memory to perform a specific operation, such as allocating memory for an array with a given shape and data type. This error commonly arises when working with large datasets or performing memory-intensive computations.

Syntax:

MemoryError: Unable to allocate array with shape and data type

Why does Unable to Allocate Array with Shape and Data Type Occur in Python?

Common reasons why “Unable to Allocate Array with Shape and Data Type” errors occur in Python are:

  • Insufficient Available Memory
  • Memory Fragmentation

Insufficient Available Memory

When trying to create a large array or data structure that requires more memory than is available on the system.

Python3
import numpy as np
# Trying to allocate a very large array that exceeds available memory
arr = np.zeros((5000000000000,), dtype=np.int64)  # This array is too large to fit in memory

Output:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-25-2d6cd11b6ca3> in <cell line: 3>()
      1 import numpy as np
      2 # Trying to allocate a very large array that exceeds available memory
----> 3 arr = np.zeros((8000000000000,), dtype=np.int64)  # This array is too large to fit in memory

MemoryError: Unable to allocate 58.2 TiB for an array with shape (8000000000000,) and data type int64

Memory Fragmentation

Continuous allocation and deallocation of memory can lead to fragmentation, where there might be enough total memory available, but it’s not contiguous, making it impossible to allocate a large block of memory.

Python3
import numpy as np
# Trying to allocate several large arrays that collectively exceed available memory
arr1 = np.zeros((500000000000000,), dtype=np.int64)  # Array 1
arr2 = np.zeros((500000000,), dtype=np.int64)  # Array 2
arr3 = np.zeros((500000000,), dtype=np.int64)  # Array 3
arr1

Output:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-21-e8df972484de> in <cell line: 3>()
      1 import numpy as np
      2 # Trying to allocate several large arrays that collectively exceed available memory
----> 3 arr1 = np.zeros((500000000000000,), dtype=np.int64)  # Array 1
      4 arr2 = np.zeros((500000000,), dtype=np.int64)  # Array 2
      5 arr3 = np.zeros((500000000,), dtype=np.int64)  # Array 3

MemoryError: Unable to allocate 3.55 PiB for an array with shape (500000000000000,) and data type int64

Solutions to Fix Unable to Allocate Array with Shape and Data Type

To fix the “Unable to allocate array with shape and data type” error, we can try the following steps:

  1. Reduce the size of the array
  2. Increase available memory: If reducing the size of the array is not feasible:
    • Close unused applications
    • Upgrade RAM
    • Use 64-bit Python: For larger memory needs, switch to a 64-bit version of Python.
  3. Use Memory Mapping
  4. Optimize the code
  5. Streamlining Data Processing

Method 1: Reduce the size of the array

If you can get away with it, consider using a smaller array with fewer elements. Try switching to a data type with a smaller memory footprint (e.g., from float64 to float32).

Python3
import numpy as np
# Smaller data type (float32)
array = np.ones((100000, 100000), dtype=np.dtype(np.float32))
print(array.nbytes)

Output:

4000000000

Method 2: Use Memory Mapping

Explore alternative data storage options, for example, memory-mapped files or disk-based files, these files act like extensions of the computer’s memory, allowing to work with data larger than your RAM capacity.

For situations where memory-mapped files aren’t suitable, disk-based arrays can be employed. These arrays reside on your hard drive, enabling to manipulate large datasets that wouldn’t fit in memory entirely.

In the below code:

  • np.memmap is used that creates a memory-mapped array in NumPy.
  • Memory-mapped arrays allow for efficient reading and writing of large datasets by mapping a file directly to memory.
Python3
import numpy as np
arr = np.memmap('large_array.dat', dtype=np.float32, mode='w+', shape=(1000000, 1000000))
print(array.nbytes)

Output:

4000000000

Method 3: Optimize the code

If we’re working with large arrays, make sure our code is optimized for memory usage.

  • Avoid unnecessary copies of arrays.
  • Use views instead of creating new arrays, and use functions that operate in-place whenever possible.

For example, in below code, Instead of creating a new array, we created a view that represents the first 500 columns of the original array. Views share the same underlying data and don’t consume additional memory.

The inplace_operation function demonstrates performing an in-place operation on the array. In this case, it squares each element of the array.

Python3
import numpy as np

def inplace_operation(array):  #Function to perform an in-place operation on a NumPy array
    array **= 2  #Example: Squaring each element in-place
    
large_array = np.random.rand(10000, 10000) # Create a large array

view_of_array = large_array[:, :500]  # Use views instead of creating new arrays

inplace_operation(view_of_array) # Call the in-place operation function on the view
# Now 'large_array' has been modified in-place without creating a new array
print(f"Memory usage of large_array:", (large_array.nbytes)) # Display memory usage

Output:

Memory usage of large_array: 800000000

Method 4: Streamlining Data Processing

Process the code into parts instead of loading it entirely.

Python3
import numpy as np
def process(chunk):
    print("Processing chunk:", chunk)

data = np.zeros((10000, 10000))
chunk_size = 100
for start in range(0, data.shape[0], chunk_size): # Iterate over the data in chunks and call the process function
    process(data[start : start + chunk_size])

Output:

Processing chunk: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
Processing chunk: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]



Contact Us