Chapter 5: Array Printing (arrayprint
)
In the previous chapter, Chapter 4: Numeric Types (numerictypes
), we explored the different kinds of data NumPy can store in its arrays, like int32
, float64
, and more. Now that we know about the arrays (ndarray
), their data types (dtype
), the functions that operate on them (ufunc
), and the specific number types (numerictypes
), a practical question arises: How do we actually look at these arrays, especially if they are very large?
What Problem Does arrayprint
Solve? Making Arrays Readable
Imagine you have a NumPy array representing a large image, maybe with millions of pixel values. Or perhaps you have simulation data with thousands of temperature readings.
import numpy as np
# Imagine this is a huge array, maybe thousands of numbers
large_array = np.arange(2000)
# If Python just tried to print every single number...
# it would flood your screen and be impossible to read!
# print(list(large_array)) # <-- Don't run this! It would be too long.
If NumPy just dumped all the numbers onto your screen whenever you tried to display a large array, it would be overwhelming and useless. We need a way to show the array’s contents in a concise, human-friendly format. How can we get a sense of the array’s data without printing every single element?
This is the job of NumPy’s array printing mechanism, often referred to internally by the name of its main Python module, arrayprint
.
What is Array Printing (arrayprint
)?
arrayprint
is NumPy’s “pretty printer” for ndarray
objects. It’s responsible for converting a NumPy array into a nicely formatted string representation that’s easy to read and understand when you display it (e.g., in your Python console, Jupyter notebook, or using the print()
function).
Think of it like getting a summary report instead of the raw database dump. arrayprint
intelligently decides how to show the array, considering things like:
- Summarization: For large arrays, it shows only the beginning and end elements, using ellipsis (
...
) to indicate the omitted parts. - Precision: It controls how many decimal places are shown for floating-point numbers.
- Line Wrapping: It breaks long rows of data into multiple lines to fit within a certain width.
- Special Values: It uses consistent strings for “Not a Number” (
nan
) and infinity (inf
). - Customization: It allows you to change these settings to suit your needs.
Let’s see it in action with our large_array
:
import numpy as np
large_array = np.arange(2000)
# Let NumPy's array printing handle it
print(large_array)
Output:
[ 0 1 2 ... 1997 1998 1999]
Instead of 2000 numbers flooding the screen, NumPy smartly printed only the first three and the last three, with ...
in between. This gives us a good idea of the array’s contents (a sequence starting from 0) without being overwhelming.
Key Features and Options
arrayprint
has several options you can control to change how arrays are displayed.
1. Summarization (threshold
and edgeitems
)
threshold
: The total number of array elements that triggers summarization. If the array’ssize
is greater thanthreshold
, the array gets summarized. (Default: 1000)edgeitems
: When summarizing, this is the number of items shown at the beginning and end of each dimension. (Default: 3)
Let’s try printing a smaller array and then changing the threshold:
import numpy as np
# An array with 10 elements
arr = np.arange(10)
print("Original:")
print(arr)
# Temporarily set the threshold lower (e.g., 5)
# We use np.printoptions as a context manager for temporary settings
with np.printoptions(threshold=5):
print("\nWith threshold=5:")
print(arr)
# Change edgeitems too
with np.printoptions(threshold=5, edgeitems=2):
print("\nWith threshold=5, edgeitems=2:")
print(arr)
Output:
Original:
[0 1 2 3 4 5 6 7 8 9]
With threshold=5:
[0 1 2 ... 7 8 9]
With threshold=5, edgeitems=2:
[0 1 ... 8 9]
You can see how lowering the threshold
caused the array (size 10) to be summarized, and edgeitems
controlled how many elements were shown at the ends.
2. Floating-Point Precision (precision
and suppress
)
precision
: Controls the number of digits displayed after the decimal point for floats. (Default: 8)suppress
: IfTrue
, prevents NumPy from using scientific notation for very small numbers and prints them as zero if they are smaller than the current precision. (Default: False)
import numpy as np
# An array with floating-point numbers
float_arr = np.array([0.123456789, 1.5e-10, 2.987])
print("Default precision:")
print(float_arr)
# Set precision to 3
with np.printoptions(precision=3):
print("\nWith precision=3:")
print(float_arr)
# Set precision to 3 and suppress small numbers
with np.printoptions(precision=3, suppress=True):
print("\nWith precision=3, suppress=True:")
print(float_arr)
Output:
Default precision:
[1.23456789e-01 1.50000000e-10 2.98700000e+00]
With precision=3:
[1.235e-01 1.500e-10 2.987e+00]
With precision=3, suppress=True:
[0.123 0. 2.987]
Notice how precision
changed the rounding, and suppress=True
made the very small number (1.5e-10
) display as 0.
and switched from scientific notation to fixed-point for the others. There’s also a floatmode
option for more fine-grained control over float formatting (e.g., ‘fixed’, ‘unique’).
3. Line Width (linewidth
)
linewidth
: The maximum number of characters allowed per line before wrapping. (Default: 75)
import numpy as np
# A 2D array
arr2d = np.arange(12).reshape(3, 4) * 0.1
print("Default linewidth:")
print(arr2d)
# Set a narrow linewidth
with np.printoptions(linewidth=30):
print("\nWith linewidth=30:")
print(arr2d)
Output:
Default linewidth:
[[0. 0.1 0.2 0.3]
[0.4 0.5 0.6 0.7]
[0.8 0.9 1. 1.1]]
With linewidth=30:
[[0. 0.1 0.2 0.3]
[0.4 0.5 0.6 0.7]
[0.8 0.9 1. 1.1]]
(Note: The output might not actually wrap here because the lines are short. If the array was wider, you’d see the rows break across multiple lines with the narrower linewidth
setting.)
4. Other Options
nanstr
: String representation for Not a Number. (Default: ‘nan’)infstr
: String representation for Infinity. (Default: ‘inf’)sign
: Control sign display for floats (‘-‘, ‘+’, or ‘ ‘).formatter
: A dictionary to provide completely custom formatting functions for specific data types (like bool, int, float, datetime, etc.). This is more advanced.
Using and Customizing Array Printing
You usually interact with array printing implicitly just by displaying an array:
import numpy as np
arr = np.linspace(0, 1, 5)
# These both use NumPy's array printing behind the scenes
print(arr) # Calls __str__ -> array_str -> array2string
arr # In interactive sessions, calls __repr__ -> array_repr -> array2string
To customize the output, you can use:
np.set_printoptions(...)
: Sets options globally (for your entire Python session).np.get_printoptions()
: Returns a dictionary of the current settings.np.printoptions(...)
: A context manager to set options temporarily within awith
block (as used in the examples above). This is often the preferred way to avoid changing settings permanently.np.array2string(...)
: A function to get the string representation directly, allowing you to override options just for that one call.
import numpy as np
import sys # Needed for sys.maxsize
arr = np.random.rand(10, 10) * 1000
# --- Global Setting ---
print("--- Setting threshold globally ---")
original_options = np.get_printoptions() # Store original settings
np.set_printoptions(threshold=50)
print(arr)
np.set_printoptions(**original_options) # Restore original settings
# --- Temporary Setting (Context Manager) ---
print("\n--- Setting precision temporarily ---")
with np.printoptions(precision=2, suppress=True):
print(arr)
print("\n--- Back to default precision ---")
print(arr) # Options are automatically restored outside the 'with' block
# --- Direct Call with Overrides ---
print("\n--- Using array2string with summarization off ---")
# Use sys.maxsize to effectively disable summarization
arr_string = np.array2string(arr, threshold=sys.maxsize, precision=1)
# print(arr_string) # This might still be very long! Let's just print the first few lines
print('\n'.join(arr_string.splitlines()[:5]) + '\n...')
Output (will vary due to random numbers):
--- Setting threshold globally ---
[[992.84337197 931.73648142 119.68616987 ... 305.61919366 516.97897205
707.69140878]
[507.45895986 253.00740626 739.97091378 ... 755.69943511 813.11931119
19.84654589]
[941.25264871 689.43209981 820.11954711 ... 709.83933545 192.49837505
609.30358618]
...
[498.86686503 872.79555956 401.19333028 ... 552.97492858 303.59379464
308.61881807]
[797.51920685 427.86020151 783.2019203 ... 511.63382762 322.52764881
778.22766019]
[ 54.84391309 938.24403397 796.7431406 ... 495.90873227 267.16620292
409.51491904]]
--- Setting precision temporarily ---
[[992.84 931.74 119.69 ... 305.62 516.98 707.69]
[507.46 253.01 739.97 ... 755.7 813.12 19.85]
[941.25 689.43 820.12 ... 709.84 192.5 609.3 ]
...
[498.87 872.8 401.19 ... 552.97 303.59 308.62]
[797.52 427.86 783.2 ... 511.63 322.53 778.23]
[ 54.84 938.24 796.74 ... 495.91 267.17 409.51]]
--- Back to default precision ---
[[992.84337197 931.73648142 119.68616987 ... 305.61919366 516.97897205
707.69140878]
[507.45895986 253.00740626 739.97091378 ... 755.69943511 813.11931119
19.84654589]
[941.25264871 689.43209981 820.11954711 ... 709.83933545 192.49837505
609.30358618]
...
[498.86686503 872.79555956 401.19333028 ... 552.97492858 303.59379464
308.61881807]
[797.51920685 427.86020151 783.2019203 ... 511.63382762 322.52764881
778.22766019]
[ 54.84391309 938.24403397 796.7431406 ... 495.90873227 267.16620292
409.51491904]]
--- Using array2string with summarization off ---
[[992.8 931.7 119.7 922. 912.2 156.5 459.4 305.6 517. 707.7]
[507.5 253. 740. 640.3 420.3 652.1 197. 755.7 813.1 19.8]
[941.3 689.4 820.1 125.8 598.2 219.3 466.7 709.8 192.5 609.3]
[ 32. 855.2 362.1 434.9 133.5 148.1 522.6 725.1 395.5 377.9]
[332.7 782.2 587.3 320.3 905.5 412.8 378. 911.9 972.1 400.2]
...
A Glimpse Under the Hood
What happens when you call print(my_array)
?
- Python calls the
__str__
method of thendarray
object. - NumPy’s
ndarray.__str__
method typically calls the internal function_array_str_implementation
. _array_str_implementation
checks for simple cases (like 0-dimensional arrays) and then calls the main workhorse:array2string
.array2string
(defined innumpy/core/arrayprint.py
) takes the array and any specified options (likeprecision
,threshold
, etc.). It also reads the current default print options (managed bynumpy/core/printoptions.py
using context variables).- It determines if the array needs summarization based on its
size
and thethreshold
option. - It figures out the correct formatting function for the array’s
dtype
(e.g.,IntegerFormat
,FloatingFormat
,DatetimeFormat
). These formatters handle details like precision, sign, and scientific notation for individual elements.FloatingFormat
, for example, might use the efficientdragon4
algorithm (implemented in C) to convert floats to strings accurately. - It recursively processes the array’s dimensions:
- For each element (or summarized chunk), it calls the chosen formatting function to get its string representation.
- It arranges these strings, adding separators (like spaces or commas) and brackets (
[
]
). - It checks the
linewidth
and inserts line breaks and indentation as needed. - If summarizing, it inserts the ellipsis (
...
) string (summary_insert
).
- Finally,
array2string
returns the complete, formatted string representation of the array.
sequenceDiagram
participant User
participant Python as print() / REPL
participant NDArray as my_array object
participant ArrayPrint as numpy.core.arrayprint module
participant PrintOpts as numpy.core.printoptions module
User->>Python: print(my_array) or my_array
Python->>NDArray: call __str__ or __repr__
NDArray->>ArrayPrint: call array_str or array_repr
ArrayPrint->>ArrayPrint: call array2string(my_array, ...)
ArrayPrint->>PrintOpts: Get current print options (threshold, precision, etc.)
ArrayPrint->>ArrayPrint: Check size vs threshold -> Summarize?
ArrayPrint->>ArrayPrint: Select Formatter based on my_array.dtype
loop For each element/chunk
ArrayPrint->>ArrayPrint: Format element using Formatter
end
ArrayPrint->>ArrayPrint: Arrange strings, add brackets, wrap lines
ArrayPrint-->>NDArray: Return formatted string
NDArray-->>Python: Return formatted string
Python-->>User: Display formatted string
The core logic resides in numpy/core/arrayprint.py
. This file contains array2string
, array_repr
, array_str
, and various formatter classes (FloatingFormat
, IntegerFormat
, BoolFormat
, ComplexFloatingFormat
, DatetimeFormat
, TimedeltaFormat
, StructuredVoidFormat
, etc.). The global print options themselves are managed using Python’s contextvars
in numpy/core/printoptions.py
, allowing settings to be changed globally or temporarily within a context.
Conclusion
You’ve now learned how NumPy takes potentially huge and complex arrays and turns them into readable string representations using its arrayprint
mechanism. Key takeaways:
arrayprint
is NumPy’s “pretty printer” for arrays.- It uses summarization (
threshold
,edgeitems
) for large arrays. - It controls formatting (like
precision
,suppress
for floats) and layout (linewidth
). - You can customize printing globally (
set_printoptions
), temporarily (printoptions
context manager), or for single calls (array2string
). - The core logic resides in
numpy/core/arrayprint.py
, using formatters tailored to different dtypes and reading options fromnumpy/core/printoptions.py
.
Understanding array printing helps you effectively inspect and share your NumPy data.
Next, we’ll start looking at the specific C and Python modules that form the core of NumPy’s implementation, beginning with the central Chapter 6: multiarray Module.
Generated by AI Codebase Knowledge Builder