Interpretation

ROOT was designed for C++, so ROOT data have an unambiguous C++ interpretation. However, their Python interpretation is open to interpretation. For instance, you may want a branch to be read as a new Numpy array, or perhaps a user-provided array in shared memory, with or without byte-swapping, type conversion, or reshaping, or as an array of unequal-length arrays, or an array of classes defined by the ROOT streamers, or an array of custom classes, or as a Numpy record array, etc. The uproot Interpretation mechanism provides such flexibility without sacrificing the flexibility of the selective reading methods.

If no interpretation is specified, uproot.interpret is automatically called to provide a reasonable default. This function may also be called by the user with custom arguments and its output may be modified in the branches or interpretation arguments of TTreeMethods and TBranchMethods array-producing functions.

uproot.interp.interp.Interpretation

class uproot.interp.interp.Interpretation

Interface for interpretations.

Interpretations do not need to inherit from this class, but they do need to satisfy the interface described below.

Arrays and other collections are filled from ROOT in two stages: raw bytes from each basket are interpreted as a “source” and sources are copied into a branch-wide collection called the “destination” (often swapping bytes from big-endian to native-endian in the process). Public functions return a finalized destination. The distinction between source and destination (a) compactifies disparate baskets into a contiguous collection and (b) allows the output data to differ from the bytes on disk (byte swapping and other conversions).

Interpretations must implement the following methods:

identifier
(property) a unique identifier for this interpretation, used as part of the cache key so that stale interpretations are not counted as cache hits.
empty(self)
return a zero-entry container (for special cases that can skip complex logic by returning an empty set).
compatible(self, other)
return True if and only if self and other interpretations would return equivalent results, such as different source interpretations that fill the same destination.
numitems(self, numbytes, numentries)
calculate the number of “items” (whatever that means for a given interpretation, but always greater than or equal to the number of entries), knowing only the number of bytes (numbytes) and the number of entries (numentries).
source_numitems(self, source)
calculate the number of “items” given a source instance.
fromroot(self, data, offsets, local_entrystart, local_entrystop)
produce a source from one basket data array (dtype numpy.uint8) and its corresponding offsets array (dtype numpy.int32 or None if not present) that has n + 1 elements for n entries: offsets[0] == 0 and offsets[-1] == numentries. The local_entrystart and local_entrystop are entry start (inclusive) and stop (exclusive), in which the first entry in the basket is number zero (hence “local”). The result of this operation may be a zero-copy cast of the basket data.
destination(self, numitems, numentries)
create or otherwise produce an unfilled destination object, knowing only the number of items (numitems) and number of entries (numentries).
fill(self, source, destination, itemstart, itemstop, entrystart, entrystop)
copy data from one basket``source`` (in its entirety) to part of the destination (usually a small slice). The items range from itemstart (inclusive) to itemstop (exclusive) and the entries range from entrystart (inclusive) to entrystop (exclusive). This function returns nothing; it is the only function in this interface called for its side-effects (the rest may be pure functions).
clip(self, destination, itemstart, itemstop, entrystart, entrystop)
return a slice of the destination from itemstart (inclusive) to itemstop (exclusive) and from entrystart (inclusive) to entrystop (exclusive). This is to trim memory allocated but not used, for instance if the entry range does not align with basket boundaries.
finalize(self, destination)
possibly post-process a destination to make it ready for consumption. This is needed if a different form must be used for filling than should be provided to the user— for instance, offsets of a jagged array can’t be computed when filling sections of it in parallel (sizes can), but the user should receive a jagged array based on offsets for random access.

uproot.interpret

uproot.interp.auto.interpret(branch, swapbytes=True, cntvers=False, tobject=True)

Generate a default interpretation of a branch.

This function is called with default options on each branch in the following methods to generate a default interpretation. You can override the default either by calling this function explicitly with different parameters or by modifying its result.

Parameters:
  • branch (TBranchMethods) – branch to interpret.
  • classes (None or dict of str → ROOTStreamedObject) – class definitions associated with each class name, usually generated by ROOT file streamers. If None (default), use the class definitions generated from the file from which this branch was read.
  • swapbytes (bool) – if True, generate an interpretation that converts ROOT’s big-endian numbers into the machine-native endianness (usually little-endian).
Returns:

the interpretation.

Return type:

Interpretation

uproot.interp.asdtype

class uproot.interp.numerical.asdtype(fromdtype, todtype=None)

Interpret branch data as a new Numpy array with given dtypes and dimensions.

This interpretation directs branch-reading functions to allocate new Numpy arrays and fill them with the branch contents. See asarray to fill an existing array, rather than filling a new array.

In this interpretation, “items” (for numitems, itemstart, itemstop, etc.) has the same meaning as in Numpy: an item is a single scalar value. For example, 100 entries of 2×2 matrices (todims == (2, 2)) is 400 items.

Parameters:
  • fromdtype (numpy.dtype) – the source type; the meaning associated with bytes in the ROOT file. Should be big-endian (e.g. ">i4" for 32-bit integers and ">f8" for 64-bit floats).
  • todtype (None or numpy.dtype) – the destination type; the conversion performed if different from the source type. If None (default), the destination type will be a native-endian variant of the source type, so that a byte-swap is performed.
  • fromdims (tuple of ints) – Numpy shape of each source entry. The Numpy shape of the whole source array is (numentries,) + fromdims. Default is () (scalar).
  • todims (None or tuple of ints) – Numpy shape of each destination entry. The Numpy shape of the whole destination array is (numentries,) + todims. If None (default), todims will be equal to fromdims. Making them different allows you to reshape arrays while reading.

Notes

Methods implementing the Interpretation interface are not documented here.

asdtype.to(todtype=None, todims=None)

Create a new asdtype interpretation from this one.

Parameters:
  • todtype (None or numpy.dtype) – if not None, change the destination type.
  • todims (None or tuple of ints) – if not None, change the destination dimensions.
Returns:

new interpretation.

Return type:

asdtype

asdtype.toarray(array)

Create a asarray interpretation from this one.

Parameters:array (numpy.ndarray) – the array to fill, instead of allocating a new one.
Returns:new interpretation.
Return type:asarray

uproot.interp.asarray

class uproot.interp.numerical.asarray(fromdtype, toarray)

Interpret branch as array data that should overwrite an existing array.

This interpretation directs branch-reading functions to fill the given Numpy array with branch contents. See asdtype to allocate a new array, rather than filling an existing array.

In this interpretation, “items” (for numitems, itemstart, itemstop, etc.) has the same meaning as in Numpy: an item is a single scalar value. For example, 100 entries of 2×2 matrices (todims == (2, 2)) is 400 items.

Parameters:
  • fromdtype (numpy.dtype) – the source type; the meaning associated with bytes in the ROOT file. Should be big-endian (e.g. ">i4" for 32-bit integers and ">f8" for 64-bit floats).
  • toarray (numpy.ndarray) – array to be filled; must be at least as large as the branch data.
  • fromdims (tuple of ints) – Numpy shape of each source entry. The Numpy shape of the whole source array is (numentries,) + fromdims. Default is () (scalar).

Notes

Methods implementing the Interpretation interface are not documented here.

This class has todtype and todims parameters like asdtype, but they are derived from the toarray attribute.

uproot.interp.asjagged

class uproot.interp.jagged.asjagged(content, skipbytes=0)

Interpret branch as a jagged array (array of non-uniformly sized arrays).

This interpretation directs branch-reading to fill contiguous arrays and present them to the user in a JaggedArray interface. Such an object behaves as though it were an array of non-uniformly sized arrays, but it is more memory and cache-line efficient because the underlying data are contiguous.

In this interpretation, “items” (for numitems, itemstart, itemstop, etc.) are the items of the inner array (however that is defined), and “entries” are elements of the outer array. The outer array is always one-dimensional.

Parameters:asdtype (asdtype) – interpretation for the inner arrays.

Notes

Methods implementing the Interpretation interface are not documented here.

asjagged.to(todtype=None, todims=None, skipbytes=None)

Create a new asjagged interpretation from this one.

Parameters:
  • todtype (None or numpy.dtype) – if not None, change the destination type of inner arrays.
  • todims (None or tuple of ints) – if not None, change the destination dimensions of inner arrays.
Returns:

new interpretation.

Return type:

asjagged

uproot.interp.jagged.JaggedArray

uproot.interp.asstrings

uproot.interp.strings.Strings