uproot.writing.writable.WritableNTuple

Defined in uproot.writing.writable on line 2010.

class uproot.writing.writable.WritableNTuple(path, file, cascading)

Parameters:

path (tuple of str) – Path of directory names to this RNTuple.
file (uproot.WritableFile) – Handle to the file in which this RNTuple can be found.
cascading (uproot.writing._cascadentuple.NTuple) – The low-level directory object.

Represents a writable RNTuple from a ROOT file.

Assigning data to a directory creates an RNTuple object by default starting in Uproot v5.7.0. This creates the RNTuple object with all of its metadata and fills it with the contents of the arrays in one step. To separate the process of creating the RNTuple metadata from filling the first cluster, use the uproot.writing.writable.WritableDirectory.mkrntuple method:

my_directory.mkrntuple("tuple6", {"branch1": numpy_dtype, "branch2": awkward_type})

The numpy_dtype is any data that NumPy recognizes as a np.dtype, and the awkward_type is an ak.types.Type from ak.type or a string in that form, such as "var * float64" for variable-length doubles.

RNTuple can be extended using extend method:

my_directory["tuple6"].extend({"branch1": another_numpy_array,
                              "branch2": another_awkward_array})

Be sure to make these extensions as large as is feasible within memory constraints, because a ROOT file full of small clusters is bloated (larger than it needs to be) and slow to read (especially for Uproot, but also for ROOT).

For instance, if you want to write a million events and have enough memory available to do that 100 thousand events at a time (total of 10 clusters), then do so. Filling the RNTuple a hundred events at a time (total of 10000 clusters) would be considerably slower for writing and reading, and the file would be much larger than it could otherwise be, even with compression.

path

WritableNTuple.path: Path of directory names to this RNTuple as a tuple of strings.

object_path

WritableNTuple.object_path: Path of directory names to this RNTuple as a single string, delimited by slashes.

file_path

WritableNTuple.file_path: Filesystem path of the open file, or None if using a file-like object.

file

WritableNTuple.file: Handle to the uproot.WritableDirectory in which this directory can be found.

close

WritableNTuple.close()

Explicitly close the file.

(Files can also be closed with the Python with statement, as context managers.)

After closing, objects cannot be read from or written to the file.

closed

WritableNTuple.closed

True if the file has been closed; False otherwise.

The file may have been closed explicitly with close or implicitly in the Python with statement, as a context manager.

After closing, objects cannot be read from or written to the file.

compression

WritableNTuple.compression

Compression algorithm and level (uproot.compression.Compression or None) for new blobs added to the RNTuple.

This property can be changed and doesn’t have to be the same as the compression of the file, which allows you to write different objects with different compression settings.

The following are equivalent:

my_directory["tree"]["branch1"].compression = uproot.ZLIB(1)
my_directory["tree"]["branch2"].compression = uproot.LZMA(9)

and

my_directory["tree"].compression = {"branch1": uproot.ZLIB(1),
                                    "branch2": uproot.LZMA(9)}

num_entries

WritableNTuple.num_entries: The number of entries accumulated so far.

extend

WritableNTuple.extend(data)

Parameters:: data (dict of str → arrays) – More array data to add to the RNTuple.

This method adds data to an existing RNTuple, whether it was created through assignment or uproot.writing.writable.WritableDirectory.mkrntuple.

The arrays must be a dict, but the values of the dict can be any of the array/DataFrame types described in uproot.WritableTree. However, these types must be compatible with the established TBranch types, the dict must contain a key for every TBranch, and the arrays must have the same lengths (in their first dimension).

For example,

my_directory.mkrntuple("ntuple6", {"branch1": numpy_dtype, "branch2": awkward_type})

my_directory["ntuple6"].extend({"branch1": another_numpy_array,
                              "branch2": another_awkward_array})

Warning

As a word of warning, be sure that each call to extend includes at least 100 kB per branch/array. (NumPy and Awkward Arrays have an nbytes property; you want at least 100000 per array.) If you ask Uproot to write very small TBaskets, it will spend more time working on TBasket overhead than actually writing data. The absolute worst case is one-entry-per-extend. See #428 (comment).