uproot.dask
Defined in uproot._dask on line 10.
- uproot._dask.dask(files, *, filter_name=<function no_filter>, filter_typename=<function no_filter>, filter_branch=<function no_filter>, recursive=True, full_paths=False, step_size='100 MB', library='ak', ak_add_doc=False, custom_classes=None, allow_missing=False, open_files=True, form_mapping=None, **options)
- Parameters:
files – See below.
filter_name (None, glob string, regex string in
"/pattern/i"
syntax, function of str → bool, or iterable of the above) – A filter to selectTBranches
by name.filter_typename (None, glob string, regex string in
"/pattern/i"
syntax, function of str → bool, or iterable of the above) – A filter to selectTBranches
by type.filter_branch (None or function of uproot.TBranch → bool, uproot.interpretation.Interpretation, or None) – A filter to select
TBranches
using the full uproot.TBranch object. If the function returns False or None, theTBranch
is excluded; if the function returns True, it is included with its standard interpretation; if an uproot.interpretation.Interpretation, this interpretation overrules the standard one.recursive (bool) – If True, include all subbranches of branches as separate fields; otherwise, only search one level deep.
full_paths (bool) – If True, include the full path to each subbranch with slashes (
/
); otherwise, use the descendant’s name as the field name.step_size (int or str) – If an integer, the maximum number of entries to include in each chunk; if a string, the maximum memory_size to include in each chunk. The string must be a number followed by a memory unit, such as “100 MB”.
library (str or uproot.interpretation.library.Library) – The library that is used to represent arrays. If
library='np'
it returns a dict of dask arrays and iflibrary='ak'
it returns a single dask-awkward array.library='pd'
has not been implemented yet and will raise aNotImplementedError
.ak_add_doc (bool) – If True and
library="ak"
, add the TBranchtitle
to the Awkward__doc__
parameter of the array.custom_classes (None or dict) – If a dict, override the classes from the uproot.ReadOnlyFile or
uproot.classes
.allow_missing (bool) – If True, skip over any files that do not contain the specified
TTree
.open_files (bool) – If True (default), the function will open the files to read file metadata, i.e. only the main data read is delayed till the compute call on the dask collections. If False, the opening of the files and reading the metadata is also delayed till the compute call. In this case, branch-names are inferred by opening only the first file.
form_mapping (Callable[awkward.forms.Form] -> awkward.forms.Form | None) – If not none and library=”ak” then apply this remapping function to the awkward form of the input data. The form keys of the desired form should be available data in the input form.
options – See below.
Returns dask equivalents of the backends supported by uproot. If
library='np'
, the function returns a Python dict of dask arrays. Iflibrary='ak'
, the function returns a single dask-awkward array.For example:
>>> uproot.dask(root_file) dask.awkward<from-uproot, npartitions=1> >>> uproot.dask(root_file,library='np') {'Type': dask.array<Type-from-uproot, shape=(2304,), dtype=object, chunksize=(2304,), chunktype=numpy.ndarray>, ...}
This function (naturally) depends on Dask. To use it with
library="np"
:# with pip pip install "dask[complete]" # or with conda conda install dask
For using
library='ak'
pip install dask-awkward # not on conda-forge yet
Allowed types for the
files
parameter:str/bytes: relative or absolute filesystem path or URL, without any colons other than Windows drive letter or URL schema. Examples:
"rel/file.root"
,"C:\abs\file.root"
,"http://where/what.root"
str/bytes: same with an object-within-ROOT path, separated by a colon. Example:
"rel/file.root:tdirectory/ttree"
pathlib.Path: always interpreted as a filesystem path or URL only (no object-within-ROOT path), regardless of whether there are any colons. Examples:
Path("rel:/file.root")
,Path("/abs/path:stuff.root")
glob syntax in str/bytes and pathlib.Path. Examples:
Path("rel/*.root")
,"/abs/*.root:tdirectory/ttree"
dict: keys are filesystem paths, values are objects-within-ROOT paths. Example:
{{"/data_v1/*.root": "ttree_v1", "/data_v2/*.root": "ttree_v2"}}
already-open TTree objects.
iterables of the above.
Options (type; default):
file_handler (uproot.source.chunk.Source class; uproot.MemmapSource)
xrootd_handler (uproot.source.chunk.Source class; uproot.XRootDSource)
http_handler (uproot.source.chunk.Source class; uproot.HTTPSource)
object_handler (uproot.source.chunk.Source class; uproot.ObjectSource)
timeout (float for HTTP, int for XRootD; 30)
max_num_elements (None or int; None)
num_workers (int; 1)
num_fallback_workers (int; 10)
begin_chunk_size (memory_size; 512)
minimal_ttree_metadata (bool; True)
Other file entry points:
uproot.open: opens one file to read any of its objects.
uproot.iterate: iterates through chunks of contiguous entries in
TTrees
.uproot.concatenate: returns a single concatenated array from
TTrees
.uproot.dask (this function): returns an unevaluated Dask array from
TTrees
.