Architecture
Note
This page describes technical details about EXtra-data. You shouldn’t need this information to use it.
Objects
There are three classes making up the core API of EXtra-data:
DataCollectionis what you get from opening a run or file: data for several sources over some range of pulse trains (i.e. time). It has methods to select a subset of that data.SourceDatacomes fromrun[source], representing one source, such as a motor or a detector module. Each source has a set of keys.KeyDatacomes fromrun[source, key], representing data for a single source & key. This has a dtype and a shape like a NumPy array, but the data is not in memory. It has methods to load the data as a NumPy array, an Xarray DataArray, or a Dask array.
Component classes for multi-module detectors build on top of this core to work more conveniently with major data sources. There are more component classes in the EXtra package.
FileAccess is a lower-level class to manage access to a single
EuXFEL format HDF5 file, including caching index information.
There should only be one FileAccess object per file on disk, even if
multiple DataCollection, SourceData and KeyData objects refer to it.
Modules
clicontains command-line interfaces.componentsprovides interfaces that bring together data from several similar sources, i.e. multi-module detectors where each module is a separate source.exceptionsdefines some custom error classes.exportsends data from files over ZMQ in the Karabo Bridge format.file_accesscontainsFileAccess(described above), along with machinery to keep the number of open files under a limit.keydatacontainsKeyData(described above).localitycan check whether files are available on disk or on tape in a dCache filesystem.lsxfelis the lsxfel command.readercontainsDataCollection(described above), and functions to open a run or a file.read_machineryis a collection of pieces that supportreader.run_files_mapmanages caching metadata about the files of a run in a JSON file, to speed up opening the run.sourcedatacontainsSourceData(described above).stackinghas functions for stacking multiple arrays into one, another option for working with multi-module detector data.utilsis miscellaneous pieces that don’t fit anywhere else.validationchecks if files & runs have the expected format, for the extra-data-validate command.writerwrites data in EuXFEL format files, forwrite()andwrite_virtual().write_cximakes CXI format HDF5 files using virtual datasets to expose multi-module detector data. Used bywrite_virtual_cxi()and the extra-data-make-virtual-cxi command.