EXtra-data
EXtra-data is a Python library for accessing saved data produced at European XFEL.
Installation
EXtra-data is available in our Python environment on the Maxwell cluster:
module load exfel exfel-python
You can also install it from PyPI to use in other environments with Python 3:
pip install extra_data
This will install the extra_data package and the most commonly useful dependencies. Some large dependencies or dependencies only required for specific functionalities are not installed by default. You can use pip to install everything required for extra or all uses of extra_data (e.g. Dask Array, karabo-bridge-serve-files). This installs both extra_data and dependencies that are necessary:
pip install "extra_data[bridge]" # install dependencies for karabo-bridge-like data streaming
pip install "extra_data[complete]" # install dependencies for all features
If you get a permissions error, add the --user
flag to that command.
Quickstart
Open a run on the Maxwell cluster:
from extra_data import open_run
run = open_run(proposal=700000, run=1)
You can also specify a run directory, or open an individual file - see Opening files for details. The same methods to access data work with any of these options.
Load data as a NumPy array for a given source & key:
arr = run["SA3_XTD10_PES/ADC/1:network", "digitizers.channel_4_A.raw.samples"].ndarray()
You can load only a region of interest, get a labelled array with train IDs, or load 1D data as columns in a pandas dataframe. See Reading data to analyse in memory (example) and Getting data by source & key (reference) for more information.
For data that’s too big to fit in memory at once, you can read one pulse train at a time:
for train_id, data in run.select("*/DET/*", "image.data").trains():
mod0 = data["FXE_DET_LPD1M-1/DET/0CH0:xtdf"]["image.data"]
Other options to work with large data volumes include breaking the run into
smaller parts with split_trains()
before loading data,
and automatic chunking with the Dask framework and
dask_array()
.
Documentation contents
Tutorials and Examples
Reference
Development
See also