EXtra-data is a Python library for accessing saved data produced at European XFEL.
EXtra-data is available on our Anaconda installation on the Maxwell cluster:
module load exfel exfel_anaconda3
You can also install it from PyPI to use in other environments with Python 3.6 or later:
pip install extra_data
If you get a permissions error, add the
--user flag to that command.
Open a run on the Maxwell cluster:
from extra_data import open_run run = open_run(proposal=700000, run=1)
You can also specify a run directory, or open an individual file - see Opening files for details. The same methods to access data work with any of these options.
Load data as a NumPy array for a given source & key:
arr = run["SA3_XTD10_PES/ADC/1:network", "digitizers.channel_4_A.raw.samples"].ndarray()
You can load only a region of interest, get a labelled array with train IDs, or load 1D data as columns in a pandas dataframe. See Reading data to analyse in memory (example) and Getting data by source & key (reference) for more information.
For data that’s too big to fit in memory at once, you can read one pulse train at a time:
for train_id, data in run.select("*/DET/*", "image.data").trains(): mod0 = data["FXE_DET_LPD1M-1/DET/0CH0:xtdf"]["image.data"]
- Reading data to analyse in memory
- Inspecting available data
- Reading data train by train
- Aligning data from different sources
- Averaging detector data with Dask
- Parallel processing with a virtual HDF5 dataset
- Accessing LPD data
- Combining data from separate but concurrent runs
- Reading data files
- Multi-module detector data
- Streaming data over ZeroMQ
- Checking data files
- Command line tools
- Data files format
- Performance notes