Inspecting available data
The .info()
method provides an overview of the data in an opened run or file:
[1]:
from extra_data import RunDirectory
run = RunDirectory("/gpfs/exfel/exp/XMPL/201750/p700000/raw/r0010")
run.info()
# of trains: 579
Duration: 0:00:57.9
First train ID: 507096934
Last train ID: 507097512
16 detector modules (SPB_DET_AGIPD1M-1)
e.g. module SPB_DET_AGIPD1M-1 0 : 512 x 128 pixels
SPB_DET_AGIPD1M-1/DET/0CH0:xtdf
250 frames per train, up to 144750 frames total
2 instrument sources (excluding detectors):
- SA1_XTD2_XGM/XGM/DOOCS:output
- SPB_XTD9_XGM/XGM/DOOCS:output
18 control sources:
- ACC_SYS_DOOCS/CTRL/BEAMCONDITIONS
- SA1_XTD2_ATT/MDL/MAIN
- SA1_XTD2_MIRR-1/MOTOR/HMRY
- SA1_XTD2_XGM/XGM/DOOCS
- SPB_IRU_AGIPD1M/MOTOR/Z_STEPPER
- SPB_IRU_AGIPD1M/PSC/HV
- SPB_IRU_AGIPD1M/TSENS/H1_T_EXTHOUS
- SPB_IRU_AGIPD1M/TSENS/H2_T_EXTHOUS
- SPB_IRU_AGIPD1M/TSENS/Q1_T_BLOCK
- SPB_IRU_AGIPD1M/TSENS/Q2_T_BLOCK
- SPB_IRU_AGIPD1M/TSENS/Q3_T_BLOCK
- SPB_IRU_AGIPD1M/TSENS/Q4_T_BLOCK
- SPB_IRU_AGIPD1M1/CTRL/MC1
- SPB_IRU_AGIPD1M1/CTRL/MC2
- SPB_IRU_VAC/GAUGE/GAUGE_FR_6
- SPB_RR_SYS/MDL/BUNCH_PATTERN
- SPB_RR_SYS/TSYS/X2TIMER2
- SPB_XTD9_XGM/XGM/DOOCS
The lsxfel command can give similar information at the command line.
The train IDs included in the run are available as a simple list:
[2]:
print(run.train_ids[:10])
[507096934, 507096935, 507096936, 507096937, 507096938, 507096939, 507096940, 507096941, 507096942, 507096943]
And the source names are available as a set:
[3]:
run.all_sources
[3]:
frozenset({'ACC_SYS_DOOCS/CTRL/BEAMCONDITIONS',
'SA1_XTD2_ATT/MDL/MAIN',
'SA1_XTD2_MIRR-1/MOTOR/HMRY',
'SA1_XTD2_XGM/XGM/DOOCS',
'SA1_XTD2_XGM/XGM/DOOCS:output',
'SPB_DET_AGIPD1M-1/DET/0CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/10CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/11CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/12CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/13CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/14CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/15CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/1CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/2CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/3CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/4CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/5CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/6CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/7CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/8CH0:xtdf',
'SPB_DET_AGIPD1M-1/DET/9CH0:xtdf',
'SPB_IRU_AGIPD1M/MOTOR/Z_STEPPER',
'SPB_IRU_AGIPD1M/PSC/HV',
'SPB_IRU_AGIPD1M/TSENS/H1_T_EXTHOUS',
'SPB_IRU_AGIPD1M/TSENS/H2_T_EXTHOUS',
'SPB_IRU_AGIPD1M/TSENS/Q1_T_BLOCK',
'SPB_IRU_AGIPD1M/TSENS/Q2_T_BLOCK',
'SPB_IRU_AGIPD1M/TSENS/Q3_T_BLOCK',
'SPB_IRU_AGIPD1M/TSENS/Q4_T_BLOCK',
'SPB_IRU_AGIPD1M1/CTRL/MC1',
'SPB_IRU_AGIPD1M1/CTRL/MC2',
'SPB_IRU_VAC/GAUGE/GAUGE_FR_6',
'SPB_RR_SYS/MDL/BUNCH_PATTERN',
'SPB_RR_SYS/TSYS/X2TIMER2',
'SPB_XTD9_XGM/XGM/DOOCS',
'SPB_XTD9_XGM/XGM/DOOCS:output'})
You can see control and instrument sources separately, but for data analysis this distinction is often not important.
[4]:
assert run.all_sources == (run.control_sources | run.instrument_sources)
Within each source, the data is organised under keys. You can look at one source and use the .keys()
method to see its keys:
[5]:
run['SA1_XTD2_XGM/XGM/DOOCS:output'].keys()
[5]:
{'data.intensityAUXSa1TD',
'data.intensityAUXSa3TD',
'data.intensityAUXTD',
'data.intensitySa1SigmaTD',
'data.intensitySa1TD',
'data.intensitySa3SigmaTD',
'data.intensitySa3TD',
'data.intensitySigmaTD',
'data.intensityTD',
'data.trainId',
'data.xSa1SigmaTD',
'data.xSa1TD',
'data.xSa3SigmaTD',
'data.xSa3TD',
'data.xSigmaTD',
'data.xTD',
'data.ySa1SigmaTD',
'data.ySa1TD',
'data.ySa3SigmaTD',
'data.ySa3TD',
'data.ySigmaTD',
'data.yTD'}
Instrument sources may have multiple entries recorded for each train, and may be missing data for some trains. You can see how many entries there are for each train with .data_counts()
. E.g. for this AGIPD detector module, the counts are the number of frames in each train:
[6]:
run['SPB_DET_AGIPD1M-1/DET/11CH0:xtdf', 'image.data'].data_counts()
[6]:
507096934 0
507096935 0
507096936 0
507096937 0
507096938 0
...
507097185 250
507097186 250
507097187 250
507097188 250
507097189 250
Length: 256, dtype: uint64
This method returns a pandas series. The index (the numbers shown on the left) are train IDs.