larchcones Larch: X-ray Data Analysis

Table Of Contents

Previous topic

Tutorial: Builtin Functions

Next topic

Plotting and Displaying Data

This Page

Tutorial: Reading and Writing Data

Larch has several built-in functions for reading scientific data. The intention that the types of supported files will increase. In addition, many Python modules for reading standard types of image data can be used.

Simple ASCII Column Files

A simple way to store small amounts of numerical data, and one that is widely used in the XAFS community, is to store data in plaintext (ASCII encoded) data files, with whitespace delimited numbers layed out as a table, with a fix number of columns and rows indicated by newlines. Typically a comment character such as “#” is used to signify header information. For instance:

# room temperature FeO.
# data from 20-BM, 2001, as part of NXS school
#------------------------
#   energy     xmu       i0
  6911.7671  -0.35992590E-01  280101.00
  6916.8730  -0.39081634E-01  278863.00
  6921.7030  -0.42193483E-01  278149.00
  6926.8344  -0.45165576E-01  277292.00
  6931.7399  -0.47365589E-01  265707.00

This file and others like it can be read with the builtin read_ascii() function.

_io.read_ascii(filename, comentchar='#;*%', labels=None)

opens and read an plaintext data file, returning a new group containing the data.

Parameters:
  • filename (string) – name of file to read.
  • commentchar (string) – string of valid comment characters
  • labels – string to split for column labels

Some examples of read_ascii():

larch> g = read_ascii('mydata.dat')
larch> show(g)
== Group ascii_file mydata.dat: 6 symbols ==
  attributes: <Group header attributes from mydata.dat>
  column_labels: ['energy', 'xmu', 'i0']
  energy: array<shape=(412,), type=dtype('float64')>
  filename: 'mydata.dat'
  i0: array<shape=(412,), type=dtype('float64')>
  xmu: array<shape=(412,), type=dtype('float64')>
larch>

which reads the data file and sets array names according to the column labels in the file. You can be explicit:

larch> g = read_ascii('mydata.dat', label='e mutrans monitor')
larch> show(g)
== Group ascii_file mydata.dat: 6 symbols ==
  attributes: <Group header attributes from mydata.dat>
  column_labels: ['e', 'mutrans', 'monitor']
  e: array<shape=(412,), type=dtype('float64')>
  filename: 'mydata.dat'
  monitor: array<shape=(412,), type=dtype('float64')>
  mutrans: array<shape=(412,), type=dtype('float64')>
larch>

and to get the data as a 2-D array:

 larch> g  = read_ascii('mydata.dat', labels=False)
 larch> show(g)
 == Group ascii_file mydata.dat: 4 symbols ==
   attributes: <Group header attributes from mydata.dat>
   column_labels: []
   data: array<shape=(3, 412), type=dtype('float64')>
   filename: 'mydata.dat'
larch>
_io.write_ascii(filename, *args, commentchar='#', label=None, header=None)

opens and writes arrays, scalars, and text to an ASCII file.

Parameters:
  • commentchar – character for comment (‘#’)
  • label – array label line (autogenerated)
  • header – array of strings for header
_io.write_group(filename, group, scalars=None, arrays=None, arrays_like=None, commentchar='#')

write data from a specified group to an ASCII data file

Using HDF5 Files

HDF5 is an increasingly popular data format for scientific data, as it can efficiently hold very large arrays in a heirarchical format that holds “metadata” about the data, and can be explored with a variety of tools.

An example using h5_group() shows that one can browse through the data heirarchy of the HDF5 file, and pick out the needed data:

larch> g = h5group('test.h5')
larch> show(g)
== Group test.h5: 3 symbols ==
  attrs: {u'Collection Time': ': Sat Feb 4 13:29:00 2012', u'Version': '1.0.0',
          u'Beamline': 'GSECARS, 13-IDC / APS', u'Title': 'Epics Scan Data'}
  data: <Group test.h5/data>
  h5_file: <HDF5 file "test.h5" (mode r)>
larch>show(g.data)
== Group test.h5/data: 5 symbols ==
  attrs: {u'scan_prefix': '13IDC:', u'start_time': ': Sat Feb 4 13:29:00 2012',
        u'correct_deadtime': 'True', u'dimension': 2,
        u'stop_time': ': Sat Feb 4 13:44:52 2009'}
  environ: <Group test.h5/data/environ>
  full_xrf: <Group test.h5/data/full_xrf>
  merged_xrf: <Group test.h5/data/merged_xrf>
  scan: <Group test.h5/data/scan>


larch> g.data.scan.sums
<HDF5 dataset "det": shape (15, 26, 26), type "<f8">

larch> imshow(g.data.scan.sums[8:,:,:])

This interface is general-purpose but somewhat low-level. As HDF5 formats and schemas become standardized, better interfaces can easily be made on top of this approach.