Data Containers

The AstroData package is built around the concept of data containers. These are objects that contain the data for a single observation, and determine the structure of these data in memory. We have extended the Astropy NDData class to provide the core functionality of these containers, and added a number of mixins to provide additional functionality.

Specifically, we extend NDData with the following:

  • astrodata.NDAstroData - the main data container class

  • astrodata.NDAstroDataMixin - a mixin class that adds additional functionality to NDData, such as the ability to access image planes and tables stored in the meta dict as attributes of the object

  • astrodata.NDArithmeticMixin - a mixin class that adds arithmetic functionality

  • astrodata.NDSlicingMixin - a mixin class that adds slicing functionality

NDAstroData class

Our main data container is NDAstroData. Fundamentally, it is a derivative of astropy.nddata.NDData, plus a number of mixins to add functionality:

class NDAstroData(AstroDataMixin, NDArithmeticMixin, NDSlicingMixin, NDData):
    ...

With these mixins, NDAstroData is extended to allow for ease and efficiency of use, as if a common array, with extra features such as uncertainty propogation and efficient slicing with typically array syntax.

Upon initialization (see AstroData’s __init__() method), the AstroData class will attempt to open the file in memory-mapping mode, which is the default mode for opening FITS files in Astropy. This means that the data is not loaded into memory until it is accessed, and is discarded from memory when it is no longer needed. This is particularly important for large data sets common in astronomy.

Much of NDAstroData acts to mimic the behavior of NDData and astropy.io.fits objects, but is designed to be extensible to other formats and means of storing, accessing, and manipulating data.

Slicing

One can already slice NDAstroData objects as with NDData, as normal Python arrays

>>> ad = astrodata.from_file(some_fits_file)
>>> ad.shape
[(2048, 2048)]

# Access pixels 100-200 in both dimensions on the first image plane.
>>> ad.data[0][100:200, 100:200].shape
(100, 100)

It’s also useful to access specific “windows” in the data, which is implemented in NDAstroData such that only the data necessary to access a window is loaded into memory.

The astrodata.AstroData.window() property returns an instance of NDWindowing, which only references the AstroData object being windowed (i.e., it contains no direct references to the data). NDWindowingAstroData, which has references pointing to the memory mapped data requested by the window.

The base NDAstroData class provides the memory-mapping functionality built upon by NDWindowingAstroData, with other important behaviors added by the other mixins.

One addition is the variance property, which allows direct access and setting of the data’s uncertainty, without the user needing to explicitly wrap it as an NDUncertainty object. Internally, the variance is stored as an ADVarianceUncertainty object, which is subclassed from Astropy’s standard VarianceUncertainty class with the addition of a check for negative values whenever the array is accessed.

NDAstroDataMixin also changes the default method of combining the mask attributes during arithmetic operations from logical_or to bitwise_or, since the individual bits in the mask have separate meanings.

The way slicing affects the wcs is also changed since DRAGONS regularly uses the callable nature of gWCS objects and this is broken by the standard slicing method.

Finally, the additional image planes and tables stored in the meta dict are exposed as attributes of the NDAstroData object, and any image planes that have the same shape as the parent NDAstroData object will be handled by NDWindowingAstroData. Sections will be ignored when accessing image planes with a different shape, as well as tables.

Note

We expect to make changes to NDAstroData in future releases. In particular, we plan to make use of the unit attribute provided by the NDData class and increase the use of memory-mapping by default. These changes mostly represent increased functionality and we anticipate a high (and possibly full) degree of backward compatibility.