Tags
We described in the previous section how to generate
tags for an AstroData derivative. In this section we’ll describe the algorithm
that generates the complete tag set out of the individual TagSet
instances.
The algorithm collects all the tags in a list and then decides whether to apply
them or not following certain rules, but let’s talk about TagSet
first.
TagSet
is actually a standard named tuple customized to generate default
values (None
) for its missing members. Its signature is:
TagSet(add=None, remove=None, blocked_by=None, blocks=None,
if_present=None)
The most common TagSet
is an additive one: TagSet(['FOO', 'BAR'])
.
If all you need is to add tags, then you’re done here. But the real power of
our tag generating system is that you can specify some conditions to apply a
certain TagSet
, or put restrictions on others. The different arguments to
TagSet
all expect a list (or some others work in the following way):
add
: if thisTagSet
is selected, then add all these members to the tag set.remove
: if thisTagSet
is selected, then prevent all these members from joining the tag set.blocked_by
: if any of the tags listed in here exist in the tag set, then discard thisTagSet
altogether.blocks
: discard from the list of unprocessed ones anyTagSet
that would add any of the tags listed here.if_present
: process this tag only if all the tags listed in here exist in the tag set at this point.
Note that blocked_by
and blocks
look like two sides of the same coin.
This is intentional: which one to use is up to the programmer, depending on
what will reduce the amount of typing and/or make the logic easier (sometimes one
wants to block a bunch of other tags from a single one; sometimes one wants a
tag to be blocked by a bunch of others). Furthermore, while blocks
and
blocked_by
prevent the entire TagSet
from being added if it contains a
tag affected by these, remove
only affects the specific tag.
Now, the algorithm works like this:
Collect all the
TagSet
generated by methods in the instance that are decorated usingastro_data_tag
.Then we sort them out:
Those that subtract tags from the tag set go first (the ones with non-empty
remove
orblocks
), allowing them to act early onThose with non-empty
blocked_by
are moved to the end of the list, to ensure that other tags can be generated before them.Those with non-empty
if_present
are moved behind those withblocked_by
.
Now that we’ve sorted the tags, process them sequentially and for each one:
If they require other tags to be present, make sure that this is the case. If the requirements are not met, drop the tagset. If not…
Figure out if any other tag is blocking the tagset. This will be the case if any of the tags to be added is in the “blocked” list, or if any of the tags added by previous tag sets are in the
blocked_by
list of the one being processed. Then…If all the previous hurdles have been passed, apply the changes declared by this tag (add, remove, and/or block others).
Note that Python’s sort algorithm is stable. This means, that if two elements are indistinguishable from the point of view of the sorting algorithm, they are guaranteed to stay in the same relative position. To better understand how this affects our tags, and the algorithm itself, let’s follow up with an example taken from real code (the Gemini-generic and GMOS modules)
# Simple tagset, with only a constant, additive content
@astro_data_tag
def _tag_instrument(self):
return TagSet(['GMOS'])
# Simple tagset, also with additive content. This one will
# check if the frame fits the requirements to be classified
# as "GMOS imaging". It returns a value conditionally:
# if this is not imaging, then it will return None, which
# means the algorithm will ignore the value
@astro_data_tag
def _tag_image(self):
if self.phu.get('GRATING') == 'MIRROR':
return TagSet(['IMAGE'])
# This is a slightly more complex TagSet (but fairly simple, anyway),
# inherited by all Gemini instruments.
@astro_data_tag
def _type_gcal_lamp(self):
if self.phu.get('GCALLAMP') == 'IRhigh':
shut = self.phu.get('GCALSHUT')
if shut == 'OPEN':
return TagSet(['GCAL_IR_ON', 'LAMPON'],
blocked_by=['PROCESSED'])
elif shut == 'CLOSED':
return TagSet(['GCAL_IR_OFF', 'LAMPOFF'],
blocked_by=['PROCESSED'])
# This tagset is only active when we detect that the frame is
# a bias. In that case we want to prevent the frame from being
# classified as "imaging" or "spectroscopy", which depend on the
# configuration of the instrument
@astro_data_tag
def _tag_bias(self):
if self.phu.get('OBSTYPE') == 'BIAS':
return TagSet(['BIAS', 'CAL'], blocks=['IMAGE', 'SPECT'])
These four simple tag methods will serve to illustrate the algorithm. Let’s pretend
that the requirements for all four of them are somehow met, meaning that we get four
TagSet
instances in our list, in some random order. After step 1 in the algorithm,
then, we may have collected the following list:
[ TagSet(['GMOS']),
TagSet(['GCAL_IR_OFF', 'LAMPOFF'], blocked_by=['PROCESSED']),
TagSet(['BIAS', 'CAL'], blocks=['IMAGE', 'SPECT']),
TagSet(['IMAGE']) ]
The algorithm then proceeds to sort them. First, it will promote the TagSet
with non-empty blocks
or remove
:
[ TagSet(['BIAS', 'CAL'], blocks=['IMAGE', 'SPECT']),
TagSet(['GMOS']),
TagSet(['GCAL_IR_OFF', 'LAMPOFF'], blocked_by=['PROCESSED']),
TagSet(['IMAGE']) ]
Note that the other three TagSet
stay in exactly the same order. Now the
algorithm will sort the list again, moving the ones with non-empty
blocked_by
to the end:
[ TagSet(['BIAS', 'CAL'], blocks=['IMAGE', 'SPECT']),
TagSet(['GMOS']), TagSet(['IMAGE']),
TagSet(['GCAL_IR_OFF', 'LAMPOFF'], blocked_by=['PROCESSED']) ]
Note that at each step, all the instances (except the ones “being moved”) have
kept the same position relative to each other -here’s where the “stability” of
the sorting comes into play,- ensuring that each step does not affect the previous
one. Finally, there are no if_present
in our example, so no more instances are
moved around.
Now the algorithm prepares three empty sets (tags
, removals
, and blocked
),
and starts iterating over the TagSet
list.
For the first
TagSet
there are no blocks or removals, so we just add its contents to the current sets:tags = {'BIAS', 'CAL'}
,blocked = {'IMAGE', 'SPECT'}
.Then comes
TagSet(['GMOS'])
. Again, there are no removals in place, andGMOS
is not in the list of blocked tags. Thus, we just add it to the current tag set:tags = {'BIAS', 'CAL', 'GMOS'}
.When processing
TagSet(['IMAGE'])
, the algorithm observes that thisIMAGE
is in theblocked
set, and stops processing this tag set.Finally, neither
GCAL_IR_OFF
norLAMPOFF
are inblocked
, andPROCESSED
is not intags
, meaning that we can add this tag set to the final one.
Our result will look something like: {'BIAS', 'CAL', 'GMOS', 'GCAL_IR_OFF', 'LAMPOFF'}