Skip to content

Future GoodRunList generation in pass3 #29

@mjlarson

Description

@mjlarson

The NuSources and LowEnAstro working groups need to generate good run lists ("GRLs") in order to correctly handle livetime and transients. These are more complex than the GRLs that are currently available via I3Live's snapshots tool.

Now that the filter database has a read-only user (thanks @gh999ic, @mcpreston), I've written scripts to collate all of this information. For pass3, though, it would be useful if we could build the databases so that these tasks are simpler and we don't need to piece together (or guess at...) so many pieces of information. I wanted to collect these in one place so we can consider the issues when building pass3 databases:

Level2 GRL text files:

  • The L2 GRL text files in /data/exp/ are known to be static and don't reflect the current snapshot
  • The L2 GRL text files are sometimes actively wrong: Run 121864 (February 10, 2013) lists 86 active strings in the text file (created April 12, 2018) while I3Live shows string 31 dropped.
  • Users need to know that level2 pre-2017 is different from level2pass2 pre-2017 and level2 post-2017 (which are the same thing, despite different names)
  • These never contained information on gaps: to access those, users usually had to access a combination of txt and tar files in /data/exp/.
  • These never contained information on missing files. Users needed to trawl through L2 i3 files on disk to calculate the start/stop times of the missing files.

I3Live database:

  • No information about active doms/strings available in the snapshot. These are needed for veto-style selections since missing strings can break the veto.
  • Number of active strings/doms is in a separate url from the good_i3, good_it, grl_start, and grl_stop so we need to ping the server twice
  • In old runs, configured_doms and grl_stop are sometimes not set, leading to awkward workarounds.
  • Overall run information is available, but access via the json/web interface tends to be slow: retrieving all of the relevant information for a single run takes ~0.7 seconds per run.

Filtering database:

  • Information is spread out across five tables (gaps, gaps_pass2, sub_runs, sub_runs_pass2, missing_files_pass2)
  • _pass2 tables are only applicable pre-2017, but that relies on user knowledge.
  • No absolute times are given: only livetime per subrun file, so users must manually calculate the time of gaps, missing files
  • Not all missing files are in the missing_files_pass2 table. For some cases, this is because a later good start time was set and early subrun files are dropped. In other cases, I can't find any explanation (eg, run 125347 is missing subrun file 254). For these cases, the user has to guess at the livetime of the missing file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions