r2r_ctd.reporting

Classes and functions that iterate through stations doing QA tests, aggregating the results, and reporting those results in the form of and XML element tree.

Because we are making XML, the code in here is a little verbose. The ResultAggregator is what iterates though stations, performs or asks for the QA results for each station, and constructs the final QA “certificate”. The builder pattern from lxml is being used here to allow the code to look similar to the XML that it is generating. If you are looking at the code yourself, start with ResultAggregator.certificate() and follow it from there.

Attributes

E

lxml element maker with the r2r namespace configured, a whole bunch of Element instances follow representing different XML elements that will be constructed

Certificate

Rating

Tests

Test

TestResult

Bounds

Bound

Infos

Info

Update

Process

Time

Reference

ALL

Literal 100 representing 100 percent

A_FEW

Literal 50 representing 50 percent, defines the cutoff between "yellow" and "red" ratings

Classes

ResultAggregator

Dataclass which iterates though all the stations their tests and aggregates their results and generates the "info blocks".

Functions

overall_rating(→ lxml.etree._Element)

Given a string code rating, wrap it in a Rating with the correct description attribute set

file_presence(→ lxml.etree._Element)

Constructs the XML element representing the "Presence of All Raw Files" test result

valid_checksum(→ lxml.etree._Element)

Constructs the XML element representing the "Valid Checksum for All Files in Manifest" test result

lon_lat_range(→ lxml.etree._Element)

Constructs the XML element representing the "Presence of All Raw Files" test result

date_range(→ lxml.etree._Element)

Constructs the XML element representing the "Dates within NAV Ranges" test result

get_update_record(→ lxml.etree._Element)

get_new_references(→ list[lxml.etree._Element])

Return a list of new Reference xml elements

write_xml_qa_report(breakout, certificate)

Module Contents

r2r_ctd.reporting.E

lxml element maker with the r2r namespace configured, a whole bunch of Element instances follow representing different XML elements that will be constructed

r2r_ctd.reporting.Certificate
r2r_ctd.reporting.Rating
r2r_ctd.reporting.Tests
r2r_ctd.reporting.Test
r2r_ctd.reporting.TestResult
r2r_ctd.reporting.Bounds
r2r_ctd.reporting.Bound
r2r_ctd.reporting.Infos
r2r_ctd.reporting.Info
r2r_ctd.reporting.Update
r2r_ctd.reporting.Process
r2r_ctd.reporting.Time
r2r_ctd.reporting.Reference
r2r_ctd.reporting.ALL = 100

Literal 100 representing 100 percent

r2r_ctd.reporting.A_FEW = 50

Literal 50 representing 50 percent, defines the cutoff between “yellow” and “red” ratings

r2r_ctd.reporting.overall_rating(rating: Literal['G', 'R', 'Y', 'N', 'X']) lxml.etree._Element

Given a string code rating, wrap it in a Rating with the correct description attribute set

r2r_ctd.reporting.file_presence(rating: Literal['G', 'R'], test_result: str | int) lxml.etree._Element

Constructs the XML element representing the “Presence of All Raw Files” test result

Parameters:

test_result – Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.

r2r_ctd.reporting.valid_checksum(rating: Literal['G', 'R']) lxml.etree._Element

Constructs the XML element representing the “Valid Checksum for All Files in Manifest” test result

Note that this check is pass/fail

r2r_ctd.reporting.lon_lat_range(rating: Literal['G', 'R', 'Y', 'N', 'X'], test_result: str | int) lxml.etree._Element

Constructs the XML element representing the “Presence of All Raw Files” test result

Parameters:

test_result – Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.

r2r_ctd.reporting.date_range(rating: Literal['G', 'R', 'Y', 'N', 'X'], test_result: str | int) lxml.etree._Element

Constructs the XML element representing the “Dates within NAV Ranges” test result

Parameters:

test_result – Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.

class r2r_ctd.reporting.ResultAggregator

Dataclass which iterates though all the stations their tests and aggregates their results and generates the “info blocks”.

It is structured in the same order that the results appear in the XML. Some ratings require extra information, e.g. the geographic bounds test needs to know if any of the stations are missing nav entirely or if the bounding box itself is missing.

breakout: r2r_ctd.breakout.Breakout
property presence_of_all_files: int

Iterate though the stations and count how many have all_three_files()

property presence_of_all_files_rating: Literal['G', 'R']

Pass/fail result string of presence_of_all_files() where 100 is a pass

property valid_checksum_rating: Literal['G', 'R']

Pass/fail result string of manifest_ok()

property lon_lat_nav_valid: int

Iterate though the stations and count how many are lon_lat_valid()

property lon_lat_nav_range: int

Iterate though the stations and count how many are lon_lat_in() the bbox()

property lon_lat_nav_ranges_rating: Literal['G', 'Y', 'R', 'N', 'X']

Calculate the rating string for the nav bounds test, also needs to check if all of the stations are missing nav or if the breakout is missing bounds.

property time_valid: int

Iterate though the stations and count how many are time_valid()

property time_range: int

Iterate though the stations and count how many are time_in() the temporal_bounds()

property time_rating: Literal['G', 'Y', 'R', 'N', 'X']

Calculate the rating string for the temporal bounds test, also needs to check if all of the stations are missing time or if the breakout is missing bounds.

property rating

Aggregates the ratings from all the *_rating properties.

Takes the “worst” rating as the overall rating in the following order:

  • R (red)

  • N (grey)

  • X (Black)

  • Y (Yellow)

  • G (Green)

The red precedence over the black and yellow is from my interpretation of the rating string in the original XML:

GREEN (G) if all tests GREEN, RED (R) if at least one test RED, else YELLOW (Y); Gray(N) if no navigation was included in the distribution; X if one or more tests could not be run.

property info_total_raw_files

Info Element with the length of r2r_ctd.breakout.Breakout.hex_paths

property info_number_bottles

Info Element with the number of casts that have bottles fired

WHOI Divergence

The original WHOI code simply checks if there is a .bl file and says the cast has bottles fired if this file is present.

This is incorrect, you need to check to see if there are any actual bottle fire records in that file. This code does that check.

In the example breakout 138036 there are three casts but only one fired bottles, the QA report in that breakout incorrectly says 0 casts with bottles fired. My understanding is the current WHOI code would report 3 for this breakout, I don’t know why is says 0 but both are incorrect.

property info_model_number

Info Element with the CTD model number (e.g. SBE911)

See r2r_ctd.derived.get_model()

property info_number_casts_with_nav_all_scans

Info Element with the number of casts that have the string “Store Lat/Lon Data = Append to Every Scan” in the header file

property info_casts_without_all_raw

Info Element with a space separated list of station names that did not have all_three_files()

property info_casts_with_hex_bad_format

Always reports OK

WHOI Divergence

In the original WHOI code, this works the same way as info_casts_with_xmlcon_bad_format().

I would like to implement this in the same way that the bad xmlcon report does, but need to actually make or find some bad data.

property info_casts_with_xmlcon_bad_format

Report the casts which could not have a conreport generated

WHOI Divergence

The original code would run the file command and check to make sure any of “data”, “object”, or “executable” were not in the output of the command. Instead, this will just check to see which casts the ConReport.exe failed on, i.e. let seabird software figure out if the xmlcon is a bad format.

The original documentation also says it is looking for ASCII, but the code does not appear to do any encoding checks, they would likely be invalid anyway since Seabird files, being on windows, are usually CP437.

property info_casts_with_dock_deck_test_in_file_name

Info Element with a space separated list of station names that look like “deck tests”

See r2r_ctd.checks.is_deck_test()

property info_casts_with_temp_sensor_sn_problems

List of casts where the serial number in the data header does not match any serial numbers in the xmlcon.

WHOI Divergence

There can be more than one temperature sensor and they will have different serial numbers. The datafile would only have one serial number, but the xmlcon will list all the sensors. The original WHOI code would only check one of the serial numbers found in the xmlcon.

I’m not sure the exact cause, but there are a lot of false problems in the breakouts I was given to test with, for example: 156405 reports that all the casts have SN problems, but a manual examination and this code, show that there are no mismatches.

This code checks the serial number in the header file against all the serial numbers in the xmlcon of the same instrument type (e.g. all temperature sensor serial numbers)

property info_casts_with_cond_sensor_sn_problems

List of casts where the serial number in the data header does not match any serial numbers in the xmlcon.

WHOI Divergence

See the info_casts_with_temp_sensor_sn_problems divergence note

property info_casts_with_bad_nav

Info Element with a space separated list of station names that aren’t lon_lat_valid()

property info_casts_failed_nav_bounds

Info Element with a space separated list of station names that are lon_lat_valid() but aren’t in bbox()

gen_geoCSV()

Generates the “geoCSV” file

The header was taken verbatim from the WHOI Code, and could probably use some cleanup. Of particular note is that the field types, units, etc.. metadata in the header, does not match the number of columns in the actual “data” part.

The original WHOI code also doesn’t calculate the dp_flag and just sets to a hard coded 0. Better might be to use a bit mask because there can be multiple problems with each cast.

property certificate

The Certificate Element with all the above test results

r2r_ctd.reporting.get_update_record() lxml.etree._Element
r2r_ctd.reporting.get_new_references(breakout: r2r_ctd.breakout.Breakout) list[lxml.etree._Element]

Return a list of new Reference xml elements

This crawls the output directories to check was was actually created to build its list

r2r_ctd.reporting.write_xml_qa_report(breakout: r2r_ctd.breakout.Breakout, certificate: lxml.etree._Element)