r2r_ctd.reporting
=================

.. py:module:: r2r_ctd.reporting

.. autoapi-nested-parse::

   Classes and functions that iterate through stations doing QA tests, aggregating the results, and reporting those results in the form of and XML element tree.

   Because we are making XML, the code in here is a little verbose.
   The :py:class:`ResultAggregator` is what iterates though stations, performs or asks for the QA results for each station, and constructs the final QA "certificate".
   The builder pattern from lxml is being used here to allow the code to look similar to the XML that it is generating.
   If you are looking at the code yourself, start with :py:meth:`ResultAggregator.certificate` and follow it from there.


Attributes
----------

.. autoapisummary::

   r2r_ctd.reporting.E
   r2r_ctd.reporting.Certificate
   r2r_ctd.reporting.Rating
   r2r_ctd.reporting.Tests
   r2r_ctd.reporting.Test
   r2r_ctd.reporting.TestResult
   r2r_ctd.reporting.Bounds
   r2r_ctd.reporting.Bound
   r2r_ctd.reporting.Infos
   r2r_ctd.reporting.Info
   r2r_ctd.reporting.Update
   r2r_ctd.reporting.Process
   r2r_ctd.reporting.Time
   r2r_ctd.reporting.Reference
   r2r_ctd.reporting.ALL
   r2r_ctd.reporting.A_FEW
   r2r_ctd.reporting.RATING_CSS_MAP


Classes
-------

.. autoapisummary::

   r2r_ctd.reporting.ResultAggregator


Functions
---------

.. autoapisummary::

   r2r_ctd.reporting.overall_rating
   r2r_ctd.reporting.file_presence
   r2r_ctd.reporting.valid_checksum
   r2r_ctd.reporting.lon_lat_range
   r2r_ctd.reporting.date_range
   r2r_ctd.reporting.boolean_span_formatter
   r2r_ctd.reporting.get_update_record
   r2r_ctd.reporting.get_new_references
   r2r_ctd.reporting.write_xml_qa_report


Module Contents
---------------

.. py:data:: E

   lxml element maker with the r2r namespace configured, a whole bunch of Element instances follow representing different XML elements that will be constructed


.. py:data:: Certificate

.. py:data:: Rating

.. py:data:: Tests

.. py:data:: Test

.. py:data:: TestResult

.. py:data:: Bounds

.. py:data:: Bound

.. py:data:: Infos

.. py:data:: Info

.. py:data:: Update

.. py:data:: Process

.. py:data:: Time

.. py:data:: Reference

.. py:data:: ALL
   :value: 100


   Literal 100 representing 100 percent


.. py:data:: A_FEW
   :value: 50


   Literal 50 representing 50 percent, defines the cutoff between "yellow" and "red" ratings


.. py:function:: overall_rating(rating: Literal['G', 'R', 'Y', 'N', 'X']) -> lxml.etree._Element

   Given a string code rating, wrap it in a :py:obj:`Rating` with the correct description attribute set


.. py:function:: file_presence(rating: Literal['G', 'R'], test_result: str | int) -> lxml.etree._Element

   Constructs the XML element representing the "Presence of All Raw Files" test result

   :param test_result: Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.


.. py:function:: valid_checksum(rating: Literal['G', 'R']) -> lxml.etree._Element

   Constructs the XML element representing the "Valid Checksum for All Files in Manifest" test result

   Note that this check is pass/fail


.. py:function:: lon_lat_range(rating: Literal['G', 'R', 'Y', 'N', 'X'], test_result: str | int) -> lxml.etree._Element

   Constructs the XML element representing the "Presence of All Raw Files" test result

   :param test_result: Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.


.. py:function:: date_range(rating: Literal['G', 'R', 'Y', 'N', 'X'], test_result: str | int) -> lxml.etree._Element

   Constructs the XML element representing the "Dates within NAV Ranges" test result

   :param test_result: Should be a string or int in the interval (0, 100) representing the percentage of files that passed this test.


.. py:function:: boolean_span_formatter(tf: bool) -> str

   Format a boolean with html span element that colors green/red for true/false


.. py:data:: RATING_CSS_MAP

   Mapping between the QA letter codes and css color name


.. py:class:: ResultAggregator

   Dataclass which iterates though all the stations their tests and aggregates their results and generates the "info blocks".

   It is structured in the same order that the results appear in the XML.
   Some ratings require extra information, e.g. the geographic bounds test needs to know if any of the stations are missing nav entirely or if the bounding box itself is missing.


   .. py:attribute:: breakout
      :type:  r2r_ctd.breakout.Breakout


   .. py:method:: geo_breakout_feature()

      If the breakout has a valid bounding box, generate the GeoJSON feature to plot on a map


   .. py:method:: geo_station_feature()

      Generate the GeoJSON feature collection with a feature for each station that has lon/lat coordinates to plot on a map


   .. py:property:: presence_of_all_files
      :type: int


      Iterate though the stations and count how many have :py:meth:`~r2r_ctd.accessors.R2RAccessor.all_three_files`


   .. py:property:: presence_of_all_files_rating
      :type: Literal['G', 'R']


      Pass/fail result string of :py:meth:`presence_of_all_files` where 100 is a pass


   .. py:property:: valid_checksum_rating
      :type: Literal['G', 'R']


      Pass/fail result string of :py:meth:`~r2r_ctd.breakout.Breakout.manifest_ok`


   .. py:property:: lon_lat_nav_valid
      :type: int


      Iterate though the stations and count how many are :py:meth:`~r2r_ctd.accessors.R2RAccessor.lon_lat_valid`


   .. py:property:: lon_lat_nav_range
      :type: int


      Iterate though the stations and count how many are :py:meth:`~r2r_ctd.accessors.R2RAccessor.lon_lat_in` the :py:meth:`~r2r_ctd.breakout.Breakout.bbox`


   .. py:property:: lon_lat_nav_ranges_rating
      :type: Literal['G', 'Y', 'R', 'N', 'X']


      Calculate the rating string for the nav bounds test, also needs to check if all of the stations are missing nav or if the breakout is missing bounds.


   .. py:property:: time_valid
      :type: int


      Iterate though the stations and count how many are :py:meth:`~r2r_ctd.accessors.R2RAccessor.time_valid`


   .. py:property:: time_range
      :type: int


      Iterate though the stations and count how many are :py:meth:`~r2r_ctd.accessors.R2RAccessor.time_in` the :py:meth:`~r2r_ctd.breakout.Breakout.temporal_bounds`


   .. py:property:: time_rating
      :type: Literal['G', 'Y', 'R', 'N', 'X']


      Calculate the rating string for the temporal bounds test, also needs to check if all of the stations are missing time or if the breakout is missing bounds.


   .. py:property:: rating

      Aggregates the ratings from all the ``*_rating`` properties.

      Takes the "worst" rating as the overall rating in the following order:

      * R (red)
      * N (grey)
      * X (Black)
      * Y (Yellow)
      * G (Green)

      The red precedence over the black and yellow is from my interpretation of the rating string in the original XML:

        GREEN (G) if all tests GREEN, RED (R) if at least one test RED, else YELLOW (Y); Gray(N) if no navigation was included in the distribution; X if one or more tests could not be run.


   .. py:property:: info_total_raw_files

      Info Element with the length of :py:class:`r2r_ctd.breakout.Breakout.hex_paths`


   .. py:property:: info_number_bottles

      Info Element with the number of casts that have bottles fired

      .. admonition:: WHOI Divergence
          :class: warning

          The original WHOI code simply checks if there is a .bl file and says the cast has bottles fired
          if this file is present.

          This is incorrect, you need to check to see if there are any actual bottle fire records in that file.
          This code does that check.

          In the example breakout 138036 there are three casts but only one fired bottles, the QA report in that breakout
          incorrectly says 0 casts with bottles fired.
          My understanding is the current WHOI code would report 3 for this breakout, I don't know why is says 0
          but both are incorrect.


   .. py:property:: info_model_number

      Info Element with the CTD model number (e.g. SBE911)

      See :py:func:`r2r_ctd.derived.get_model`


   .. py:property:: info_number_casts_with_nav_all_scans

      Info Element with the number of casts that have the string "Store Lat/Lon Data = Append to Every Scan" in the header file


   .. py:property:: info_casts_without_all_raw

      Info Element with a space separated list of station names that did not have :py:meth:`~r2r_ctd.accessors.R2RAccessor.all_three_files`


   .. py:property:: info_casts_with_hex_bad_format

      Always reports OK

      .. admonition:: WHOI Divergence
          :class: warning

          In the original WHOI code, this works the same way as :py:meth:`.info_casts_with_xmlcon_bad_format`.

          I would like to implement this in the same way that the bad xmlcon report does, but need to actually make or find some bad data.


   .. py:property:: info_casts_with_xmlcon_bad_format

      Report the casts which could not have a conreport generated

      .. admonition:: WHOI Divergence
          :class: warning

          The original code would run the ``file`` command and check to make sure any of
          "data", "object", or "executable" were not in the output of the command.
          Instead, this will just check to see which casts the ConReport.exe failed on,
          i.e. let seabird software figure out if the xmlcon is a bad format.

          The original documentation also says it is looking for ASCII, but the code does not appear
          to do any encoding checks, they would likely be invalid anyway since Seabird files, being on windows,
          are usually `CP437 <https://en.wikipedia.org/wiki/Code_page_437>`_.


   .. py:property:: info_casts_with_dock_deck_test_in_file_name

      Info Element with a space separated list of station names that look like "deck tests"

      See :py:func:`r2r_ctd.checks.is_deck_test`


   .. py:property:: info_casts_with_temp_sensor_sn_problems

      List of casts where the serial number in the data header does not
      match any serial numbers in the xmlcon.

      .. admonition:: WHOI Divergence
          :class: warning

          There can be more than one temperature sensor and they will have different serial numbers.
          The datafile would only have one serial number, but the xmlcon will list all the sensors.
          The original WHOI code would only check one of the serial numbers found in the xmlcon.

          I'm not sure the exact cause, but there are a lot of false problems in the breakouts I was given
          to test with, for example: 156405 reports that all the casts have SN problems, but a manual examination
          and this code, show that there are no mismatches.

          This code checks the serial number in the header file against all the serial numbers in the xmlcon of the
          same instrument type (e.g. all temperature sensor serial numbers)


   .. py:property:: info_casts_with_cond_sensor_sn_problems

      List of casts where the serial number in the data header does not
      match any serial numbers in the xmlcon.

      .. admonition:: WHOI Divergence
          :class: warning

          See the info_casts_with_temp_sensor_sn_problems divergence note


   .. py:property:: info_casts_with_bad_nav

      Info Element with a space separated list of station names that aren't :py:meth:`~r2r_ctd.accessors.R2RAccessor.lon_lat_valid`


   .. py:property:: info_casts_failed_nav_bounds

      Info Element with a space separated list of station names that are :py:meth:`~r2r_ctd.accessors.R2RAccessor.lon_lat_valid` but aren't in :py:meth:`~r2r_ctd.breakout.Breakout.bbox`


   .. py:method:: gen_geoCSV()

      Generates the "geoCSV" file

      The header was taken verbatim from the WHOI Code, and could probably use some cleanup.
      Of particular note is that the field types, units, etc.. metadata in the header, does not
      match the number of columns in the actual "data" part.

      The original WHOI code also doesn't calculate the dp_flag and just sets to a hard coded 0.
      Better might be to use a bit mask because there can be multiple problems with each cast.


   .. py:property:: certificate

      The Certificate Element with all the above test results


.. py:function:: get_update_record() -> lxml.etree._Element

.. py:function:: get_new_references(breakout: r2r_ctd.breakout.Breakout) -> list[lxml.etree._Element]

   Return a list of new Reference xml elements

   This crawls the output directories to check was was actually created to build its list


.. py:function:: write_xml_qa_report(breakout: r2r_ctd.breakout.Breakout, certificate: lxml.etree._Element)