MOMA/XINA Data Mining
Telemetry archives (also called "TID's") are imported into XINA by the process described here. "TID" is short-hand for "Telemetry Identifier", a text string that uniquely describes the experiment contained in the telemetry archive.
The XINA Data Mining process uses a shared network drive from the mining server (mine699.gsfc.nasa.gov) to the IOC server (e.g., MOMAIOC/DRAMSIOC/etc). Each mission uses a mission-specific folder on this shared drive to perform the data mining processes described below (e.g., /mine699/missions/moma/).
This flow chart shows the major steps taken during the data mining process:
- The user commits a TID to Subversion (SVN).
- On MOMAIOC/DRAMSIOC/etc):
- An SVN post-commit hook calls the "svnlook" utility to generate a list of all revision changes in standard format
- "svnlook" output is saved to
/mine699/missions/*/rev/{rev#}.rev
.
- The XINA Commit Watch utility running on MINE699 detects new rev file. Each mission has a separate running instance of this utility.
- Runs command(s) as specified in configuration file:
/mine699/missions/*/config.watch.json
- Standard setup runs two commands: XINA Commit and XINA Import
- Once completed successfully, the rev file is moved to the rev archive:
/mine699/missions/*/rev/archive/
- The XINA Commit utility is run by XINA Commit Watch (above)
- Runs command(s) as specified in configuration file:
/mine699/missions/*/config.json
- Reads in rev file generated by svnlook
- For each file listed in the rev file, the file is checked out from that revision into a local directory:
/mine699/missions/*/local/{file}
- An additional directory is provided for temporary storage for mining processes if required:
/mine699/missions/*/temp
- Files can only be processed individually
- After single file mining operations are finished, the local and temp directories are emptied
- The XINA Import utility is run by XINA Commit Watch (above)
- Requires the XINA Tunnel utility to be running in order to connect to the XINA server (this is always kept running in the background on mine699)
- Reads and imports files in alphanumeric order from
/mine699/missions/*/import/{timestamp}_{rev#}
- Moves the files as they are completed to the current revision archive directory:
/mine699/mission/*/archive/
- Because the import process relies on operations happening sequentially, any errors shut down the import process until they are repaired
Handling Import Errors
XINA Commit Watch will not move on to the next rev file until the current file has been completed successfully- If any operation in XINA Commit or XINA Import fails, the process returns an error code
- When XINA Commit Watch reads an error code from any sub-process it writes an empty file into the mission revision directory:
/mine699/mission/*/rev/rev.lock
- As long as a rev.lock file is present XINA Commit Watch will not attempt to process any rev files
Debugging
Log output from XINA Commit Watch, XINA Commit, and XINA Import is stored in ```/mine699/mission/*/log```. - Errors generated during the "commit phase" are generally easier to fix, and my indicate a problem with a configuration file or script. After a commit phase bug is fixed, the rev.lock file can simply be deleted. XINA Commit Watch will perform a clean attempt to re-process the entire revision. - Errors generated during the "import phase" can be more complicated to fix because some operations may have already been imported before the failure, and running them again may cause new errors. It is necessary to check the archive directory for completed operations before attempting to re-process the revision. It may be preferable to perform corrections on the mined files depending on the error (for example, a typo in a field name), import the files manually, transfer the rev file to the rev archive manually, and then remove the rev.lock file.Preload
For mining previously committed data, each mission has directory ```/mine699/mission/*/preload```- Full checkouts of each mission are kept in
/mine699/mission/*/data
. This checkout is not related to the XINA Commit Watch process. - Scripts are provided in each preload directory to create new task directories:
setup.sh task_name
- Generated directories require a config.json file
- Provide scripts to process and import the backlog:
- commit.sh, commit_bg.sh - run across all files (in foreground or background)
- commit_from.sh, commit_from_bg.sh - run starting at a TID (in foreground or background)
- import.sh, import_bg.sh - import all generated files (in foreground or background)