MOMA/XINA Data Mining
MOMA telemetryTelemetry archives (knownalso ascalled "TID's") are imported into XINA by the process described here. "TID" is short-hand for "Telemetry Identifier", a text string that uniquely describes the experiment contained in the telemetry archive.
The XINA Data Mining process uses a shared network drive from the mining server (mine699.gsfc.nasa.gov) to the IOC server (e.g., MOMAIOC/DRAMSIOC/etc). Each mission uses a mission-specific folder on this shared drive to perform the data mining processes described below (e.g., /mine699/missions/moma/).
This flow chart shows the major steps taken during the data mining process:
USER:TheCommitsuser commits a TID toSVN. The SVN repo "momadata" resides on the MOMAIOC serverSubversion (momaioc.gsfc.nasa.gov)SVN).MOMAIOC:On MOMAIOC/DRAMSIOC/etc):
- An SVN post-commit hook
creates a "rev" file and saves it to a shared network drive (/mine699/missions/moma/rev/) MINE699: A cronjob (running /mine699/missions/moma/watch.sh) monitorscalls the "rev"svnlook" utility to generate a list of all revision changes in standard format- "svnlook" output is saved to
/mine699/missions/*/rev/{rev#}.rev
.
- The XINA Commit Watch utility running on MINE699 detects new rev file. Each mission has a separate running instance of this utility.
- Runs command(s) as specified in configuration file:
/mine699/missions/*/config.watch.json
- Standard setup runs two commands: XINA Commit and XINA Import
- Once completed successfully, the rev file is moved to the rev archive:
/mine699/missions/*/rev/archive/
- The XINA Commit utility is run by XINA Commit Watch (above)
- Runs command(s) as specified in configuration file:
/mine699/missions/*/config.json
- Reads in rev file generated by svnlook
- For each file listed in the rev file, the file is checked out from that revision into a local directory:
/mine699/missions/*/local/{file}
- An additional directory is provided for temporary storage for mining processes if required:
/mine699/missions/*/temp
- Files can only be processed individually
- After single file mining operations are finished, the local and temp directories are emptied
- The XINA Import utility is run by XINA Commit Watch (above)
- Requires the XINA Tunnel utility to be running in order to connect to the XINA server (this is always kept running in the background on mine699)
- Reads and imports files in alphanumeric order from
/mine699/missions/*/import/{timestamp}_{rev#}
- Moves the files as they are completed to the current revision archive directory:
/mine699/mission/*/archive/
- Because the import process relies on operations happening sequentially, any errors shut down the import process until they are repaired
Handling Import Errors
XINA Commit Watch will not move on to the next rev file until the current file has been completed successfully- If any operation in XINA Commit or XINA Import fails, the process returns an error code
- When XINA Commit Watch reads an error code from any sub-process it writes an empty file into the mission revision directory:
/mine699/mission/*/rev/rev.lock
- As long as a rev.lock file is present XINA Commit Watch will not attempt to process any rev files
Debugging
Log output from XINA Commit Watch, XINA Commit, and XINA Import is stored in ```/mine699/mission/*/log```. - Errors generated during the "commit phase" are generally easier to fix, and my indicate a problem with a configuration file or script. After a commit phase bug is fixed, the rev.lock file can simply be deleted. XINA Commit Watch will perform a clean attempt to re-process the entire revision. - Errors generated during the "import phase" can be more complicated to fix because some operations may have already been imported before the failure, and running them again may cause new errors. It is necessary to check the archive directory forPreload
For- Full checkouts of each mission are kept in
/mine699/mission/*/data
. This checkout is not related to the"tm.meta"XINAfileCommitneedsWatch process. - Scripts are provided in each preload directory to
havecreatebeennewaddedtask directories:setup.sh task_name
- Generated directories require a config.json file
- Provide scripts to process and import the backlog:
-
commit.sh, commit_bg.sh - run across all files (in foreground or
updatedbackground) -
processingcommit_from.sh,tocommit_from_bg.shcontinue.-Therun"conditions"startingfor how to handleat acommitTIDare defined(in/mine699/missions/moma/config.watch.json.foreground or background) -
javaimport.sh, import_bg.sh -jar/mine699/app/x3/xina_commit_watch.jar \ -archive /mine699/mission/moma/archive \ -config /mine699/mission/moma/config.watch.json \ -data /mine699/mission/moma/data \ -import/mine699/mission/moma/importall\generated-javafiles/mine699/env/jre1.8.0_45/bin/java(in\foreground-labelorMOMA \ -mailto "nick.dobson@gmail.com;eric.i.lyness@nasa.gov" \ -mailhost mailhost.gsfc.nasa.gov \ -python2 /mine699/env/python2.7/bin/python2.7 \ -python3 /mine699/env/python3.4/bin/python3.4 \ -rev /mine699/mission/moma/rev \ -svn /usr/bin/svn \ -svnlook /usr/bin/svnlook \ -timeout 43200 >> /mine699/mission/moma/log/watch.log 2>&1 &MINE699: xina_commit_watch will call /mine699/missions/moma/commit.shbackground)
/mine699/env/jre1.8.0_45/bin/java -jar /mine699/app/xina_commit_6.9.0.jar \
-mode remote \
-config /mine699/mission/moma/config.json \
-import $1 \
-java /mine699/env/jre1.8.0_45/bin/java \
-local /mine699/mission/moma/local \
-python2 /mine699/env/python2.7/bin/python2.7 \
-python3 /mine699/env/python3.4/bin/python3.4 \
-repo momadata \
-rev $2 \
-svn /usr/bin/svn \
-svnlook /mine699/mission/moma/rev/$2.rev \
-temp /mine699/mission/moma/temp \
-url "svn://momaioc/" >> /mine699/mission/moma/log/commit.log 2>&1
MINE699: xina_commit_watch will call /mine699/missions/moma/import.sh
/mine699/env/jre1.8.0_45/bin/java -jar /mine699/app/x3/xina_import.jar \
-port 42000 \
-fdelay 1000 \
-dir $1 \
-movejson $2 \
-movefile $2 >> /mine699/mission/moma/log/import.log 2>&1