= Translation Statistics =
=== By TC ===
<
>
== Description ==
This page is a users guide for [[transStats.py]] which is a python script that is meant for use in conjunction with [[batchCorrelation]] to yield some useful statistics about a region of interest in a shape model. This script expects an input in the form of [[locations.txt]], which is one of the outputs of the batch correlation process.
<
>
== Usage ==
The syntax for this script is very simple, as it has one very specialized purpose. For general usage directions, running:
{{{
python transStarts.py -h
}}}
Will print out the usage header.
{{{
# USAGE: python transStats.py [-option] infile
#
# -h Use this to print usage directions
# to StdOut. (Prints this header)
#
# -v Use this to only output current
# version number and exit. By default
# the version will be sent to StdOut
# at the begining of each use.
#
###################################################
}}}
A typical invocation of this script is as follows:
{{{
python transStats.py locations.txt
}}}
If you do not specify the input file, or any command line arguments, you will be prompted for a filename:
{{{
Input filename:
}}}
<
>
== Output ==
Once a correct input file is given, the calculated statistics are printed to the screen. After every successful invocation, the version number and input file will be printed to the screen for traceability.
{{{
transStats.py version: 1.1.1
Input file: locations.txt
Maximum: 42.3588285211
Minimum: 41.3077804618
Median: 41.748964396
Average: 41.7606542577 dx: -41.4010377 dy: 5.43761855
Standard Deviation: 0.280988725228
Marginal Bound from Average: +/- 5
Pass: 80.00% Marginal: 0.00% Fail: 20.00%
Correlation < 0.5: 4
0.5-0.6: 5
0.6-0.7: 7
0.7-0.8: 4
0.8-0.9: 0
> 0.9: 0
Pass:
EE0115
EE0116
EE0117
EE0118
EE0119
EE0127
EE0128
EE0129
EE0130
EE0131
EE0139
EE0140
EE0141
EE0142
EE0143
EE0162
EE0163
EE0164
EE0165
EE0166
Marginal:
Fail:
EE0151
EE0152
EE0153
EE0154
EE0155
}}}
The first thing that is done is the removal of outliers. Because of the nature of cross-correlation, and shape models, it is often the case that maplets will correlate to the wrong location. This can happen for many reasons, but generally this means the topography in that area is weak and needs further processing. So as of version 1.1.1 of [[transStats.py]] outliers are classified as such if their translation is 100 pixels or more than the average translation of all the maplets. This is a somewhat arbitrary criteria, but we felt it was very safe to assume that if a maplet is correlating that far away from the average deviation, it is obviously a fail. So far this has been very robust and not exclusive of any good data.
The outliers are now placed into the "Fail" category, and a new average is calculated. All of the statistics before the pass/marginal/fail percentages are calculated '''AFTER''' the removal of the outliers, and give a variety of useful information. It is good to see that the maximum, average, and median translation magnitudes are very close together. This means there is good agreement of the data.
After the statistics are calculated the data is split, once again, into pass and marginal categories. As of version 1.1.1, the bound for classifying a maplet as marginal is if it translated 5 or more pixels from the new average that was calculated after removal of outliers. Again, this is largely arbitrary, but for our applications so far, 5 makes for a good bound as these maplets should probably still be looked at further. Future versions may support user specified fail and/or marginal criteria.
The last thing that is done is the calculation of a correlation histogram that shows how many maplets fall into each correlation score bin. For our purposes, anything 0.6 and above is very good as we have seen that two resamplings of the same truth area give a correlation score of 0.833.