Remove questionable claim, reorder simulation experiments
This commit is contained in:
50
readme.md
50
readme.md
@@ -331,7 +331,6 @@ Options when making a Sample Plate file:
|
||||
* Standard deviation size
|
||||
* Exponential
|
||||
* Lambda value
|
||||
* *(Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was very roughly exponential with a lambda ~0.6. (Howie, et al. 2015) The actual distribution was certainly quite different.)*
|
||||
* Total number of wells on the plate
|
||||
* Well populations random or fixed
|
||||
* If random, minimum and maximum population sizes
|
||||
@@ -474,28 +473,7 @@ Several BiGpairSEQ simulations were performed on a home computer with the follow
|
||||
* 2TB PCIe 3.0 SSD
|
||||
* Linux Mint 21 (5.15 kernel)
|
||||
|
||||
### Simulation 1
|
||||
This simulation was an attempt to replicate the conditions of experiment 1 from the 2015 pairSEQ paper: a matching was found for a
|
||||
96-well sample plate with 4,000 T cells/well comprising ~11,900 TCRAs and TCRBs, taken from a sample of 8,400,000
|
||||
distinct cells with an exponential frequency distribution (lambda 0.6). The sequence dropout rate was 10%, as the analysis
|
||||
from the original paper concluded that most TCR sequences "have less than a 10% chance of going unobserved." (Howie, et al. 2015)
|
||||
|
||||
The original paper does not contain (or the author of this document failed to identify) information on sequencing depth,
|
||||
read error probability, or the probabilities of different kinds of read error collisions. As the pre-filtering of BiGpairSEQ
|
||||
has successfully filtered out all such errors for any reasonable error rates the author has yet tested, this simulation was
|
||||
done without any sequencing errors, to reduce the processing time.
|
||||
|
||||
With min/max occupancy thresholds of 3 and 95 wells respectively for matching, BiGpairSEQ identified:
|
||||
* 8,495 correct pairings
|
||||
* 5 incorrect pairings
|
||||
|
||||
for an overall pairing accuracy of 99.9992%.
|
||||
|
||||
The total simulation time (excluding file I/O) was 28m52. The total elapsed time with file I/O was 41m23s.
|
||||
Calculation of p-values was enabled for this simulation, increasing the overall processing time.
|
||||
|
||||
|
||||
## BEHAVIOR WITH RANDOMIZED WELL POPULATIONS (old results, need updating for new version of the simulator (though resilience to varying well populations is unchanged))
|
||||
### SAMPLE PLATES WITH VARYING NUMBERS OF CELLS PER WELL (old results, need updating for new version of the simulator (though resilience to varying well populations is unchanged))
|
||||
|
||||
A series of BiGpairSEQ simulations were conducted using a cell sample file of 3.5 million unique T cells. From these cells,
|
||||
10 sample plate files were created. All of these sample plates had 96 wells, used an exponential distribution with a lambda of 0.6, and
|
||||
@@ -540,6 +518,32 @@ The average results for the randomized plates are closest to the constant plate
|
||||
This and several other tests indicate that BiGpairSEQ treats a sample plate with a highly variable number of T cells/well
|
||||
roughly as though it had a constant well population equal to the plate's average well population.
|
||||
|
||||
### EXPERIMENTS FROM THE 2015 pairSEQ PAPER
|
||||
#### Experiment 1
|
||||
This simulation was an attempt to replicate the conditions of experiment 1 from the 2015 pairSEQ paper: a matching was found for a
|
||||
96-well sample plate with 4,000 T cells/well comprising ~11,900 TCRAs and TCRBs, taken from a sample of 8,400,000
|
||||
distinct cells with an exponential frequency distribution (lambda 0.6). The sequence dropout rate was 10%, as the analysis
|
||||
from the original paper concluded that most TCR sequences "have less than a 10% chance of going unobserved." (Howie, et al. 2015)
|
||||
|
||||
The original paper does not contain (or the author of this document failed to identify) information on sequencing depth,
|
||||
read error probability, or the probabilities of different kinds of read error collisions. As the pre-filtering of BiGpairSEQ
|
||||
has successfully filtered out all such errors for any reasonable error rates the author has yet tested, this simulation was
|
||||
done without any sequencing errors, to reduce the processing time.
|
||||
|
||||
With min/max occupancy thresholds of 3 and 95 wells respectively for matching, BiGpairSEQ identified:
|
||||
* 8,495 correct pairings
|
||||
* 5 incorrect pairings
|
||||
|
||||
for an overall pairing accuracy of 99.9992%.
|
||||
|
||||
The total simulation time (excluding file I/O) was 28m52. The total elapsed time with file I/O was 41m23s.
|
||||
Calculation of p-values was enabled for this simulation, increasing the overall processing time.
|
||||
|
||||
Note that the frequency distribution of T cell clones in this simulation is only roughly that of
|
||||
|
||||
#### Experiment 2
|
||||
|
||||
|
||||
## TODO
|
||||
|
||||
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
||||
|
||||
Reference in New Issue
Block a user