Add data on randomized well population behavior

2022-03-02 22:54:17 -06:00
parent 03e8d31210
commit df047267ee
1 changed files with 5 additions and 0 deletions
--- a/readme.md
+++ b/readme.md
@@ -264,6 +264,7 @@ Example output:
 P-values are calculated *after* BiGpairSEQ matching is completed, for purposes of comparison only, 
 using the (2021 corrected) formula from the original pairSEQ paper. (Howie, et al. 2015)

+
 ## PERFORMANCE

 On a home computer with a Ryzen 5600X CPU, 64GB of 3200MHz DDR4 RAM (half of which was allocated to the Java Virtual Machine), and a PCIe 3.0 SSD, running Linux Mint 20.3 Edge (5.13 kernel), 
@@ -279,6 +280,9 @@ Since this implementation of BiGpairSEQ writes intermediate results to disk (to
 with different filtering options), the actual elapsed time was greater. File I/O time was not measured, but took 
 slightly less time than the simulation itself. Real elapsed time from start to finish was under 30 minutes.

+As mentioned in the theory section, performance could be improved by implementing a more efficient algorithm for finding
+the maximum weighted matching.
+
 ## BEHAVIOR WITH RANDOMIZED WELL POPULATIONS

 A series of BiGpairSEQ simulations were conducted using a cell sample file of 3.5 million unique T cells. From these cells,
@@ -294,6 +298,7 @@ The well populations of the plates were:
 * Five sample plates with each individual well's population randomized, from 1000 to 5000 T cells. (Average population ~3000 T cells/well.)

 All BiGpairSEQ simulations were run with a low overlap threshold of 3 and a high overlap threshold of 94.
+No optional filters were used, so pairing was attempted for all sequences with overlaps within the threshold values.

 Constant well population plate results: