From 3a47efd361aa377d66e65544d8f755e320b81d9f Mon Sep 17 00:00:00 2001
From: eugenefischer <66030419+eugenefischer@users.noreply.github.com>
Date: Wed, 28 Sep 2022 03:01:03 -0500
Subject: [PATCH] Update TODO

---
 readme.md | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/readme.md b/readme.md
index bbc6f90..4d4f616 100644
--- a/readme.md
+++ b/readme.md
@@ -200,6 +200,10 @@ then use it for multiple different BiGpairSEQ simulations.
 Options for creating a Graph/Data file:
 * The Cell Sample file to use
 * The Sample Plate file to use. (This must have been generated from the selected Cell Sample file.)
+* Whether to simulate sequence read depth. If simulated:
+  * The read depth (number of times each sequence is read)
+  * The read error rate (probability a sequence is misread)
+  * The error collision rate (probability two misreads produce the same spurious sequence)
 
 These files do not have a human-readable structure, and are not portable to other programs.
 
@@ -265,9 +269,7 @@ P-values are calculated *after* BiGpairSEQ matching is completed, for purposes o
 using the (2021 corrected) formula from the original pairSEQ paper. (Howie, et al. 2015)
 
 
-## PERFORMANCE
-
-(NOTE: These results are from an older, less efficient version of the simulator, and need to be updated.)
+## PERFORMANCE (old results; need updating to reflect current, improved simulator performance)
 
 On a home computer with a Ryzen 5600X CPU, 64GB of 3200MHz DDR4 RAM (half of which was allocated to the Java Virtual Machine), and a PCIe 3.0 SSD, running Linux Mint 20.3 Edge (5.13 kernel), 
 the author ran a BiGpairSEQ simulation of a 96-well sample plate with 30,000 T cells/well comprising ~11,800 alphas and betas,
@@ -357,8 +359,9 @@ roughly as though it had a constant well population equal to the plate's average
 * ~~Implement simulation of read depth, and of read errors. Pre-filter graph for difference in read count to eliminate spurious sequences.~~ DONE
   * Pre-filtering based on comparing (read depth) * (occupancy) to (read count) for each sequence works extremely well
 * ~~Add read depth simulation options to CLI~~ DONE
+* ~~Update graphml output to reflect current Vertex class attributes~~ DONE
+  * Individual well data from the SequenceRecords could be included, if there's ever a reason for it
 * Update matching metadata output options in CLI
-* Update graphml output to reflect current Vertex class attributes
 * Update performance data in this readme
 * Re-implement CDR1 matching method
 * Refactor simulator code to collect all needed data in a single scan of the plate