diff --git a/readme.md b/readme.md index d8d7b6d..bbc6f90 100644 --- a/readme.md +++ b/readme.md @@ -342,24 +342,25 @@ roughly as though it had a constant well population equal to the plate's average * ~~Add controllable heap-type parameter?~~ * Parameter implemented. Fibonacci heap the current default. * ~~Implement sample plates with random numbers of T cells per well.~~ DONE - * Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well population sizes on a sample plate; pairSEQ is not. + * Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well population sizes on a sample plate; pairSEQ is not due to nature of probability calculations. * preliminary data suggests that BiGpairSEQ behaves roughly as though the whole plate had whatever the *average* well concentration is, but that's still speculative. -* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows. +* ~~See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.~~ * ~~Problem is variable number of cells in a well~~ * ~~Apache Commons CSV library writes entries a row at a time~~ - * _Got this working, but at the cost of a profoundly strange bug in graph occupancy filtering. Have reverted the repo until I can figure out what caused that. Given how easily Thingiverse transposes CSV matrices in R, might not even be worth fixing. + * Got this working, but at the cost of a profoundly strange bug in graph occupancy filtering. Have reverted the repo until I can figure out what caused that. Given how easily Thingiverse transposes CSV matrices in R, might not even be worth fixing. * ~~Enable GraphML output in addition to serialized object binaries, for data portability~~ DONE * ~~Have a branch where this is implemented, but there's a bug that broke matching. Don't currently have time to fix.~~ * ~~Re-implement command line arguments, to enable scripting and statistical simulation studies~~ DONE * ~~Implement custom Vertex class to simplify code and make it easier to implement different MWM algorithms~~ DONE * Advantage: would eliminate the need to use maps to associate vertices with sequences, which would make the code easier to understand. * This also seems to be faster when using the same algorithm than the version with lots of maps, which is a nice bonus! -* Re-implement CDR1 matching method * ~~Implement simulation of read depth, and of read errors. Pre-filter graph for difference in read count to eliminate spurious sequences.~~ DONE * Pre-filtering based on comparing (read depth) * (occupancy) to (read count) for each sequence works extremely well * ~~Add read depth simulation options to CLI~~ DONE * Update matching metadata output options in CLI +* Update graphml output to reflect current Vertex class attributes * Update performance data in this readme +* Re-implement CDR1 matching method * Refactor simulator code to collect all needed data in a single scan of the plate * Currently it scans once for the vertices and then again for the edge weights. This made simulating read depth awkward, and incompatible with caching of plate files. * This would be a fairly major rewrite of the simulator code, but could make things faster, and would definitely make them cleaner.