update readme and default heap type

This commit is contained in:
eugenefischer
2025-04-09 11:18:21 -05:00
parent 0071cafbbd
commit fbc0496675
2 changed files with 24 additions and 24 deletions

View File

@@ -1,31 +1,31 @@
# BiGpairSEQ SIMULATOR
## CONTENTS
1. ABOUT
2. THEORY
3. THE BiGpairSEQ ALGORITHM
4. USAGE
1. RUNNING THE PROGRAM
2. COMMAND LINE OPTIONS
3. INTERACTIVE INTERFACE
4. INPUT/OUTPUT
1. [ABOUT](#about)
2. [THEORY](#theory)
3. [THE BiGpairSEQ ALGORITHM](#the-bigpairseq-algorithm)
4. [USAGE](#usage)
1. [RUNNING THE PROGRAM](#running-the-program)
2. [COMMAND LINE OPTIONS](#command-line-options)
3. [INTERACTIVE INTERFACE](#interactive-interface)
4. [INPUT/OUTPUT](#inputoutput)
1. Cell Sample Files
2. Sample Plate Files
3. Graph/Data Files
4. Matching Results Files
5. RESULTS
5. [RESULTS](#results)
1. SAMPLE PLATES WITH VARYING NUMBERS OF CELLS PER WELL
2. SIMULATING EXPERIMENTS FROM pairSEQ PAPER
6. TODO
7. CITATIONS
8. ACKNOWLEDGEMENTS
9. AUTHOR
10. DISCLOSURE
6. [TODO](#todo)
7. [CITATIONS](#citations)
8. [ACKNOWLEDGEMENTS](#acknowledgements)
9. [AUTHOR](#author)
10. [DISCLOSURE](#disclosure)
## ABOUT
This program simulates BiGpairSEQ (Bipartite Graph pairSEQ), a graph theory-based adaptation
of the pairSEQ algorithm (Howie, et al. 2015) for pairing T cell receptor sequences.
of the pairSEQ algorithm ([Howie, et al. 2015](#citations)) for pairing T cell receptor sequences.
## THEORY
@@ -51,17 +51,17 @@ matching (MWM) on a bipartite graph--the subset of vertex-disjoint edges whose w
This is a well-studied combinatorial optimization problem, with many known algorithms that produce
provably-optimal solutions. The most theoretically efficient algorithm known to the author for maximum weight matching of a bipartite
graph with strictly integral weights is from Duan and Su (2012). For a graph with m edges, n vertices per side,
graph with strictly integral weights is from [Duan and Su (2012)](#citations). For a graph with m edges, n vertices per side,
and maximum integer edge weight N, their algorithm runs in **O(m sqrt(n) log(N))** time. As the graph representation of
a pairSEQ experiment is bipartite with integer weights, this algorithm seems ideal for BiGpairSEQ. Unfortunately, it's a
fairly new algorithm, and not yet implemented by the graph theory library used in this simulator (JGraphT), nor has the author had
time to implement it himself.
a pairSEQ experiment is bipartite with integer weights, this algorithm seems ideal for BiGpairSEQ. Unfortunately, it is not
implemented by the graph theory library used in this simulator (JGraphT), and the author has not yet had time to write a
full, optimized implementation himself for testing.
So this program instead uses the [Fibonacci heap](https://en.wikipedia.org/wiki/Fibonacci_heap) based algorithm of Fredman and Tarjan (1987) (essentially
[the Hungarian algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) augmented with a more efficeint priority queue) which has a worst-case
runtime of **O(n (n log(n) + m))**. The algorithm is implemented as described in Melhorn and Näher (1999). (The simulator
allows the substitution of a [pairing heap](https://en.wikipedia.org/wiki/Pairing_heap) for a Fibonacci heap, though the relative performance difference of the two
has not yet been thoroughly tested.)
[the Hungarian algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) augmented with a more efficient priority queue) which has a worst-case
runtime of **O(n (n log(n) + m))**. The algorithm is implemented as described in [Melhorn and Näher (1999)](#citations). (The simulator can use either a
Fibonacci heap or a [pairing heap](https://en.wikipedia.org/wiki/Pairing_heap) as desired. By default, a pairing heap is used,
as in practice they often offer superior performance.)
One possible advantage of this less efficient algorithm is that the Hungarian algorithm and its variations work with both the balanced and the unbalanced assignment problem
(that is, cases where both sides of the bipartite graph have the same number of vertices and those in which they don't.)