diff --git a/readme.md b/readme.md
index ff06da2..629b75e 100644
--- a/readme.md
+++ b/readme.md
@@ -1,14 +1,45 @@
 # BiGpairSEQ SIMULATOR
 
+
 ## ABOUT
 
-This program simulates BiGpairSEQ, a graph theory based adaptation
+This program simulates BiGpairSEQ (Bipartite Graph pairSEQ), a graph theory-based adaptation
 of the pairSEQ algorithm (Howie et al. 2015) for pairing T cell receptor sequences.
 
+## THEORY
+
+Unlike pairSEQ, which calculates p-values for every TCR alpha/beta overlap and compares
+against a null distribution, BiGpairSEQ does not do any statistical calculations
+directly.
+
+BiGpairSEQ creates a [simple bipartite weighted graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
+The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
+are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
+(Sequences in all wells are filtered out prior to creating the graph, as there is no signal in their occupancy pattern.)
+The problem of pairing TCRA/TCRB sequences thus reduces to the "assignment problem" of finding a maximum weight
+matching on a bipartite graph--the subset of vertex-disjoint edges whose weights sum to the maximum possible value.
+
+This is a well-studied combinatorial optimization problem, with many known solutions.
+The best currently-known algorithm for bipartite graphs with integer weights--which is what BiGpairSEQ uses--
+is from Duan and Su (2012). For a graph with m edges, n vertices per side, and maximum integer edge weight N, 
+their algorithm runs in **O(m sqrt(n) log(N))** time. This is the best known efficiency for finding a maximum weight
+matching on a bipartite graph, and the integer edge weight requirement makes it ideal for BiGpairSEQ.
+
+Unfortunately, it's a fairly new algorithm, and the integer edge weight requirement makes it less generically useful.
+It is not implemented by the graph theory library used in this simulator. So this program
+instead uses the Fibonacci heap-based algorithm of Fredman and Tarjan (1987), which has a worst-case
+runtime of **O(n (n log(n) + m))**. The algorithm is implemented as described in Melhorn and Näher (1999).
+
+The current version of the program uses a pairing heap instead of a Fibonacci heap for its priority queue,
+which has lower theoretical efficiency but also lower complexity overhead, and is often equivalently performant
+in practice.
+
 ## USAGE
 
-Released as an executable .jar file with interactive, command line UI.
-Requires Java11 or higher (openjdk-17 recommended).
+### RUNNING THE PROGRAM
+
+BiGpairSEQ_Sim is an executable .jar file. Requires Java 11 or higher. [OpenJDK 17](https://jdk.java.net/17/)
+recommended.
 
 Run with the command:
 
@@ -20,7 +51,24 @@ For example, to run the program with 32 gigabytes of memory, use the command:
 
 `java -Xmx32G -jar BiGpairSEQ_Sim.jar`
 
-## OUTPUT
+Once running, BiGpairSEQ_Sim has an interactive, menu-driven CLI for generating files and simulating TCR pairing. The
+main menu looks like this:
+
+```
+--------BiGPairSEQ SIMULATOR--------
+ALPHA/BETA T-CELL RECEPTOR MATCHING
+  USING WEIGHTED BIPARTITE GRAPHS
+------------------------------------
+Please select an option:
+1) Generate a population of distinct cells
+2) Generate a sample plate of T cells
+3) Generate CDR3 alpha/beta occupancy data and overlap graph
+4) Simulate bipartite graph CDR3 alpha/beta matching (BiGpairSEQ)
+9) About/Acknowledgments
+0) Exit
+```
+
+### OUTPUT
 
 To run the simulation, the program reads and writes 4 kinds of files:
 * Cell Sample files in CSV format
@@ -28,7 +76,10 @@ To run the simulation, the program reads and writes 4 kinds of files:
 * Graph and Data files in binary object serialization format
 * Matching Results files in CSV format
 
-### -- Cell Sample Files --
+When entering filenames, it is not necessary to include the file extension (.csv or .ser). When reading or
+writing files, the program will automatically add the correct extension to any filename without one.
+
+#### Cell Sample Files
 Cell Sample files consist of any number of distinct "T cells." Every cell contains 
 four sequences: Alpha CDR3, Beta CDR, Alpha CDR1, Beta CDR1. The sequences are represented by
 random integers. CDR3 Alpha and Beta sequences are all unique. CDR1 Alpha and Beta sequences
@@ -55,7 +106,7 @@ Structure:
 
 **NOTE:** Matching of CDR1s is currently awaiting re-implementation.
 
-### -- Sample Plate Files --
+#### Sample Plate Files
 Sample Plate files consist of any number of "wells" containing any number of T cells (as 
 described above). The wells are filled randomly from a Cell Sample file, according to a selected
 frequency distribution. Additionally, every individual sequence within each cell may, with some
@@ -90,7 +141,13 @@ Dropouts are represented by replacing sequences with the value `-1`. Comments ar
 Structure:
 
 ---
-
+```
+# Cell source file name:
+# Each row represents one well on the plate
+# Plate size:
+# Concentrations:
+# Lambda: 
+```
 | Well 1, cell 1 | Well 1, cell 2 | Well 1, cell 3| ... |
 |---|---|---|---|
 | **Well 2, cell 1** | **Well 2, cell 2** | **Well 2, cell 3**| ... |
@@ -99,7 +156,7 @@ Structure:
 
 ---
 
-### -- Graph and Data Files --
+#### Graph and Data Files
 Graph and Data files are serialized binaries of a Java object containing the graph representation of a
 Sample Plate and necessary metadata for matching and results output. Making them requires a Cell Sample file (to construct a list of correct sequence pairs for checking
 the accuracy of BiGpairSEQ simulations) and a Sample Plate file (to construct the associated
@@ -115,13 +172,11 @@ Options for creating a Graph and Data file:
 * The Cell Sample file to use
 * The Sample Plate file (generated from the given Cell Sample file) to use.
 
-### -- Matching Results Files --
+#### Matching Results Files 
 Matching results files consist of the results of a BiGpairSEQ matching simulation.
 Files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
 Metadata about the matching simulation is included as comments. Comments are preceded by `#`.
 
-
-
 Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
 * The minimum number of alpha/beta overlap wells to attempt to match
   * (must be >= 1)
@@ -136,19 +191,20 @@ Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
 Sample File Structure:
 
 ---
-
-    # T cell counts in sample plate wells: 5000
-    # Total alphas found: 3387
-    # Total betas found: 3396
-    # High overlap threshold: 94
-    # Low overlap threshold: 3
-    # Minimum overlap percent: 0
-    # Maximum occupancy difference: 50
-    # Pairing attempt rate: 0.488
-    # Correct pairings: 1650
-    # Incorrect pairings: 4
-    # Pairing error rate: 0.00242
-    # Simulation time: 19 seconds
+```
+# T cell counts in sample plate wells: 5000
+# Total alphas found: 3387
+# Total betas found: 3396
+# High overlap threshold: 94
+# Low overlap threshold: 3
+# Minimum overlap percent: 0
+# Maximum occupancy difference: 50
+# Pairing attempt rate: 0.488
+# Correct pairings: 1650
+# Incorrect pairings: 4
+# Pairing error rate: 0.00242
+# Simulation time: 19 seconds
+```
 
 | Alpha | Alpha well count | Beta | Beta well count | Overlap count | Matched Correctly? | P-value |
 |---|---|---|---|---|---|---|
@@ -159,35 +215,9 @@ Sample File Structure:
 ---
 
 **NOTE: The p-values in the output are not used for matching**—they aren't part of the BiGpairSEQ algorithm at all. 
-P-values are calculated *after* BiGpairSEQ matching is completed for purposes of comparison, 
+P-values are calculated *after* BiGpairSEQ matching is completed, for purposes of comparison, 
 using the (2021 corrected) formula from the original pairSEQ paper. (Howie, et al. 2015)
 
-## THEORY
-
-Unlike pairSEQ, which calculates p-values for every TCR alpha/beta overlap and compares
-against a null distribution, BiGpairSEQ does not do any statistical calculations
-directly. Instead, BiGpairSEQ creates a simple bipartite weighted graph representing the sample plate.
-The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
-are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
-(Sequences in all wells are filtered out prior to creating the graph, as there is no signal in their occupancy
-The problem of pairing TCRA/TCRB sequences thus reduces to the "assignment problem" of finding a maximum weight
-matching on a bipartite graph--the subset of vertex-disjoint edges whose weights sum to the maximum possible value.
-
-This is a very well-studied combinatorial optimization problem, with many known solutions.
-The best currently-known algorithm for bipartite graphs with integer weights is from Duan and Su (2012).
-For a graph with m edges, n vertices per side, and maximum integer edge weight N, their algorithm runs
-in O(m sqrt(n) log(N)) time. With its best-known efficiency and requirement of integer weights, this
-algorithm is ideal for BiGpairSEQ.
-
-Unfortunately, the qualities that make it ideal for BiGpairSEQ make it less generically useful,
-and it is not implemented by the graph theory library used in this simulator. So this program
-instead uses the Fibonacci heap-based algorithm of Fredman and Tarjan (1984), which has a worst-case
-runtime of O(n (n log(n) + m)). The algorithm is implemented as described in Melhorn and Näher (1999).
-
-The current version of the program uses a pairing heap instead of a Fibonacci heap for its priority queue, 
-which has lower theoretical efficiency but also lower complexity overhead, and is often equivalently performant
-in practice.
-
 ## TODO
 
 * Try invoking GC at end of workloads to reduce paging to disk
@@ -203,27 +233,28 @@ in practice.
   * in theory Fibonacci heap should be more efficient, but complexity overhead may eliminate theoretical advantage
   * Add controllable heap-type parameter?
 * Implement sample plates with random numbers of T cells per well
-  * Possible BiGpairSEQ advantage: BiGpairSEQ is resilient to variations in well populations; pairSEQ is not.
+  * Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well populations; pairSEQ is not.
     * preliminary data suggests that BiGpairSEQ behaves roughly as though the whole plate had whatever the *average* well concentration is, but that's still speculative.
 * See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows
   * Problem is variable number of cells in a well
   * Apache Commons CSV library writes entries a row at a time
-
-## EXTERNAL LIBRARIES USED
-* [JGraphT](https://jgrapht.org) -- Graph theory data structures and algorithms
-* [JHeaps](https://www.jheaps.org) -- For pairing heap priority queue used in maximum weight matching algorithm
-* [Apache Commons CSV](https://commons.apache.org/proper/commons-csv/) -- For CSV file output
-* [Apache Commons CLI](https://commons.apache.org/proper/commons-cli/) -- To enable command line arguments for scripting. (**Awaiting reimplementation**.)
-
+    * Can possibly sort the wells by length first, then construct entries 
+  
 ## CITATIONS
 * Howie, B., Sherwood, A. M., et al. ["High-throughput pairing of T cell receptor alpha and beta sequences."](https://pubmed.ncbi.nlm.nih.gov/26290413/) Sci. Transl. Med. 7, 301ra131 (2015)
 * Duan, R., Su H. ["A Scaling Algorithm for Maximum Weight Matching in Bipartite Graphs."](https://web.eecs.umich.edu/~pettie/matching/Duan-Su-scaling-bipartite-matching.pdf) Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, p. 1413-1424. (2012)
 * K. Melhorn, St. Näher. [The LEDA Platform of Combinatorial and Geometric Computing.](https://people.mpi-inf.mpg.de/~mehlhorn/LEDAbook.html) Cambridge University Press. Chapter 7, Graph Algorithms; p. 132-162 (1999)
 * M. Fredman, R. Tarjan. ["Fibonacci heaps and their uses in improved network optimization algorithms."](https://www.cl.cam.ac.uk/teaching/1011/AlgorithII/1987-FredmanTar-fibonacci.pdf) J. ACM, 34(3):596–615 (1987))
 
+## EXTERNAL LIBRARIES USED
+* [JGraphT](https://jgrapht.org) -- Graph theory data structures and algorithms
+* [JHeaps](https://www.jheaps.org) -- For pairing heap priority queue used in maximum weight matching algorithm
+* [Apache Commons CSV](https://commons.apache.org/proper/commons-csv/) -- For CSV file output
+* [Apache Commons CLI](https://commons.apache.org/proper/commons-cli/) -- To enable command line arguments for scripting. (**Awaiting re-implementation**.)
+
 ## ACKNOWLEDGEMENTS
-BiGpairSEQ was conceived in collaboration with Dr. Alice MacQueen, who brought the original pairSEQ paper to the author's attention
-and explained all the biology terms he didn't know.
+BiGpairSEQ was conceived in collaboration with Dr. Alice MacQueen, who brought the original 
+pairSEQ paper to the author's attention and explained all the biology terms he didn't know.
 
 ## AUTHOR
 Eugene Fischer, 2021. UI improvements and documentation, 2022.
\ No newline at end of file