Compare commits
33 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 6f5afbc6ec | |||
| fb4d22e7a4 | |||
| e10350c214 | |||
| b1155f8100 | |||
| 12b003a69f | |||
| 32c5bcaaff | |||
| 2485ac4cf6 | |||
| 05556bce0c | |||
| a822f69ea4 | |||
| 3d1f8668ee | |||
| 40c743308b | |||
| 5246cc4a0c | |||
| a5f7c0641d | |||
| 8ebfc1469f | |||
| b53f5f1cc0 | |||
| 974d2d650c | |||
| 6b5837e6ce | |||
| b4cc240048 | |||
| ff72c9b359 | |||
| 88eb8aca50 | |||
| 98bf452891 | |||
| c2db4f87c1 | |||
| 8935407ade | |||
| 9fcc20343d | |||
| e4d094d796 | |||
| f385ebc31f | |||
| 8745550e11 | |||
| 41805135b3 | |||
| 373a5e02f9 | |||
| 7f18311054 | |||
| bcb816c3e6 | |||
| dad0fd35fd | |||
| 35d580cfcf |
107
readme.md
107
readme.md
@@ -12,7 +12,7 @@ Unlike pairSEQ, which calculates p-values for every TCR alpha/beta overlap and c
|
||||
against a null distribution, BiGpairSEQ does not do any statistical calculations
|
||||
directly.
|
||||
|
||||
BiGpairSEQ creates a [weightd bipartite graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
||||
BiGpairSEQ creates a [weighted bipartite graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
||||
The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
|
||||
are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
|
||||
(Sequences present in *all* wells are filtered out prior to creating the graph, as there is no signal in their occupancy pattern.)
|
||||
@@ -48,8 +48,13 @@ For example, to run the program with 32 gigabytes of memory, use the command:
|
||||
|
||||
`java -Xmx32G -jar BiGpairSEQ_Sim.jar`
|
||||
|
||||
Once running, BiGpairSEQ_Sim has an interactive, menu-driven CLI for generating files and simulating TCR pairing. The
|
||||
main menu looks like this:
|
||||
There are a number of command line options, to allow the program to be used in shell scripts. For a full list,
|
||||
use the -help flag:
|
||||
|
||||
`java -jar BiGpairSEQ_Sim.jar -help`
|
||||
|
||||
If no command line arguments are given, BiGpairSEQ_Sim will launch with an interactive, menu-driven CLI for
|
||||
generating files and simulating TCR pairing. The main menu looks like this:
|
||||
|
||||
```
|
||||
--------BiGPairSEQ SIMULATOR--------
|
||||
@@ -66,6 +71,19 @@ Please select an option:
|
||||
0) Exit
|
||||
```
|
||||
|
||||
By default, the Options menu looks like this:
|
||||
```
|
||||
--------------OPTIONS---------------
|
||||
1) Turn on cell sample file caching
|
||||
2) Turn on plate file caching
|
||||
3) Turn on graph/data file caching
|
||||
4) Turn off serialized binary graph output
|
||||
5) Turn on GraphML graph output
|
||||
6) Maximum weight matching algorithm options
|
||||
0) Return to main menu
|
||||
```
|
||||
|
||||
|
||||
### INPUT/OUTPUT
|
||||
|
||||
To run the simulation, the program reads and writes 4 kinds of files:
|
||||
@@ -75,21 +93,25 @@ To run the simulation, the program reads and writes 4 kinds of files:
|
||||
* Matching Results files in CSV format
|
||||
|
||||
These files are often generated in sequence. When entering filenames, it is not necessary to include the file extension
|
||||
(.csv or .ser). When reading or writing files, the program will automatically add the correct extension to any filename without one.
|
||||
(.csv or .ser). When reading or writing files, the program will automatically add the correct extension to any filename
|
||||
without one.
|
||||
|
||||
To save file I/O time, the most recent instance of each of these four
|
||||
files either generated or read from disk can be cached in program memory. This is could be important for Graph/Data files,
|
||||
which can be several gigabytes in size. Since some simulations may require running multiple,
|
||||
differently-configured BiGpairSEQ matchings on the same graph, keeping the most recent graph cached may reduce execution time.
|
||||
(The manipulation necessary to re-use a graph incurs its own performance overhead, though, which may scale with graph
|
||||
size faster than file I/O does. If so, caching is best for smaller graphs.)
|
||||
|
||||
When caching is active, subsequent uses of the same data file won't need to be read in again until another file of that type is used or generated,
|
||||
files either generated or read from disk can be cached in program memory. When caching is active, subsequent uses of the
|
||||
same data file won't need to be read in again until another file of that type is used or generated,
|
||||
or caching is turned off for that file type. The program checks whether it needs to update its cached data by comparing
|
||||
filenames as entered by the user. On encountering a new filename, the program flushes its cache and reads in the new file.
|
||||
|
||||
(Note that cached Graph/Data files must be transformed back into their original state after a matching experiment, which
|
||||
may take some time. Whether file I/O or graph transformation takes longer for graph/data files is likely to be
|
||||
device-specific.)
|
||||
|
||||
The program's caching behavior can be controlled in the Options menu. By default, all caching is OFF.
|
||||
|
||||
The program can optionally output Graph/Data files in .GraphML format (.graphml) for data portability. This can be
|
||||
turned on in the Options menu. By default, GraphML output is OFF.
|
||||
|
||||
---
|
||||
#### Cell Sample Files
|
||||
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
||||
four sequences: Alpha CDR3, Beta CDR3, Alpha CDR1, Beta CDR1. The sequences are represented by
|
||||
@@ -107,7 +129,6 @@ Comments are preceded by `#`
|
||||
|
||||
Structure:
|
||||
|
||||
---
|
||||
# Sample contains 1 unique CDR1 for every 4 unique CDR3s.
|
||||
| Alpha CDR3 | Beta CDR3 | Alpha CDR1 | Beta CDR1 |
|
||||
|---|---|---|---|
|
||||
@@ -131,11 +152,14 @@ Options when making a Sample Plate file:
|
||||
* Standard deviation size
|
||||
* Exponential
|
||||
* Lambda value
|
||||
* *(Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was exponential with a lambda of approximately 0.6. (Howie, et al. 2015))*
|
||||
* *(Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was approximately exponential with a lambda ~0.6. (Howie, et al. 2015))*
|
||||
* Total number of wells on the plate
|
||||
* Number of sections on plate
|
||||
* Number of T cells per well
|
||||
* per section, if more than one section
|
||||
* Well populations random or fixed
|
||||
* If random, minimum and maximum population sizes
|
||||
* If fixed
|
||||
* Number of sections on plate
|
||||
* Number of T cells per well
|
||||
* per section, if more than one section
|
||||
* Dropout rate
|
||||
|
||||
Files are in CSV format. There are no header labels. Every row represents a well.
|
||||
@@ -149,7 +173,6 @@ Dropout sequences are replaced with the value `-1`. Comments are preceded by `#`
|
||||
|
||||
Structure:
|
||||
|
||||
---
|
||||
```
|
||||
# Cell source file name:
|
||||
# Each row represents one well on the plate
|
||||
@@ -178,14 +201,19 @@ Options for creating a Graph/Data file:
|
||||
* The Cell Sample file to use
|
||||
* The Sample Plate file to use. (This must have been generated from the selected Cell Sample file.)
|
||||
|
||||
These files do not have a human-readable structure, and are not portable to other programs. (Export of graphs in a
|
||||
portable data format may be implemented in the future. The tricky part is encoding the necessary metadata.)
|
||||
These files do not have a human-readable structure, and are not portable to other programs.
|
||||
|
||||
(For portability to other software, turn on GraphML output in the Options menu. This will produce a .graphml file
|
||||
for the weighted graph, with vertex attributes sequence, type, and occupancy data.)
|
||||
|
||||
---
|
||||
|
||||
#### Matching Results Files
|
||||
Matching results files consist of the results of a BiGpairSEQ matching simulation. Making them requires a Graph and
|
||||
Data file. Matching results files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
||||
Matching results files consist of the results of a BiGpairSEQ matching simulation. Making them requires a serialized
|
||||
binary Graph/Data file (.ser). (Because .graphML files are larger than .ser files, BiGpairSEQ_Sim supports .graphML
|
||||
output only. Graph/data input must use a serialized binary.)
|
||||
|
||||
Matching results files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
||||
Metadata about the matching simulation is included as comments. Comments are preceded by `#`.
|
||||
|
||||
Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
|
||||
@@ -200,7 +228,6 @@ Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
|
||||
|
||||
Example output:
|
||||
|
||||
---
|
||||
```
|
||||
# Source Sample Plate file: 4MilCellsPlate.csv
|
||||
# Source Graph and Data file: 4MilCellsPlateGraph.ser
|
||||
@@ -251,29 +278,31 @@ slightly less time than the simulation itself. Real elapsed time from start to f
|
||||
## TODO
|
||||
|
||||
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
||||
* Hold graph data in memory until another graph is read-in? ~~ABANDONED~~ ~~UNABANDONED~~ DONE
|
||||
* ~~Hold graph data in memory until another graph is read-in? ABANDONED UNABANDONED~~ DONE
|
||||
* ~~*No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.*~~
|
||||
* Might have figured out a way to do it, by taking edges out and then putting them back into the graph. This may actually be possible.
|
||||
* It is possible, though the modifications to the graph incur their own performance penalties. Need testing to see which option is best.
|
||||
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
||||
* ~~Problem is variable number of cells in a well~~
|
||||
* ~~Apache Commons CSV library writes entries a row at a time~~
|
||||
* _Got this working, but at the cost of a profoundly strange bug in graph occupancy filtering. Have reverted the repo until I can figure out what caused that. Given how easily Thingiverse transposes CSV matrices in R, might not even be worth fixing._
|
||||
* Re-implement command line arguments, to enable scripting and statistical simulation studies
|
||||
* Implement sample plates with random numbers of T cells per well.
|
||||
* Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well population sizes on a sample plate; pairSEQ is not.
|
||||
* preliminary data suggests that BiGpairSEQ behaves roughly as though the whole plate had whatever the *average* well concentration is, but that's still speculative.
|
||||
* Enable GraphML output in addition to serialized object binaries, for data portability
|
||||
* Custom vertex type with attribute for sequence occupancy?
|
||||
* Re-implement CDR1 matching method
|
||||
* Implement Duan and Su's maximum weight matching algorithm
|
||||
* Add controllable algorithm-type parameter?
|
||||
* ~~Test whether pairing heap (currently used) or Fibonacci heap is more efficient for priority queue in current matching algorithm~~ DONE
|
||||
* ~~in theory Fibonacci heap should be more efficient, but complexity overhead may eliminate theoretical advantage~~
|
||||
* ~~Add controllable heap-type parameter?~~
|
||||
* Parameter implemented. For large graphs, Fibonacci heap wins. Now the new default.
|
||||
|
||||
|
||||
* Parameter implemented. Fibonacci heap the current default.
|
||||
* ~~Implement sample plates with random numbers of T cells per well.~~ DONE
|
||||
* Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well population sizes on a sample plate; pairSEQ is not.
|
||||
* preliminary data suggests that BiGpairSEQ behaves roughly as though the whole plate had whatever the *average* well concentration is, but that's still speculative.
|
||||
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
||||
* ~~Problem is variable number of cells in a well~~
|
||||
* ~~Apache Commons CSV library writes entries a row at a time~~
|
||||
* _Got this working, but at the cost of a profoundly strange bug in graph occupancy filtering. Have reverted the repo until I can figure out what caused that. Given how easily Thingiverse transposes CSV matrices in R, might not even be worth fixing.
|
||||
* ~~Enable GraphML output in addition to serialized object binaries, for data portability~~ DONE
|
||||
* ~~Custom vertex type with attribute for sequence occupancy?~~ ABANDONED
|
||||
* Have a branch where this is implemented, but there's a bug that broke matching. Don't currently have time to fix.
|
||||
* ~~Re-implement command line arguments, to enable scripting and statistical simulation studies~~ DONE
|
||||
* Re-implement CDR1 matching method
|
||||
* Implement Duan and Su's maximum weight matching algorithm
|
||||
* Add controllable algorithm-type parameter?
|
||||
* This would be fun and valuable, but probably take more time than I have for a hobby project.
|
||||
* Implement Vose's alias method for arbitrary statistical distributions of cells
|
||||
|
||||
|
||||
## CITATIONS
|
||||
* Howie, B., Sherwood, A. M., et al. ["High-throughput pairing of T cell receptor alpha and beta sequences."](https://pubmed.ncbi.nlm.nih.gov/26290413/) Sci. Transl. Med. 7, 301ra131 (2015)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import java.util.Random;
|
||||
|
||||
//main class. For choosing interface type and caching file data
|
||||
//main class. For choosing interface type and holding settings
|
||||
public class BiGpairSEQ {
|
||||
|
||||
private static final Random rand = new Random();
|
||||
@@ -14,6 +14,8 @@ public class BiGpairSEQ {
|
||||
private static boolean cachePlate = false;
|
||||
private static boolean cacheGraph = false;
|
||||
private static String priorityQueueHeapType = "FIBONACCI";
|
||||
private static boolean outputBinary = true;
|
||||
private static boolean outputGraphML = false;
|
||||
|
||||
public static void main(String[] args) {
|
||||
if (args.length == 0) {
|
||||
@@ -21,8 +23,8 @@ public class BiGpairSEQ {
|
||||
}
|
||||
else {
|
||||
//This will be uncommented when command line arguments are re-implemented.
|
||||
//CommandLineInterface.startCLI(args);
|
||||
System.out.println("Command line arguments are still being re-implemented.");
|
||||
CommandLineInterface.startCLI(args);
|
||||
//System.out.println("Command line arguments are still being re-implemented.");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -164,4 +166,11 @@ public class BiGpairSEQ {
|
||||
public static void setFibonacciHeap() {
|
||||
priorityQueueHeapType = "FIBONACCI";
|
||||
}
|
||||
|
||||
public static boolean outputBinary() {return outputBinary;}
|
||||
public static void setOutputBinary(boolean b) {outputBinary = b;}
|
||||
|
||||
public static boolean outputGraphML() {return outputGraphML;}
|
||||
public static void setOutputGraphML(boolean b) {outputGraphML = b;}
|
||||
|
||||
}
|
||||
|
||||
@@ -1,10 +1,37 @@
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
import java.util.stream.IntStream;
|
||||
|
||||
public class CellSample {
|
||||
|
||||
private List<Integer[]> cells;
|
||||
private Integer cdr1Freq;
|
||||
|
||||
public CellSample(Integer numDistinctCells, Integer cdr1Freq){
|
||||
this.cdr1Freq = cdr1Freq;
|
||||
List<Integer> numbersCDR3 = new ArrayList<>();
|
||||
List<Integer> numbersCDR1 = new ArrayList<>();
|
||||
Integer numDistCDR3s = 2 * numDistinctCells + 1;
|
||||
IntStream.range(1, numDistCDR3s + 1).forEach(i -> numbersCDR3.add(i));
|
||||
IntStream.range(numDistCDR3s + 1, numDistCDR3s + 1 + (numDistCDR3s / cdr1Freq) + 1).forEach(i -> numbersCDR1.add(i));
|
||||
Collections.shuffle(numbersCDR3);
|
||||
Collections.shuffle(numbersCDR1);
|
||||
|
||||
//Each cell represented by 4 values
|
||||
//two CDR3s, and two CDR1s. First two values are CDR3s (alpha, beta), second two are CDR1s (alpha, beta)
|
||||
List<Integer[]> distinctCells = new ArrayList<>();
|
||||
for(int i = 0; i < numbersCDR3.size() - 1; i = i + 2){
|
||||
Integer tmpCDR3a = numbersCDR3.get(i);
|
||||
Integer tmpCDR3b = numbersCDR3.get(i+1);
|
||||
Integer tmpCDR1a = numbersCDR1.get(i % numbersCDR1.size());
|
||||
Integer tmpCDR1b = numbersCDR1.get((i+1) % numbersCDR1.size());
|
||||
Integer[] tmp = {tmpCDR3a, tmpCDR3b, tmpCDR1a, tmpCDR1b};
|
||||
distinctCells.add(tmp);
|
||||
}
|
||||
this.cells = distinctCells;
|
||||
}
|
||||
|
||||
public CellSample(List<Integer[]> cells, Integer cdr1Freq){
|
||||
this.cells = cells;
|
||||
this.cdr1Freq = cdr1Freq;
|
||||
|
||||
@@ -1,5 +1,9 @@
|
||||
import org.apache.commons.cli.*;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.Arrays;
|
||||
import java.util.stream.Stream;
|
||||
|
||||
/*
|
||||
* Class for parsing options passed to program from command line
|
||||
*
|
||||
@@ -29,6 +33,8 @@ import org.apache.commons.cli.*;
|
||||
* cellfile : name of the cell sample file to use as input
|
||||
* platefile : name of the sample plate file to use as input
|
||||
* output : name of the output file
|
||||
* graphml : output a graphml file
|
||||
* binary : output a serialized binary object file
|
||||
*
|
||||
* Match flags:
|
||||
* graphFile : name of graph and data file to use as input
|
||||
@@ -43,286 +49,379 @@ import org.apache.commons.cli.*;
|
||||
public class CommandLineInterface {
|
||||
|
||||
public static void startCLI(String[] args) {
|
||||
//These command line options are a big mess
|
||||
//Really, I don't think command line tools are expected to work in this many different modes
|
||||
//making cells, making plates, and matching are the sort of thing that UNIX philosophy would say
|
||||
//should be three separate programs.
|
||||
//There might be a way to do it with option parameters?
|
||||
|
||||
//main options set
|
||||
Options mainOptions = new Options();
|
||||
Option makeCells = Option.builder("cells")
|
||||
.longOpt("make-cells")
|
||||
.desc("Makes a file of distinct cells")
|
||||
.build();
|
||||
Option makePlate = Option.builder("plates")
|
||||
.longOpt("make-plates")
|
||||
.desc("Makes a sample plate file")
|
||||
.build();
|
||||
Option makeGraph = Option.builder("graph")
|
||||
.longOpt("make-graph")
|
||||
.desc("Makes a graph and data file")
|
||||
.build();
|
||||
Option matchCDR3 = Option.builder("match")
|
||||
.longOpt("match-cdr3")
|
||||
.desc("Match CDR3s. Requires a cell sample file and any number of plate files.")
|
||||
.build();
|
||||
OptionGroup mainGroup = new OptionGroup();
|
||||
mainGroup.addOption(makeCells);
|
||||
mainGroup.addOption(makePlate);
|
||||
mainGroup.addOption(makeGraph);
|
||||
mainGroup.addOption(matchCDR3);
|
||||
mainGroup.setRequired(true);
|
||||
mainOptions.addOptionGroup(mainGroup);
|
||||
|
||||
//Reuse clones of this for other options groups, rather than making it lots of times
|
||||
Option outputFile = Option.builder("o")
|
||||
.longOpt("output-file")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.desc("Name of output file")
|
||||
.build();
|
||||
mainOptions.addOption(outputFile);
|
||||
|
||||
//Options cellOptions = new Options();
|
||||
Option numCells = Option.builder("nc")
|
||||
.longOpt("num-cells")
|
||||
.desc("The number of distinct cells to generate")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.build();
|
||||
mainOptions.addOption(numCells);
|
||||
Option cdr1Freq = Option.builder("d")
|
||||
.longOpt("peptide-diversity-factor")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("Number of distinct CDR3s for every CDR1")
|
||||
.build();
|
||||
mainOptions.addOption(cdr1Freq);
|
||||
//Option cellOutput = (Option) outputFile.clone();
|
||||
//cellOutput.setRequired(true);
|
||||
//mainOptions.addOption(cellOutput);
|
||||
|
||||
//Options plateOptions = new Options();
|
||||
Option inputCells = Option.builder("c")
|
||||
.longOpt("cell-file")
|
||||
.hasArg()
|
||||
.argName("file")
|
||||
.desc("The cell sample file used for filling wells")
|
||||
.build();
|
||||
mainOptions.addOption(inputCells);
|
||||
Option numWells = Option.builder("w")
|
||||
.longOpt("num-wells")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("The number of wells on each plate")
|
||||
.build();
|
||||
mainOptions.addOption(numWells);
|
||||
Option numPlates = Option.builder("np")
|
||||
.longOpt("num-plates")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("The number of plate files to output")
|
||||
.build();
|
||||
mainOptions.addOption(numPlates);
|
||||
//Option plateOutput = (Option) outputFile.clone();
|
||||
//plateOutput.setRequired(true);
|
||||
//plateOutput.setDescription("Prefix for plate output filenames");
|
||||
//mainOptions.addOption(plateOutput);
|
||||
Option plateErr = Option.builder("err")
|
||||
.longOpt("drop-out-rate")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("Well drop-out rate. (Probability between 0 and 1)")
|
||||
.build();
|
||||
mainOptions.addOption(plateErr);
|
||||
Option plateConcentrations = Option.builder("t")
|
||||
.longOpt("t-cells-per-well")
|
||||
.hasArgs()
|
||||
.argName("number 1, number 2, ...")
|
||||
.desc("Number of T cells per well for each plate section")
|
||||
.build();
|
||||
mainOptions.addOption(plateConcentrations);
|
||||
|
||||
//different distributions, mutually exclusive
|
||||
OptionGroup plateDistributions = new OptionGroup();
|
||||
Option plateExp = Option.builder("exponential")
|
||||
.desc("Sample from distinct cells with exponential frequency distribution")
|
||||
.build();
|
||||
plateDistributions.addOption(plateExp);
|
||||
Option plateGaussian = Option.builder("gaussian")
|
||||
.desc("Sample from distinct cells with gaussain frequency distribution")
|
||||
.build();
|
||||
plateDistributions.addOption(plateGaussian);
|
||||
Option platePoisson = Option.builder("poisson")
|
||||
.desc("Sample from distinct cells with poisson frequency distribution")
|
||||
.build();
|
||||
plateDistributions.addOption(platePoisson);
|
||||
mainOptions.addOptionGroup(plateDistributions);
|
||||
|
||||
Option plateStdDev = Option.builder("stddev")
|
||||
.desc("Standard deviation for gaussian distribution")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.build();
|
||||
mainOptions.addOption(plateStdDev);
|
||||
|
||||
Option plateLambda = Option.builder("lambda")
|
||||
.desc("Lambda for exponential distribution")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.build();
|
||||
mainOptions.addOption(plateLambda);
|
||||
|
||||
|
||||
|
||||
//
|
||||
// String cellFile, String filename, Double stdDev,
|
||||
// Integer numWells, Integer numSections,
|
||||
// Integer[] concentrations, Double dropOutRate
|
||||
//
|
||||
|
||||
//Options matchOptions = new Options();
|
||||
inputCells.setDescription("The cell sample file to be used for matching.");
|
||||
mainOptions.addOption(inputCells);
|
||||
Option lowThresh = Option.builder("low")
|
||||
.longOpt("low-threshold")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("Sets the minimum occupancy overlap to attempt matching")
|
||||
.build();
|
||||
mainOptions.addOption(lowThresh);
|
||||
Option highThresh = Option.builder("high")
|
||||
.longOpt("high-threshold")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.desc("Sets the maximum occupancy overlap to attempt matching")
|
||||
.build();
|
||||
mainOptions.addOption(highThresh);
|
||||
Option occDiff = Option.builder("occdiff")
|
||||
.longOpt("occupancy-difference")
|
||||
.hasArg()
|
||||
.argName("Number")
|
||||
.desc("Maximum difference in alpha/beta occupancy to attempt matching")
|
||||
.build();
|
||||
mainOptions.addOption(occDiff);
|
||||
Option overlapPer = Option.builder("ovper")
|
||||
.longOpt("overlap-percent")
|
||||
.hasArg()
|
||||
.argName("Percent")
|
||||
.desc("Minimum overlap percent to attempt matching (0 -100)")
|
||||
.build();
|
||||
mainOptions.addOption(overlapPer);
|
||||
Option inputPlates = Option.builder("p")
|
||||
.longOpt("plate-files")
|
||||
.hasArgs()
|
||||
.desc("Plate files to match")
|
||||
.build();
|
||||
mainOptions.addOption(inputPlates);
|
||||
|
||||
|
||||
//Options sets for the different modes
|
||||
Options mainOptions = buildMainOptions();
|
||||
Options cellOptions = buildCellOptions();
|
||||
Options plateOptions = buildPlateOptions();
|
||||
Options graphOptions = buildGraphOptions();
|
||||
Options matchOptions = buildMatchCDR3options();
|
||||
|
||||
CommandLineParser parser = new DefaultParser();
|
||||
try {
|
||||
CommandLine line = parser.parse(mainOptions, args);
|
||||
if(line.hasOption("match")){
|
||||
//line = parser.parse(mainOptions, args);
|
||||
//String cellFile = line.getOptionValue("c");
|
||||
String graphFile = line.getOptionValue("g");
|
||||
Integer lowThreshold = Integer.valueOf(line.getOptionValue(lowThresh));
|
||||
Integer highThreshold = Integer.valueOf(line.getOptionValue(highThresh));
|
||||
Integer occupancyDifference = Integer.valueOf(line.getOptionValue(occDiff));
|
||||
Integer overlapPercent = Integer.valueOf(line.getOptionValue(overlapPer));
|
||||
for(String plate: line.getOptionValues("p")) {
|
||||
matchCDR3s(graphFile, lowThreshold, highThreshold, occupancyDifference, overlapPercent);
|
||||
}
|
||||
try{
|
||||
CommandLine line = parser.parse(mainOptions, Arrays.copyOfRange(args, 0, 1));
|
||||
|
||||
if (line.hasOption("help")) {
|
||||
HelpFormatter formatter = new HelpFormatter();
|
||||
formatter.printHelp("BiGpairSEQ_Sim", mainOptions);
|
||||
System.out.println();
|
||||
formatter.printHelp("BiGpairSEQ_SIM -cells", cellOptions);
|
||||
System.out.println();
|
||||
formatter.printHelp("BiGpairSEQ_Sim -plate", plateOptions);
|
||||
System.out.println();
|
||||
formatter.printHelp("BiGpairSEQ_Sim -graph", graphOptions);
|
||||
System.out.println();
|
||||
formatter.printHelp("BiGpairSEQ_Sim -match", matchOptions);
|
||||
}
|
||||
else if(line.hasOption("cells")){
|
||||
//line = parser.parse(mainOptions, args);
|
||||
else if (line.hasOption("cells")) {
|
||||
line = parser.parse(cellOptions, Arrays.copyOfRange(args, 1, args.length));
|
||||
Integer number = Integer.valueOf(line.getOptionValue("n"));
|
||||
Integer diversity = Integer.valueOf(line.getOptionValue("d"));
|
||||
String filename = line.getOptionValue("o");
|
||||
Integer numDistCells = Integer.valueOf(line.getOptionValue("nc"));
|
||||
Integer freq = Integer.valueOf(line.getOptionValue("d"));
|
||||
makeCells(filename, numDistCells, freq);
|
||||
makeCells(filename, number, diversity);
|
||||
}
|
||||
else if(line.hasOption("plates")){
|
||||
//line = parser.parse(mainOptions, args);
|
||||
String cellFile = line.getOptionValue("c");
|
||||
String filenamePrefix = line.getOptionValue("o");
|
||||
Integer numWellsOnPlate = Integer.valueOf(line.getOptionValue("w"));
|
||||
Integer numPlatesToMake = Integer.valueOf(line.getOptionValue("np"));
|
||||
String[] concentrationsToUseString = line.getOptionValues("t");
|
||||
Integer numSections = concentrationsToUseString.length;
|
||||
|
||||
Integer[] concentrationsToUse = new Integer[numSections];
|
||||
for(int i = 0; i <numSections; i++){
|
||||
concentrationsToUse[i] = Integer.valueOf(concentrationsToUseString[i]);
|
||||
else if (line.hasOption("plate")) {
|
||||
line = parser.parse(plateOptions, Arrays.copyOfRange(args, 1, args.length));
|
||||
//get the cells
|
||||
String cellFilename = line.getOptionValue("c");
|
||||
CellSample cells = getCells(cellFilename);
|
||||
//get the rest of the parameters
|
||||
Integer[] populations;
|
||||
String outputFilename = line.getOptionValue("o");
|
||||
Integer numWells = Integer.parseInt(line.getOptionValue("w"));
|
||||
Double dropoutRate = Double.parseDouble(line.getOptionValue("err"));
|
||||
if (line.hasOption("random")) {
|
||||
//Array holding values of minimum and maximum populations
|
||||
Integer[] min_max = Stream.of(line.getOptionValues("random"))
|
||||
.mapToInt(Integer::parseInt)
|
||||
.boxed()
|
||||
.toArray(Integer[]::new);
|
||||
populations = BiGpairSEQ.getRand().ints(min_max[0], min_max[1] + 1)
|
||||
.limit(numWells)
|
||||
.boxed()
|
||||
.toArray(Integer[]::new);
|
||||
}
|
||||
Double dropOutRate = Double.valueOf(line.getOptionValue("err"));
|
||||
if(line.hasOption("exponential")){
|
||||
Double lambda = Double.valueOf(line.getOptionValue("lambda"));
|
||||
for(int i = 1; i <= numPlatesToMake; i++){
|
||||
makePlateExp(cellFile, filenamePrefix + i, lambda, numWellsOnPlate,
|
||||
concentrationsToUse,dropOutRate);
|
||||
}
|
||||
else if (line.hasOption("pop")) {
|
||||
populations = Stream.of(line.getOptionValues("pop"))
|
||||
.mapToInt(Integer::parseInt)
|
||||
.boxed()
|
||||
.toArray(Integer[]::new);
|
||||
}
|
||||
else if(line.hasOption("gaussian")){
|
||||
Double stdDev = Double.valueOf(line.getOptionValue("std-dev"));
|
||||
for(int i = 1; i <= numPlatesToMake; i++){
|
||||
makePlate(cellFile, filenamePrefix + i, stdDev, numWellsOnPlate,
|
||||
concentrationsToUse,dropOutRate);
|
||||
}
|
||||
else{
|
||||
populations = new Integer[1];
|
||||
populations[0] = 1;
|
||||
}
|
||||
//make the plate
|
||||
Plate plate;
|
||||
if (line.hasOption("poisson")) {
|
||||
Double stdDev = Math.sqrt(numWells);
|
||||
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, stdDev, false);
|
||||
}
|
||||
else if (line.hasOption("gaussian")) {
|
||||
Double stdDev = Double.parseDouble(line.getOptionValue("stddev"));
|
||||
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, stdDev, false);
|
||||
}
|
||||
else {
|
||||
assert line.hasOption("exponential");
|
||||
Double lambda = Double.parseDouble(line.getOptionValue("lambda"));
|
||||
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, lambda, true);
|
||||
}
|
||||
PlateFileWriter writer = new PlateFileWriter(outputFilename, plate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
|
||||
else if (line.hasOption("graph")) { //Making a graph
|
||||
line = parser.parse(graphOptions, Arrays.copyOfRange(args, 1, args.length));
|
||||
String cellFilename = line.getOptionValue("c");
|
||||
String plateFilename = line.getOptionValue("p");
|
||||
String outputFilename = line.getOptionValue("o");
|
||||
//get cells
|
||||
CellSample cells = getCells(cellFilename);
|
||||
//get plate
|
||||
Plate plate = getPlate(plateFilename);
|
||||
GraphWithMapData graph = Simulator.makeGraph(cells, plate, false);
|
||||
if (!line.hasOption("no-binary")) { //output binary file unless told not to
|
||||
GraphDataObjectWriter writer = new GraphDataObjectWriter(outputFilename, graph, false);
|
||||
writer.writeDataToFile();
|
||||
}
|
||||
else if(line.hasOption("poisson")){
|
||||
for(int i = 1; i <= numPlatesToMake; i++){
|
||||
makePlatePoisson(cellFile, filenamePrefix + i, numWellsOnPlate,
|
||||
concentrationsToUse,dropOutRate);
|
||||
}
|
||||
if (line.hasOption("graphml")) { //if told to, output graphml file
|
||||
GraphMLFileWriter gmlwriter = new GraphMLFileWriter(outputFilename, graph);
|
||||
gmlwriter.writeGraphToFile();
|
||||
}
|
||||
}
|
||||
|
||||
else if (line.hasOption("match")) { //can add a flag for which match type in future, spit this in two
|
||||
line = parser.parse(matchOptions, Arrays.copyOfRange(args, 1, args.length));
|
||||
String graphFilename = line.getOptionValue("g");
|
||||
String outputFilename = line.getOptionValue("o");
|
||||
Integer minThreshold = Integer.parseInt(line.getOptionValue("min"));
|
||||
Integer maxThreshold = Integer.parseInt(line.getOptionValue("max"));
|
||||
Integer minOverlapPct;
|
||||
if (line.hasOption("minpct")) { //see if this filter is being used
|
||||
minOverlapPct = Integer.parseInt(line.getOptionValue("minpct"));
|
||||
}
|
||||
else {
|
||||
minOverlapPct = 0;
|
||||
}
|
||||
Integer maxOccupancyDiff;
|
||||
if (line.hasOption("maxdiff")) { //see if this filter is being used
|
||||
maxOccupancyDiff = Integer.parseInt(line.getOptionValue("maxdiff"));
|
||||
}
|
||||
else {
|
||||
maxOccupancyDiff = Integer.MAX_VALUE;
|
||||
}
|
||||
GraphWithMapData graph = getGraph(graphFilename);
|
||||
MatchingResult result = Simulator.matchCDR3s(graph, graphFilename, minThreshold, maxThreshold,
|
||||
maxOccupancyDiff, minOverlapPct, false);
|
||||
MatchingFileWriter writer = new MatchingFileWriter(outputFilename, result);
|
||||
writer.writeResultsToFile();
|
||||
//can put a bunch of ifs for outputting various things from the MatchingResult to System.out here
|
||||
//after I put those flags in the matchOptions
|
||||
}
|
||||
}
|
||||
catch (ParseException exp) {
|
||||
System.err.println("Parsing failed. Reason: " + exp.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
private static Option outputFileOption() {
|
||||
Option outputFile = Option.builder("o")
|
||||
.longOpt("output-file")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.desc("Name of output file")
|
||||
.required()
|
||||
.build();
|
||||
return outputFile;
|
||||
}
|
||||
|
||||
private static Options buildMainOptions() {
|
||||
Options mainOptions = new Options();
|
||||
Option help = Option.builder("help")
|
||||
.desc("Displays this help menu")
|
||||
.build();
|
||||
Option makeCells = Option.builder("cells")
|
||||
.longOpt("make-cells")
|
||||
.desc("Makes a cell sample file of distinct T cells")
|
||||
.build();
|
||||
Option makePlate = Option.builder("plate")
|
||||
.longOpt("make-plate")
|
||||
.desc("Makes a sample plate file. Requires a cell sample file.")
|
||||
.build();
|
||||
Option makeGraph = Option.builder("graph")
|
||||
.longOpt("make-graph")
|
||||
.desc("Makes a graph/data file. Requires a cell sample file and a sample plate file")
|
||||
.build();
|
||||
Option matchCDR3 = Option.builder("match")
|
||||
.longOpt("match-cdr3")
|
||||
.desc("Matches CDR3s. Requires a graph/data file.")
|
||||
.build();
|
||||
OptionGroup mainGroup = new OptionGroup();
|
||||
mainGroup.addOption(help);
|
||||
mainGroup.addOption(makeCells);
|
||||
mainGroup.addOption(makePlate);
|
||||
mainGroup.addOption(makeGraph);
|
||||
mainGroup.addOption(matchCDR3);
|
||||
mainGroup.setRequired(true);
|
||||
mainOptions.addOptionGroup(mainGroup);
|
||||
return mainOptions;
|
||||
}
|
||||
|
||||
private static Options buildCellOptions() {
|
||||
Options cellOptions = new Options();
|
||||
Option numCells = Option.builder("n")
|
||||
.longOpt("num-cells")
|
||||
.desc("The number of distinct cells to generate")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.required().build();
|
||||
Option cdr3Diversity = Option.builder("d")
|
||||
.longOpt("diversity-factor")
|
||||
.desc("The factor by which unique CDR3s outnumber unique CDR1s")
|
||||
.hasArg()
|
||||
.argName("factor")
|
||||
.required().build();
|
||||
cellOptions.addOption(numCells);
|
||||
cellOptions.addOption(cdr3Diversity);
|
||||
cellOptions.addOption(outputFileOption());
|
||||
return cellOptions;
|
||||
}
|
||||
|
||||
private static Options buildPlateOptions() {
|
||||
Options plateOptions = new Options();
|
||||
Option cellFile = Option.builder("c") // add this to plate options
|
||||
.longOpt("cell-file")
|
||||
.desc("The cell sample file to use")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.required().build();
|
||||
Option numWells = Option.builder("w")// add this to plate options
|
||||
.longOpt("wells")
|
||||
.desc("The number of wells on the sample plate")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.required().build();
|
||||
//options group for choosing with distribution to use
|
||||
OptionGroup distributions = new OptionGroup();// add this to plate options
|
||||
distributions.setRequired(true);
|
||||
Option poisson = Option.builder("poisson")
|
||||
.desc("Use a Poisson distribution for cell sample")
|
||||
.build();
|
||||
Option gaussian = Option.builder("gaussian")
|
||||
.desc("Use a Gaussian distribution for cell sample")
|
||||
.build();
|
||||
Option exponential = Option.builder("exponential")
|
||||
.desc("Use an exponential distribution for cell sample")
|
||||
.build();
|
||||
distributions.addOption(poisson);
|
||||
distributions.addOption(gaussian);
|
||||
distributions.addOption(exponential);
|
||||
//options group for statistical distribution parameters
|
||||
OptionGroup statParams = new OptionGroup();// add this to plate options
|
||||
Option stdDev = Option.builder("stddev")
|
||||
.desc("If using -gaussian flag, standard deviation for distrbution")
|
||||
.hasArg()
|
||||
.argName("value")
|
||||
.build();
|
||||
Option lambda = Option.builder("lambda")
|
||||
.desc("If using -exponential flag, lambda value for distribution")
|
||||
.hasArg()
|
||||
.argName("value")
|
||||
.build();
|
||||
statParams.addOption(stdDev);
|
||||
statParams.addOption(lambda);
|
||||
//Option group for random plate or set populations
|
||||
OptionGroup wellPopOptions = new OptionGroup(); // add this to plate options
|
||||
wellPopOptions.setRequired(true);
|
||||
Option randomWellPopulations = Option.builder("random")
|
||||
.desc("Randomize well populations on sample plate. Takes two arguments: the minimum possible population and the maximum possible population.")
|
||||
.hasArgs()
|
||||
.numberOfArgs(2)
|
||||
.argName("minimum maximum")
|
||||
.build();
|
||||
Option specificWellPopulations = Option.builder("pop")
|
||||
.desc("The well populations for each section of the sample plate. There will be as many sections as there are populations given.")
|
||||
.hasArgs()
|
||||
.argName("number [number]...")
|
||||
.build();
|
||||
Option dropoutRate = Option.builder("err") //add this to plate options
|
||||
.hasArg()
|
||||
.desc("The sequence dropout rate due to amplification error. (0.0 - 1.0)")
|
||||
.argName("rate")
|
||||
.required()
|
||||
.build();
|
||||
wellPopOptions.addOption(randomWellPopulations);
|
||||
wellPopOptions.addOption(specificWellPopulations);
|
||||
plateOptions.addOption(cellFile);
|
||||
plateOptions.addOption(numWells);
|
||||
plateOptions.addOptionGroup(distributions);
|
||||
plateOptions.addOptionGroup(statParams);
|
||||
plateOptions.addOptionGroup(wellPopOptions);
|
||||
plateOptions.addOption(dropoutRate);
|
||||
plateOptions.addOption(outputFileOption());
|
||||
return plateOptions;
|
||||
}
|
||||
|
||||
private static Options buildGraphOptions() {
|
||||
Options graphOptions = new Options();
|
||||
Option cellFilename = Option.builder("c")
|
||||
.longOpt("cell-file")
|
||||
.desc("Cell sample file to use for checking accuracy")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.required().build();
|
||||
Option plateFilename = Option.builder("p")
|
||||
.longOpt("plate-filename")
|
||||
.desc("Sample plate file (made from given cell sample file) to construct graph from")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.required().build();
|
||||
Option outputGraphML = Option.builder("graphml")
|
||||
.desc("Output GraphML file")
|
||||
.build();
|
||||
Option outputSerializedBinary = Option.builder("nb")
|
||||
.longOpt("no-binary")
|
||||
.desc("Don't output serialized binary file")
|
||||
.build();
|
||||
graphOptions.addOption(cellFilename);
|
||||
graphOptions.addOption(plateFilename);
|
||||
graphOptions.addOption(outputFileOption());
|
||||
graphOptions.addOption(outputGraphML);
|
||||
graphOptions.addOption(outputSerializedBinary);
|
||||
return graphOptions;
|
||||
}
|
||||
|
||||
private static Options buildMatchCDR3options() {
|
||||
Options matchCDR3options = new Options();
|
||||
Option graphFilename = Option.builder("g")
|
||||
.longOpt("graph-file")
|
||||
.desc("The graph/data file to use")
|
||||
.hasArg()
|
||||
.argName("filename")
|
||||
.required().build();
|
||||
Option minOccupancyOverlap = Option.builder("min")
|
||||
.desc("The minimum number of shared wells to attempt to match a sequence pair")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.required().build();
|
||||
Option maxOccupancyOverlap = Option.builder("max")
|
||||
.desc("The maximum number of shared wells to attempt to match a sequence pair")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.required().build();
|
||||
Option minOverlapPercent = Option.builder("minpct")
|
||||
.desc("(Optional) The minimum percentage of a sequence's total occupancy shared by another sequence to attempt matching. (0 - 100) ")
|
||||
.hasArg()
|
||||
.argName("percent")
|
||||
.build();
|
||||
Option maxOccupancyDifference = Option.builder("maxdiff")
|
||||
.desc("(Optional) The maximum difference in total occupancy between two sequences to attempt matching.")
|
||||
.hasArg()
|
||||
.argName("number")
|
||||
.build();
|
||||
matchCDR3options.addOption(graphFilename);
|
||||
matchCDR3options.addOption(minOccupancyOverlap);
|
||||
matchCDR3options.addOption(maxOccupancyOverlap);
|
||||
matchCDR3options.addOption(minOverlapPercent);
|
||||
matchCDR3options.addOption(maxOccupancyDifference);
|
||||
matchCDR3options.addOption(outputFileOption());
|
||||
//options for output to System.out
|
||||
//Option printPairingErrorRate = Option.builder()
|
||||
|
||||
return matchCDR3options;
|
||||
}
|
||||
|
||||
|
||||
|
||||
private static CellSample getCells(String cellFilename) {
|
||||
assert cellFilename != null;
|
||||
CellFileReader reader = new CellFileReader(cellFilename);
|
||||
return reader.getCellSample();
|
||||
}
|
||||
|
||||
private static Plate getPlate(String plateFilename) {
|
||||
assert plateFilename != null;
|
||||
PlateFileReader reader = new PlateFileReader(plateFilename);
|
||||
return reader.getSamplePlate();
|
||||
}
|
||||
|
||||
private static GraphWithMapData getGraph(String graphFilename) {
|
||||
assert graphFilename != null;
|
||||
try{
|
||||
GraphDataObjectReader reader = new GraphDataObjectReader(graphFilename, false);
|
||||
return reader.getData();
|
||||
|
||||
}
|
||||
catch (IOException ex) {
|
||||
ex.printStackTrace();
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
//for calling from command line
|
||||
public static void makeCells(String filename, Integer numCells, Integer cdr1Freq){
|
||||
CellSample sample = Simulator.generateCellSample(numCells, cdr1Freq);
|
||||
public static void makeCells(String filename, Integer numCells, Integer cdr1Freq) {
|
||||
CellSample sample = new CellSample(numCells, cdr1Freq);
|
||||
CellFileWriter writer = new CellFileWriter(filename, sample);
|
||||
writer.writeCellsToFile();
|
||||
}
|
||||
|
||||
public static void makePlateExp(String cellFile, String filename, Double lambda,
|
||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), lambda);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
|
||||
private static void makePlatePoisson(String cellFile, String filename, Integer numWells,
|
||||
Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Double stdDev = Math.sqrt(cellReader.getCellCountDEPRECATED());
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
|
||||
private static void makePlate(String cellFile, String filename, Double stdDev,
|
||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
|
||||
private static void matchCDR3s(String graphFile, Integer lowThreshold, Integer highThreshold,
|
||||
Integer occupancyDifference, Integer overlapPercent) {
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
import java.io.*;
|
||||
|
||||
public class GraphDataObjectReader {
|
||||
|
||||
private GraphWithMapData data;
|
||||
private String filename;
|
||||
private boolean verbose = true;
|
||||
|
||||
public GraphDataObjectReader(String filename) throws IOException {
|
||||
public GraphDataObjectReader(String filename, boolean verbose) throws IOException {
|
||||
if(!filename.matches(".*\\.ser")){
|
||||
filename = filename + ".ser";
|
||||
}
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
import org.jgrapht.Graph;
|
||||
|
||||
import java.io.BufferedOutputStream;
|
||||
import java.io.FileOutputStream;
|
||||
import java.io.IOException;
|
||||
@@ -7,6 +9,7 @@ public class GraphDataObjectWriter {
|
||||
|
||||
private GraphWithMapData data;
|
||||
private String filename;
|
||||
private boolean verbose = true;
|
||||
|
||||
public GraphDataObjectWriter(String filename, GraphWithMapData data) {
|
||||
if(!filename.matches(".*\\.ser")){
|
||||
@@ -16,13 +19,24 @@ public class GraphDataObjectWriter {
|
||||
this.data = data;
|
||||
}
|
||||
|
||||
public GraphDataObjectWriter(String filename, GraphWithMapData data, boolean verbose) {
|
||||
this.verbose = verbose;
|
||||
if(!filename.matches(".*\\.ser")){
|
||||
filename = filename + ".ser";
|
||||
}
|
||||
this.filename = filename;
|
||||
this.data = data;
|
||||
}
|
||||
|
||||
public void writeDataToFile() {
|
||||
try (BufferedOutputStream bufferedOut = new BufferedOutputStream(new FileOutputStream(filename));
|
||||
|
||||
ObjectOutputStream out = new ObjectOutputStream(bufferedOut);
|
||||
){
|
||||
System.out.println("Writing graph and occupancy data to file. This may take some time.");
|
||||
System.out.println("File I/O time is not included in results.");
|
||||
if(verbose) {
|
||||
System.out.println("Writing graph and occupancy data to file. This may take some time.");
|
||||
System.out.println("File I/O time is not included in results.");
|
||||
}
|
||||
out.writeObject(data);
|
||||
} catch (IOException ex) {
|
||||
ex.printStackTrace();
|
||||
|
||||
@@ -1,35 +0,0 @@
|
||||
import org.jgrapht.graph.SimpleWeightedGraph;
|
||||
import org.jgrapht.nio.graphml.GraphMLImporter;
|
||||
|
||||
import java.io.BufferedReader;
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
public class GraphMLFileReader {
|
||||
|
||||
private String filename;
|
||||
private SimpleWeightedGraph graph;
|
||||
|
||||
public GraphMLFileReader(String filename, SimpleWeightedGraph graph) {
|
||||
if(!filename.matches(".*\\.graphml")){
|
||||
filename = filename + ".graphml";
|
||||
}
|
||||
this.filename = filename;
|
||||
this.graph = graph;
|
||||
|
||||
try(//don't need to close reader bc of try-with-resources auto-closing
|
||||
BufferedReader reader = Files.newBufferedReader(Path.of(filename));
|
||||
){
|
||||
GraphMLImporter<SimpleWeightedGraph, BufferedReader> importer = new GraphMLImporter<>();
|
||||
importer.importGraph(graph, reader);
|
||||
}
|
||||
catch (IOException ex) {
|
||||
System.out.println("Graph file " + filename + " not found.");
|
||||
System.err.println(ex);
|
||||
}
|
||||
}
|
||||
|
||||
public SimpleWeightedGraph getGraph() { return graph; }
|
||||
|
||||
}
|
||||
@@ -1,4 +1,8 @@
|
||||
import org.jgrapht.graph.DefaultWeightedEdge;
|
||||
import org.jgrapht.graph.SimpleWeightedGraph;
|
||||
import org.jgrapht.nio.Attribute;
|
||||
import org.jgrapht.nio.AttributeType;
|
||||
import org.jgrapht.nio.DefaultAttribute;
|
||||
import org.jgrapht.nio.dot.DOTExporter;
|
||||
import org.jgrapht.nio.graphml.GraphMLExporter;
|
||||
|
||||
@@ -7,25 +11,69 @@ import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.StandardOpenOption;
|
||||
import java.util.HashMap;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.Map;
|
||||
|
||||
public class GraphMLFileWriter {
|
||||
|
||||
String filename;
|
||||
SimpleWeightedGraph graph;
|
||||
GraphWithMapData data;
|
||||
|
||||
|
||||
public GraphMLFileWriter(String filename, SimpleWeightedGraph graph) {
|
||||
public GraphMLFileWriter(String filename, GraphWithMapData data) {
|
||||
if(!filename.matches(".*\\.graphml")){
|
||||
filename = filename + ".graphml";
|
||||
}
|
||||
this.filename = filename;
|
||||
this.graph = graph;
|
||||
this.data = data;
|
||||
}
|
||||
|
||||
// public void writeGraphToFile() {
|
||||
// try(BufferedWriter writer = Files.newBufferedWriter(Path.of(filename), StandardOpenOption.CREATE_NEW);
|
||||
// ){
|
||||
// GraphMLExporter<SimpleWeightedGraph, BufferedWriter> exporter = new GraphMLExporter<>();
|
||||
// exporter.exportGraph(graph, writer);
|
||||
// } catch(IOException ex){
|
||||
// System.out.println("Could not make new file named "+filename);
|
||||
// System.err.println(ex);
|
||||
// }
|
||||
// }
|
||||
|
||||
public void writeGraphToFile() {
|
||||
SimpleWeightedGraph graph = data.getGraph();
|
||||
Map<Integer, Integer> vertexToAlphaMap = data.getPlateVtoAMap();
|
||||
Map<Integer, Integer> vertexToBetaMap = data.getPlateVtoBMap();
|
||||
Map<Integer, Integer> alphaOccs = data.getAlphaWellCounts();
|
||||
Map<Integer, Integer> betaOccs = data.getBetaWellCounts();
|
||||
try(BufferedWriter writer = Files.newBufferedWriter(Path.of(filename), StandardOpenOption.CREATE_NEW);
|
||||
){
|
||||
GraphMLExporter<SimpleWeightedGraph, BufferedWriter> exporter = new GraphMLExporter<>();
|
||||
//create exporter. Let the vertex labels be the unique ids for the vertices
|
||||
GraphMLExporter<Integer, SimpleWeightedGraph<Vertex, DefaultWeightedEdge>> exporter = new GraphMLExporter<>(v -> v.toString());
|
||||
//set to export weights
|
||||
exporter.setExportEdgeWeights(true);
|
||||
//set type, sequence, and occupancy attributes for each vertex
|
||||
exporter.setVertexAttributeProvider( v -> {
|
||||
Map<String, Attribute> attributes = new HashMap<>();
|
||||
if(vertexToAlphaMap.containsKey(v)) {
|
||||
attributes.put("type", DefaultAttribute.createAttribute("CDR3 Alpha"));
|
||||
attributes.put("sequence", DefaultAttribute.createAttribute(vertexToAlphaMap.get(v)));
|
||||
attributes.put("occupancy", DefaultAttribute.createAttribute(
|
||||
alphaOccs.get(vertexToAlphaMap.get(v))));
|
||||
}
|
||||
else if(vertexToBetaMap.containsKey(v)) {
|
||||
attributes.put("type", DefaultAttribute.createAttribute("CDR3 Beta"));
|
||||
attributes.put("sequence", DefaultAttribute.createAttribute(vertexToBetaMap.get(v)));
|
||||
attributes.put("occupancy", DefaultAttribute.createAttribute(
|
||||
betaOccs.get(vertexToBetaMap.get(v))));
|
||||
}
|
||||
return attributes;
|
||||
});
|
||||
//register the attributes
|
||||
exporter.registerAttribute("type", GraphMLExporter.AttributeCategory.NODE, AttributeType.STRING);
|
||||
exporter.registerAttribute("sequence", GraphMLExporter.AttributeCategory.NODE, AttributeType.STRING);
|
||||
exporter.registerAttribute("occupancy", GraphMLExporter.AttributeCategory.NODE, AttributeType.STRING);
|
||||
//export the graph
|
||||
exporter.exportGraph(graph, writer);
|
||||
} catch(IOException ex){
|
||||
System.out.println("Could not make new file named "+filename);
|
||||
@@ -33,3 +81,4 @@ public class GraphMLFileWriter {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -4,7 +4,6 @@ import org.jgrapht.graph.SimpleWeightedGraph;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.Set;
|
||||
|
||||
public interface GraphModificationFunctions {
|
||||
|
||||
|
||||
@@ -74,7 +74,7 @@ public class InteractiveInterface {
|
||||
System.out.println(ex);
|
||||
sc.next();
|
||||
}
|
||||
CellSample sample = Simulator.generateCellSample(numCells, cdr1Freq);
|
||||
CellSample sample = new CellSample(numCells, cdr1Freq);
|
||||
assert filename != null;
|
||||
System.out.println("Writing cells to file");
|
||||
CellFileWriter writer = new CellFileWriter(filename, sample);
|
||||
@@ -227,16 +227,14 @@ public class InteractiveInterface {
|
||||
Plate samplePlate;
|
||||
PlateFileWriter writer;
|
||||
if(exponential){
|
||||
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||
samplePlate.fillWellsExponential(cellFile, cells.getCells(), lambda);
|
||||
samplePlate = new Plate(cells, cellFile, numWells, populations, dropOutRate, lambda, true);
|
||||
writer = new PlateFileWriter(filename, samplePlate);
|
||||
}
|
||||
else {
|
||||
if (poisson) {
|
||||
stdDev = Math.sqrt(cells.getCellCount()); //gaussian with square root of elements approximates poisson
|
||||
}
|
||||
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||
samplePlate.fillWells(cellFile, cells.getCells(), stdDev);
|
||||
samplePlate = new Plate(cells, cellFile, numWells, populations, dropOutRate, stdDev, false);
|
||||
writer = new PlateFileWriter(filename, samplePlate);
|
||||
}
|
||||
System.out.println("Writing Sample Plate to file");
|
||||
@@ -252,7 +250,6 @@ public class InteractiveInterface {
|
||||
String filename = null;
|
||||
String cellFile = null;
|
||||
String plateFile = null;
|
||||
|
||||
try {
|
||||
String str = "\nGenerating bipartite weighted graph encoding occupancy overlap data ";
|
||||
str = str.concat("\nrequires a cell sample file and a sample plate file.");
|
||||
@@ -293,7 +290,7 @@ public class InteractiveInterface {
|
||||
else {
|
||||
System.out.println("Reading Sample Plate file: " + plateFile);
|
||||
PlateFileReader plateReader = new PlateFileReader(plateFile);
|
||||
plate = new Plate(plateReader.getFilename(), plateReader.getWells());
|
||||
plate = plateReader.getSamplePlate();
|
||||
if(BiGpairSEQ.cachePlate()) {
|
||||
BiGpairSEQ.setPlateInMemory(plate, plateFile);
|
||||
}
|
||||
@@ -307,12 +304,18 @@ public class InteractiveInterface {
|
||||
System.out.println("Returning to main menu.");
|
||||
}
|
||||
else{
|
||||
List<Integer[]> cells = cellSample.getCells();
|
||||
GraphWithMapData data = Simulator.makeGraph(cells, plate, true);
|
||||
GraphWithMapData data = Simulator.makeGraph(cellSample, plate, true);
|
||||
assert filename != null;
|
||||
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
|
||||
dataWriter.writeDataToFile();
|
||||
System.out.println("Graph and Data file written to: " + filename);
|
||||
if(BiGpairSEQ.outputBinary()) {
|
||||
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
|
||||
dataWriter.writeDataToFile();
|
||||
System.out.println("Serialized binary graph/data file written to: " + filename);
|
||||
}
|
||||
if(BiGpairSEQ.outputGraphML()) {
|
||||
GraphMLFileWriter graphMLWriter = new GraphMLFileWriter(filename, data);
|
||||
graphMLWriter.writeGraphToFile();
|
||||
System.out.println("GraphML file written to: " + filename);
|
||||
}
|
||||
if(BiGpairSEQ.cacheGraph()) {
|
||||
BiGpairSEQ.setGraphInMemory(data, filename);
|
||||
|
||||
@@ -372,7 +375,7 @@ public class InteractiveInterface {
|
||||
data = BiGpairSEQ.getGraphInMemory();
|
||||
}
|
||||
else {
|
||||
GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename);
|
||||
GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename, true);
|
||||
data = dataReader.getData();
|
||||
if(BiGpairSEQ.cacheGraph()) {
|
||||
BiGpairSEQ.setGraphInMemory(data, graphFilename);
|
||||
@@ -500,7 +503,9 @@ public class InteractiveInterface {
|
||||
System.out.println("1) Turn " + getOnOff(!BiGpairSEQ.cacheCells()) + " cell sample file caching");
|
||||
System.out.println("2) Turn " + getOnOff(!BiGpairSEQ.cachePlate()) + " plate file caching");
|
||||
System.out.println("3) Turn " + getOnOff(!BiGpairSEQ.cacheGraph()) + " graph/data file caching");
|
||||
System.out.println("4) Maximum weight matching algorithm options");
|
||||
System.out.println("4) Turn " + getOnOff(!BiGpairSEQ.outputBinary()) + " serialized binary graph output");
|
||||
System.out.println("5) Turn " + getOnOff(!BiGpairSEQ.outputGraphML()) + " GraphML graph output");
|
||||
System.out.println("6) Maximum weight matching algorithm options");
|
||||
System.out.println("0) Return to main menu");
|
||||
try {
|
||||
input = sc.nextInt();
|
||||
@@ -508,7 +513,9 @@ public class InteractiveInterface {
|
||||
case 1 -> BiGpairSEQ.setCacheCells(!BiGpairSEQ.cacheCells());
|
||||
case 2 -> BiGpairSEQ.setCachePlate(!BiGpairSEQ.cachePlate());
|
||||
case 3 -> BiGpairSEQ.setCacheGraph(!BiGpairSEQ.cacheGraph());
|
||||
case 4 -> algorithmOptions();
|
||||
case 4 -> BiGpairSEQ.setOutputBinary(!BiGpairSEQ.outputBinary());
|
||||
case 5 -> BiGpairSEQ.setOutputGraphML(!BiGpairSEQ.outputGraphML());
|
||||
case 6 -> algorithmOptions();
|
||||
case 0 -> backToMain = true;
|
||||
default -> System.out.println("Invalid input");
|
||||
}
|
||||
|
||||
@@ -21,15 +21,15 @@ public class MatchingResult {
|
||||
* well populations *
|
||||
* total alphas found *
|
||||
* total betas found *
|
||||
* high overlap threshold
|
||||
* low overlap threshold
|
||||
* maximum occupancy difference
|
||||
* minimum overlap percent
|
||||
* pairing attempt rate
|
||||
* correct pairing count
|
||||
* incorrect pairing count
|
||||
* pairing error rate
|
||||
* simulation time
|
||||
* high overlap threshold *
|
||||
* low overlap threshold *
|
||||
* maximum occupancy difference *
|
||||
* minimum overlap percent *
|
||||
* pairing attempt rate *
|
||||
* correct pairing count *
|
||||
* incorrect pairing count *
|
||||
* pairing error rate *
|
||||
* simulation time (seconds)
|
||||
*/
|
||||
this.metadata = metadata;
|
||||
this.comments = new ArrayList<>();
|
||||
@@ -91,6 +91,22 @@ public class MatchingResult {
|
||||
return Integer.parseInt(metadata.get("total beta count"));
|
||||
}
|
||||
|
||||
//put in the rest of these methods following the same pattern
|
||||
public Integer getHighOverlapThreshold() { return Integer.parseInt(metadata.get("high overlap threshold"));}
|
||||
|
||||
public Integer getLowOverlapThreshold() { return Integer.parseInt(metadata.get("low overlap threshold"));}
|
||||
|
||||
public Integer getMaxOccupancyDifference() { return Integer.parseInt(metadata.get("maximum occupancy difference"));}
|
||||
|
||||
public Integer getMinOverlapPercent() { return Integer.parseInt(metadata.get("minimum overlap percent"));}
|
||||
|
||||
public Double getPairingAttemptRate() { return Double.parseDouble(metadata.get("pairing attempt rate"));}
|
||||
|
||||
public Integer getCorrectPairingCount() { return Integer.parseInt(metadata.get("correct pairing count"));}
|
||||
|
||||
public Integer getIncorrectPairingCount() { return Integer.parseInt(metadata.get("incorrect pairing count"));}
|
||||
|
||||
public Double getPairingErrorRate() { return Double.parseDouble(metadata.get("pairing error rate"));}
|
||||
|
||||
public String getSimulationTime() { return metadata.get("simulation time (seconds)"); }
|
||||
|
||||
}
|
||||
|
||||
@@ -8,7 +8,9 @@ TODO: Implement discrete frequency distributions using Vose's Alias Method
|
||||
import java.util.*;
|
||||
|
||||
public class Plate {
|
||||
private CellSample cells;
|
||||
private String sourceFile;
|
||||
private String filename;
|
||||
private List<List<Integer[]>> wells;
|
||||
private final Random rand = BiGpairSEQ.getRand();
|
||||
private int size;
|
||||
@@ -18,6 +20,25 @@ public class Plate {
|
||||
private double lambda;
|
||||
boolean exponential = false;
|
||||
|
||||
public Plate(CellSample cells, String cellFilename, int numWells, Integer[] populations,
|
||||
double dropoutRate, double stdDev_or_lambda, boolean exponential){
|
||||
this.cells = cells;
|
||||
this.sourceFile = cellFilename;
|
||||
this.size = numWells;
|
||||
this.wells = new ArrayList<>();
|
||||
this.error = dropoutRate;
|
||||
this.populations = populations;
|
||||
this.exponential = exponential;
|
||||
if (this.exponential) {
|
||||
this.lambda = stdDev_or_lambda;
|
||||
fillWellsExponential(cells.getCells(), this.lambda);
|
||||
}
|
||||
else {
|
||||
this.stdDev = stdDev_or_lambda;
|
||||
fillWells(cells.getCells(), this.stdDev);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
public Plate(int size, double error, Integer[] populations) {
|
||||
this.size = size;
|
||||
@@ -26,8 +47,9 @@ public class Plate {
|
||||
wells = new ArrayList<>();
|
||||
}
|
||||
|
||||
public Plate(String sourceFileName, List<List<Integer[]>> wells) {
|
||||
this.sourceFile = sourceFileName;
|
||||
//constructor for returning a Plate from a PlateFileReader
|
||||
public Plate(String filename, List<List<Integer[]>> wells) {
|
||||
this.filename = filename;
|
||||
this.wells = wells;
|
||||
this.size = wells.size();
|
||||
|
||||
@@ -43,10 +65,9 @@ public class Plate {
|
||||
}
|
||||
}
|
||||
|
||||
public void fillWellsExponential(String sourceFileName, List<Integer[]> cells, double lambda){
|
||||
private void fillWellsExponential(List<Integer[]> cells, double lambda){
|
||||
this.lambda = lambda;
|
||||
exponential = true;
|
||||
sourceFile = sourceFileName;
|
||||
int numSections = populations.length;
|
||||
int section = 0;
|
||||
double m;
|
||||
@@ -74,9 +95,8 @@ public class Plate {
|
||||
}
|
||||
}
|
||||
|
||||
public void fillWells(String sourceFileName, List<Integer[]> cells, double stdDev) {
|
||||
private void fillWells( List<Integer[]> cells, double stdDev) {
|
||||
this.stdDev = stdDev;
|
||||
sourceFile = sourceFileName;
|
||||
int numSections = populations.length;
|
||||
int section = 0;
|
||||
double m;
|
||||
@@ -159,4 +179,6 @@ public class Plate {
|
||||
public String getSourceFileName() {
|
||||
return sourceFile;
|
||||
}
|
||||
|
||||
public String getFilename() { return filename; }
|
||||
}
|
||||
|
||||
@@ -56,11 +56,8 @@ public class PlateFileReader {
|
||||
|
||||
}
|
||||
|
||||
public List<List<Integer[]>> getWells() {
|
||||
return wells;
|
||||
public Plate getSamplePlate() {
|
||||
return new Plate(filename, wells);
|
||||
}
|
||||
|
||||
public String getFilename() {
|
||||
return filename;
|
||||
}
|
||||
}
|
||||
@@ -23,33 +23,10 @@ public class Simulator implements GraphModificationFunctions {
|
||||
private static final int cdr1AlphaIndex = 2;
|
||||
private static final int cdr1BetaIndex = 3;
|
||||
|
||||
public static CellSample generateCellSample(Integer numDistinctCells, Integer cdr1Freq) {
|
||||
//In real T cells, CDR1s have about one third the diversity of CDR3s
|
||||
List<Integer> numbersCDR3 = new ArrayList<>();
|
||||
List<Integer> numbersCDR1 = new ArrayList<>();
|
||||
Integer numDistCDR3s = 2 * numDistinctCells + 1;
|
||||
IntStream.range(1, numDistCDR3s + 1).forEach(i -> numbersCDR3.add(i));
|
||||
IntStream.range(numDistCDR3s + 1, numDistCDR3s + 1 + (numDistCDR3s / cdr1Freq) + 1).forEach(i -> numbersCDR1.add(i));
|
||||
Collections.shuffle(numbersCDR3);
|
||||
Collections.shuffle(numbersCDR1);
|
||||
|
||||
//Each cell represented by 4 values
|
||||
//two CDR3s, and two CDR1s. First two values are CDR3s (alpha, beta), second two are CDR1s (alpha, beta)
|
||||
List<Integer[]> distinctCells = new ArrayList<>();
|
||||
for(int i = 0; i < numbersCDR3.size() - 1; i = i + 2){
|
||||
Integer tmpCDR3a = numbersCDR3.get(i);
|
||||
Integer tmpCDR3b = numbersCDR3.get(i+1);
|
||||
Integer tmpCDR1a = numbersCDR1.get(i % numbersCDR1.size());
|
||||
Integer tmpCDR1b = numbersCDR1.get((i+1) % numbersCDR1.size());
|
||||
Integer[] tmp = {tmpCDR3a, tmpCDR3b, tmpCDR1a, tmpCDR1b};
|
||||
distinctCells.add(tmp);
|
||||
}
|
||||
return new CellSample(distinctCells, cdr1Freq);
|
||||
}
|
||||
|
||||
//Make the graph needed for matching CDR3s
|
||||
public static GraphWithMapData makeGraph(List<Integer[]> distinctCells, Plate samplePlate, boolean verbose) {
|
||||
public static GraphWithMapData makeGraph(CellSample cellSample, Plate samplePlate, boolean verbose) {
|
||||
Instant start = Instant.now();
|
||||
List<Integer[]> distinctCells = cellSample.getCells();
|
||||
int[] alphaIndex = {cdr3AlphaIndex};
|
||||
int[] betaIndex = {cdr3BetaIndex};
|
||||
|
||||
@@ -137,7 +114,7 @@ public class Simulator implements GraphModificationFunctions {
|
||||
distCellsMapAlphaKey, plateVtoAMap, plateVtoBMap, plateAtoVMap,
|
||||
plateBtoVMap, alphaWellCounts, betaWellCounts, time);
|
||||
//Set source file name in graph to name of sample plate
|
||||
output.setSourceFilename(samplePlate.getSourceFileName());
|
||||
output.setSourceFilename(samplePlate.getFilename());
|
||||
//return GraphWithMapData object
|
||||
return output;
|
||||
}
|
||||
@@ -260,6 +237,7 @@ public class Simulator implements GraphModificationFunctions {
|
||||
}
|
||||
|
||||
//Metadata comments for CSV file
|
||||
String algoType = "LEDA book with heap: " + heapType;
|
||||
int min = Math.min(alphaCount, betaCount);
|
||||
//rate of attempted matching
|
||||
double attemptRate = (double) (trueCount + falseCount) / min;
|
||||
@@ -290,6 +268,7 @@ public class Simulator implements GraphModificationFunctions {
|
||||
Map<String, String> metadata = new LinkedHashMap<>();
|
||||
metadata.put("sample plate filename", data.getSourceFilename());
|
||||
metadata.put("graph filename", dataFilename);
|
||||
metadata.put("algorithm type", algoType);
|
||||
metadata.put("well populations", wellPopulationsString);
|
||||
metadata.put("total alphas found", alphaCount.toString());
|
||||
metadata.put("total betas found", betaCount.toString());
|
||||
@@ -301,7 +280,7 @@ public class Simulator implements GraphModificationFunctions {
|
||||
metadata.put("correct pairing count", Integer.toString(trueCount));
|
||||
metadata.put("incorrect pairing count", Integer.toString(falseCount));
|
||||
metadata.put("pairing error rate", pairingErrorRateTrunc.toString());
|
||||
metadata.put("simulation time", nf.format(time.toSeconds()));
|
||||
metadata.put("simulation time (seconds)", nf.format(time.toSeconds()));
|
||||
//create MatchingResult object
|
||||
MatchingResult output = new MatchingResult(metadata, header, allResults, matchMap, time);
|
||||
if(verbose){
|
||||
@@ -690,7 +669,7 @@ public class Simulator implements GraphModificationFunctions {
|
||||
|
||||
private static Map<Integer, Integer> makeVertexToSequenceMap(Map<Integer, Integer> sequences, Integer startValue) {
|
||||
Map<Integer, Integer> map = new LinkedHashMap<>(); //LinkedHashMap to preserve order of entry
|
||||
Integer index = startValue;
|
||||
Integer index = startValue; //is this necessary? I don't think I use this.
|
||||
for (Integer k: sequences.keySet()) {
|
||||
map.put(index, k);
|
||||
index++;
|
||||
|
||||
@@ -1,14 +1,20 @@
|
||||
|
||||
|
||||
public class Vertex {
|
||||
private final Integer peptide;
|
||||
private final Integer vertexLabel;
|
||||
private final Integer sequence;
|
||||
private final Integer occupancy;
|
||||
|
||||
public Vertex(Integer peptide, Integer occupancy) {
|
||||
this.peptide = peptide;
|
||||
public Vertex(Integer vertexLabel, Integer sequence, Integer occupancy) {
|
||||
this.vertexLabel = vertexLabel;
|
||||
this.sequence = sequence;
|
||||
this.occupancy = occupancy;
|
||||
}
|
||||
|
||||
public Integer getPeptide() {
|
||||
return peptide;
|
||||
public Integer getVertexLabel() { return vertexLabel; }
|
||||
|
||||
public Integer getSequence() {
|
||||
return sequence;
|
||||
}
|
||||
|
||||
public Integer getOccupancy() {
|
||||
|
||||
Reference in New Issue
Block a user