21 Commits

Author SHA1 Message Date
6f5afbc6ec Update readme with CLI arguments 2022-02-27 17:01:12 -06:00
fb4d22e7a4 Update readme with CLI arguments 2022-02-27 17:00:54 -06:00
e10350c214 Update readme with CLI arguments 2022-02-27 16:56:58 -06:00
b1155f8100 Format -help CLI option 2022-02-27 16:53:46 -06:00
12b003a69f Add -help CLI option 2022-02-27 16:45:30 -06:00
32c5bcaaff Deactivate file I/O announcement for CLI 2022-02-27 16:16:24 -06:00
2485ac4cf6 Add getters to MatchingResult 2022-02-27 16:15:26 -06:00
05556bce0c Add units to metadata 2022-02-27 16:08:59 -06:00
a822f69ea4 Control verbose output 2022-02-27 16:07:17 -06:00
3d1f8668ee Control verbose output 2022-02-27 16:03:57 -06:00
40c743308b Initialize wells 2022-02-27 15:54:47 -06:00
5246cc4a0c Re-implement command line options 2022-02-27 15:35:07 -06:00
a5f7c0641d Refactor for better encapsulation with CellSamples 2022-02-27 14:51:53 -06:00
8ebfc1469f Refactor plate to fill its own wells in its constructor 2022-02-27 14:25:53 -06:00
b53f5f1cc0 Refactor plate to fill its own wells in its constructor 2022-02-27 14:17:16 -06:00
974d2d650c Refactor plate to fill its own wells in its constructor 2022-02-27 14:17:11 -06:00
6b5837e6ce Add Vose's alias method to to-dos 2022-02-27 11:46:11 -06:00
b4cc240048 Update Readme 2022-02-26 11:03:31 -06:00
ff72c9b359 Update Readme 2022-02-26 11:02:23 -06:00
88eb8aca50 Update Readme 2022-02-26 11:01:44 -06:00
98bf452891 Update Readme 2022-02-26 11:01:20 -06:00
10 changed files with 456 additions and 303 deletions

View File

@@ -48,8 +48,13 @@ For example, to run the program with 32 gigabytes of memory, use the command:
`java -Xmx32G -jar BiGpairSEQ_Sim.jar` `java -Xmx32G -jar BiGpairSEQ_Sim.jar`
Once running, BiGpairSEQ_Sim has an interactive, menu-driven CLI for generating files and simulating TCR pairing. The There are a number of command line options, to allow the program to be used in shell scripts. For a full list,
main menu looks like this: use the -help flag:
`java -jar BiGpairSEQ_Sim.jar -help`
If no command line arguments are given, BiGpairSEQ_Sim will launch with an interactive, menu-driven CLI for
generating files and simulating TCR pairing. The main menu looks like this:
``` ```
--------BiGPairSEQ SIMULATOR-------- --------BiGPairSEQ SIMULATOR--------
@@ -78,6 +83,7 @@ By default, the Options menu looks like this:
0) Return to main menu 0) Return to main menu
``` ```
### INPUT/OUTPUT ### INPUT/OUTPUT
To run the simulation, the program reads and writes 4 kinds of files: To run the simulation, the program reads and writes 4 kinds of files:
@@ -105,6 +111,7 @@ The program's caching behavior can be controlled in the Options menu. By default
The program can optionally output Graph/Data files in .GraphML format (.graphml) for data portability. This can be The program can optionally output Graph/Data files in .GraphML format (.graphml) for data portability. This can be
turned on in the Options menu. By default, GraphML output is OFF. turned on in the Options menu. By default, GraphML output is OFF.
---
#### Cell Sample Files #### Cell Sample Files
Cell Sample files consist of any number of distinct "T cells." Every cell contains Cell Sample files consist of any number of distinct "T cells." Every cell contains
four sequences: Alpha CDR3, Beta CDR3, Alpha CDR1, Beta CDR1. The sequences are represented by four sequences: Alpha CDR3, Beta CDR3, Alpha CDR1, Beta CDR1. The sequences are represented by
@@ -122,7 +129,6 @@ Comments are preceded by `#`
Structure: Structure:
---
# Sample contains 1 unique CDR1 for every 4 unique CDR3s. # Sample contains 1 unique CDR1 for every 4 unique CDR3s.
| Alpha CDR3 | Beta CDR3 | Alpha CDR1 | Beta CDR1 | | Alpha CDR3 | Beta CDR3 | Alpha CDR1 | Beta CDR1 |
|---|---|---|---| |---|---|---|---|
@@ -167,7 +173,6 @@ Dropout sequences are replaced with the value `-1`. Comments are preceded by `#`
Structure: Structure:
---
``` ```
# Cell source file name: # Cell source file name:
# Each row represents one well on the plate # Each row represents one well on the plate
@@ -223,7 +228,6 @@ Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
Example output: Example output:
---
``` ```
# Source Sample Plate file: 4MilCellsPlate.csv # Source Sample Plate file: 4MilCellsPlate.csv
# Source Graph and Data file: 4MilCellsPlateGraph.ser # Source Graph and Data file: 4MilCellsPlateGraph.ser
@@ -292,11 +296,12 @@ slightly less time than the simulation itself. Real elapsed time from start to f
* ~~Enable GraphML output in addition to serialized object binaries, for data portability~~ DONE * ~~Enable GraphML output in addition to serialized object binaries, for data portability~~ DONE
* ~~Custom vertex type with attribute for sequence occupancy?~~ ABANDONED * ~~Custom vertex type with attribute for sequence occupancy?~~ ABANDONED
* Have a branch where this is implemented, but there's a bug that broke matching. Don't currently have time to fix. * Have a branch where this is implemented, but there's a bug that broke matching. Don't currently have time to fix.
* Re-implement command line arguments, to enable scripting and statistical simulation studies * ~~Re-implement command line arguments, to enable scripting and statistical simulation studies~~ DONE
* Re-implement CDR1 matching method * Re-implement CDR1 matching method
* Implement Duan and Su's maximum weight matching algorithm * Implement Duan and Su's maximum weight matching algorithm
* Add controllable algorithm-type parameter? * Add controllable algorithm-type parameter?
* This would be fun and valuable, but probably take more time than I have for a hobby project. * This would be fun and valuable, but probably take more time than I have for a hobby project.
* Implement Vose's alias method for arbitrary statistical distributions of cells
## CITATIONS ## CITATIONS

View File

@@ -23,8 +23,8 @@ public class BiGpairSEQ {
} }
else { else {
//This will be uncommented when command line arguments are re-implemented. //This will be uncommented when command line arguments are re-implemented.
//CommandLineInterface.startCLI(args); CommandLineInterface.startCLI(args);
System.out.println("Command line arguments are still being re-implemented."); //System.out.println("Command line arguments are still being re-implemented.");
} }
} }

View File

@@ -1,5 +1,9 @@
import org.apache.commons.cli.*; import org.apache.commons.cli.*;
import java.io.IOException;
import java.util.Arrays;
import java.util.stream.Stream;
/* /*
* Class for parsing options passed to program from command line * Class for parsing options passed to program from command line
* *
@@ -29,6 +33,8 @@ import org.apache.commons.cli.*;
* cellfile : name of the cell sample file to use as input * cellfile : name of the cell sample file to use as input
* platefile : name of the sample plate file to use as input * platefile : name of the sample plate file to use as input
* output : name of the output file * output : name of the output file
* graphml : output a graphml file
* binary : output a serialized binary object file
* *
* Match flags: * Match flags:
* graphFile : name of graph and data file to use as input * graphFile : name of graph and data file to use as input
@@ -43,246 +49,372 @@ import org.apache.commons.cli.*;
public class CommandLineInterface { public class CommandLineInterface {
public static void startCLI(String[] args) { public static void startCLI(String[] args) {
//These command line options are a big mess //Options sets for the different modes
//Really, I don't think command line tools are expected to work in this many different modes Options mainOptions = buildMainOptions();
//making cells, making plates, and matching are the sort of thing that UNIX philosophy would say Options cellOptions = buildCellOptions();
//should be three separate programs. Options plateOptions = buildPlateOptions();
//There might be a way to do it with option parameters? Options graphOptions = buildGraphOptions();
Options matchOptions = buildMatchCDR3options();
//main options set CommandLineParser parser = new DefaultParser();
try{
CommandLine line = parser.parse(mainOptions, Arrays.copyOfRange(args, 0, 1));
if (line.hasOption("help")) {
HelpFormatter formatter = new HelpFormatter();
formatter.printHelp("BiGpairSEQ_Sim", mainOptions);
System.out.println();
formatter.printHelp("BiGpairSEQ_SIM -cells", cellOptions);
System.out.println();
formatter.printHelp("BiGpairSEQ_Sim -plate", plateOptions);
System.out.println();
formatter.printHelp("BiGpairSEQ_Sim -graph", graphOptions);
System.out.println();
formatter.printHelp("BiGpairSEQ_Sim -match", matchOptions);
}
else if (line.hasOption("cells")) {
line = parser.parse(cellOptions, Arrays.copyOfRange(args, 1, args.length));
Integer number = Integer.valueOf(line.getOptionValue("n"));
Integer diversity = Integer.valueOf(line.getOptionValue("d"));
String filename = line.getOptionValue("o");
makeCells(filename, number, diversity);
}
else if (line.hasOption("plate")) {
line = parser.parse(plateOptions, Arrays.copyOfRange(args, 1, args.length));
//get the cells
String cellFilename = line.getOptionValue("c");
CellSample cells = getCells(cellFilename);
//get the rest of the parameters
Integer[] populations;
String outputFilename = line.getOptionValue("o");
Integer numWells = Integer.parseInt(line.getOptionValue("w"));
Double dropoutRate = Double.parseDouble(line.getOptionValue("err"));
if (line.hasOption("random")) {
//Array holding values of minimum and maximum populations
Integer[] min_max = Stream.of(line.getOptionValues("random"))
.mapToInt(Integer::parseInt)
.boxed()
.toArray(Integer[]::new);
populations = BiGpairSEQ.getRand().ints(min_max[0], min_max[1] + 1)
.limit(numWells)
.boxed()
.toArray(Integer[]::new);
}
else if (line.hasOption("pop")) {
populations = Stream.of(line.getOptionValues("pop"))
.mapToInt(Integer::parseInt)
.boxed()
.toArray(Integer[]::new);
}
else{
populations = new Integer[1];
populations[0] = 1;
}
//make the plate
Plate plate;
if (line.hasOption("poisson")) {
Double stdDev = Math.sqrt(numWells);
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, stdDev, false);
}
else if (line.hasOption("gaussian")) {
Double stdDev = Double.parseDouble(line.getOptionValue("stddev"));
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, stdDev, false);
}
else {
assert line.hasOption("exponential");
Double lambda = Double.parseDouble(line.getOptionValue("lambda"));
plate = new Plate(cells, cellFilename, numWells, populations, dropoutRate, lambda, true);
}
PlateFileWriter writer = new PlateFileWriter(outputFilename, plate);
writer.writePlateFile();
}
else if (line.hasOption("graph")) { //Making a graph
line = parser.parse(graphOptions, Arrays.copyOfRange(args, 1, args.length));
String cellFilename = line.getOptionValue("c");
String plateFilename = line.getOptionValue("p");
String outputFilename = line.getOptionValue("o");
//get cells
CellSample cells = getCells(cellFilename);
//get plate
Plate plate = getPlate(plateFilename);
GraphWithMapData graph = Simulator.makeGraph(cells, plate, false);
if (!line.hasOption("no-binary")) { //output binary file unless told not to
GraphDataObjectWriter writer = new GraphDataObjectWriter(outputFilename, graph, false);
writer.writeDataToFile();
}
if (line.hasOption("graphml")) { //if told to, output graphml file
GraphMLFileWriter gmlwriter = new GraphMLFileWriter(outputFilename, graph);
gmlwriter.writeGraphToFile();
}
}
else if (line.hasOption("match")) { //can add a flag for which match type in future, spit this in two
line = parser.parse(matchOptions, Arrays.copyOfRange(args, 1, args.length));
String graphFilename = line.getOptionValue("g");
String outputFilename = line.getOptionValue("o");
Integer minThreshold = Integer.parseInt(line.getOptionValue("min"));
Integer maxThreshold = Integer.parseInt(line.getOptionValue("max"));
Integer minOverlapPct;
if (line.hasOption("minpct")) { //see if this filter is being used
minOverlapPct = Integer.parseInt(line.getOptionValue("minpct"));
}
else {
minOverlapPct = 0;
}
Integer maxOccupancyDiff;
if (line.hasOption("maxdiff")) { //see if this filter is being used
maxOccupancyDiff = Integer.parseInt(line.getOptionValue("maxdiff"));
}
else {
maxOccupancyDiff = Integer.MAX_VALUE;
}
GraphWithMapData graph = getGraph(graphFilename);
MatchingResult result = Simulator.matchCDR3s(graph, graphFilename, minThreshold, maxThreshold,
maxOccupancyDiff, minOverlapPct, false);
MatchingFileWriter writer = new MatchingFileWriter(outputFilename, result);
writer.writeResultsToFile();
//can put a bunch of ifs for outputting various things from the MatchingResult to System.out here
//after I put those flags in the matchOptions
}
}
catch (ParseException exp) {
System.err.println("Parsing failed. Reason: " + exp.getMessage());
}
}
private static Option outputFileOption() {
Option outputFile = Option.builder("o")
.longOpt("output-file")
.hasArg()
.argName("filename")
.desc("Name of output file")
.required()
.build();
return outputFile;
}
private static Options buildMainOptions() {
Options mainOptions = new Options(); Options mainOptions = new Options();
Option help = Option.builder("help")
.desc("Displays this help menu")
.build();
Option makeCells = Option.builder("cells") Option makeCells = Option.builder("cells")
.longOpt("make-cells") .longOpt("make-cells")
.desc("Makes a file of distinct cells") .desc("Makes a cell sample file of distinct T cells")
.build(); .build();
Option makePlate = Option.builder("plates") Option makePlate = Option.builder("plate")
.longOpt("make-plates") .longOpt("make-plate")
.desc("Makes a sample plate file") .desc("Makes a sample plate file. Requires a cell sample file.")
.build(); .build();
Option makeGraph = Option.builder("graph") Option makeGraph = Option.builder("graph")
.longOpt("make-graph") .longOpt("make-graph")
.desc("Makes a graph and data file") .desc("Makes a graph/data file. Requires a cell sample file and a sample plate file")
.build(); .build();
Option matchCDR3 = Option.builder("match") Option matchCDR3 = Option.builder("match")
.longOpt("match-cdr3") .longOpt("match-cdr3")
.desc("Match CDR3s. Requires a cell sample file and any number of plate files.") .desc("Matches CDR3s. Requires a graph/data file.")
.build(); .build();
OptionGroup mainGroup = new OptionGroup(); OptionGroup mainGroup = new OptionGroup();
mainGroup.addOption(help);
mainGroup.addOption(makeCells); mainGroup.addOption(makeCells);
mainGroup.addOption(makePlate); mainGroup.addOption(makePlate);
mainGroup.addOption(makeGraph); mainGroup.addOption(makeGraph);
mainGroup.addOption(matchCDR3); mainGroup.addOption(matchCDR3);
mainGroup.setRequired(true); mainGroup.setRequired(true);
mainOptions.addOptionGroup(mainGroup); mainOptions.addOptionGroup(mainGroup);
return mainOptions;
}
//Reuse clones of this for other options groups, rather than making it lots of times private static Options buildCellOptions() {
Option outputFile = Option.builder("o") Options cellOptions = new Options();
.longOpt("output-file") Option numCells = Option.builder("n")
.hasArg()
.argName("filename")
.desc("Name of output file")
.build();
mainOptions.addOption(outputFile);
//Options cellOptions = new Options();
Option numCells = Option.builder("nc")
.longOpt("num-cells") .longOpt("num-cells")
.desc("The number of distinct cells to generate") .desc("The number of distinct cells to generate")
.hasArg() .hasArg()
.argName("number") .argName("number")
.build(); .required().build();
mainOptions.addOption(numCells); Option cdr3Diversity = Option.builder("d")
Option cdr1Freq = Option.builder("d") .longOpt("diversity-factor")
.longOpt("peptide-diversity-factor") .desc("The factor by which unique CDR3s outnumber unique CDR1s")
.hasArg() .hasArg()
.argName("number") .argName("factor")
.desc("Number of distinct CDR3s for every CDR1") .required().build();
.build(); cellOptions.addOption(numCells);
mainOptions.addOption(cdr1Freq); cellOptions.addOption(cdr3Diversity);
//Option cellOutput = (Option) outputFile.clone(); cellOptions.addOption(outputFileOption());
//cellOutput.setRequired(true); return cellOptions;
//mainOptions.addOption(cellOutput); }
//Options plateOptions = new Options(); private static Options buildPlateOptions() {
Option inputCells = Option.builder("c") Options plateOptions = new Options();
Option cellFile = Option.builder("c") // add this to plate options
.longOpt("cell-file") .longOpt("cell-file")
.desc("The cell sample file to use")
.hasArg() .hasArg()
.argName("file") .argName("filename")
.desc("The cell sample file used for filling wells") .required().build();
.build(); Option numWells = Option.builder("w")// add this to plate options
mainOptions.addOption(inputCells); .longOpt("wells")
Option numWells = Option.builder("w") .desc("The number of wells on the sample plate")
.longOpt("num-wells")
.hasArg() .hasArg()
.argName("number") .argName("number")
.desc("The number of wells on each plate") .required().build();
//options group for choosing with distribution to use
OptionGroup distributions = new OptionGroup();// add this to plate options
distributions.setRequired(true);
Option poisson = Option.builder("poisson")
.desc("Use a Poisson distribution for cell sample")
.build(); .build();
mainOptions.addOption(numWells); Option gaussian = Option.builder("gaussian")
Option numPlates = Option.builder("np") .desc("Use a Gaussian distribution for cell sample")
.longOpt("num-plates") .build();
Option exponential = Option.builder("exponential")
.desc("Use an exponential distribution for cell sample")
.build();
distributions.addOption(poisson);
distributions.addOption(gaussian);
distributions.addOption(exponential);
//options group for statistical distribution parameters
OptionGroup statParams = new OptionGroup();// add this to plate options
Option stdDev = Option.builder("stddev")
.desc("If using -gaussian flag, standard deviation for distrbution")
.hasArg() .hasArg()
.argName("number") .argName("value")
.desc("The number of plate files to output")
.build(); .build();
mainOptions.addOption(numPlates); Option lambda = Option.builder("lambda")
//Option plateOutput = (Option) outputFile.clone(); .desc("If using -exponential flag, lambda value for distribution")
//plateOutput.setRequired(true);
//plateOutput.setDescription("Prefix for plate output filenames");
//mainOptions.addOption(plateOutput);
Option plateErr = Option.builder("err")
.longOpt("drop-out-rate")
.hasArg() .hasArg()
.argName("number") .argName("value")
.desc("Well drop-out rate. (Probability between 0 and 1)")
.build(); .build();
mainOptions.addOption(plateErr); statParams.addOption(stdDev);
Option plateConcentrations = Option.builder("t") statParams.addOption(lambda);
.longOpt("t-cells-per-well") //Option group for random plate or set populations
OptionGroup wellPopOptions = new OptionGroup(); // add this to plate options
wellPopOptions.setRequired(true);
Option randomWellPopulations = Option.builder("random")
.desc("Randomize well populations on sample plate. Takes two arguments: the minimum possible population and the maximum possible population.")
.hasArgs() .hasArgs()
.argName("number 1, number 2, ...") .numberOfArgs(2)
.desc("Number of T cells per well for each plate section") .argName("minimum maximum")
.build(); .build();
mainOptions.addOption(plateConcentrations); Option specificWellPopulations = Option.builder("pop")
.desc("The well populations for each section of the sample plate. There will be as many sections as there are populations given.")
//different distributions, mutually exclusive
OptionGroup plateDistributions = new OptionGroup();
Option plateExp = Option.builder("exponential")
.desc("Sample from distinct cells with exponential frequency distribution")
.build();
plateDistributions.addOption(plateExp);
Option plateGaussian = Option.builder("gaussian")
.desc("Sample from distinct cells with gaussain frequency distribution")
.build();
plateDistributions.addOption(plateGaussian);
Option platePoisson = Option.builder("poisson")
.desc("Sample from distinct cells with poisson frequency distribution")
.build();
plateDistributions.addOption(platePoisson);
mainOptions.addOptionGroup(plateDistributions);
Option plateStdDev = Option.builder("stddev")
.desc("Standard deviation for gaussian distribution")
.hasArg()
.argName("number")
.build();
mainOptions.addOption(plateStdDev);
Option plateLambda = Option.builder("lambda")
.desc("Lambda for exponential distribution")
.hasArg()
.argName("number")
.build();
mainOptions.addOption(plateLambda);
//
// String cellFile, String filename, Double stdDev,
// Integer numWells, Integer numSections,
// Integer[] concentrations, Double dropOutRate
//
//Options matchOptions = new Options();
inputCells.setDescription("The cell sample file to be used for matching.");
mainOptions.addOption(inputCells);
Option lowThresh = Option.builder("low")
.longOpt("low-threshold")
.hasArg()
.argName("number")
.desc("Sets the minimum occupancy overlap to attempt matching")
.build();
mainOptions.addOption(lowThresh);
Option highThresh = Option.builder("high")
.longOpt("high-threshold")
.hasArg()
.argName("number")
.desc("Sets the maximum occupancy overlap to attempt matching")
.build();
mainOptions.addOption(highThresh);
Option occDiff = Option.builder("occdiff")
.longOpt("occupancy-difference")
.hasArg()
.argName("Number")
.desc("Maximum difference in alpha/beta occupancy to attempt matching")
.build();
mainOptions.addOption(occDiff);
Option overlapPer = Option.builder("ovper")
.longOpt("overlap-percent")
.hasArg()
.argName("Percent")
.desc("Minimum overlap percent to attempt matching (0 -100)")
.build();
mainOptions.addOption(overlapPer);
Option inputPlates = Option.builder("p")
.longOpt("plate-files")
.hasArgs() .hasArgs()
.desc("Plate files to match") .argName("number [number]...")
.build(); .build();
mainOptions.addOption(inputPlates); Option dropoutRate = Option.builder("err") //add this to plate options
.hasArg()
.desc("The sequence dropout rate due to amplification error. (0.0 - 1.0)")
.argName("rate")
.required()
.build();
wellPopOptions.addOption(randomWellPopulations);
wellPopOptions.addOption(specificWellPopulations);
plateOptions.addOption(cellFile);
plateOptions.addOption(numWells);
plateOptions.addOptionGroup(distributions);
plateOptions.addOptionGroup(statParams);
plateOptions.addOptionGroup(wellPopOptions);
plateOptions.addOption(dropoutRate);
plateOptions.addOption(outputFileOption());
return plateOptions;
}
private static Options buildGraphOptions() {
Options graphOptions = new Options();
Option cellFilename = Option.builder("c")
.longOpt("cell-file")
.desc("Cell sample file to use for checking accuracy")
.hasArg()
.argName("filename")
.required().build();
Option plateFilename = Option.builder("p")
.longOpt("plate-filename")
.desc("Sample plate file (made from given cell sample file) to construct graph from")
.hasArg()
.argName("filename")
.required().build();
Option outputGraphML = Option.builder("graphml")
.desc("Output GraphML file")
.build();
Option outputSerializedBinary = Option.builder("nb")
.longOpt("no-binary")
.desc("Don't output serialized binary file")
.build();
graphOptions.addOption(cellFilename);
graphOptions.addOption(plateFilename);
graphOptions.addOption(outputFileOption());
graphOptions.addOption(outputGraphML);
graphOptions.addOption(outputSerializedBinary);
return graphOptions;
}
private static Options buildMatchCDR3options() {
Options matchCDR3options = new Options();
Option graphFilename = Option.builder("g")
.longOpt("graph-file")
.desc("The graph/data file to use")
.hasArg()
.argName("filename")
.required().build();
Option minOccupancyOverlap = Option.builder("min")
.desc("The minimum number of shared wells to attempt to match a sequence pair")
.hasArg()
.argName("number")
.required().build();
Option maxOccupancyOverlap = Option.builder("max")
.desc("The maximum number of shared wells to attempt to match a sequence pair")
.hasArg()
.argName("number")
.required().build();
Option minOverlapPercent = Option.builder("minpct")
.desc("(Optional) The minimum percentage of a sequence's total occupancy shared by another sequence to attempt matching. (0 - 100) ")
.hasArg()
.argName("percent")
.build();
Option maxOccupancyDifference = Option.builder("maxdiff")
.desc("(Optional) The maximum difference in total occupancy between two sequences to attempt matching.")
.hasArg()
.argName("number")
.build();
matchCDR3options.addOption(graphFilename);
matchCDR3options.addOption(minOccupancyOverlap);
matchCDR3options.addOption(maxOccupancyOverlap);
matchCDR3options.addOption(minOverlapPercent);
matchCDR3options.addOption(maxOccupancyDifference);
matchCDR3options.addOption(outputFileOption());
//options for output to System.out
//Option printPairingErrorRate = Option.builder()
return matchCDR3options;
}
CommandLineParser parser = new DefaultParser(); private static CellSample getCells(String cellFilename) {
assert cellFilename != null;
CellFileReader reader = new CellFileReader(cellFilename);
return reader.getCellSample();
}
private static Plate getPlate(String plateFilename) {
assert plateFilename != null;
PlateFileReader reader = new PlateFileReader(plateFilename);
return reader.getSamplePlate();
}
private static GraphWithMapData getGraph(String graphFilename) {
assert graphFilename != null;
try{ try{
CommandLine line = parser.parse(mainOptions, args); GraphDataObjectReader reader = new GraphDataObjectReader(graphFilename, false);
if(line.hasOption("match")){ return reader.getData();
//line = parser.parse(mainOptions, args);
//String cellFile = line.getOptionValue("c");
String graphFile = line.getOptionValue("g");
Integer lowThreshold = Integer.valueOf(line.getOptionValue(lowThresh));
Integer highThreshold = Integer.valueOf(line.getOptionValue(highThresh));
Integer occupancyDifference = Integer.valueOf(line.getOptionValue(occDiff));
Integer overlapPercent = Integer.valueOf(line.getOptionValue(overlapPer));
for(String plate: line.getOptionValues("p")) {
matchCDR3s(graphFile, lowThreshold, highThreshold, occupancyDifference, overlapPercent);
}
}
else if(line.hasOption("cells")){
//line = parser.parse(mainOptions, args);
String filename = line.getOptionValue("o");
Integer numDistCells = Integer.valueOf(line.getOptionValue("nc"));
Integer freq = Integer.valueOf(line.getOptionValue("d"));
makeCells(filename, numDistCells, freq);
}
else if(line.hasOption("plates")){
//line = parser.parse(mainOptions, args);
String cellFile = line.getOptionValue("c");
String filenamePrefix = line.getOptionValue("o");
Integer numWellsOnPlate = Integer.valueOf(line.getOptionValue("w"));
Integer numPlatesToMake = Integer.valueOf(line.getOptionValue("np"));
String[] concentrationsToUseString = line.getOptionValues("t");
Integer numSections = concentrationsToUseString.length;
Integer[] concentrationsToUse = new Integer[numSections];
for(int i = 0; i <numSections; i++){
concentrationsToUse[i] = Integer.valueOf(concentrationsToUseString[i]);
}
Double dropOutRate = Double.valueOf(line.getOptionValue("err"));
if(line.hasOption("exponential")){
Double lambda = Double.valueOf(line.getOptionValue("lambda"));
for(int i = 1; i <= numPlatesToMake; i++){
makePlateExp(cellFile, filenamePrefix + i, lambda, numWellsOnPlate,
concentrationsToUse,dropOutRate);
}
}
else if(line.hasOption("gaussian")){
Double stdDev = Double.valueOf(line.getOptionValue("std-dev"));
for(int i = 1; i <= numPlatesToMake; i++){
makePlate(cellFile, filenamePrefix + i, stdDev, numWellsOnPlate,
concentrationsToUse,dropOutRate);
}
} }
else if(line.hasOption("poisson")){ catch (IOException ex) {
for(int i = 1; i <= numPlatesToMake; i++){ ex.printStackTrace();
makePlatePoisson(cellFile, filenamePrefix + i, numWellsOnPlate, return null;
concentrationsToUse,dropOutRate);
}
}
}
}
catch (ParseException exp) {
System.err.println("Parsing failed. Reason: " + exp.getMessage());
} }
} }
@@ -292,37 +424,4 @@ public class CommandLineInterface {
CellFileWriter writer = new CellFileWriter(filename, sample); CellFileWriter writer = new CellFileWriter(filename, sample);
writer.writeCellsToFile(); writer.writeCellsToFile();
} }
public static void makePlateExp(String cellFile, String filename, Double lambda,
Integer numWells, Integer[] concentrations, Double dropOutRate){
CellFileReader cellReader = new CellFileReader(cellFile);
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), lambda);
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
writer.writePlateFile();
}
private static void makePlatePoisson(String cellFile, String filename, Integer numWells,
Integer[] concentrations, Double dropOutRate){
CellFileReader cellReader = new CellFileReader(cellFile);
Double stdDev = Math.sqrt(cellReader.getCellCountDEPRECATED());
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
writer.writePlateFile();
}
private static void makePlate(String cellFile, String filename, Double stdDev,
Integer numWells, Integer[] concentrations, Double dropOutRate){
CellFileReader cellReader = new CellFileReader(cellFile);
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
writer.writePlateFile();
}
private static void matchCDR3s(String graphFile, Integer lowThreshold, Integer highThreshold,
Integer occupancyDifference, Integer overlapPercent) {
}
} }

View File

@@ -1,10 +1,12 @@
import java.io.*; import java.io.*;
public class GraphDataObjectReader { public class GraphDataObjectReader {
private GraphWithMapData data; private GraphWithMapData data;
private String filename; private String filename;
private boolean verbose = true;
public GraphDataObjectReader(String filename) throws IOException { public GraphDataObjectReader(String filename, boolean verbose) throws IOException {
if(!filename.matches(".*\\.ser")){ if(!filename.matches(".*\\.ser")){
filename = filename + ".ser"; filename = filename + ".ser";
} }

View File

@@ -1,3 +1,5 @@
import org.jgrapht.Graph;
import java.io.BufferedOutputStream; import java.io.BufferedOutputStream;
import java.io.FileOutputStream; import java.io.FileOutputStream;
import java.io.IOException; import java.io.IOException;
@@ -7,6 +9,7 @@ public class GraphDataObjectWriter {
private GraphWithMapData data; private GraphWithMapData data;
private String filename; private String filename;
private boolean verbose = true;
public GraphDataObjectWriter(String filename, GraphWithMapData data) { public GraphDataObjectWriter(String filename, GraphWithMapData data) {
if(!filename.matches(".*\\.ser")){ if(!filename.matches(".*\\.ser")){
@@ -16,13 +19,24 @@ public class GraphDataObjectWriter {
this.data = data; this.data = data;
} }
public GraphDataObjectWriter(String filename, GraphWithMapData data, boolean verbose) {
this.verbose = verbose;
if(!filename.matches(".*\\.ser")){
filename = filename + ".ser";
}
this.filename = filename;
this.data = data;
}
public void writeDataToFile() { public void writeDataToFile() {
try (BufferedOutputStream bufferedOut = new BufferedOutputStream(new FileOutputStream(filename)); try (BufferedOutputStream bufferedOut = new BufferedOutputStream(new FileOutputStream(filename));
ObjectOutputStream out = new ObjectOutputStream(bufferedOut); ObjectOutputStream out = new ObjectOutputStream(bufferedOut);
){ ){
if(verbose) {
System.out.println("Writing graph and occupancy data to file. This may take some time."); System.out.println("Writing graph and occupancy data to file. This may take some time.");
System.out.println("File I/O time is not included in results."); System.out.println("File I/O time is not included in results.");
}
out.writeObject(data); out.writeObject(data);
} catch (IOException ex) { } catch (IOException ex) {
ex.printStackTrace(); ex.printStackTrace();

View File

@@ -227,16 +227,14 @@ public class InteractiveInterface {
Plate samplePlate; Plate samplePlate;
PlateFileWriter writer; PlateFileWriter writer;
if(exponential){ if(exponential){
samplePlate = new Plate(numWells, dropOutRate, populations); samplePlate = new Plate(cells, cellFile, numWells, populations, dropOutRate, lambda, true);
samplePlate.fillWellsExponential(cellFile, cells.getCells(), lambda);
writer = new PlateFileWriter(filename, samplePlate); writer = new PlateFileWriter(filename, samplePlate);
} }
else { else {
if (poisson) { if (poisson) {
stdDev = Math.sqrt(cells.getCellCount()); //gaussian with square root of elements approximates poisson stdDev = Math.sqrt(cells.getCellCount()); //gaussian with square root of elements approximates poisson
} }
samplePlate = new Plate(numWells, dropOutRate, populations); samplePlate = new Plate(cells, cellFile, numWells, populations, dropOutRate, stdDev, false);
samplePlate.fillWells(cellFile, cells.getCells(), stdDev);
writer = new PlateFileWriter(filename, samplePlate); writer = new PlateFileWriter(filename, samplePlate);
} }
System.out.println("Writing Sample Plate to file"); System.out.println("Writing Sample Plate to file");
@@ -292,7 +290,7 @@ public class InteractiveInterface {
else { else {
System.out.println("Reading Sample Plate file: " + plateFile); System.out.println("Reading Sample Plate file: " + plateFile);
PlateFileReader plateReader = new PlateFileReader(plateFile); PlateFileReader plateReader = new PlateFileReader(plateFile);
plate = new Plate(plateReader.getFilename(), plateReader.getWells()); plate = plateReader.getSamplePlate();
if(BiGpairSEQ.cachePlate()) { if(BiGpairSEQ.cachePlate()) {
BiGpairSEQ.setPlateInMemory(plate, plateFile); BiGpairSEQ.setPlateInMemory(plate, plateFile);
} }
@@ -306,8 +304,7 @@ public class InteractiveInterface {
System.out.println("Returning to main menu."); System.out.println("Returning to main menu.");
} }
else{ else{
List<Integer[]> cells = cellSample.getCells(); GraphWithMapData data = Simulator.makeGraph(cellSample, plate, true);
GraphWithMapData data = Simulator.makeGraph(cells, plate, true);
assert filename != null; assert filename != null;
if(BiGpairSEQ.outputBinary()) { if(BiGpairSEQ.outputBinary()) {
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data); GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
@@ -378,7 +375,7 @@ public class InteractiveInterface {
data = BiGpairSEQ.getGraphInMemory(); data = BiGpairSEQ.getGraphInMemory();
} }
else { else {
GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename); GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename, true);
data = dataReader.getData(); data = dataReader.getData();
if(BiGpairSEQ.cacheGraph()) { if(BiGpairSEQ.cacheGraph()) {
BiGpairSEQ.setGraphInMemory(data, graphFilename); BiGpairSEQ.setGraphInMemory(data, graphFilename);

View File

@@ -21,15 +21,15 @@ public class MatchingResult {
* well populations * * well populations *
* total alphas found * * total alphas found *
* total betas found * * total betas found *
* high overlap threshold * high overlap threshold *
* low overlap threshold * low overlap threshold *
* maximum occupancy difference * maximum occupancy difference *
* minimum overlap percent * minimum overlap percent *
* pairing attempt rate * pairing attempt rate *
* correct pairing count * correct pairing count *
* incorrect pairing count * incorrect pairing count *
* pairing error rate * pairing error rate *
* simulation time * simulation time (seconds)
*/ */
this.metadata = metadata; this.metadata = metadata;
this.comments = new ArrayList<>(); this.comments = new ArrayList<>();
@@ -91,6 +91,22 @@ public class MatchingResult {
return Integer.parseInt(metadata.get("total beta count")); return Integer.parseInt(metadata.get("total beta count"));
} }
//put in the rest of these methods following the same pattern public Integer getHighOverlapThreshold() { return Integer.parseInt(metadata.get("high overlap threshold"));}
public Integer getLowOverlapThreshold() { return Integer.parseInt(metadata.get("low overlap threshold"));}
public Integer getMaxOccupancyDifference() { return Integer.parseInt(metadata.get("maximum occupancy difference"));}
public Integer getMinOverlapPercent() { return Integer.parseInt(metadata.get("minimum overlap percent"));}
public Double getPairingAttemptRate() { return Double.parseDouble(metadata.get("pairing attempt rate"));}
public Integer getCorrectPairingCount() { return Integer.parseInt(metadata.get("correct pairing count"));}
public Integer getIncorrectPairingCount() { return Integer.parseInt(metadata.get("incorrect pairing count"));}
public Double getPairingErrorRate() { return Double.parseDouble(metadata.get("pairing error rate"));}
public String getSimulationTime() { return metadata.get("simulation time (seconds)"); }
} }

View File

@@ -8,7 +8,9 @@ TODO: Implement discrete frequency distributions using Vose's Alias Method
import java.util.*; import java.util.*;
public class Plate { public class Plate {
private CellSample cells;
private String sourceFile; private String sourceFile;
private String filename;
private List<List<Integer[]>> wells; private List<List<Integer[]>> wells;
private final Random rand = BiGpairSEQ.getRand(); private final Random rand = BiGpairSEQ.getRand();
private int size; private int size;
@@ -18,6 +20,25 @@ public class Plate {
private double lambda; private double lambda;
boolean exponential = false; boolean exponential = false;
public Plate(CellSample cells, String cellFilename, int numWells, Integer[] populations,
double dropoutRate, double stdDev_or_lambda, boolean exponential){
this.cells = cells;
this.sourceFile = cellFilename;
this.size = numWells;
this.wells = new ArrayList<>();
this.error = dropoutRate;
this.populations = populations;
this.exponential = exponential;
if (this.exponential) {
this.lambda = stdDev_or_lambda;
fillWellsExponential(cells.getCells(), this.lambda);
}
else {
this.stdDev = stdDev_or_lambda;
fillWells(cells.getCells(), this.stdDev);
}
}
public Plate(int size, double error, Integer[] populations) { public Plate(int size, double error, Integer[] populations) {
this.size = size; this.size = size;
@@ -26,8 +47,9 @@ public class Plate {
wells = new ArrayList<>(); wells = new ArrayList<>();
} }
public Plate(String sourceFileName, List<List<Integer[]>> wells) { //constructor for returning a Plate from a PlateFileReader
this.sourceFile = sourceFileName; public Plate(String filename, List<List<Integer[]>> wells) {
this.filename = filename;
this.wells = wells; this.wells = wells;
this.size = wells.size(); this.size = wells.size();
@@ -43,10 +65,9 @@ public class Plate {
} }
} }
public void fillWellsExponential(String sourceFileName, List<Integer[]> cells, double lambda){ private void fillWellsExponential(List<Integer[]> cells, double lambda){
this.lambda = lambda; this.lambda = lambda;
exponential = true; exponential = true;
sourceFile = sourceFileName;
int numSections = populations.length; int numSections = populations.length;
int section = 0; int section = 0;
double m; double m;
@@ -74,9 +95,8 @@ public class Plate {
} }
} }
public void fillWells(String sourceFileName, List<Integer[]> cells, double stdDev) { private void fillWells( List<Integer[]> cells, double stdDev) {
this.stdDev = stdDev; this.stdDev = stdDev;
sourceFile = sourceFileName;
int numSections = populations.length; int numSections = populations.length;
int section = 0; int section = 0;
double m; double m;
@@ -159,4 +179,6 @@ public class Plate {
public String getSourceFileName() { public String getSourceFileName() {
return sourceFile; return sourceFile;
} }
public String getFilename() { return filename; }
} }

View File

@@ -56,11 +56,8 @@ public class PlateFileReader {
} }
public List<List<Integer[]>> getWells() { public Plate getSamplePlate() {
return wells; return new Plate(filename, wells);
} }
public String getFilename() {
return filename;
}
} }

View File

@@ -24,8 +24,9 @@ public class Simulator implements GraphModificationFunctions {
private static final int cdr1BetaIndex = 3; private static final int cdr1BetaIndex = 3;
//Make the graph needed for matching CDR3s //Make the graph needed for matching CDR3s
public static GraphWithMapData makeGraph(List<Integer[]> distinctCells, Plate samplePlate, boolean verbose) { public static GraphWithMapData makeGraph(CellSample cellSample, Plate samplePlate, boolean verbose) {
Instant start = Instant.now(); Instant start = Instant.now();
List<Integer[]> distinctCells = cellSample.getCells();
int[] alphaIndex = {cdr3AlphaIndex}; int[] alphaIndex = {cdr3AlphaIndex};
int[] betaIndex = {cdr3BetaIndex}; int[] betaIndex = {cdr3BetaIndex};
@@ -113,7 +114,7 @@ public class Simulator implements GraphModificationFunctions {
distCellsMapAlphaKey, plateVtoAMap, plateVtoBMap, plateAtoVMap, distCellsMapAlphaKey, plateVtoAMap, plateVtoBMap, plateAtoVMap,
plateBtoVMap, alphaWellCounts, betaWellCounts, time); plateBtoVMap, alphaWellCounts, betaWellCounts, time);
//Set source file name in graph to name of sample plate //Set source file name in graph to name of sample plate
output.setSourceFilename(samplePlate.getSourceFileName()); output.setSourceFilename(samplePlate.getFilename());
//return GraphWithMapData object //return GraphWithMapData object
return output; return output;
} }
@@ -279,7 +280,7 @@ public class Simulator implements GraphModificationFunctions {
metadata.put("correct pairing count", Integer.toString(trueCount)); metadata.put("correct pairing count", Integer.toString(trueCount));
metadata.put("incorrect pairing count", Integer.toString(falseCount)); metadata.put("incorrect pairing count", Integer.toString(falseCount));
metadata.put("pairing error rate", pairingErrorRateTrunc.toString()); metadata.put("pairing error rate", pairingErrorRateTrunc.toString());
metadata.put("simulation time", nf.format(time.toSeconds())); metadata.put("simulation time (seconds)", nf.format(time.toSeconds()));
//create MatchingResult object //create MatchingResult object
MatchingResult output = new MatchingResult(metadata, header, allResults, matchMap, time); MatchingResult output = new MatchingResult(metadata, header, allResults, matchMap, time);
if(verbose){ if(verbose){