Compare commits
29 Commits
906c06062f
...
v1.3
| Author | SHA1 | Date | |
|---|---|---|---|
| 2829b88689 | |||
| 108b0ec13f | |||
| a8b58d3f79 | |||
| bf64d57731 | |||
| c068c3db3c | |||
| 4bcda9b66c | |||
| 17ae763c6c | |||
| decdb147a9 | |||
| 74ffbfd8ac | |||
| 08699ce8ce | |||
| 69b0cc535c | |||
| e58f7b0a55 | |||
| dd2164c250 | |||
| 7323093bdc | |||
| f904cf6672 | |||
| 3ccee9891b | |||
| 40c2be1cfb | |||
| 4b597c4e5e | |||
| b2398531a3 | |||
| 8e9a250890 | |||
| e2a996c997 | |||
| a5db89cb0b | |||
| 1630f9ccba | |||
| d785aa0da2 | |||
| a7afeb6119 | |||
| f8167b0774 | |||
| 68ee9e4bb6 | |||
| fd2ec76b71 | |||
| 875f457a2d |
48
readme.md
48
readme.md
@@ -12,7 +12,7 @@ Unlike pairSEQ, which calculates p-values for every TCR alpha/beta overlap and c
|
|||||||
against a null distribution, BiGpairSEQ does not do any statistical calculations
|
against a null distribution, BiGpairSEQ does not do any statistical calculations
|
||||||
directly.
|
directly.
|
||||||
|
|
||||||
BiGpairSEQ creates a [simple bipartite weighted graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
BiGpairSEQ creates a [weightd bipartite graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
||||||
The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
|
The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
|
||||||
are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
|
are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
|
||||||
(Sequences present in *all* wells are filtered out prior to creating the graph, as there is no signal in their occupancy pattern.)
|
(Sequences present in *all* wells are filtered out prior to creating the graph, as there is no signal in their occupancy pattern.)
|
||||||
@@ -65,20 +65,33 @@ Please select an option:
|
|||||||
2) Generate a sample plate of T cells
|
2) Generate a sample plate of T cells
|
||||||
3) Generate CDR3 alpha/beta occupancy data and overlap graph
|
3) Generate CDR3 alpha/beta occupancy data and overlap graph
|
||||||
4) Simulate bipartite graph CDR3 alpha/beta matching (BiGpairSEQ)
|
4) Simulate bipartite graph CDR3 alpha/beta matching (BiGpairSEQ)
|
||||||
|
8) Options
|
||||||
9) About/Acknowledgments
|
9) About/Acknowledgments
|
||||||
0) Exit
|
0) Exit
|
||||||
```
|
```
|
||||||
|
|
||||||
### OUTPUT
|
### INPUT/OUTPUT
|
||||||
|
|
||||||
To run the simulation, the program reads and writes 4 kinds of files:
|
To run the simulation, the program reads and writes 4 kinds of files:
|
||||||
* Cell Sample files in CSV format
|
* Cell Sample files in CSV format
|
||||||
* Sample Plate files in CSV format
|
* Sample Plate files in CSV format
|
||||||
* Graph and Data files in binary object serialization format
|
* Graph/Data files in binary object serialization format
|
||||||
* Matching Results files in CSV format
|
* Matching Results files in CSV format
|
||||||
|
|
||||||
When entering filenames, it is not necessary to include the file extension (.csv or .ser). When reading or
|
These files are often generated in sequence. When entering filenames, it is not necessary to include the file extension
|
||||||
writing files, the program will automatically add the correct extension to any filename without one.
|
(.csv or .ser). When reading or writing files, the program will automatically add the correct extension to any filename without one.
|
||||||
|
|
||||||
|
To save file I/O time, the most recent instance of each of these four
|
||||||
|
files either generated or read from disk can be cached in program memory. This is especially important for Graph/Data files,
|
||||||
|
which can be several gigabytes in size. Since some simulations may require running multiple,
|
||||||
|
differently-configured BiGpairSEQ matchings on the same graph, keeping the most recent graph cached can reduce execution time
|
||||||
|
|
||||||
|
Subsequent uses of the same data file won't need to be read in again until another file of that type is used or generated,
|
||||||
|
or caching is turned off for that file type. The program checks whether it needs to update its cached data by comparing
|
||||||
|
filenames as entered by the user. On encountering a new filename, the program flushes its cache and reads in the new file.
|
||||||
|
|
||||||
|
The program's caching behavior can be controlled in the Options menu. By default, caching for cell sample and
|
||||||
|
sample plate files is OFF, and caching for graph/data files is ON.
|
||||||
|
|
||||||
#### Cell Sample Files
|
#### Cell Sample Files
|
||||||
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
||||||
@@ -121,7 +134,7 @@ Options when making a Sample Plate file:
|
|||||||
* Standard deviation size
|
* Standard deviation size
|
||||||
* Exponential
|
* Exponential
|
||||||
* Lambda value
|
* Lambda value
|
||||||
* (Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was exponential with a lambda of approximately 0.6. (Howie, et al. 2015))
|
* *(Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was exponential with a lambda of approximately 0.6. (Howie, et al. 2015))*
|
||||||
* Total number of wells on the plate
|
* Total number of wells on the plate
|
||||||
* Number of sections on plate
|
* Number of sections on plate
|
||||||
* Number of T cells per well
|
* Number of T cells per well
|
||||||
@@ -129,7 +142,7 @@ Options when making a Sample Plate file:
|
|||||||
* Dropout rate
|
* Dropout rate
|
||||||
|
|
||||||
Files are in CSV format. There are no header labels. Every row represents a well.
|
Files are in CSV format. There are no header labels. Every row represents a well.
|
||||||
Every column represents an individual cell, containing four sequences, depicted as an array string:
|
Every value represents an individual cell, containing four sequences, depicted as an array string:
|
||||||
`[CDR3A, CDR3B, CDR1A, CDR1B]`. So a representative cell might look like this:
|
`[CDR3A, CDR3B, CDR1A, CDR1B]`. So a representative cell might look like this:
|
||||||
|
|
||||||
`[525902, 791533, -1, 866282]`
|
`[525902, 791533, -1, 866282]`
|
||||||
@@ -155,14 +168,16 @@ Structure:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
#### Graph and Data Files
|
#### Graph/Data Files
|
||||||
Graph and Data files are serialized binaries of a Java object containing the weigthed bipartite graph representation of a
|
Graph/Data files are serialized binaries of a Java object containing the weigthed bipartite graph representation of a
|
||||||
Sample Plate, along with the necessary metadata for matching and results output. Making them requires a Cell Sample file
|
Sample Plate, along with the necessary metadata for matching and results output. Making them requires a Cell Sample file
|
||||||
(to construct a list of correct sequence pairs for checking the accuracy of BiGpairSEQ simulations) and a
|
(to construct a list of correct sequence pairs for checking the accuracy of BiGpairSEQ simulations) and a
|
||||||
Sample Plate file (to construct the associated occupancy graph). These files can be several gigabytes in size.
|
Sample Plate file (to construct the associated occupancy graph).
|
||||||
Writing them to a file lets us generate a graph and its metadata once, then use it for multiple different BiGpairSEQ simulations.
|
|
||||||
|
|
||||||
Options for creating a Graph and Data file:
|
These files can be several gigabytes in size. Writing them to a file lets us generate a graph and its metadata once,
|
||||||
|
then use it for multiple different BiGpairSEQ simulations.
|
||||||
|
|
||||||
|
Options for creating a Graph/Data file:
|
||||||
* The Cell Sample file to use
|
* The Cell Sample file to use
|
||||||
* The Sample Plate file to use. (This must have been generated from the selected Cell Sample file.)
|
* The Sample Plate file to use. (This must have been generated from the selected Cell Sample file.)
|
||||||
|
|
||||||
@@ -172,8 +187,8 @@ portable data format may be implemented in the future. The tricky part is encodi
|
|||||||
---
|
---
|
||||||
|
|
||||||
#### Matching Results Files
|
#### Matching Results Files
|
||||||
Matching results files consist of the results of a BiGpairSEQ matching simulation.
|
Matching results files consist of the results of a BiGpairSEQ matching simulation. Making them requires a Graph and
|
||||||
Files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
Data file. Matching results files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
||||||
Metadata about the matching simulation is included as comments. Comments are preceded by `#`.
|
Metadata about the matching simulation is included as comments. Comments are preceded by `#`.
|
||||||
|
|
||||||
Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
|
Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
|
||||||
@@ -239,8 +254,9 @@ slightly less time than the simulation itself. Real elapsed time from start to f
|
|||||||
## TODO
|
## TODO
|
||||||
|
|
||||||
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
||||||
* ~~Hold graph data in memory until another graph is read-in?~~ ABANDONED
|
* Hold graph data in memory until another graph is read-in? ~~ABANDONED~~ ~~UNABANDONED~~ DONE
|
||||||
* *No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.*
|
* ~~*No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.*~~
|
||||||
|
* Might have figured out a way to do it, by taking edges out and then putting them back into the graph. This may actually be possible. If so, awesome.
|
||||||
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
||||||
* ~~Problem is variable number of cells in a well~~
|
* ~~Problem is variable number of cells in a well~~
|
||||||
* ~~Apache Commons CSV library writes entries a row at a time~~
|
* ~~Apache Commons CSV library writes entries a row at a time~~
|
||||||
|
|||||||
@@ -1,14 +1,154 @@
|
|||||||
//main class. Only job is to choose which interface to use.
|
import java.util.Random;
|
||||||
|
|
||||||
|
//main class. For choosing interface type and caching file data
|
||||||
public class BiGpairSEQ {
|
public class BiGpairSEQ {
|
||||||
|
|
||||||
private static void main(String[] args) {
|
private static final Random rand = new Random();
|
||||||
|
private static CellSample cellSampleInMemory = null;
|
||||||
|
private static String cellFilename = null;
|
||||||
|
private static Plate plateInMemory = null;
|
||||||
|
private static String plateFilename = null;
|
||||||
|
private static GraphWithMapData graphInMemory = null;
|
||||||
|
private static String graphFilename = null;
|
||||||
|
private static boolean cacheCells = false;
|
||||||
|
private static boolean cachePlate = false;
|
||||||
|
private static boolean cacheGraph = true;
|
||||||
|
|
||||||
|
public static void main(String[] args) {
|
||||||
if (args.length == 0) {
|
if (args.length == 0) {
|
||||||
InteractiveInterface.startInteractive();
|
InteractiveInterface.startInteractive();
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
//This will be uncommented when command line arguments are fixed.
|
//This will be uncommented when command line arguments are re-implemented.
|
||||||
//CommandLineInterface.startCLI(args);
|
//CommandLineInterface.startCLI(args);
|
||||||
System.out.println("Command line arguments are still being re-implemented.");
|
System.out.println("Command line arguments are still being re-implemented.");
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public static Random getRand() {
|
||||||
|
return rand;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static CellSample getCellSampleInMemory() {
|
||||||
|
return cellSampleInMemory;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setCellSampleInMemory(CellSample cellSample, String filename) {
|
||||||
|
if(cellSampleInMemory != null) {
|
||||||
|
clearCellSampleInMemory();
|
||||||
|
}
|
||||||
|
cellSampleInMemory = cellSample;
|
||||||
|
cellFilename = filename;
|
||||||
|
System.out.println("Cell sample file " + filename + " cached.");
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void clearCellSampleInMemory() {
|
||||||
|
cellSampleInMemory = null;
|
||||||
|
cellFilename = null;
|
||||||
|
System.gc();
|
||||||
|
System.out.println("Cell sample file cache cleared.");
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
public static String getCellFilename() {
|
||||||
|
return cellFilename;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static Plate getPlateInMemory() {
|
||||||
|
return plateInMemory;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setPlateInMemory(Plate plate, String filename) {
|
||||||
|
if(plateInMemory != null) {
|
||||||
|
clearPlateInMemory();
|
||||||
|
}
|
||||||
|
plateInMemory = plate;
|
||||||
|
plateFilename = filename;
|
||||||
|
System.out.println("Sample plate file " + filename + " cached.");
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void clearPlateInMemory() {
|
||||||
|
plateInMemory = null;
|
||||||
|
plateFilename = null;
|
||||||
|
System.gc();
|
||||||
|
System.out.println("Sample plate file cache cleared.");
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
public static String getPlateFilename() {
|
||||||
|
return plateFilename;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
public static GraphWithMapData getGraphInMemory() {return graphInMemory;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setGraphInMemory(GraphWithMapData g, String filename) {
|
||||||
|
if (graphInMemory != null) {
|
||||||
|
clearGraphInMemory();
|
||||||
|
}
|
||||||
|
graphInMemory = g;
|
||||||
|
graphFilename = filename;
|
||||||
|
System.out.println("Graph and data file " + filename + " cached.");
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void clearGraphInMemory() {
|
||||||
|
graphInMemory = null;
|
||||||
|
graphFilename = null;
|
||||||
|
System.gc();
|
||||||
|
System.out.println("Graph and data file cache cleared.");
|
||||||
|
}
|
||||||
|
|
||||||
|
public static String getGraphFilename() {
|
||||||
|
return graphFilename;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
public static boolean cacheCells() {
|
||||||
|
return cacheCells;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setCacheCells(boolean cacheCells) {
|
||||||
|
//if not caching, clear the memory
|
||||||
|
if(!cacheCells){
|
||||||
|
BiGpairSEQ.clearCellSampleInMemory();
|
||||||
|
System.out.println("Cell sample file caching: OFF.");
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Cell sample file caching: ON.");
|
||||||
|
}
|
||||||
|
BiGpairSEQ.cacheCells = cacheCells;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static boolean cachePlate() {
|
||||||
|
return cachePlate;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setCachePlate(boolean cachePlate) {
|
||||||
|
//if not caching, clear the memory
|
||||||
|
if(!cachePlate) {
|
||||||
|
BiGpairSEQ.clearPlateInMemory();
|
||||||
|
System.out.println("Sample plate file caching: OFF.");
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Sample plate file caching: ON.");
|
||||||
|
}
|
||||||
|
BiGpairSEQ.cachePlate = cachePlate;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static boolean cacheGraph() {
|
||||||
|
return cacheGraph;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void setCacheGraph(boolean cacheGraph) {
|
||||||
|
//if not caching, clear the memory
|
||||||
|
if(!cacheGraph) {
|
||||||
|
BiGpairSEQ.clearGraphInMemory();
|
||||||
|
System.out.println("Graph/data file caching: OFF.");
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Graph/data file caching: ON.");
|
||||||
|
}
|
||||||
|
BiGpairSEQ.cacheGraph = cacheGraph;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ public class CellFileReader {
|
|||||||
|
|
||||||
private String filename;
|
private String filename;
|
||||||
private List<Integer[]> distinctCells = new ArrayList<>();
|
private List<Integer[]> distinctCells = new ArrayList<>();
|
||||||
|
private Integer cdr1Freq;
|
||||||
|
|
||||||
public CellFileReader(String filename) {
|
public CellFileReader(String filename) {
|
||||||
if(!filename.matches(".*\\.csv")){
|
if(!filename.matches(".*\\.csv")){
|
||||||
@@ -38,19 +39,37 @@ public class CellFileReader {
|
|||||||
cell[3] = Integer.valueOf(record.get("Beta CDR1"));
|
cell[3] = Integer.valueOf(record.get("Beta CDR1"));
|
||||||
distinctCells.add(cell);
|
distinctCells.add(cell);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
} catch(IOException ex){
|
} catch(IOException ex){
|
||||||
System.out.println("cell file " + filename + " not found.");
|
System.out.println("cell file " + filename + " not found.");
|
||||||
System.err.println(ex);
|
System.err.println(ex);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
//get CDR1 frequency
|
||||||
|
ArrayList<Integer> cdr1Alphas = new ArrayList<>();
|
||||||
|
for (Integer[] cell : distinctCells) {
|
||||||
|
cdr1Alphas.add(cell[3]);
|
||||||
|
}
|
||||||
|
double count = cdr1Alphas.stream().distinct().count();
|
||||||
|
count = Math.ceil(distinctCells.size() / count);
|
||||||
|
cdr1Freq = (int) count;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
public CellSample getCellSample() {
|
||||||
|
return new CellSample(distinctCells, cdr1Freq);
|
||||||
}
|
}
|
||||||
|
|
||||||
public String getFilename() { return filename;}
|
public String getFilename() { return filename;}
|
||||||
|
|
||||||
public List<Integer[]> getCells(){
|
//Refactor everything that uses this to have access to a Cell Sample and get the cells there instead.
|
||||||
|
public List<Integer[]> getListOfDistinctCellsDEPRECATED(){
|
||||||
return distinctCells;
|
return distinctCells;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Integer getCellCount() {
|
public Integer getCellCountDEPRECATED() {
|
||||||
|
//Refactor everything that uses this to have access to a Cell Sample and get the count there instead.
|
||||||
return distinctCells.size();
|
return distinctCells.size();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -18,7 +18,7 @@ public class CellSample {
|
|||||||
return cdr1Freq;
|
return cdr1Freq;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Integer population(){
|
public Integer getCellCount(){
|
||||||
return cells.size();
|
return cells.size();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,45 @@
|
|||||||
import org.apache.commons.cli.*;
|
import org.apache.commons.cli.*;
|
||||||
|
|
||||||
//Class for parsing options passed to program from command line
|
/*
|
||||||
|
* Class for parsing options passed to program from command line
|
||||||
|
*
|
||||||
|
* Top-level flags:
|
||||||
|
* cells : to make a cell sample file
|
||||||
|
* plate : to make a sample plate file
|
||||||
|
* graph : to make a graph and data file
|
||||||
|
* match : to do a cdr3 matching (WITH OR WITHOUT MAKING A RESULTS FILE. May just want to print summary for piping.)
|
||||||
|
*
|
||||||
|
* Cell flags:
|
||||||
|
* count : number of cells to generate
|
||||||
|
* diversity factor : factor by which CDR3s are more diverse than CDR1s
|
||||||
|
* output : name of the output file
|
||||||
|
*
|
||||||
|
* Plate flags:
|
||||||
|
* cellfile : name of the cell sample file to use as input
|
||||||
|
* wells : the number of wells on the plate
|
||||||
|
* dist : the statistical distribution to use
|
||||||
|
* (if exponential) lambda : the lambda value of the exponential distribution
|
||||||
|
* (if gaussian) stddev : the standard deviation of the gaussian distribution
|
||||||
|
* rand : randomize well populations, take a minimum argument and a maximum argument
|
||||||
|
* populations : number of t cells per well per section (number of arguments determines number of sections)
|
||||||
|
* dropout : plate dropout rate, double from 0.0 to 1.0
|
||||||
|
* output : name of the output file
|
||||||
|
*
|
||||||
|
* Graph flags:
|
||||||
|
* cellfile : name of the cell sample file to use as input
|
||||||
|
* platefile : name of the sample plate file to use as input
|
||||||
|
* output : name of the output file
|
||||||
|
*
|
||||||
|
* Match flags:
|
||||||
|
* graphFile : name of graph and data file to use as input
|
||||||
|
* min : minimum number of overlap wells to attempt a matching
|
||||||
|
* max : the maximum number of overlap wells to attempt a matching
|
||||||
|
* maxdiff : (optional) the maximum difference in occupancy to attempt a matching
|
||||||
|
* minpercent : (optional) the minimum percent overlap to attempt a matching.
|
||||||
|
* writefile : (optional) the filename to write results to
|
||||||
|
* output : the values to print to System.out for piping
|
||||||
|
*
|
||||||
|
*/
|
||||||
public class CommandLineInterface {
|
public class CommandLineInterface {
|
||||||
|
|
||||||
public static void startCLI(String[] args) {
|
public static void startCLI(String[] args) {
|
||||||
@@ -20,7 +59,7 @@ public class CommandLineInterface {
|
|||||||
.longOpt("make-plates")
|
.longOpt("make-plates")
|
||||||
.desc("Makes a sample plate file")
|
.desc("Makes a sample plate file")
|
||||||
.build();
|
.build();
|
||||||
Option makeGraph = Option.builder("graoh")
|
Option makeGraph = Option.builder("graph")
|
||||||
.longOpt("make-graph")
|
.longOpt("make-graph")
|
||||||
.desc("Makes a graph and data file")
|
.desc("Makes a graph and data file")
|
||||||
.build();
|
.build();
|
||||||
@@ -258,7 +297,7 @@ public class CommandLineInterface {
|
|||||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getCells(), lambda);
|
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), lambda);
|
||||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||||
writer.writePlateFile();
|
writer.writePlateFile();
|
||||||
}
|
}
|
||||||
@@ -266,9 +305,9 @@ public class CommandLineInterface {
|
|||||||
private static void makePlatePoisson(String cellFile, String filename, Integer numWells,
|
private static void makePlatePoisson(String cellFile, String filename, Integer numWells,
|
||||||
Integer[] concentrations, Double dropOutRate){
|
Integer[] concentrations, Double dropOutRate){
|
||||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||||
Double stdDev = Math.sqrt(cellReader.getCellCount());
|
Double stdDev = Math.sqrt(cellReader.getCellCountDEPRECATED());
|
||||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||||
writer.writePlateFile();
|
writer.writePlateFile();
|
||||||
}
|
}
|
||||||
@@ -277,7 +316,7 @@ public class CommandLineInterface {
|
|||||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||||
writer.writePlateFile();
|
writer.writePlateFile();
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -4,10 +4,6 @@ import java.math.MathContext;
|
|||||||
|
|
||||||
public abstract class Equations {
|
public abstract class Equations {
|
||||||
|
|
||||||
public static int getRandomNumber(int min, int max) {
|
|
||||||
return (int) ((Math.random() * (max - min)) + min);
|
|
||||||
}
|
|
||||||
|
|
||||||
//pValue calculation as described in original pairSEQ paper.
|
//pValue calculation as described in original pairSEQ paper.
|
||||||
//Included for comparison with original results.
|
//Included for comparison with original results.
|
||||||
//Not used by BiGpairSEQ for matching.
|
//Not used by BiGpairSEQ for matching.
|
||||||
|
|||||||
@@ -13,6 +13,8 @@ public class GraphDataObjectReader {
|
|||||||
BufferedInputStream fileIn = new BufferedInputStream(new FileInputStream(filename));
|
BufferedInputStream fileIn = new BufferedInputStream(new FileInputStream(filename));
|
||||||
ObjectInputStream in = new ObjectInputStream(fileIn))
|
ObjectInputStream in = new ObjectInputStream(fileIn))
|
||||||
{
|
{
|
||||||
|
System.out.println("Reading graph data from file. This may take some time");
|
||||||
|
System.out.println("File I/O time is not included in results");
|
||||||
data = (GraphWithMapData) in.readObject();
|
data = (GraphWithMapData) in.readObject();
|
||||||
} catch (FileNotFoundException | ClassNotFoundException ex) {
|
} catch (FileNotFoundException | ClassNotFoundException ex) {
|
||||||
ex.printStackTrace();
|
ex.printStackTrace();
|
||||||
|
|||||||
@@ -18,8 +18,11 @@ public class GraphDataObjectWriter {
|
|||||||
|
|
||||||
public void writeDataToFile() {
|
public void writeDataToFile() {
|
||||||
try (BufferedOutputStream bufferedOut = new BufferedOutputStream(new FileOutputStream(filename));
|
try (BufferedOutputStream bufferedOut = new BufferedOutputStream(new FileOutputStream(filename));
|
||||||
|
|
||||||
ObjectOutputStream out = new ObjectOutputStream(bufferedOut);
|
ObjectOutputStream out = new ObjectOutputStream(bufferedOut);
|
||||||
){
|
){
|
||||||
|
System.out.println("Writing graph and occupancy data to file. This may take some time.");
|
||||||
|
System.out.println("File I/O time is not included in results.");
|
||||||
out.writeObject(data);
|
out.writeObject(data);
|
||||||
} catch (IOException ex) {
|
} catch (IOException ex) {
|
||||||
ex.printStackTrace();
|
ex.printStackTrace();
|
||||||
|
|||||||
90
src/main/java/GraphModificationFunctions.java
Normal file
90
src/main/java/GraphModificationFunctions.java
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
import org.jgrapht.graph.DefaultWeightedEdge;
|
||||||
|
import org.jgrapht.graph.SimpleWeightedGraph;
|
||||||
|
|
||||||
|
import java.util.ArrayList;
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.Map;
|
||||||
|
import java.util.Set;
|
||||||
|
|
||||||
|
public abstract class GraphModificationFunctions {
|
||||||
|
|
||||||
|
//remove over- and under-weight edges
|
||||||
|
public static List<Integer[]> filterByOverlapThresholds(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
||||||
|
int low, int high) {
|
||||||
|
List<Integer[]> removedEdges = new ArrayList<>();
|
||||||
|
for(DefaultWeightedEdge e: graph.edgeSet()){
|
||||||
|
if ((graph.getEdgeWeight(e) > high) || (graph.getEdgeWeight(e) < low)){
|
||||||
|
Integer source = graph.getEdgeSource(e);
|
||||||
|
Integer target = graph.getEdgeTarget(e);
|
||||||
|
Integer weight = (int) graph.getEdgeWeight(e);
|
||||||
|
Integer[] edge = {source, target, weight};
|
||||||
|
removedEdges.add(edge);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for (Integer[] edge : removedEdges) {
|
||||||
|
graph.removeEdge(edge[0], edge[1]);
|
||||||
|
}
|
||||||
|
return removedEdges;
|
||||||
|
}
|
||||||
|
|
||||||
|
//Remove edges for pairs with large occupancy discrepancy
|
||||||
|
public static List<Integer[]> filterByRelativeOccupancy(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
||||||
|
Map<Integer, Integer> alphaWellCounts,
|
||||||
|
Map<Integer, Integer> betaWellCounts,
|
||||||
|
Map<Integer, Integer> plateVtoAMap,
|
||||||
|
Map<Integer, Integer> plateVtoBMap,
|
||||||
|
Integer maxOccupancyDifference) {
|
||||||
|
List<Integer[]> removedEdges = new ArrayList<>();
|
||||||
|
for (DefaultWeightedEdge e : graph.edgeSet()) {
|
||||||
|
Integer alphaOcc = alphaWellCounts.get(plateVtoAMap.get(graph.getEdgeSource(e)));
|
||||||
|
Integer betaOcc = betaWellCounts.get(plateVtoBMap.get(graph.getEdgeTarget(e)));
|
||||||
|
if (Math.abs(alphaOcc - betaOcc) >= maxOccupancyDifference) {
|
||||||
|
Integer source = graph.getEdgeSource(e);
|
||||||
|
Integer target = graph.getEdgeTarget(e);
|
||||||
|
Integer weight = (int) graph.getEdgeWeight(e);
|
||||||
|
Integer[] edge = {source, target, weight};
|
||||||
|
removedEdges.add(edge);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for (Integer[] edge : removedEdges) {
|
||||||
|
graph.removeEdge(edge[0], edge[1]);
|
||||||
|
}
|
||||||
|
return removedEdges;
|
||||||
|
}
|
||||||
|
|
||||||
|
//Remove edges for pairs where overlap size is significantly lower than the well occupancy
|
||||||
|
public static List<Integer[]> filterByOverlapPercent(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
||||||
|
Map<Integer, Integer> alphaWellCounts,
|
||||||
|
Map<Integer, Integer> betaWellCounts,
|
||||||
|
Map<Integer, Integer> plateVtoAMap,
|
||||||
|
Map<Integer, Integer> plateVtoBMap,
|
||||||
|
Integer minOverlapPercent) {
|
||||||
|
List<Integer[]> removedEdges = new ArrayList<>();
|
||||||
|
for (DefaultWeightedEdge e : graph.edgeSet()) {
|
||||||
|
Integer alphaOcc = alphaWellCounts.get(plateVtoAMap.get(graph.getEdgeSource(e)));
|
||||||
|
Integer betaOcc = betaWellCounts.get(plateVtoBMap.get(graph.getEdgeTarget(e)));
|
||||||
|
double weight = graph.getEdgeWeight(e);
|
||||||
|
double min = minOverlapPercent / 100.0;
|
||||||
|
if ((weight / alphaOcc < min) || (weight / betaOcc < min)) {
|
||||||
|
Integer source = graph.getEdgeSource(e);
|
||||||
|
Integer target = graph.getEdgeTarget(e);
|
||||||
|
Integer intWeight = (int) graph.getEdgeWeight(e);
|
||||||
|
Integer[] edge = {source, target, intWeight};
|
||||||
|
removedEdges.add(edge);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for (Integer[] edge : removedEdges) {
|
||||||
|
graph.removeEdge(edge[0], edge[1]);
|
||||||
|
}
|
||||||
|
return removedEdges;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static void addRemovedEdges(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
||||||
|
List<Integer[]> removedEdges) {
|
||||||
|
for (Integer[] edge : removedEdges) {
|
||||||
|
DefaultWeightedEdge e = graph.addEdge(edge[0], edge[1]);
|
||||||
|
graph.setEdgeWeight(e, (double) edge[2]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
@@ -11,7 +11,7 @@ public class GraphWithMapData implements java.io.Serializable {
|
|||||||
private String sourceFilename;
|
private String sourceFilename;
|
||||||
private final SimpleWeightedGraph graph;
|
private final SimpleWeightedGraph graph;
|
||||||
private Integer numWells;
|
private Integer numWells;
|
||||||
private Integer[] wellConcentrations;
|
private Integer[] wellPopulations;
|
||||||
private Integer alphaCount;
|
private Integer alphaCount;
|
||||||
private Integer betaCount;
|
private Integer betaCount;
|
||||||
private final Map<Integer, Integer> distCellsMapAlphaKey;
|
private final Map<Integer, Integer> distCellsMapAlphaKey;
|
||||||
@@ -31,7 +31,7 @@ public class GraphWithMapData implements java.io.Serializable {
|
|||||||
Map<Integer, Integer> betaWellCounts, Duration time) {
|
Map<Integer, Integer> betaWellCounts, Duration time) {
|
||||||
this.graph = graph;
|
this.graph = graph;
|
||||||
this.numWells = numWells;
|
this.numWells = numWells;
|
||||||
this.wellConcentrations = wellConcentrations;
|
this.wellPopulations = wellConcentrations;
|
||||||
this.alphaCount = alphaCount;
|
this.alphaCount = alphaCount;
|
||||||
this.betaCount = betaCount;
|
this.betaCount = betaCount;
|
||||||
this.distCellsMapAlphaKey = distCellsMapAlphaKey;
|
this.distCellsMapAlphaKey = distCellsMapAlphaKey;
|
||||||
@@ -52,8 +52,8 @@ public class GraphWithMapData implements java.io.Serializable {
|
|||||||
return numWells;
|
return numWells;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Integer[] getWellConcentrations() {
|
public Integer[] getWellPopulations() {
|
||||||
return wellConcentrations;
|
return wellPopulations;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Integer getAlphaCount() {
|
public Integer getAlphaCount() {
|
||||||
|
|||||||
@@ -1,14 +1,15 @@
|
|||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
import java.util.List;
|
import java.util.*;
|
||||||
import java.util.Scanner;
|
import java.util.regex.Matcher;
|
||||||
import java.util.InputMismatchException;
|
import java.util.regex.Pattern;
|
||||||
|
|
||||||
//
|
//
|
||||||
public class InteractiveInterface {
|
public class InteractiveInterface {
|
||||||
|
|
||||||
final static Scanner sc = new Scanner(System.in);
|
private static final Random rand = BiGpairSEQ.getRand();
|
||||||
static int input;
|
private static final Scanner sc = new Scanner(System.in);
|
||||||
static boolean quit = false;
|
private static int input;
|
||||||
|
private static boolean quit = false;
|
||||||
|
|
||||||
public static void startInteractive() {
|
public static void startInteractive() {
|
||||||
|
|
||||||
@@ -26,16 +27,18 @@ public class InteractiveInterface {
|
|||||||
//Need to re-do the CDR3/CDR1 matching to correspond to new pattern
|
//Need to re-do the CDR3/CDR1 matching to correspond to new pattern
|
||||||
//System.out.println("5) Generate CDR3/CDR1 occupancy graph");
|
//System.out.println("5) Generate CDR3/CDR1 occupancy graph");
|
||||||
//System.out.println("6) Simulate CDR3/CDR1 T cell matching");
|
//System.out.println("6) Simulate CDR3/CDR1 T cell matching");
|
||||||
|
System.out.println("8) Options");
|
||||||
System.out.println("9) About/Acknowledgments");
|
System.out.println("9) About/Acknowledgments");
|
||||||
System.out.println("0) Exit");
|
System.out.println("0) Exit");
|
||||||
try {
|
try {
|
||||||
input = sc.nextInt();
|
input = sc.nextInt();
|
||||||
switch (input) {
|
switch (input) {
|
||||||
case 1 -> makeCellsInteractive();
|
case 1 -> makeCells();
|
||||||
case 2 -> makePlateInteractive();
|
case 2 -> makePlate();
|
||||||
case 3 -> makeCDR3GraphInteractive();
|
case 3 -> makeCDR3Graph();
|
||||||
case 4 -> matchCDR3sInteractive();
|
case 4 -> matchCDR3s();
|
||||||
//case 6 -> matchCellsCDR1();
|
//case 6 -> matchCellsCDR1();
|
||||||
|
case 8 -> options();
|
||||||
case 9 -> acknowledge();
|
case 9 -> acknowledge();
|
||||||
case 0 -> quit = true;
|
case 0 -> quit = true;
|
||||||
default -> throw new InputMismatchException("Invalid input.");
|
default -> throw new InputMismatchException("Invalid input.");
|
||||||
@@ -48,7 +51,7 @@ public class InteractiveInterface {
|
|||||||
sc.close();
|
sc.close();
|
||||||
}
|
}
|
||||||
|
|
||||||
private static void makeCellsInteractive() {
|
private static void makeCells() {
|
||||||
String filename = null;
|
String filename = null;
|
||||||
Integer numCells = 0;
|
Integer numCells = 0;
|
||||||
Integer cdr1Freq = 1;
|
Integer cdr1Freq = 1;
|
||||||
@@ -73,19 +76,23 @@ public class InteractiveInterface {
|
|||||||
}
|
}
|
||||||
CellSample sample = Simulator.generateCellSample(numCells, cdr1Freq);
|
CellSample sample = Simulator.generateCellSample(numCells, cdr1Freq);
|
||||||
assert filename != null;
|
assert filename != null;
|
||||||
|
System.out.println("Writing cells to file");
|
||||||
CellFileWriter writer = new CellFileWriter(filename, sample);
|
CellFileWriter writer = new CellFileWriter(filename, sample);
|
||||||
writer.writeCellsToFile();
|
writer.writeCellsToFile();
|
||||||
System.gc();
|
System.out.println("Cell sample written to: " + filename);
|
||||||
|
if(BiGpairSEQ.cacheCells()) {
|
||||||
|
BiGpairSEQ.setCellSampleInMemory(sample, filename);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
//Output a CSV of sample plate
|
//Output a CSV of sample plate
|
||||||
private static void makePlateInteractive() {
|
private static void makePlate() {
|
||||||
String cellFile = null;
|
String cellFile = null;
|
||||||
String filename = null;
|
String filename = null;
|
||||||
Double stdDev = 0.0;
|
Double stdDev = 0.0;
|
||||||
Integer numWells = 0;
|
Integer numWells = 0;
|
||||||
Integer numSections;
|
Integer numSections;
|
||||||
Integer[] concentrations = {1};
|
Integer[] populations = {1};
|
||||||
Double dropOutRate = 0.0;
|
Double dropOutRate = 0.0;
|
||||||
boolean poisson = false;
|
boolean poisson = false;
|
||||||
boolean exponential = false;
|
boolean exponential = false;
|
||||||
@@ -124,10 +131,11 @@ public class InteractiveInterface {
|
|||||||
}
|
}
|
||||||
case 3 -> {
|
case 3 -> {
|
||||||
exponential = true;
|
exponential = true;
|
||||||
System.out.println("Please enter lambda value for exponential distribution.");
|
System.out.print("Please enter lambda value for exponential distribution: ");
|
||||||
lambda = sc.nextDouble();
|
lambda = sc.nextDouble();
|
||||||
if (lambda <= 0.0) {
|
if (lambda <= 0.0) {
|
||||||
throw new InputMismatchException("Value must be positive.");
|
lambda = 0.6;
|
||||||
|
System.out.println("Value must be positive. Defaulting to 0.6.");
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
default -> {
|
default -> {
|
||||||
@@ -140,22 +148,57 @@ public class InteractiveInterface {
|
|||||||
if(numWells < 1){
|
if(numWells < 1){
|
||||||
throw new InputMismatchException("No wells on plate");
|
throw new InputMismatchException("No wells on plate");
|
||||||
}
|
}
|
||||||
System.out.println("\nThe plate can be evenly sectioned to allow multiple concentrations of T-cells/well");
|
//choose whether to make T cell population/well random
|
||||||
System.out.println("How many sections would you like to make (minimum 1)?");
|
boolean randomWellPopulations;
|
||||||
numSections = sc.nextInt();
|
System.out.println("Randomize number of T cells in each well? (y/n)");
|
||||||
if(numSections < 1) {
|
String ans = sc.next();
|
||||||
throw new InputMismatchException("Too few sections.");
|
Pattern pattern = Pattern.compile("(?:yes|y)", Pattern.CASE_INSENSITIVE);
|
||||||
|
Matcher matcher = pattern.matcher(ans);
|
||||||
|
if(matcher.matches()){
|
||||||
|
randomWellPopulations = true;
|
||||||
}
|
}
|
||||||
else if (numSections > numWells) {
|
else{
|
||||||
throw new InputMismatchException("Cannot have more sections than wells.");
|
randomWellPopulations = false;
|
||||||
}
|
}
|
||||||
int i = 1;
|
if(randomWellPopulations) { //if T cell population/well is random
|
||||||
concentrations = new Integer[numSections];
|
numSections = numWells;
|
||||||
while(numSections > 0) {
|
Integer minPop;
|
||||||
System.out.print("Enter number of T-cells per well in section " + i +": ");
|
Integer maxPop;
|
||||||
concentrations[i - 1] = sc.nextInt();
|
System.out.print("Please enter minimum number of T cells in a well: ");
|
||||||
i++;
|
minPop = sc.nextInt();
|
||||||
numSections--;
|
if(minPop < 1) {
|
||||||
|
throw new InputMismatchException("Minimum well population must be positive");
|
||||||
|
}
|
||||||
|
System.out.println("Please enter maximum number of T cells in a well: ");
|
||||||
|
maxPop = sc.nextInt();
|
||||||
|
if(maxPop < minPop) {
|
||||||
|
throw new InputMismatchException("Max well population must be greater than min well population");
|
||||||
|
}
|
||||||
|
//maximum should be inclusive, so need to add one to max of randomly generated values
|
||||||
|
populations = rand.ints(minPop, maxPop + 1)
|
||||||
|
.limit(numSections)
|
||||||
|
.boxed()
|
||||||
|
.toArray(Integer[]::new);
|
||||||
|
System.out.print("Populations: ");
|
||||||
|
System.out.println(Arrays.toString(populations));
|
||||||
|
}
|
||||||
|
else{ //if T cell population/well is not random
|
||||||
|
System.out.println("\nThe plate can be evenly sectioned to allow different numbers of T cells per well.");
|
||||||
|
System.out.println("How many sections would you like to make (minimum 1)?");
|
||||||
|
numSections = sc.nextInt();
|
||||||
|
if (numSections < 1) {
|
||||||
|
throw new InputMismatchException("Too few sections.");
|
||||||
|
} else if (numSections > numWells) {
|
||||||
|
throw new InputMismatchException("Cannot have more sections than wells.");
|
||||||
|
}
|
||||||
|
int i = 1;
|
||||||
|
populations = new Integer[numSections];
|
||||||
|
while (numSections > 0) {
|
||||||
|
System.out.print("Enter number of T cells per well in section " + i + ": ");
|
||||||
|
populations[i - 1] = sc.nextInt();
|
||||||
|
i++;
|
||||||
|
numSections--;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
System.out.println("\nErrors in amplification can induce a well dropout rate for sequences");
|
System.out.println("\nErrors in amplification can induce a well dropout rate for sequences");
|
||||||
System.out.print("Enter well dropout rate (0.0 to 1.0): ");
|
System.out.print("Enter well dropout rate (0.0 to 1.0): ");
|
||||||
@@ -167,32 +210,45 @@ public class InteractiveInterface {
|
|||||||
System.out.println(ex);
|
System.out.println(ex);
|
||||||
sc.next();
|
sc.next();
|
||||||
}
|
}
|
||||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
|
||||||
assert cellFile != null;
|
assert cellFile != null;
|
||||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
CellSample cells;
|
||||||
|
if (cellFile.equals(BiGpairSEQ.getCellFilename())){
|
||||||
|
cells = BiGpairSEQ.getCellSampleInMemory();
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||||
|
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||||
|
cells = cellReader.getCellSample();
|
||||||
|
if(BiGpairSEQ.cacheCells()) {
|
||||||
|
BiGpairSEQ.setCellSampleInMemory(cells, cellFile);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
assert filename != null;
|
||||||
|
Plate samplePlate;
|
||||||
|
PlateFileWriter writer;
|
||||||
if(exponential){
|
if(exponential){
|
||||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getCells(), lambda);
|
samplePlate.fillWellsExponential(cellFile, cells.getCells(), lambda);
|
||||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
writer = new PlateFileWriter(filename, samplePlate);
|
||||||
writer.writePlateFile();
|
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
if (poisson) {
|
if (poisson) {
|
||||||
stdDev = Math.sqrt(cellReader.getCellCount()); //gaussian with square root of elements approximates poisson
|
stdDev = Math.sqrt(cells.getCellCount()); //gaussian with square root of elements approximates poisson
|
||||||
}
|
}
|
||||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
samplePlate.fillWells(cellFile, cells.getCells(), stdDev);
|
||||||
assert filename != null;
|
writer = new PlateFileWriter(filename, samplePlate);
|
||||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
}
|
||||||
System.out.println("Writing Sample Plate to file");
|
System.out.println("Writing Sample Plate to file");
|
||||||
writer.writePlateFile();
|
writer.writePlateFile();
|
||||||
System.out.println("Sample Plate written to file: " + filename);
|
System.out.println("Sample Plate written to file: " + filename);
|
||||||
System.gc();
|
if(BiGpairSEQ.cachePlate()) {
|
||||||
|
BiGpairSEQ.setPlateInMemory(samplePlate, filename);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
//Output serialized binary of GraphAndMapData object
|
//Output serialized binary of GraphAndMapData object
|
||||||
private static void makeCDR3GraphInteractive() {
|
private static void makeCDR3Graph() {
|
||||||
String filename = null;
|
String filename = null;
|
||||||
String cellFile = null;
|
String cellFile = null;
|
||||||
String plateFile = null;
|
String plateFile = null;
|
||||||
@@ -212,14 +268,37 @@ public class InteractiveInterface {
|
|||||||
System.out.println(ex);
|
System.out.println(ex);
|
||||||
sc.next();
|
sc.next();
|
||||||
}
|
}
|
||||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
|
||||||
assert cellFile != null;
|
assert cellFile != null;
|
||||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
CellSample cellSample;
|
||||||
System.out.println("Reading Sample Plate file: " + plateFile);
|
//check if cells are already in memory
|
||||||
|
if(cellFile.equals(BiGpairSEQ.getCellFilename()) && BiGpairSEQ.getCellSampleInMemory() != null) {
|
||||||
|
cellSample = BiGpairSEQ.getCellSampleInMemory();
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||||
|
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||||
|
cellSample = cellReader.getCellSample();
|
||||||
|
if(BiGpairSEQ.cacheCells()) {
|
||||||
|
BiGpairSEQ.setCellSampleInMemory(cellSample, cellFile);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
assert plateFile != null;
|
assert plateFile != null;
|
||||||
PlateFileReader plateReader = new PlateFileReader(plateFile);
|
Plate plate;
|
||||||
Plate plate = new Plate(plateReader.getFilename(), plateReader.getWells());
|
//check if plate is already in memory
|
||||||
if (cellReader.getCells().size() == 0){
|
if(plateFile.equals(BiGpairSEQ.getPlateFilename())){
|
||||||
|
plate = BiGpairSEQ.getPlateInMemory();
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
System.out.println("Reading Sample Plate file: " + plateFile);
|
||||||
|
PlateFileReader plateReader = new PlateFileReader(plateFile);
|
||||||
|
plate = new Plate(plateReader.getFilename(), plateReader.getWells());
|
||||||
|
if(BiGpairSEQ.cachePlate()) {
|
||||||
|
BiGpairSEQ.setPlateInMemory(plate, plateFile);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (cellSample.getCells().size() == 0){
|
||||||
System.out.println("No cell sample found.");
|
System.out.println("No cell sample found.");
|
||||||
System.out.println("Returning to main menu.");
|
System.out.println("Returning to main menu.");
|
||||||
}
|
}
|
||||||
@@ -228,22 +307,23 @@ public class InteractiveInterface {
|
|||||||
System.out.println("Returning to main menu.");
|
System.out.println("Returning to main menu.");
|
||||||
}
|
}
|
||||||
else{
|
else{
|
||||||
List<Integer[]> cells = cellReader.getCells();
|
List<Integer[]> cells = cellSample.getCells();
|
||||||
GraphWithMapData data = Simulator.makeGraph(cells, plate, true);
|
GraphWithMapData data = Simulator.makeGraph(cells, plate, true);
|
||||||
assert filename != null;
|
assert filename != null;
|
||||||
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
|
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
|
||||||
System.out.println("Writing graph and occupancy data to file. This may take some time.");
|
|
||||||
System.out.println("File I/O time is not included in results.");
|
|
||||||
dataWriter.writeDataToFile();
|
dataWriter.writeDataToFile();
|
||||||
System.out.println("Graph and Data file written to: " + filename);
|
System.out.println("Graph and Data file written to: " + filename);
|
||||||
System.gc();
|
if(BiGpairSEQ.cacheGraph()) {
|
||||||
|
BiGpairSEQ.setGraphInMemory(data, filename);
|
||||||
|
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
//Simulate matching and output CSV file of results
|
//Simulate matching and output CSV file of results
|
||||||
private static void matchCDR3sInteractive() throws IOException {
|
private static void matchCDR3s() throws IOException {
|
||||||
String filename = null;
|
String filename = null;
|
||||||
String dataFilename = null;
|
String graphFilename = null;
|
||||||
Integer lowThreshold = 0;
|
Integer lowThreshold = 0;
|
||||||
Integer highThreshold = Integer.MAX_VALUE;
|
Integer highThreshold = Integer.MAX_VALUE;
|
||||||
Integer maxOccupancyDiff = Integer.MAX_VALUE;
|
Integer maxOccupancyDiff = Integer.MAX_VALUE;
|
||||||
@@ -251,39 +331,55 @@ public class InteractiveInterface {
|
|||||||
try {
|
try {
|
||||||
System.out.println("\nBiGpairSEQ simulation requires an occupancy data and overlap graph file");
|
System.out.println("\nBiGpairSEQ simulation requires an occupancy data and overlap graph file");
|
||||||
System.out.println("Please enter name of an existing graph and occupancy data file: ");
|
System.out.println("Please enter name of an existing graph and occupancy data file: ");
|
||||||
dataFilename = sc.next();
|
graphFilename = sc.next();
|
||||||
System.out.println("The matching results will be written to a file.");
|
System.out.println("The matching results will be written to a file.");
|
||||||
System.out.print("Please enter a name for the output file: ");
|
System.out.print("Please enter a name for the output file: ");
|
||||||
filename = sc.next();
|
filename = sc.next();
|
||||||
System.out.println("\nWhat is the minimum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
System.out.println("\nWhat is the minimum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
||||||
lowThreshold = sc.nextInt();
|
lowThreshold = sc.nextInt();
|
||||||
if(lowThreshold < 1){
|
if(lowThreshold < 1){
|
||||||
throw new InputMismatchException("Minimum value for low threshold set to 1");
|
lowThreshold = 1;
|
||||||
|
System.out.println("Value for low occupancy overlap threshold must be positive");
|
||||||
|
System.out.println("Value for low occupancy overlap threshold set to 1");
|
||||||
}
|
}
|
||||||
System.out.println("\nWhat is the maximum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
System.out.println("\nWhat is the maximum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
||||||
highThreshold = sc.nextInt();
|
highThreshold = sc.nextInt();
|
||||||
System.out.println("\nWhat is the maximum difference in alpha/beta occupancy to attempt matching?");
|
if(highThreshold < lowThreshold) {
|
||||||
maxOccupancyDiff = sc.nextInt();
|
highThreshold = lowThreshold;
|
||||||
System.out.println("\nWell overlap percentage = pair overlap / sequence occupancy");
|
System.out.println("Value for high occupancy overlap threshold must be >= low overlap threshold");
|
||||||
System.out.println("What is the minimum well overlap percentage to attempt matching? (0 to 100)");
|
System.out.println("Value for high occupancy overlap threshold set to " + lowThreshold);
|
||||||
|
}
|
||||||
|
System.out.println("What is the minimum percentage of a sequence's wells in alpha/beta overlap to attempt matching? (0 - 100)");
|
||||||
minOverlapPercent = sc.nextInt();
|
minOverlapPercent = sc.nextInt();
|
||||||
if (minOverlapPercent < 0 || minOverlapPercent > 100) {
|
if (minOverlapPercent < 0 || minOverlapPercent > 100) {
|
||||||
throw new InputMismatchException("Value outside range. Minimum percent set to 0");
|
System.out.println("Value outside range. Minimum occupancy overlap percentage set to 0");
|
||||||
|
}
|
||||||
|
System.out.println("\nWhat is the maximum difference in alpha/beta occupancy to attempt matching?");
|
||||||
|
maxOccupancyDiff = sc.nextInt();
|
||||||
|
if (maxOccupancyDiff < 0) {
|
||||||
|
maxOccupancyDiff = 0;
|
||||||
|
System.out.println("Maximum allowable difference in alpha/beta occupancy must be nonnegative");
|
||||||
|
System.out.println("Maximum allowable difference in alpha/beta occupancy set to 0");
|
||||||
}
|
}
|
||||||
} catch (InputMismatchException ex) {
|
} catch (InputMismatchException ex) {
|
||||||
System.out.println(ex);
|
System.out.println(ex);
|
||||||
sc.next();
|
sc.next();
|
||||||
}
|
}
|
||||||
//read object data from file
|
assert graphFilename != null;
|
||||||
System.out.println("Reading graph data from file. This may take some time");
|
//check if this is the same graph we already have in memory.
|
||||||
System.out.println("File I/O time is not included in results");
|
GraphWithMapData data;
|
||||||
assert dataFilename != null;
|
if(graphFilename.equals(BiGpairSEQ.getGraphFilename())) {
|
||||||
GraphDataObjectReader dataReader = new GraphDataObjectReader(dataFilename);
|
data = BiGpairSEQ.getGraphInMemory();
|
||||||
GraphWithMapData data = dataReader.getData();
|
}
|
||||||
//set source file name
|
else {
|
||||||
data.setSourceFilename(dataFilename);
|
GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename);
|
||||||
|
data = dataReader.getData();
|
||||||
|
if(BiGpairSEQ.cacheGraph()) {
|
||||||
|
BiGpairSEQ.setGraphInMemory(data, graphFilename);
|
||||||
|
}
|
||||||
|
}
|
||||||
//simulate matching
|
//simulate matching
|
||||||
MatchingResult results = Simulator.matchCDR3s(data, dataFilename, lowThreshold, highThreshold, maxOccupancyDiff,
|
MatchingResult results = Simulator.matchCDR3s(data, graphFilename, lowThreshold, highThreshold, maxOccupancyDiff,
|
||||||
minOverlapPercent, true);
|
minOverlapPercent, true);
|
||||||
//write results to file
|
//write results to file
|
||||||
assert filename != null;
|
assert filename != null;
|
||||||
@@ -291,7 +387,6 @@ public class InteractiveInterface {
|
|||||||
System.out.println("Writing results to file");
|
System.out.println("Writing results to file");
|
||||||
writer.writeResultsToFile();
|
writer.writeResultsToFile();
|
||||||
System.out.println("Results written to file: " + filename);
|
System.out.println("Results written to file: " + filename);
|
||||||
System.gc();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
///////
|
///////
|
||||||
@@ -398,6 +493,35 @@ public class InteractiveInterface {
|
|||||||
// }
|
// }
|
||||||
// }
|
// }
|
||||||
|
|
||||||
|
private static void options(){
|
||||||
|
boolean backToMain = false;
|
||||||
|
while(!backToMain) {
|
||||||
|
System.out.println("\n--------------OPTIONS---------------");
|
||||||
|
System.out.println("1) Turn " + getOnOff(!BiGpairSEQ.cacheCells()) + " cell sample file caching");
|
||||||
|
System.out.println("2) Turn " + getOnOff(!BiGpairSEQ.cachePlate()) + " plate file caching");
|
||||||
|
System.out.println("3) Turn " + getOnOff(!BiGpairSEQ.cacheGraph()) + " graph/data file caching");
|
||||||
|
System.out.println("0) Return to main menu");
|
||||||
|
try {
|
||||||
|
input = sc.nextInt();
|
||||||
|
switch (input) {
|
||||||
|
case 1 -> BiGpairSEQ.setCacheCells(!BiGpairSEQ.cacheCells());
|
||||||
|
case 2 -> BiGpairSEQ.setCachePlate(!BiGpairSEQ.cachePlate());
|
||||||
|
case 3 -> BiGpairSEQ.setCacheGraph(!BiGpairSEQ.cacheGraph());
|
||||||
|
case 0 -> backToMain = true;
|
||||||
|
default -> throw new InputMismatchException("Invalid input.");
|
||||||
|
}
|
||||||
|
} catch (InputMismatchException ex) {
|
||||||
|
System.out.println(ex);
|
||||||
|
sc.next();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private static String getOnOff(boolean b) {
|
||||||
|
if (b) { return "on";}
|
||||||
|
else { return "off"; }
|
||||||
|
}
|
||||||
|
|
||||||
private static void acknowledge(){
|
private static void acknowledge(){
|
||||||
System.out.println("This program simulates BiGpairSEQ, a graph theory based adaptation");
|
System.out.println("This program simulates BiGpairSEQ, a graph theory based adaptation");
|
||||||
System.out.println("of the pairSEQ algorithm for pairing T cell receptor sequences.");
|
System.out.println("of the pairSEQ algorithm for pairing T cell receptor sequences.");
|
||||||
|
|||||||
3
src/main/java/META-INF/MANIFEST.MF
Normal file
3
src/main/java/META-INF/MANIFEST.MF
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
Manifest-Version: 1.0
|
||||||
|
Main-Class: BiGpairSEQ
|
||||||
|
|
||||||
@@ -11,7 +11,6 @@ import java.util.List;
|
|||||||
public class MatchingFileWriter {
|
public class MatchingFileWriter {
|
||||||
|
|
||||||
private String filename;
|
private String filename;
|
||||||
private String sourceFileName;
|
|
||||||
private List<String> comments;
|
private List<String> comments;
|
||||||
private List<String> headers;
|
private List<String> headers;
|
||||||
private List<List<String>> allResults;
|
private List<List<String>> allResults;
|
||||||
@@ -21,7 +20,6 @@ public class MatchingFileWriter {
|
|||||||
filename = filename + ".csv";
|
filename = filename + ".csv";
|
||||||
}
|
}
|
||||||
this.filename = filename;
|
this.filename = filename;
|
||||||
this.sourceFileName = result.getSourceFileName();
|
|
||||||
this.comments = result.getComments();
|
this.comments = result.getComments();
|
||||||
this.headers = result.getHeaders();
|
this.headers = result.getHeaders();
|
||||||
this.allResults = result.getAllResults();
|
this.allResults = result.getAllResults();
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ import java.util.List;
|
|||||||
import java.util.Map;
|
import java.util.Map;
|
||||||
|
|
||||||
public class MatchingResult {
|
public class MatchingResult {
|
||||||
private final String sourceFile;
|
|
||||||
private final Map<String, String> metadata;
|
private final Map<String, String> metadata;
|
||||||
private final List<String> comments;
|
private final List<String> comments;
|
||||||
private final List<String> headers;
|
private final List<String> headers;
|
||||||
@@ -12,16 +12,15 @@ public class MatchingResult {
|
|||||||
private final Map<Integer, Integer> matchMap;
|
private final Map<Integer, Integer> matchMap;
|
||||||
private final Duration time;
|
private final Duration time;
|
||||||
|
|
||||||
public MatchingResult(String sourceFileName, Map<String, String> metadata, List<String> headers,
|
public MatchingResult(Map<String, String> metadata, List<String> headers,
|
||||||
List<List<String>> allResults, Map<Integer, Integer>matchMap, Duration time){
|
List<List<String>> allResults, Map<Integer, Integer>matchMap, Duration time){
|
||||||
this.sourceFile = sourceFileName;
|
|
||||||
/*
|
/*
|
||||||
* POSSIBLE KEYS FOR METADATA MAP ARE:
|
* POSSIBLE KEYS FOR METADATA MAP ARE:
|
||||||
* sample plate filename
|
* sample plate filename *
|
||||||
* graph filename
|
* graph filename *
|
||||||
* well populations
|
* well populations *
|
||||||
* total alphas found
|
* total alphas found *
|
||||||
* total betas found
|
* total betas found *
|
||||||
* high overlap threshold
|
* high overlap threshold
|
||||||
* low overlap threshold
|
* low overlap threshold
|
||||||
* maximum occupancy difference
|
* maximum occupancy difference
|
||||||
@@ -66,7 +65,32 @@ public class MatchingResult {
|
|||||||
return time;
|
return time;
|
||||||
}
|
}
|
||||||
|
|
||||||
public String getSourceFileName() {
|
public String getPlateFilename() {
|
||||||
return sourceFile;
|
return metadata.get("sample plate filename");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public String getGraphFilename() {
|
||||||
|
return metadata.get("graph filename");
|
||||||
|
}
|
||||||
|
|
||||||
|
public Integer[] getWellPopulations() {
|
||||||
|
List<Integer> wellPopulations = new ArrayList<>();
|
||||||
|
String popString = metadata.get("well populations");
|
||||||
|
for (String p : popString.split(", ")) {
|
||||||
|
wellPopulations.add(Integer.parseInt(p));
|
||||||
|
}
|
||||||
|
Integer[] popArray = new Integer[wellPopulations.size()];
|
||||||
|
return wellPopulations.toArray(popArray);
|
||||||
|
}
|
||||||
|
|
||||||
|
public Integer getAlphaCount() {
|
||||||
|
return Integer.parseInt(metadata.get("total alpha count"));
|
||||||
|
}
|
||||||
|
|
||||||
|
public Integer getBetaCount() {
|
||||||
|
return Integer.parseInt(metadata.get("total beta count"));
|
||||||
|
}
|
||||||
|
|
||||||
|
//put in the rest of these methods following the same pattern
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -10,7 +10,7 @@ import java.util.*;
|
|||||||
public class Plate {
|
public class Plate {
|
||||||
private String sourceFile;
|
private String sourceFile;
|
||||||
private List<List<Integer[]>> wells;
|
private List<List<Integer[]>> wells;
|
||||||
private Random rand = new Random();
|
private final Random rand = BiGpairSEQ.getRand();
|
||||||
private int size;
|
private int size;
|
||||||
private double error;
|
private double error;
|
||||||
private Integer[] populations;
|
private Integer[] populations;
|
||||||
@@ -51,7 +51,6 @@ public class Plate {
|
|||||||
int section = 0;
|
int section = 0;
|
||||||
double m;
|
double m;
|
||||||
int n;
|
int n;
|
||||||
int test=0;
|
|
||||||
while (section < numSections){
|
while (section < numSections){
|
||||||
for (int i = 0; i < (size / numSections); i++) {
|
for (int i = 0; i < (size / numSections); i++) {
|
||||||
List<Integer[]> well = new ArrayList<>();
|
List<Integer[]> well = new ArrayList<>();
|
||||||
@@ -61,13 +60,6 @@ public class Plate {
|
|||||||
m = (Math.log10((1 - rand.nextDouble()))/(-lambda)) * Math.sqrt(cells.size());
|
m = (Math.log10((1 - rand.nextDouble()))/(-lambda)) * Math.sqrt(cells.size());
|
||||||
} while (m >= cells.size() || m < 0);
|
} while (m >= cells.size() || m < 0);
|
||||||
n = (int) Math.floor(m);
|
n = (int) Math.floor(m);
|
||||||
//n = Equations.getRandomNumber(0, cells.size());
|
|
||||||
// was testing generating the cell sample file with exponential dist, then sampling flat here
|
|
||||||
//that would be more realistic
|
|
||||||
//But would mess up other things in the simulation with how I've coded it.
|
|
||||||
if(n > test){
|
|
||||||
test = n;
|
|
||||||
}
|
|
||||||
Integer[] cellToAdd = cells.get(n).clone();
|
Integer[] cellToAdd = cells.get(n).clone();
|
||||||
for(int k = 0; k < cellToAdd.length; k++){
|
for(int k = 0; k < cellToAdd.length; k++){
|
||||||
if(Math.abs(rand.nextDouble()) < error){//error applied to each seqeunce
|
if(Math.abs(rand.nextDouble()) < error){//error applied to each seqeunce
|
||||||
@@ -80,7 +72,6 @@ public class Plate {
|
|||||||
}
|
}
|
||||||
section++;
|
section++;
|
||||||
}
|
}
|
||||||
System.out.println("Highest index: " +test);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public void fillWells(String sourceFileName, List<Integer[]> cells, double stdDev) {
|
public void fillWells(String sourceFileName, List<Integer[]> cells, double stdDev) {
|
||||||
|
|||||||
@@ -16,8 +16,7 @@ public class PlateFileWriter {
|
|||||||
private Double error;
|
private Double error;
|
||||||
private String filename;
|
private String filename;
|
||||||
private String sourceFileName;
|
private String sourceFileName;
|
||||||
private String[] headers;
|
private Integer[] populations;
|
||||||
private Integer[] concentrations;
|
|
||||||
private boolean isExponential = false;
|
private boolean isExponential = false;
|
||||||
|
|
||||||
public PlateFileWriter(String filename, Plate plate) {
|
public PlateFileWriter(String filename, Plate plate) {
|
||||||
@@ -36,8 +35,8 @@ public class PlateFileWriter {
|
|||||||
}
|
}
|
||||||
this.error = plate.getError();
|
this.error = plate.getError();
|
||||||
this.wells = plate.getWells();
|
this.wells = plate.getWells();
|
||||||
this.concentrations = plate.getPopulations();
|
this.populations = plate.getPopulations();
|
||||||
Arrays.sort(concentrations);
|
Arrays.sort(populations);
|
||||||
}
|
}
|
||||||
|
|
||||||
public void writePlateFile(){
|
public void writePlateFile(){
|
||||||
@@ -58,29 +57,28 @@ public class PlateFileWriter {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
//this took forever and I don't use it
|
// //this took forever and I don't use it
|
||||||
List<List<String>> rows = new ArrayList<>();
|
// //if I wanted to use it, I'd replace printer.printRecords(wellsAsStrings) with printer.printRecords(rows)
|
||||||
List<String> tmp = new ArrayList<>();
|
// List<List<String>> rows = new ArrayList<>();
|
||||||
for(int i = 0; i < wellsAsStrings.size(); i++){//List<Integer[]> w: wells){
|
// List<String> tmp = new ArrayList<>();
|
||||||
tmp.add("well " + (i+1));
|
// for(int i = 0; i < wellsAsStrings.size(); i++){//List<Integer[]> w: wells){
|
||||||
}
|
// tmp.add("well " + (i+1));
|
||||||
rows.add(tmp);
|
// }
|
||||||
for(int row = 0; row < maxLength; row++){
|
// rows.add(tmp);
|
||||||
tmp = new ArrayList<>();
|
// for(int row = 0; row < maxLength; row++){
|
||||||
for(List<String> c: wellsAsStrings){
|
// tmp = new ArrayList<>();
|
||||||
tmp.add(c.get(row));
|
// for(List<String> c: wellsAsStrings){
|
||||||
}
|
// tmp.add(c.get(row));
|
||||||
rows.add(tmp);
|
// }
|
||||||
}
|
// rows.add(tmp);
|
||||||
|
// }
|
||||||
|
|
||||||
//get list of well populations
|
//make string out of populations array
|
||||||
List<Integer> wellPopulations = Arrays.asList(concentrations);
|
|
||||||
//make string out of populations list
|
|
||||||
StringBuilder populationsStringBuilder = new StringBuilder();
|
StringBuilder populationsStringBuilder = new StringBuilder();
|
||||||
populationsStringBuilder.append(wellPopulations.remove(0).toString());
|
populationsStringBuilder.append(populations[0].toString());
|
||||||
for(Integer i: wellPopulations){
|
for(int i = 1; i < populations.length; i++){
|
||||||
populationsStringBuilder.append(", ");
|
populationsStringBuilder.append(", ");
|
||||||
populationsStringBuilder.append(i.toString());
|
populationsStringBuilder.append(populations[i].toString());
|
||||||
}
|
}
|
||||||
String wellPopulationsString = populationsStringBuilder.toString();
|
String wellPopulationsString = populationsStringBuilder.toString();
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,8 @@ import java.time.Duration;
|
|||||||
import java.util.*;
|
import java.util.*;
|
||||||
import java.util.stream.IntStream;
|
import java.util.stream.IntStream;
|
||||||
|
|
||||||
|
import static java.lang.Float.*;
|
||||||
|
|
||||||
//NOTE: "sequence" in method and variable names refers to a peptide sequence from a simulated T cell
|
//NOTE: "sequence" in method and variable names refers to a peptide sequence from a simulated T cell
|
||||||
public class Simulator {
|
public class Simulator {
|
||||||
private static final int cdr3AlphaIndex = 0;
|
private static final int cdr3AlphaIndex = 0;
|
||||||
@@ -49,6 +51,7 @@ public class Simulator {
|
|||||||
Instant start = Instant.now();
|
Instant start = Instant.now();
|
||||||
int[] alphaIndex = {cdr3AlphaIndex};
|
int[] alphaIndex = {cdr3AlphaIndex};
|
||||||
int[] betaIndex = {cdr3BetaIndex};
|
int[] betaIndex = {cdr3BetaIndex};
|
||||||
|
|
||||||
int numWells = samplePlate.getSize();
|
int numWells = samplePlate.getSize();
|
||||||
|
|
||||||
if(verbose){System.out.println("Making cell maps");}
|
if(verbose){System.out.println("Making cell maps");}
|
||||||
@@ -63,15 +66,11 @@ public class Simulator {
|
|||||||
if(verbose){System.out.println("All alphas count: " + alphaCount);}
|
if(verbose){System.out.println("All alphas count: " + alphaCount);}
|
||||||
int betaCount = allBetas.size();
|
int betaCount = allBetas.size();
|
||||||
if(verbose){System.out.println("All betas count: " + betaCount);}
|
if(verbose){System.out.println("All betas count: " + betaCount);}
|
||||||
|
|
||||||
if(verbose){System.out.println("Well maps made");}
|
if(verbose){System.out.println("Well maps made");}
|
||||||
|
|
||||||
//Remove saturating-occupancy sequences because they have no signal value.
|
|
||||||
//Remove sequences with total occupancy below minimum pair overlap threshold
|
|
||||||
if(verbose){System.out.println("Removing sequences present in all wells.");}
|
if(verbose){System.out.println("Removing sequences present in all wells.");}
|
||||||
//if(verbose){System.out.println("Removing sequences with occupancy below the minimum overlap threshold");}
|
filterByOccupancyThresholds(allAlphas, 1, numWells - 1);
|
||||||
filterByOccupancyThreshold(allAlphas, 1, numWells - 1);
|
filterByOccupancyThresholds(allBetas, 1, numWells - 1);
|
||||||
filterByOccupancyThreshold(allBetas, 1, numWells - 1);
|
|
||||||
if(verbose){System.out.println("Sequences removed");}
|
if(verbose){System.out.println("Sequences removed");}
|
||||||
int pairableAlphaCount = allAlphas.size();
|
int pairableAlphaCount = allAlphas.size();
|
||||||
if(verbose){System.out.println("Remaining alphas count: " + pairableAlphaCount);}
|
if(verbose){System.out.println("Remaining alphas count: " + pairableAlphaCount);}
|
||||||
@@ -136,6 +135,7 @@ public class Simulator {
|
|||||||
GraphWithMapData output = new GraphWithMapData(graph, numWells, samplePlate.getPopulations(), alphaCount, betaCount,
|
GraphWithMapData output = new GraphWithMapData(graph, numWells, samplePlate.getPopulations(), alphaCount, betaCount,
|
||||||
distCellsMapAlphaKey, plateVtoAMap, plateVtoBMap, plateAtoVMap,
|
distCellsMapAlphaKey, plateVtoAMap, plateVtoBMap, plateAtoVMap,
|
||||||
plateBtoVMap, alphaWellCounts, betaWellCounts, time);
|
plateBtoVMap, alphaWellCounts, betaWellCounts, time);
|
||||||
|
//Set source file name in graph to name of sample plate
|
||||||
output.setSourceFilename(samplePlate.getSourceFileName());
|
output.setSourceFilename(samplePlate.getSourceFileName());
|
||||||
//return GraphWithMapData object
|
//return GraphWithMapData object
|
||||||
return output;
|
return output;
|
||||||
@@ -146,6 +146,8 @@ public class Simulator {
|
|||||||
Integer highThreshold, Integer maxOccupancyDifference,
|
Integer highThreshold, Integer maxOccupancyDifference,
|
||||||
Integer minOverlapPercent, boolean verbose) {
|
Integer minOverlapPercent, boolean verbose) {
|
||||||
Instant start = Instant.now();
|
Instant start = Instant.now();
|
||||||
|
//Integer arrays will contain TO VERTEX, FROM VERTEX, and WEIGHT (which I'll need to cast to double)
|
||||||
|
List<Integer[]> removedEdges = new ArrayList<>();
|
||||||
int numWells = data.getNumWells();
|
int numWells = data.getNumWells();
|
||||||
Integer alphaCount = data.getAlphaCount();
|
Integer alphaCount = data.getAlphaCount();
|
||||||
Integer betaCount = data.getBetaCount();
|
Integer betaCount = data.getBetaCount();
|
||||||
@@ -156,24 +158,26 @@ public class Simulator {
|
|||||||
Map<Integer, Integer> betaWellCounts = data.getBetaWellCounts();
|
Map<Integer, Integer> betaWellCounts = data.getBetaWellCounts();
|
||||||
SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph = data.getGraph();
|
SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph = data.getGraph();
|
||||||
|
|
||||||
//remove weights outside given overlap thresholds
|
//remove edges with weights outside given overlap thresholds, add those to removed edge list
|
||||||
if(verbose){System.out.println("Eliminating edges with weights outside overlap threshold values");}
|
if(verbose){System.out.println("Eliminating edges with weights outside overlap threshold values");}
|
||||||
filterByOccupancyThreshold(graph, lowThreshold, highThreshold);
|
removedEdges.addAll(GraphModificationFunctions.filterByOverlapThresholds(graph, lowThreshold, highThreshold));
|
||||||
if(verbose){System.out.println("Over- and under-weight edges set to 0.0");}
|
if(verbose){System.out.println("Over- and under-weight edges removed");}
|
||||||
|
|
||||||
//Filter by overlap size
|
//remove edges between vertices with too small an overlap size, add those to removed edge list
|
||||||
if(verbose){System.out.println("Eliminating edges with weights less than " + minOverlapPercent.toString() +
|
if(verbose){System.out.println("Eliminating edges with weights less than " + minOverlapPercent.toString() +
|
||||||
" percent of vertex occupancy value.");}
|
" percent of vertex occupancy value.");}
|
||||||
filterByOverlapSize(graph, alphaWellCounts, betaWellCounts, plateVtoAMap, plateVtoBMap, minOverlapPercent);
|
removedEdges.addAll(GraphModificationFunctions.filterByOverlapPercent(graph, alphaWellCounts, betaWellCounts,
|
||||||
if(verbose){System.out.println("Edges with weights too far below vertex occupancy values set to 0.0");}
|
plateVtoAMap, plateVtoBMap, minOverlapPercent));
|
||||||
|
if(verbose){System.out.println("Edges with weights too far below a vertex occupancy value removed");}
|
||||||
|
|
||||||
//Filter by relative occupancy
|
//Filter by relative occupancy
|
||||||
if(verbose){System.out.println("Eliminating edges between vertices with occupancy difference > "
|
if(verbose){System.out.println("Eliminating edges between vertices with occupancy difference > "
|
||||||
+ maxOccupancyDifference);}
|
+ maxOccupancyDifference);}
|
||||||
filterByRelativeOccupancy(graph, alphaWellCounts, betaWellCounts, plateVtoAMap, plateVtoBMap,
|
removedEdges.addAll(GraphModificationFunctions.filterByRelativeOccupancy(graph, alphaWellCounts, betaWellCounts,
|
||||||
maxOccupancyDifference);
|
plateVtoAMap, plateVtoBMap, maxOccupancyDifference));
|
||||||
if(verbose){System.out.println("Edges between vertices of with excessively different occupancy values " +
|
if(verbose){System.out.println("Edges between vertices of with excessively different occupancy values " +
|
||||||
"set to 0.0");}
|
"removed");}
|
||||||
|
|
||||||
//Find Maximum Weighted Matching
|
//Find Maximum Weighted Matching
|
||||||
//using jheaps library class PairingHeap for improved efficiency
|
//using jheaps library class PairingHeap for improved efficiency
|
||||||
if(verbose){System.out.println("Finding maximum weighted matching");}
|
if(verbose){System.out.println("Finding maximum weighted matching");}
|
||||||
@@ -239,18 +243,26 @@ public class Simulator {
|
|||||||
|
|
||||||
//Metadata comments for CSV file
|
//Metadata comments for CSV file
|
||||||
int min = Math.min(alphaCount, betaCount);
|
int min = Math.min(alphaCount, betaCount);
|
||||||
|
//rate of attempted matching
|
||||||
double attemptRate = (double) (trueCount + falseCount) / min;
|
double attemptRate = (double) (trueCount + falseCount) / min;
|
||||||
BigDecimal attemptRateTrunc = new BigDecimal(attemptRate, mc);
|
BigDecimal attemptRateTrunc = new BigDecimal(attemptRate, mc);
|
||||||
|
//rate of pairing error
|
||||||
double pairingErrorRate = (double) falseCount / (trueCount + falseCount);
|
double pairingErrorRate = (double) falseCount / (trueCount + falseCount);
|
||||||
BigDecimal pairingErrorRateTrunc = new BigDecimal(pairingErrorRate, mc);
|
BigDecimal pairingErrorRateTrunc;
|
||||||
//get list of well concentrations
|
if(pairingErrorRate == NaN || pairingErrorRate == POSITIVE_INFINITY || pairingErrorRate == NEGATIVE_INFINITY) {
|
||||||
List<Integer> wellPopulations = Arrays.asList(data.getWellConcentrations());
|
pairingErrorRateTrunc = new BigDecimal(-1, mc);
|
||||||
//make string out of concentrations list
|
}
|
||||||
|
else{
|
||||||
|
pairingErrorRateTrunc = new BigDecimal(pairingErrorRate, mc);
|
||||||
|
}
|
||||||
|
//get list of well populations
|
||||||
|
Integer[] wellPopulations = data.getWellPopulations();
|
||||||
|
//make string out of populations list
|
||||||
StringBuilder populationsStringBuilder = new StringBuilder();
|
StringBuilder populationsStringBuilder = new StringBuilder();
|
||||||
populationsStringBuilder.append(wellPopulations.remove(0).toString());
|
populationsStringBuilder.append(wellPopulations[0].toString());
|
||||||
for(Integer i: wellPopulations){
|
for(int i = 1; i < wellPopulations.length; i++){
|
||||||
populationsStringBuilder.append(", ");
|
populationsStringBuilder.append(", ");
|
||||||
populationsStringBuilder.append(i.toString());
|
populationsStringBuilder.append(wellPopulations[i].toString());
|
||||||
}
|
}
|
||||||
String wellPopulationsString = populationsStringBuilder.toString();
|
String wellPopulationsString = populationsStringBuilder.toString();
|
||||||
//total simulation time
|
//total simulation time
|
||||||
@@ -265,20 +277,26 @@ public class Simulator {
|
|||||||
metadata.put("total betas found", betaCount.toString());
|
metadata.put("total betas found", betaCount.toString());
|
||||||
metadata.put("high overlap threshold", highThreshold.toString());
|
metadata.put("high overlap threshold", highThreshold.toString());
|
||||||
metadata.put("low overlap threshold", lowThreshold.toString());
|
metadata.put("low overlap threshold", lowThreshold.toString());
|
||||||
metadata.put("maximum occupancy difference", maxOccupancyDifference.toString());
|
|
||||||
metadata.put("minimum overlap percent", minOverlapPercent.toString());
|
metadata.put("minimum overlap percent", minOverlapPercent.toString());
|
||||||
|
metadata.put("maximum occupancy difference", maxOccupancyDifference.toString());
|
||||||
metadata.put("pairing attempt rate", attemptRateTrunc.toString());
|
metadata.put("pairing attempt rate", attemptRateTrunc.toString());
|
||||||
metadata.put("correct pairing count", Integer.toString(trueCount));
|
metadata.put("correct pairing count", Integer.toString(trueCount));
|
||||||
metadata.put("incorrect pairing count", Integer.toString(falseCount));
|
metadata.put("incorrect pairing count", Integer.toString(falseCount));
|
||||||
metadata.put("pairing error rate", pairingErrorRateTrunc.toString());
|
metadata.put("pairing error rate", pairingErrorRateTrunc.toString());
|
||||||
metadata.put("simulation time", nf.format(time.toSeconds()));
|
metadata.put("simulation time", nf.format(time.toSeconds()));
|
||||||
|
//create MatchingResult object
|
||||||
MatchingResult output = new MatchingResult(data.getSourceFilename(), metadata, header, allResults, matchMap, time);
|
MatchingResult output = new MatchingResult(metadata, header, allResults, matchMap, time);
|
||||||
if(verbose){
|
if(verbose){
|
||||||
for(String s: output.getComments()){
|
for(String s: output.getComments()){
|
||||||
System.out.println(s);
|
System.out.println(s);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
//put the removed edges back on the graph
|
||||||
|
System.out.println("Restoring removed edges to graph.");
|
||||||
|
GraphModificationFunctions.addRemovedEdges(graph, removedEdges);
|
||||||
|
|
||||||
|
//return MatchingResult object
|
||||||
return output;
|
return output;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -587,6 +605,18 @@ public class Simulator {
|
|||||||
// return output;
|
// return output;
|
||||||
// }
|
// }
|
||||||
|
|
||||||
|
//Remove sequences based on occupancy
|
||||||
|
public static void filterByOccupancyThresholds(Map<Integer, Integer> wellMap, int low, int high){
|
||||||
|
List<Integer> noise = new ArrayList<>();
|
||||||
|
for(Integer k: wellMap.keySet()){
|
||||||
|
if((wellMap.get(k) > high) || (wellMap.get(k) < low)){
|
||||||
|
noise.add(k);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for(Integer k: noise) {
|
||||||
|
wellMap.remove(k);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
//Counts the well occupancy of the row peptides and column peptides into given maps, and
|
//Counts the well occupancy of the row peptides and column peptides into given maps, and
|
||||||
//fills weights in the given 2D array
|
//fills weights in the given 2D array
|
||||||
@@ -630,62 +660,6 @@ public class Simulator {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
private static void filterByOccupancyThreshold(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
|
||||||
int low, int high) {
|
|
||||||
for(DefaultWeightedEdge e: graph.edgeSet()){
|
|
||||||
if ((graph.getEdgeWeight(e) > high) || (graph.getEdgeWeight(e) < low)){
|
|
||||||
graph.setEdgeWeight(e, 0.0);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
private static void filterByOccupancyThreshold(Map<Integer, Integer> wellMap, int low, int high){
|
|
||||||
List<Integer> noise = new ArrayList<>();
|
|
||||||
for(Integer k: wellMap.keySet()){
|
|
||||||
if((wellMap.get(k) > high) || (wellMap.get(k) < low)){
|
|
||||||
noise.add(k);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
for(Integer k: noise) {
|
|
||||||
wellMap.remove(k);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
//Remove edges for pairs with large occupancy discrepancy
|
|
||||||
private static void filterByRelativeOccupancy(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
|
||||||
Map<Integer, Integer> alphaWellCounts,
|
|
||||||
Map<Integer, Integer> betaWellCounts,
|
|
||||||
Map<Integer, Integer> plateVtoAMap,
|
|
||||||
Map<Integer, Integer> plateVtoBMap,
|
|
||||||
Integer maxOccupancyDifference) {
|
|
||||||
for (DefaultWeightedEdge e : graph.edgeSet()) {
|
|
||||||
Integer alphaOcc = alphaWellCounts.get(plateVtoAMap.get(graph.getEdgeSource(e)));
|
|
||||||
Integer betaOcc = betaWellCounts.get(plateVtoBMap.get(graph.getEdgeTarget(e)));
|
|
||||||
//Adjust this to something cleverer later
|
|
||||||
if (Math.abs(alphaOcc - betaOcc) >= maxOccupancyDifference) {
|
|
||||||
graph.setEdgeWeight(e, 0.0);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
//Remove edges for pairs where overlap size is significantly lower than the well occupancy
|
|
||||||
private static void filterByOverlapSize(SimpleWeightedGraph<Integer, DefaultWeightedEdge> graph,
|
|
||||||
Map<Integer, Integer> alphaWellCounts,
|
|
||||||
Map<Integer, Integer> betaWellCounts,
|
|
||||||
Map<Integer, Integer> plateVtoAMap,
|
|
||||||
Map<Integer, Integer> plateVtoBMap,
|
|
||||||
Integer minOverlapPercent) {
|
|
||||||
for (DefaultWeightedEdge e : graph.edgeSet()) {
|
|
||||||
Integer alphaOcc = alphaWellCounts.get(plateVtoAMap.get(graph.getEdgeSource(e)));
|
|
||||||
Integer betaOcc = betaWellCounts.get(plateVtoBMap.get(graph.getEdgeTarget(e)));
|
|
||||||
double weight = graph.getEdgeWeight(e);
|
|
||||||
double min = minOverlapPercent / 100.0;
|
|
||||||
if ((weight / alphaOcc < min) || (weight / betaOcc < min)) {
|
|
||||||
graph.setEdgeWeight(e, 0.0);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private static Map<Integer, Integer> makeSequenceToSequenceMap(List<Integer[]> cells, int keySequenceIndex,
|
private static Map<Integer, Integer> makeSequenceToSequenceMap(List<Integer[]> cells, int keySequenceIndex,
|
||||||
int valueSequenceIndex){
|
int valueSequenceIndex){
|
||||||
Map<Integer, Integer> keySequenceToValueSequenceMap = new HashMap<>();
|
Map<Integer, Integer> keySequenceToValueSequenceMap = new HashMap<>();
|
||||||
|
|||||||
Reference in New Issue
Block a user