Compare commits
14 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 4bcda9b66c | |||
| 17ae763c6c | |||
| decdb147a9 | |||
| 74ffbfd8ac | |||
| 08699ce8ce | |||
| 69b0cc535c | |||
| e58f7b0a55 | |||
| dd2164c250 | |||
| 7323093bdc | |||
| f904cf6672 | |||
| 3ccee9891b | |||
| 40c2be1cfb | |||
| 4b597c4e5e | |||
| b2398531a3 |
34
readme.md
34
readme.md
@@ -12,7 +12,7 @@ Unlike pairSEQ, which calculates p-values for every TCR alpha/beta overlap and c
|
||||
against a null distribution, BiGpairSEQ does not do any statistical calculations
|
||||
directly.
|
||||
|
||||
BiGpairSEQ creates a [simple bipartite weighted graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
||||
BiGpairSEQ creates a [weightd bipartite graph](https://en.wikipedia.org/wiki/Bipartite_graph) representing the sample plate.
|
||||
The distinct TCRA and TCRB sequences form the two sets of vertices. Every TCRA/TCRB pair that share a well
|
||||
are connected by an edge, with the edge weight set to the number of wells in which both sequences appear.
|
||||
(Sequences present in *all* wells are filtered out prior to creating the graph, as there is no signal in their occupancy pattern.)
|
||||
@@ -69,16 +69,26 @@ Please select an option:
|
||||
0) Exit
|
||||
```
|
||||
|
||||
### OUTPUT
|
||||
### INPUT/OUTPUT
|
||||
|
||||
To run the simulation, the program reads and writes 4 kinds of files:
|
||||
* Cell Sample files in CSV format
|
||||
* Sample Plate files in CSV format
|
||||
* Graph and Data files in binary object serialization format
|
||||
* Graph/Data files in binary object serialization format
|
||||
* Matching Results files in CSV format
|
||||
|
||||
These files are often generated in sequence. To save file I/O time, the most recent instance of each of these four
|
||||
files either generated or read from disk is cached in program memory. This is especially important for Graph/Data files,
|
||||
which can be several gigabytes in size. Since some simulations may require running multiple,
|
||||
differntly-configured BiGpairSEQ matchings on the same graph, keeping the most recent graph cached drastically reduces
|
||||
execution time.
|
||||
|
||||
Subsequent uses of the same data file won't need to be read in again until another file of that type is used or generated.
|
||||
The program checks whether it needs to update its cached data by comparing filenames as entered by the user. On
|
||||
encountering a new filename, the program flushes its cache and reads in the new file.
|
||||
|
||||
When entering filenames, it is not necessary to include the file extension (.csv or .ser). When reading or
|
||||
writing files, the program will automatically add the correct extension to any filename without one.
|
||||
writing files, the program will automatically add the correct extension to any filename without one.
|
||||
|
||||
#### Cell Sample Files
|
||||
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
||||
@@ -121,7 +131,7 @@ Options when making a Sample Plate file:
|
||||
* Standard deviation size
|
||||
* Exponential
|
||||
* Lambda value
|
||||
* (Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was exponential with a lambda of approximately 0.6. (Howie, et al. 2015))
|
||||
* *(Based on the slope of the graph in Figure 4C of the pairSEQ paper, the distribution of the original experiment was exponential with a lambda of approximately 0.6. (Howie, et al. 2015))*
|
||||
* Total number of wells on the plate
|
||||
* Number of sections on plate
|
||||
* Number of T cells per well
|
||||
@@ -129,7 +139,7 @@ Options when making a Sample Plate file:
|
||||
* Dropout rate
|
||||
|
||||
Files are in CSV format. There are no header labels. Every row represents a well.
|
||||
Every column represents an individual cell, containing four sequences, depicted as an array string:
|
||||
Every value represents an individual cell, containing four sequences, depicted as an array string:
|
||||
`[CDR3A, CDR3B, CDR1A, CDR1B]`. So a representative cell might look like this:
|
||||
|
||||
`[525902, 791533, -1, 866282]`
|
||||
@@ -155,8 +165,8 @@ Structure:
|
||||
|
||||
---
|
||||
|
||||
#### Graph and Data Files
|
||||
Graph and Data files are serialized binaries of a Java object containing the weigthed bipartite graph representation of a
|
||||
#### Graph/Data Files
|
||||
Graph/Data files are serialized binaries of a Java object containing the weigthed bipartite graph representation of a
|
||||
Sample Plate, along with the necessary metadata for matching and results output. Making them requires a Cell Sample file
|
||||
(to construct a list of correct sequence pairs for checking the accuracy of BiGpairSEQ simulations) and a
|
||||
Sample Plate file (to construct the associated occupancy graph).
|
||||
@@ -164,7 +174,7 @@ Sample Plate file (to construct the associated occupancy graph).
|
||||
These files can be several gigabytes in size. Writing them to a file lets us generate a graph and its metadata once,
|
||||
then use it for multiple different BiGpairSEQ simulations.
|
||||
|
||||
Options for creating a Graph and Data file:
|
||||
Options for creating a Graph/Data file:
|
||||
* The Cell Sample file to use
|
||||
* The Sample Plate file to use. (This must have been generated from the selected Cell Sample file.)
|
||||
|
||||
@@ -175,11 +185,7 @@ portable data format may be implemented in the future. The tricky part is encodi
|
||||
|
||||
#### Matching Results Files
|
||||
Matching results files consist of the results of a BiGpairSEQ matching simulation. Making them requires a Graph and
|
||||
Data file. To save file I/O time, the data from the most recent Graph and Data file read or generated is cached
|
||||
by the simulator. Subsequent BiGpairSEQ simulations run with the same input filename will use the cached version
|
||||
rather than reading in again from disk.
|
||||
|
||||
Files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
||||
Data file. Matching results files are in CSV format. Rows are sequence pairings with extra relevant data. Columns are pairing-specific details.
|
||||
Metadata about the matching simulation is included as comments. Comments are preceded by `#`.
|
||||
|
||||
Options when running a BiGpairSEQ simulation of CDR3 alpha/beta matching:
|
||||
|
||||
@@ -1,6 +1,13 @@
|
||||
//main class. Only job is to choose which interface to use, and hold graph data in memory
|
||||
import java.util.Random;
|
||||
|
||||
//main class. For choosing interface type and caching file data
|
||||
public class BiGpairSEQ {
|
||||
|
||||
private static final Random rand = new Random();
|
||||
private static CellSample cellSampleInMemory = null;
|
||||
private static String cellFilename = null;
|
||||
private static Plate plateInMemory = null;
|
||||
private static String plateFilename = null;
|
||||
private static GraphWithMapData graphInMemory = null;
|
||||
private static String graphFilename = null;
|
||||
|
||||
@@ -15,18 +22,64 @@ public class BiGpairSEQ {
|
||||
}
|
||||
}
|
||||
|
||||
public static GraphWithMapData getGraph() {
|
||||
public static Random getRand() {
|
||||
return rand;
|
||||
}
|
||||
|
||||
public static CellSample getCellSampleInMemory() {
|
||||
return cellSampleInMemory;
|
||||
}
|
||||
|
||||
public static void setCellSampleInMemory(CellSample cellSampleInMemory) {
|
||||
BiGpairSEQ.cellSampleInMemory = cellSampleInMemory;
|
||||
}
|
||||
|
||||
public static void clearCellSampleInMemory() {
|
||||
cellSampleInMemory = null;
|
||||
System.gc();
|
||||
}
|
||||
|
||||
public static String getCellFilename() {
|
||||
return cellFilename;
|
||||
}
|
||||
|
||||
public static void setCellFilename(String cellFilename) {
|
||||
BiGpairSEQ.cellFilename = cellFilename;
|
||||
}
|
||||
|
||||
public static Plate getPlateInMemory() {
|
||||
return plateInMemory;
|
||||
}
|
||||
|
||||
public static void setPlateInMemory(Plate plateInMemory) {
|
||||
BiGpairSEQ.plateInMemory = plateInMemory;
|
||||
}
|
||||
|
||||
public static void clearPlateInMemory() {
|
||||
plateInMemory = null;
|
||||
System.gc();
|
||||
}
|
||||
|
||||
public static String getPlateFilename() {
|
||||
return plateFilename;
|
||||
}
|
||||
|
||||
public static void setPlateFilename(String plateFilename) {
|
||||
BiGpairSEQ.plateFilename = plateFilename;
|
||||
}
|
||||
|
||||
public static GraphWithMapData getGraphInMemory() {
|
||||
return graphInMemory;
|
||||
}
|
||||
|
||||
public static void setGraph(GraphWithMapData g) {
|
||||
public static void setGraphInMemory(GraphWithMapData g) {
|
||||
if (graphInMemory != null) {
|
||||
clearGraph();
|
||||
clearGraphInMemory();
|
||||
}
|
||||
graphInMemory = g;
|
||||
}
|
||||
|
||||
public static void clearGraph() {
|
||||
public static void clearGraphInMemory() {
|
||||
graphInMemory = null;
|
||||
System.gc();
|
||||
}
|
||||
|
||||
@@ -13,6 +13,7 @@ public class CellFileReader {
|
||||
|
||||
private String filename;
|
||||
private List<Integer[]> distinctCells = new ArrayList<>();
|
||||
private Integer cdr1Freq;
|
||||
|
||||
public CellFileReader(String filename) {
|
||||
if(!filename.matches(".*\\.csv")){
|
||||
@@ -38,19 +39,37 @@ public class CellFileReader {
|
||||
cell[3] = Integer.valueOf(record.get("Beta CDR1"));
|
||||
distinctCells.add(cell);
|
||||
}
|
||||
|
||||
|
||||
} catch(IOException ex){
|
||||
System.out.println("cell file " + filename + " not found.");
|
||||
System.err.println(ex);
|
||||
}
|
||||
|
||||
//get CDR1 frequency
|
||||
ArrayList<Integer> cdr1Alphas = new ArrayList<>();
|
||||
for (Integer[] cell : distinctCells) {
|
||||
cdr1Alphas.add(cell[3]);
|
||||
}
|
||||
double count = cdr1Alphas.stream().distinct().count();
|
||||
count = Math.ceil(distinctCells.size() / count);
|
||||
cdr1Freq = (int) count;
|
||||
|
||||
}
|
||||
|
||||
public CellSample getCellSample() {
|
||||
return new CellSample(distinctCells, cdr1Freq);
|
||||
}
|
||||
|
||||
public String getFilename() { return filename;}
|
||||
|
||||
public List<Integer[]> getCells(){
|
||||
//Refactor everything that uses this to have access to a Cell Sample and get the cells there instead.
|
||||
public List<Integer[]> getListOfDistinctCellsDEPRECATED(){
|
||||
return distinctCells;
|
||||
}
|
||||
|
||||
public Integer getCellCount() {
|
||||
public Integer getCellCountDEPRECATED() {
|
||||
//Refactor everything that uses this to have access to a Cell Sample and get the count there instead.
|
||||
return distinctCells.size();
|
||||
}
|
||||
}
|
||||
|
||||
@@ -18,7 +18,7 @@ public class CellSample {
|
||||
return cdr1Freq;
|
||||
}
|
||||
|
||||
public Integer population(){
|
||||
public Integer getCellCount(){
|
||||
return cells.size();
|
||||
}
|
||||
|
||||
|
||||
@@ -297,7 +297,7 @@ public class CommandLineInterface {
|
||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getCells(), lambda);
|
||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), lambda);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
@@ -305,9 +305,9 @@ public class CommandLineInterface {
|
||||
private static void makePlatePoisson(String cellFile, String filename, Integer numWells,
|
||||
Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Double stdDev = Math.sqrt(cellReader.getCellCount());
|
||||
Double stdDev = Math.sqrt(cellReader.getCellCountDEPRECATED());
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
@@ -316,7 +316,7 @@ public class CommandLineInterface {
|
||||
Integer numWells, Integer[] concentrations, Double dropOutRate){
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getListOfDistinctCellsDEPRECATED(), stdDev);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
}
|
||||
|
||||
@@ -4,10 +4,6 @@ import java.math.MathContext;
|
||||
|
||||
public abstract class Equations {
|
||||
|
||||
public static int getRandomNumber(int min, int max) {
|
||||
return (int) ((Math.random() * (max - min)) + min);
|
||||
}
|
||||
|
||||
//pValue calculation as described in original pairSEQ paper.
|
||||
//Included for comparison with original results.
|
||||
//Not used by BiGpairSEQ for matching.
|
||||
|
||||
@@ -11,7 +11,7 @@ public class GraphWithMapData implements java.io.Serializable {
|
||||
private String sourceFilename;
|
||||
private final SimpleWeightedGraph graph;
|
||||
private Integer numWells;
|
||||
private Integer[] wellConcentrations;
|
||||
private Integer[] wellPopulations;
|
||||
private Integer alphaCount;
|
||||
private Integer betaCount;
|
||||
private final Map<Integer, Integer> distCellsMapAlphaKey;
|
||||
@@ -31,7 +31,7 @@ public class GraphWithMapData implements java.io.Serializable {
|
||||
Map<Integer, Integer> betaWellCounts, Duration time) {
|
||||
this.graph = graph;
|
||||
this.numWells = numWells;
|
||||
this.wellConcentrations = wellConcentrations;
|
||||
this.wellPopulations = wellConcentrations;
|
||||
this.alphaCount = alphaCount;
|
||||
this.betaCount = betaCount;
|
||||
this.distCellsMapAlphaKey = distCellsMapAlphaKey;
|
||||
@@ -52,8 +52,8 @@ public class GraphWithMapData implements java.io.Serializable {
|
||||
return numWells;
|
||||
}
|
||||
|
||||
public Integer[] getWellConcentrations() {
|
||||
return wellConcentrations;
|
||||
public Integer[] getWellPopulations() {
|
||||
return wellPopulations;
|
||||
}
|
||||
|
||||
public Integer getAlphaCount() {
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
import java.io.IOException;
|
||||
import java.util.List;
|
||||
import java.util.Scanner;
|
||||
import java.util.InputMismatchException;
|
||||
import java.util.*;
|
||||
import java.util.regex.Matcher;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
//
|
||||
public class InteractiveInterface {
|
||||
|
||||
final static Scanner sc = new Scanner(System.in);
|
||||
static int input;
|
||||
static boolean quit = false;
|
||||
private static final Random rand = BiGpairSEQ.getRand();
|
||||
private static final Scanner sc = new Scanner(System.in);
|
||||
private static int input;
|
||||
private static boolean quit = false;
|
||||
|
||||
public static void startInteractive() {
|
||||
|
||||
@@ -73,8 +74,15 @@ public class InteractiveInterface {
|
||||
}
|
||||
CellSample sample = Simulator.generateCellSample(numCells, cdr1Freq);
|
||||
assert filename != null;
|
||||
System.out.println("Writing cells to file");
|
||||
CellFileWriter writer = new CellFileWriter(filename, sample);
|
||||
writer.writeCellsToFile();
|
||||
System.out.println("Cell sample written to: " + filename);
|
||||
if(BiGpairSEQ.getCellSampleInMemory() != null) {
|
||||
BiGpairSEQ.clearCellSampleInMemory();
|
||||
}
|
||||
BiGpairSEQ.setCellSampleInMemory(sample);
|
||||
BiGpairSEQ.setCellFilename(filename);
|
||||
}
|
||||
|
||||
//Output a CSV of sample plate
|
||||
@@ -84,7 +92,7 @@ public class InteractiveInterface {
|
||||
Double stdDev = 0.0;
|
||||
Integer numWells = 0;
|
||||
Integer numSections;
|
||||
Integer[] concentrations = {1};
|
||||
Integer[] populations = {1};
|
||||
Double dropOutRate = 0.0;
|
||||
boolean poisson = false;
|
||||
boolean exponential = false;
|
||||
@@ -123,10 +131,11 @@ public class InteractiveInterface {
|
||||
}
|
||||
case 3 -> {
|
||||
exponential = true;
|
||||
System.out.println("Please enter lambda value for exponential distribution.");
|
||||
System.out.print("Please enter lambda value for exponential distribution: ");
|
||||
lambda = sc.nextDouble();
|
||||
if (lambda <= 0.0) {
|
||||
throw new InputMismatchException("Value must be positive.");
|
||||
lambda = 0.6;
|
||||
System.out.println("Value must be positive. Defaulting to 0.6.");
|
||||
}
|
||||
}
|
||||
default -> {
|
||||
@@ -139,22 +148,57 @@ public class InteractiveInterface {
|
||||
if(numWells < 1){
|
||||
throw new InputMismatchException("No wells on plate");
|
||||
}
|
||||
System.out.println("\nThe plate can be evenly sectioned to allow multiple concentrations of T-cells/well");
|
||||
System.out.println("How many sections would you like to make (minimum 1)?");
|
||||
numSections = sc.nextInt();
|
||||
if(numSections < 1) {
|
||||
throw new InputMismatchException("Too few sections.");
|
||||
//choose whether to make T cell population/well random
|
||||
boolean randomWellPopulations;
|
||||
System.out.println("Randomize number of T cells in each well? (y/n)");
|
||||
String ans = sc.next();
|
||||
Pattern pattern = Pattern.compile("(?:yes|y)", Pattern.CASE_INSENSITIVE);
|
||||
Matcher matcher = pattern.matcher(ans);
|
||||
if(matcher.matches()){
|
||||
randomWellPopulations = true;
|
||||
}
|
||||
else if (numSections > numWells) {
|
||||
throw new InputMismatchException("Cannot have more sections than wells.");
|
||||
else{
|
||||
randomWellPopulations = false;
|
||||
}
|
||||
int i = 1;
|
||||
concentrations = new Integer[numSections];
|
||||
while(numSections > 0) {
|
||||
System.out.print("Enter number of T-cells per well in section " + i +": ");
|
||||
concentrations[i - 1] = sc.nextInt();
|
||||
i++;
|
||||
numSections--;
|
||||
if(randomWellPopulations) { //if T cell population/well is random
|
||||
numSections = numWells;
|
||||
Integer minPop;
|
||||
Integer maxPop;
|
||||
System.out.print("Please enter minimum number of T cells in a well: ");
|
||||
minPop = sc.nextInt();
|
||||
if(minPop < 1) {
|
||||
throw new InputMismatchException("Minimum well population must be positive");
|
||||
}
|
||||
System.out.println("Please enter maximum number of T cells in a well: ");
|
||||
maxPop = sc.nextInt();
|
||||
if(maxPop < minPop) {
|
||||
throw new InputMismatchException("Max well population must be greater than min well population");
|
||||
}
|
||||
//maximum should be inclusive, so need to add one to max of randomly generated values
|
||||
populations = rand.ints(minPop, maxPop + 1)
|
||||
.limit(numSections)
|
||||
.boxed()
|
||||
.toArray(Integer[]::new);
|
||||
System.out.print("Populations: ");
|
||||
System.out.println(Arrays.toString(populations));
|
||||
}
|
||||
else{ //if T cell population/well is not random
|
||||
System.out.println("\nThe plate can be evenly sectioned to allow different numbers of T cells per well.");
|
||||
System.out.println("How many sections would you like to make (minimum 1)?");
|
||||
numSections = sc.nextInt();
|
||||
if (numSections < 1) {
|
||||
throw new InputMismatchException("Too few sections.");
|
||||
} else if (numSections > numWells) {
|
||||
throw new InputMismatchException("Cannot have more sections than wells.");
|
||||
}
|
||||
int i = 1;
|
||||
populations = new Integer[numSections];
|
||||
while (numSections > 0) {
|
||||
System.out.print("Enter number of T cells per well in section " + i + ": ");
|
||||
populations[i - 1] = sc.nextInt();
|
||||
i++;
|
||||
numSections--;
|
||||
}
|
||||
}
|
||||
System.out.println("\nErrors in amplification can induce a well dropout rate for sequences");
|
||||
System.out.print("Enter well dropout rate (0.0 to 1.0): ");
|
||||
@@ -166,27 +210,40 @@ public class InteractiveInterface {
|
||||
System.out.println(ex);
|
||||
sc.next();
|
||||
}
|
||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||
assert cellFile != null;
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
CellSample cells;
|
||||
if (cellFile.equals(BiGpairSEQ.getCellFilename())){
|
||||
cells = BiGpairSEQ.getCellSampleInMemory();
|
||||
}
|
||||
else {
|
||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
cells = cellReader.getCellSample();
|
||||
BiGpairSEQ.clearCellSampleInMemory();
|
||||
BiGpairSEQ.setCellSampleInMemory(cells);
|
||||
BiGpairSEQ.setCellFilename(cellFile);
|
||||
}
|
||||
assert filename != null;
|
||||
Plate samplePlate;
|
||||
PlateFileWriter writer;
|
||||
if(exponential){
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWellsExponential(cellReader.getFilename(), cellReader.getCells(), lambda);
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
writer.writePlateFile();
|
||||
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||
samplePlate.fillWellsExponential(cellFile, cells.getCells(), lambda);
|
||||
writer = new PlateFileWriter(filename, samplePlate);
|
||||
}
|
||||
else {
|
||||
if (poisson) {
|
||||
stdDev = Math.sqrt(cellReader.getCellCount()); //gaussian with square root of elements approximates poisson
|
||||
stdDev = Math.sqrt(cells.getCellCount()); //gaussian with square root of elements approximates poisson
|
||||
}
|
||||
Plate samplePlate = new Plate(numWells, dropOutRate, concentrations);
|
||||
samplePlate.fillWells(cellReader.getFilename(), cellReader.getCells(), stdDev);
|
||||
assert filename != null;
|
||||
PlateFileWriter writer = new PlateFileWriter(filename, samplePlate);
|
||||
System.out.println("Writing Sample Plate to file");
|
||||
writer.writePlateFile();
|
||||
System.out.println("Sample Plate written to file: " + filename);
|
||||
samplePlate = new Plate(numWells, dropOutRate, populations);
|
||||
samplePlate.fillWells(cellFile, cells.getCells(), stdDev);
|
||||
writer = new PlateFileWriter(filename, samplePlate);
|
||||
}
|
||||
System.out.println("Writing Sample Plate to file");
|
||||
writer.writePlateFile();
|
||||
System.out.println("Sample Plate written to file: " + filename);
|
||||
BiGpairSEQ.setPlateInMemory(samplePlate);
|
||||
BiGpairSEQ.setPlateFilename(filename);
|
||||
}
|
||||
|
||||
//Output serialized binary of GraphAndMapData object
|
||||
@@ -210,14 +267,37 @@ public class InteractiveInterface {
|
||||
System.out.println(ex);
|
||||
sc.next();
|
||||
}
|
||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||
|
||||
assert cellFile != null;
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
System.out.println("Reading Sample Plate file: " + plateFile);
|
||||
CellSample cellSample;
|
||||
//check if cells are already in memory
|
||||
if(cellFile.equals(BiGpairSEQ.getCellFilename())) {
|
||||
cellSample = BiGpairSEQ.getCellSampleInMemory();
|
||||
}
|
||||
else {
|
||||
BiGpairSEQ.clearCellSampleInMemory();
|
||||
System.out.println("Reading Cell Sample file: " + cellFile);
|
||||
CellFileReader cellReader = new CellFileReader(cellFile);
|
||||
cellSample = cellReader.getCellSample();
|
||||
BiGpairSEQ.setCellSampleInMemory(cellSample);
|
||||
BiGpairSEQ.setCellFilename(cellFile);
|
||||
}
|
||||
|
||||
assert plateFile != null;
|
||||
PlateFileReader plateReader = new PlateFileReader(plateFile);
|
||||
Plate plate = new Plate(plateReader.getFilename(), plateReader.getWells());
|
||||
if (cellReader.getCells().size() == 0){
|
||||
Plate plate;
|
||||
//check if plate is already in memory
|
||||
if(plateFile.equals(BiGpairSEQ.getPlateFilename())){
|
||||
plate = BiGpairSEQ.getPlateInMemory();
|
||||
}
|
||||
else {
|
||||
BiGpairSEQ.clearPlateInMemory();
|
||||
System.out.println("Reading Sample Plate file: " + plateFile);
|
||||
PlateFileReader plateReader = new PlateFileReader(plateFile);
|
||||
plate = new Plate(plateReader.getFilename(), plateReader.getWells());
|
||||
BiGpairSEQ.setPlateInMemory(plate);
|
||||
BiGpairSEQ.setPlateFilename(plateFile);
|
||||
}
|
||||
if (cellSample.getCells().size() == 0){
|
||||
System.out.println("No cell sample found.");
|
||||
System.out.println("Returning to main menu.");
|
||||
}
|
||||
@@ -226,13 +306,13 @@ public class InteractiveInterface {
|
||||
System.out.println("Returning to main menu.");
|
||||
}
|
||||
else{
|
||||
List<Integer[]> cells = cellReader.getCells();
|
||||
List<Integer[]> cells = cellSample.getCells();
|
||||
GraphWithMapData data = Simulator.makeGraph(cells, plate, true);
|
||||
assert filename != null;
|
||||
GraphDataObjectWriter dataWriter = new GraphDataObjectWriter(filename, data);
|
||||
dataWriter.writeDataToFile();
|
||||
System.out.println("Graph and Data file written to: " + filename);
|
||||
BiGpairSEQ.setGraph(data);
|
||||
BiGpairSEQ.setGraphInMemory(data);
|
||||
BiGpairSEQ.setGraphFilename(filename);
|
||||
System.out.println("Graph and Data file " + filename + " cached.");
|
||||
}
|
||||
@@ -256,17 +336,28 @@ public class InteractiveInterface {
|
||||
System.out.println("\nWhat is the minimum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
||||
lowThreshold = sc.nextInt();
|
||||
if(lowThreshold < 1){
|
||||
throw new InputMismatchException("Minimum value for low threshold set to 1");
|
||||
lowThreshold = 1;
|
||||
System.out.println("Value for low occupancy overlap threshold must be positive");
|
||||
System.out.println("Value for low occupancy overlap threshold set to 1");
|
||||
}
|
||||
System.out.println("\nWhat is the maximum number of CDR3 alpha/beta overlap wells to attempt matching?");
|
||||
highThreshold = sc.nextInt();
|
||||
System.out.println("\nWhat is the maximum difference in alpha/beta occupancy to attempt matching?");
|
||||
maxOccupancyDiff = sc.nextInt();
|
||||
System.out.println("\nWell overlap percentage = pair overlap / sequence occupancy");
|
||||
System.out.println("What is the minimum well overlap percentage to attempt matching? (0 to 100)");
|
||||
if(highThreshold < lowThreshold) {
|
||||
highThreshold = lowThreshold;
|
||||
System.out.println("Value for high occupancy overlap threshold must be >= low overlap threshold");
|
||||
System.out.println("Value for high occupancy overlap threshold set to " + lowThreshold);
|
||||
}
|
||||
System.out.println("What is the minimum percentage of a sequence's wells in alpha/beta overlap to attempt matching? (0 - 100)");
|
||||
minOverlapPercent = sc.nextInt();
|
||||
if (minOverlapPercent < 0 || minOverlapPercent > 100) {
|
||||
throw new InputMismatchException("Value outside range. Minimum percent set to 0");
|
||||
System.out.println("Value outside range. Minimum occupancy overlap percentage set to 0");
|
||||
}
|
||||
System.out.println("\nWhat is the maximum difference in alpha/beta occupancy to attempt matching?");
|
||||
maxOccupancyDiff = sc.nextInt();
|
||||
if (maxOccupancyDiff < 0) {
|
||||
maxOccupancyDiff = 0;
|
||||
System.out.println("Maximum allowable difference in alpha/beta occupancy must be nonnegative");
|
||||
System.out.println("Maximum allowable difference in alpha/beta occupancy set to 0");
|
||||
}
|
||||
} catch (InputMismatchException ex) {
|
||||
System.out.println(ex);
|
||||
@@ -275,17 +366,17 @@ public class InteractiveInterface {
|
||||
assert graphFilename != null;
|
||||
//check if this is the same graph we already have in memory.
|
||||
GraphWithMapData data;
|
||||
if(!(graphFilename.equals(BiGpairSEQ.getGraphFilename())) || BiGpairSEQ.getGraph() == null) {
|
||||
BiGpairSEQ.clearGraph();
|
||||
if(!(graphFilename.equals(BiGpairSEQ.getGraphFilename())) || BiGpairSEQ.getGraphInMemory() == null) {
|
||||
BiGpairSEQ.clearGraphInMemory();
|
||||
//read object data from file
|
||||
GraphDataObjectReader dataReader = new GraphDataObjectReader(graphFilename);
|
||||
data = dataReader.getData();
|
||||
//set new graph in memory and new filename
|
||||
BiGpairSEQ.setGraph(data);
|
||||
BiGpairSEQ.setGraphInMemory(data);
|
||||
BiGpairSEQ.setGraphFilename(graphFilename);
|
||||
}
|
||||
else {
|
||||
data = BiGpairSEQ.getGraph();
|
||||
data = BiGpairSEQ.getGraphInMemory();
|
||||
}
|
||||
//simulate matching
|
||||
MatchingResult results = Simulator.matchCDR3s(data, graphFilename, lowThreshold, highThreshold, maxOccupancyDiff,
|
||||
|
||||
@@ -10,7 +10,7 @@ import java.util.*;
|
||||
public class Plate {
|
||||
private String sourceFile;
|
||||
private List<List<Integer[]>> wells;
|
||||
private Random rand = new Random();
|
||||
private final Random rand = BiGpairSEQ.getRand();
|
||||
private int size;
|
||||
private double error;
|
||||
private Integer[] populations;
|
||||
@@ -51,7 +51,6 @@ public class Plate {
|
||||
int section = 0;
|
||||
double m;
|
||||
int n;
|
||||
int test=0;
|
||||
while (section < numSections){
|
||||
for (int i = 0; i < (size / numSections); i++) {
|
||||
List<Integer[]> well = new ArrayList<>();
|
||||
@@ -61,13 +60,6 @@ public class Plate {
|
||||
m = (Math.log10((1 - rand.nextDouble()))/(-lambda)) * Math.sqrt(cells.size());
|
||||
} while (m >= cells.size() || m < 0);
|
||||
n = (int) Math.floor(m);
|
||||
//n = Equations.getRandomNumber(0, cells.size());
|
||||
// was testing generating the cell sample file with exponential dist, then sampling flat here
|
||||
//that would be more realistic
|
||||
//But would mess up other things in the simulation with how I've coded it.
|
||||
if(n > test){
|
||||
test = n;
|
||||
}
|
||||
Integer[] cellToAdd = cells.get(n).clone();
|
||||
for(int k = 0; k < cellToAdd.length; k++){
|
||||
if(Math.abs(rand.nextDouble()) < error){//error applied to each seqeunce
|
||||
@@ -80,7 +72,6 @@ public class Plate {
|
||||
}
|
||||
section++;
|
||||
}
|
||||
System.out.println("Highest index: " +test);
|
||||
}
|
||||
|
||||
public void fillWells(String sourceFileName, List<Integer[]> cells, double stdDev) {
|
||||
|
||||
@@ -16,7 +16,7 @@ public class PlateFileWriter {
|
||||
private Double error;
|
||||
private String filename;
|
||||
private String sourceFileName;
|
||||
private Integer[] concentrations;
|
||||
private Integer[] populations;
|
||||
private boolean isExponential = false;
|
||||
|
||||
public PlateFileWriter(String filename, Plate plate) {
|
||||
@@ -35,8 +35,8 @@ public class PlateFileWriter {
|
||||
}
|
||||
this.error = plate.getError();
|
||||
this.wells = plate.getWells();
|
||||
this.concentrations = plate.getPopulations();
|
||||
Arrays.sort(concentrations);
|
||||
this.populations = plate.getPopulations();
|
||||
Arrays.sort(populations);
|
||||
}
|
||||
|
||||
public void writePlateFile(){
|
||||
@@ -73,14 +73,12 @@ public class PlateFileWriter {
|
||||
// rows.add(tmp);
|
||||
// }
|
||||
|
||||
//get list of well populations
|
||||
List<Integer> wellPopulations = Arrays.asList(concentrations);
|
||||
//make string out of populations list
|
||||
//make string out of populations array
|
||||
StringBuilder populationsStringBuilder = new StringBuilder();
|
||||
populationsStringBuilder.append(wellPopulations.remove(0).toString());
|
||||
for(Integer i: wellPopulations){
|
||||
populationsStringBuilder.append(populations[0].toString());
|
||||
for(int i = 1; i < populations.length; i++){
|
||||
populationsStringBuilder.append(", ");
|
||||
populationsStringBuilder.append(i.toString());
|
||||
populationsStringBuilder.append(populations[i].toString());
|
||||
}
|
||||
String wellPopulationsString = populationsStringBuilder.toString();
|
||||
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import org.jgrapht.Graph;
|
||||
import org.jgrapht.alg.interfaces.MatchingAlgorithm;
|
||||
import org.jgrapht.alg.matching.MaximumWeightBipartiteMatching;
|
||||
import org.jgrapht.generate.SimpleWeightedBipartiteGraphMatrixGenerator;
|
||||
@@ -14,6 +13,8 @@ import java.time.Duration;
|
||||
import java.util.*;
|
||||
import java.util.stream.IntStream;
|
||||
|
||||
import static java.lang.Float.*;
|
||||
|
||||
//NOTE: "sequence" in method and variable names refers to a peptide sequence from a simulated T cell
|
||||
public class Simulator {
|
||||
private static final int cdr3AlphaIndex = 0;
|
||||
@@ -247,10 +248,16 @@ public class Simulator {
|
||||
BigDecimal attemptRateTrunc = new BigDecimal(attemptRate, mc);
|
||||
//rate of pairing error
|
||||
double pairingErrorRate = (double) falseCount / (trueCount + falseCount);
|
||||
BigDecimal pairingErrorRateTrunc = new BigDecimal(pairingErrorRate, mc);
|
||||
//get list of well concentrations
|
||||
Integer[] wellPopulations = data.getWellConcentrations();
|
||||
//make string out of concentrations list
|
||||
BigDecimal pairingErrorRateTrunc;
|
||||
if(pairingErrorRate == NaN || pairingErrorRate == POSITIVE_INFINITY || pairingErrorRate == NEGATIVE_INFINITY) {
|
||||
pairingErrorRateTrunc = new BigDecimal(-1, mc);
|
||||
}
|
||||
else{
|
||||
pairingErrorRateTrunc = new BigDecimal(pairingErrorRate, mc);
|
||||
}
|
||||
//get list of well populations
|
||||
Integer[] wellPopulations = data.getWellPopulations();
|
||||
//make string out of populations list
|
||||
StringBuilder populationsStringBuilder = new StringBuilder();
|
||||
populationsStringBuilder.append(wellPopulations[0].toString());
|
||||
for(int i = 1; i < wellPopulations.length; i++){
|
||||
@@ -270,8 +277,8 @@ public class Simulator {
|
||||
metadata.put("total betas found", betaCount.toString());
|
||||
metadata.put("high overlap threshold", highThreshold.toString());
|
||||
metadata.put("low overlap threshold", lowThreshold.toString());
|
||||
metadata.put("maximum occupancy difference", maxOccupancyDifference.toString());
|
||||
metadata.put("minimum overlap percent", minOverlapPercent.toString());
|
||||
metadata.put("maximum occupancy difference", maxOccupancyDifference.toString());
|
||||
metadata.put("pairing attempt rate", attemptRateTrunc.toString());
|
||||
metadata.put("correct pairing count", Integer.toString(trueCount));
|
||||
metadata.put("incorrect pairing count", Integer.toString(falseCount));
|
||||
|
||||
Reference in New Issue
Block a user