Revert attempt to switch plate output format. It worked, but introduced a bug in graph filtering I don't want to chase down
This commit is contained in:
28
readme.md
28
readme.md
@@ -94,7 +94,7 @@ Options when making a Cell Sample file:
|
||||
Files are in CSV format. Rows are distinct T cells, columns are sequences within the cells.
|
||||
Comments are preceded by `#`
|
||||
|
||||
Structure example:
|
||||
Structure:
|
||||
|
||||
---
|
||||
# Sample contains 1 unique CDR1 for every 4 unique CDR3s.
|
||||
@@ -136,20 +136,20 @@ Every column represents an individual cell, containing four sequences, represent
|
||||
Notice that the Alpha CDR1 is missing in the cell above, due to sequence dropout.
|
||||
Dropouts are represented by replacing sequences with the value `-1`. Comments are preceded by `#`
|
||||
|
||||
Structure Example:
|
||||
Structure:
|
||||
|
||||
---
|
||||
```
|
||||
# Cell source file name: 4MilCells.csv
|
||||
# Plate size: 96
|
||||
# Error rate: 0.1
|
||||
# Concentrations: 10000 5000 500
|
||||
# Lambda: 0.6
|
||||
# Cell source file name:
|
||||
# Each row represents one well on the plate
|
||||
# Plate size:
|
||||
# Concentrations:
|
||||
# Lambda:
|
||||
```
|
||||
| well 1 | well 2 | well 3| ... |
|
||||
| Well 1, cell 1 | Well 1, cell 2 | Well 1, cell 3| ... |
|
||||
|---|---|---|---|
|
||||
| [105383, 786528, 959247, 925928] | [525902, 791533, -1, 866282] | [409236, 132303, 804465, 942261]| ... |
|
||||
| [249930, 301502, 970003, 881099] | [523787, 552952, 997194, 970507]| [425363, 417411, 845399, -1]| ... |
|
||||
| **Well 2, cell 1** | **Well 2, cell 2** | **Well 2, cell 3**| ... |
|
||||
| **Well 3, cell 1** | **Well 3, cell 2** | **Well 3, cell 3**| ... |
|
||||
| ... | ... | ... | ... |
|
||||
|
||||
---
|
||||
@@ -222,10 +222,9 @@ using the (2021 corrected) formula from the original pairSEQ paper. (Howie, et a
|
||||
|
||||
## TODO
|
||||
|
||||
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
||||
* Try invoking GC at end of workloads to reduce paging to disk
|
||||
* ~~Hold graph data in memory until another graph is read-in?~~
|
||||
* No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.
|
||||
* ~~See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows~~ DONE
|
||||
* Enable GraphML output in addition to serialized object binaries, for data portability
|
||||
* Custom vertex type with attribute for sequence occupancy?
|
||||
* Re-implement CDR1 matching method
|
||||
@@ -238,7 +237,10 @@ using the (2021 corrected) formula from the original pairSEQ paper. (Howie, et a
|
||||
* Implement sample plates with random numbers of T cells per well
|
||||
* Possible BiGpairSEQ advantage over pairSEQ: BiGpairSEQ is resilient to variations in well populations; pairSEQ is not.
|
||||
* preliminary data suggests that BiGpairSEQ behaves roughly as though the whole plate had whatever the *average* well concentration is, but that's still speculative.
|
||||
|
||||
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows
|
||||
* Problem is variable number of cells in a well
|
||||
* Apache Commons CSV library writes entries a row at a time
|
||||
* Can possibly sort the wells by length first, then construct entries
|
||||
|
||||
## CITATIONS
|
||||
* Howie, B., Sherwood, A. M., et al. ["High-throughput pairing of T cell receptor alpha and beta sequences."](https://pubmed.ncbi.nlm.nih.gov/26290413/) Sci. Transl. Med. 7, 301ra131 (2015)
|
||||
|
||||
Reference in New Issue
Block a user