Update readme to reflect new default caching behavior.
This commit is contained in:
21
readme.md
21
readme.md
@@ -78,16 +78,17 @@ These files are often generated in sequence. When entering filenames, it is not
|
|||||||
(.csv or .ser). When reading or writing files, the program will automatically add the correct extension to any filename without one.
|
(.csv or .ser). When reading or writing files, the program will automatically add the correct extension to any filename without one.
|
||||||
|
|
||||||
To save file I/O time, the most recent instance of each of these four
|
To save file I/O time, the most recent instance of each of these four
|
||||||
files either generated or read from disk can be cached in program memory. This is especially important for Graph/Data files,
|
files either generated or read from disk can be cached in program memory. This is could be important for Graph/Data files,
|
||||||
which can be several gigabytes in size. Since some simulations may require running multiple,
|
which can be several gigabytes in size. Since some simulations may require running multiple,
|
||||||
differently-configured BiGpairSEQ matchings on the same graph, keeping the most recent graph cached can reduce execution time
|
differently-configured BiGpairSEQ matchings on the same graph, keeping the most recent graph cached may reduce execution time.
|
||||||
|
(The manipulation necessary to re-use a graph incurs its own performance overhead, though, which may scale with graph
|
||||||
|
size faster than file I/O does. If so, caching is best for smaller graphs.)
|
||||||
|
|
||||||
Subsequent uses of the same data file won't need to be read in again until another file of that type is used or generated,
|
When caching is active, subsequent uses of the same data file won't need to be read in again until another file of that type is used or generated,
|
||||||
or caching is turned off for that file type. The program checks whether it needs to update its cached data by comparing
|
or caching is turned off for that file type. The program checks whether it needs to update its cached data by comparing
|
||||||
filenames as entered by the user. On encountering a new filename, the program flushes its cache and reads in the new file.
|
filenames as entered by the user. On encountering a new filename, the program flushes its cache and reads in the new file.
|
||||||
|
|
||||||
The program's caching behavior can be controlled in the Options menu. By default, caching for cell sample and
|
The program's caching behavior can be controlled in the Options menu. By default, all caching is OFF.
|
||||||
sample plate files is OFF, and caching for graph/data files is OFF.
|
|
||||||
|
|
||||||
#### Cell Sample Files
|
#### Cell Sample Files
|
||||||
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
Cell Sample files consist of any number of distinct "T cells." Every cell contains
|
||||||
@@ -252,7 +253,8 @@ slightly less time than the simulation itself. Real elapsed time from start to f
|
|||||||
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
* ~~Try invoking GC at end of workloads to reduce paging to disk~~ DONE
|
||||||
* Hold graph data in memory until another graph is read-in? ~~ABANDONED~~ ~~UNABANDONED~~ DONE
|
* Hold graph data in memory until another graph is read-in? ~~ABANDONED~~ ~~UNABANDONED~~ DONE
|
||||||
* ~~*No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.*~~
|
* ~~*No, this won't work, because BiGpairSEQ simulations alter the underlying graph based on filtering constraints. Changes would cascade with multiple experiments.*~~
|
||||||
* Might have figured out a way to do it, by taking edges out and then putting them back into the graph. This may actually be possible. If so, awesome.
|
* Might have figured out a way to do it, by taking edges out and then putting them back into the graph. This may actually be possible.
|
||||||
|
* It is possible, though the modifications to the graph incur their own performance penalties. Need testing to see which option is best.
|
||||||
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
* See if there's a reasonable way to reformat Sample Plate files so that wells are columns instead of rows.
|
||||||
* ~~Problem is variable number of cells in a well~~
|
* ~~Problem is variable number of cells in a well~~
|
||||||
* ~~Apache Commons CSV library writes entries a row at a time~~
|
* ~~Apache Commons CSV library writes entries a row at a time~~
|
||||||
@@ -266,9 +268,10 @@ slightly less time than the simulation itself. Real elapsed time from start to f
|
|||||||
* Re-implement CDR1 matching method
|
* Re-implement CDR1 matching method
|
||||||
* Implement Duan and Su's maximum weight matching algorithm
|
* Implement Duan and Su's maximum weight matching algorithm
|
||||||
* Add controllable algorithm-type parameter?
|
* Add controllable algorithm-type parameter?
|
||||||
* Test whether pairing heap (currently used) or Fibonacci heap is more efficient for priority queue in current matching algorithm
|
* ~~Test whether pairing heap (currently used) or Fibonacci heap is more efficient for priority queue in current matching algorithm~~ DONE
|
||||||
* in theory Fibonacci heap should be more efficient, but complexity overhead may eliminate theoretical advantage
|
* ~~in theory Fibonacci heap should be more efficient, but complexity overhead may eliminate theoretical advantage~~
|
||||||
* Add controllable heap-type parameter?
|
* ~~Add controllable heap-type parameter?~~
|
||||||
|
* Parameter implemented. For large graphs, Fibonacci heap wins. Now the new default.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user