Sequencing technologies, in particular RNASeq, have become critical tools in the design, build, test, learn cycle for synthetic biology.
More...Sequencing technologies, in particular RNASeq, have become critical tools in the design, build, test, learn cycle for synthetic biology. They provide a better understanding of synthetic designs and they help identify ways to improve and select designs. While this data is beneficial to design, its collection and analysis is a complex, multi-step process that has implications both on discovery and reproducibility of experiments. Additionally, tool parameters, experimental metadata, and normalization of data and standardization of file formats present challenges that are computationally intensive. This calls for high-throughput pipelines expressly designed to handle the combinatorial and longitudinal nature of synthetic biology. In this paper, we present a pipeline to maximize analytical reproducibility of RNASeq for synthetic biologists. We also explore the impact of reproducibility on the validation of machine learning models.
Overall design: 1,344 samples were collected in quadruplicate, and there are non-induced controls for each organism.
Less...