Add plot_bcv()#1007
Conversation
LuisHeinzlmeier
left a comment
There was a problem hiding this comment.
Ready for a review!
| self.fit = fit | ||
|
|
||
| @_doc_params(common_plot_args=doc_common_plot_args) | ||
| def plot_bcv( # pragma: no cover # noqa: D417 |
There was a problem hiding this comment.
This is only designed for edger now, right? But it should also work with statsmodels or pydeseq2. Like ideally, I'd like to completely phase out the edger support at some point in the future.
There was a problem hiding this comment.
As I understand it, plotBCV() in edgeR is equivalent to plotDispEsts() in DESeq2 because they use the same approach: plotting mean counts against dispersions/variation. The plots differ because edgeR and DESeq2 model differential gene expression slightly differently. But they want to show the same concept.
So I would say it doesn't make sense to add the plotBCV() function from edgeR to DESeq2. I don't think it would even be possible to integrate the same plot into DESeq2.
If you plan to remove edgeR in the long run, I would also only add plotMA() of #969. Also, these plots essentially show the same thing, but there exists one version for DESeq2 and one for edgeR.
There was a problem hiding this comment.
Got it. I'd be happy if we only had one version that worked for both.
| >>> adata.layers["counts"] = adata.X.copy() | ||
| >>> pdata = dc.pp.pseudobulk(adata, sample_col="Patient", groups_col="Cluster", layer="counts", mode="sum") | ||
| >>> dc.pp.filter_samples(pdata, inplace=True) | ||
| >>> edgr = pt.tl.EdgeR(pdata, design="~Efficacy+Treatment") |
There was a problem hiding this comment.
See above :)
|
|
||
| ro.globalenv["fit"] = fit | ||
| self.fit = fit | ||
| self.dge = dge |
There was a problem hiding this comment.
It is possible to run plotBCV() before fitting, but the plot needs the part before 'self.dge = dge' for that (e.g. estimateDisp()). This prevents running the part before 'self.dge = dge' multiple times. In this way, it is also possible to run plotBCV() without fitting the whole model immediately. Basically, to save the intermediate steps to make it run faster and prevent redundant code.
plot_bcv()_prepare_dge()to be able to runplot_bcv()before.fit()without calculating normalization factors and estimating dispersions twice_ensure_deps()to prevent repetitive imports and code