scDesign3Py.scDesign3.fit_marginal
- scDesign3.fit_marginal(mu_formula: str, sigma_formula: str, family_use: Literal['binomial', 'poisson', 'nb', 'zip', 'zinb', 'gaussian'] | list[str], usebam: bool, data: rpy2.robjects.vectors.ListVector | rpy2.rlike.container.OrdDict | dict = 'default', predictor: str = 'gene', n_cores: int = 'default', parallelization: Literal['mcmapply', 'bpmapply', 'pbmcmapply'] = 'default', bpparam: rpy2.robjects.methods.RS4 | None = 'default', trace: bool = False, return_py: bool = 'default') rpy2.robjects.vectors.ListVector [source]
Fit the marginal models
@fit_marginal fits the per-feature regression models.
Details:
The function takes the result from @construct_data as the input, and fit the regression models for each feature based on users’ specification.
Arguments:
- mu_formula: str
A string of the mu parameter formula
- sigma_formula: str
A string of the sigma parameter formula
- family_use: str or list[str]
A string or a list of strings of the marginal distribution. Must be one of ‘binomial’, ‘poisson’, ‘nb’, ‘zip’, ‘zinb’ or ‘gaussian’, which represent ‘poisson distribution’, ‘negative binomial distribution’, ‘zero-inflated poisson distribution’, ‘zero-inflated negative binomail distribution’ and ‘gaussian distribution’ respectively.
- usebam: bool
If True, call R function mgcv::bam for calculation acceleration.
- data: rpy2.robject.vectors.ListVector or rpy2.rlike.container.OrdDict or dict (default: ‘default’)
The result of @construct_data. Default is ‘default’, using the class property @construct_data_res.
- predictor: str (default: ‘gene’)
A string of the predictor for the gam/gamlss model. This is essentially just a name.
- n_cores: int (default: ‘default’)
The number of cores to use. Default is ‘default’, use the setting when initializing.
- parallelization: str (default: ‘default’)
The specific parallelization function to use. If ‘bpmapply’, first call method @get_bpparam. Default is ‘default’, use the setting when initializing.
- bpparam: rpy2.robject.methods.RS4 (default: ‘default’)
If @parallelization is ‘bpmapply’, first call function @get_bpparam to get the robject. If @parallelization is ‘mcmapply’ or ‘pbmcmapply’, it should be None. Default is ‘default’, use the setting when initializing.
- trace: bool (default: False)
If True, the warning/error log and runtime for gam/gamlss will be returned.
- return_py: bool (default: ‘default’)
If True, functions will return a result easy for manipulation in python. Default is ‘default’, use the setting when initializing.
Output:
A dict like object. Each key corresponds to one gene name and the length is equal to the total gene number.
Every gene has the following keys.
- fit: rpy2.rlike.container.OrdDict
The fitted regression models.
- removed_cell: numpy.ndarray
The removed cell (observation) when fitting the marginal regression model.
- warning: rpy2.rlike.container.OrdDict
If @trace = True, this key is returned. Hosting the warning/error log when fitting the marginal models.
- time: numpy.ndarray
If @trace = True, this key is returned. The runtime for each marginal model. The first value corresponds to runtime for gam model and the second one corresponds to runtime for gamlss model.