scDesign3Py.scDesign3.fit_marginal

scDesign3.fit_marginal(mu_formula: str, sigma_formula: str, family_use: Literal['binomial', 'poisson', 'nb', 'zip', 'zinb', 'gaussian'] | list[str], usebam: bool, data: rpy2.robjects.vectors.ListVector | rpy2.rlike.container.OrdDict | dict = 'default', predictor: str = 'gene', n_cores: int = 'default', parallelization: Literal['mcmapply', 'bpmapply', 'pbmcmapply'] = 'default', bpparam: rpy2.robjects.methods.RS4 | None = 'default', trace: bool = False, return_py: bool = 'default') rpy2.robjects.vectors.ListVector[source]

Fit the marginal models

@fit_marginal fits the per-feature regression models.

Details:

The function takes the result from @construct_data as the input, and fit the regression models for each feature based on users’ specification.

Arguments:

mu_formula: str

A string of the mu parameter formula

sigma_formula: str

A string of the sigma parameter formula

family_use: str or list[str]

A string or a list of strings of the marginal distribution. Must be one of ‘binomial’, ‘poisson’, ‘nb’, ‘zip’, ‘zinb’ or ‘gaussian’, which represent ‘poisson distribution’, ‘negative binomial distribution’, ‘zero-inflated poisson distribution’, ‘zero-inflated negative binomail distribution’ and ‘gaussian distribution’ respectively.

usebam: bool

If True, call R function mgcv::bam for calculation acceleration.

data: rpy2.robject.vectors.ListVector or rpy2.rlike.container.OrdDict or dict (default: ‘default’)

The result of @construct_data. Default is ‘default’, using the class property @construct_data_res.

predictor: str (default: ‘gene’)

A string of the predictor for the gam/gamlss model. This is essentially just a name.

n_cores: int (default: ‘default’)

The number of cores to use. Default is ‘default’, use the setting when initializing.

parallelization: str (default: ‘default’)

The specific parallelization function to use. If ‘bpmapply’, first call method @get_bpparam. Default is ‘default’, use the setting when initializing.

bpparam: rpy2.robject.methods.RS4 (default: ‘default’)

If @parallelization is ‘bpmapply’, first call function @get_bpparam to get the robject. If @parallelization is ‘mcmapply’ or ‘pbmcmapply’, it should be None. Default is ‘default’, use the setting when initializing.

trace: bool (default: False)

If True, the warning/error log and runtime for gam/gamlss will be returned.

return_py: bool (default: ‘default’)

If True, functions will return a result easy for manipulation in python. Default is ‘default’, use the setting when initializing.

Output:

A dict like object. Each key corresponds to one gene name and the length is equal to the total gene number.

Every gene has the following keys.

fit: rpy2.rlike.container.OrdDict

The fitted regression models.

removed_cell: numpy.ndarray

The removed cell (observation) when fitting the marginal regression model.

warning: rpy2.rlike.container.OrdDict

If @trace = True, this key is returned. Hosting the warning/error log when fitting the marginal models.

time: numpy.ndarray

If @trace = True, this key is returned. The runtime for each marginal model. The first value corresponds to runtime for gam model and the second one corresponds to runtime for gamlss model.