scDesign3Py.scDesign3.fit_marginal

scDesign3.fit_marginal(mu_formula: str, sigma_formula: str, family_use: Literal['binomial', 'poisson', 'nb', 'zip', 'zinb', 'gaussian'] | list[str], usebam: bool, data: rpy2.robjects.vectors.ListVector | rpy2.rlike.container.OrdDict | dict = 'default', predictor: str = 'gene', n_cores: int = 'default', parallelization: Literal['mcmapply', 'bpmapply', 'pbmcmapply'] = 'default', bpparam: rpy2.robjects.methods.RS4 | None = 'default', trace: bool = False, return_py: bool = 'default') → rpy2.robjects.vectors.ListVector[source]

Fit the marginal models

@fit_marginal fits the per-feature regression models.

Details:

The function takes the result from @construct_data as the input, and fit the regression models for each feature based on users’ specification.

Arguments:

mu_formula: str: A string of the mu parameter formula
sigma_formula: str: A string of the sigma parameter formula
family_use: str or list[str]: A string or a list of strings of the marginal distribution. Must be one of ‘binomial’, ‘poisson’, ‘nb’, ‘zip’, ‘zinb’ or ‘gaussian’, which represent ‘poisson distribution’, ‘negative binomial distribution’, ‘zero-inflated poisson distribution’, ‘zero-inflated negative binomail distribution’ and ‘gaussian distribution’ respectively.
usebam: bool: If True, call R function mgcv::bam for calculation acceleration.
data: rpy2.robject.vectors.ListVector or rpy2.rlike.container.OrdDict or dict (default: ‘default’): The result of @construct_data. Default is ‘default’, using the class property @construct_data_res.
predictor: str (default: ‘gene’): A string of the predictor for the gam/gamlss model. This is essentially just a name.
n_cores: int (default: ‘default’): The number of cores to use. Default is ‘default’, use the setting when initializing.
parallelization: str (default: ‘default’): The specific parallelization function to use. If ‘bpmapply’, first call method @get_bpparam. Default is ‘default’, use the setting when initializing.
bpparam: rpy2.robject.methods.RS4 (default: ‘default’): If @parallelization is ‘bpmapply’, first call function @get_bpparam to get the robject. If @parallelization is ‘mcmapply’ or ‘pbmcmapply’, it should be None. Default is ‘default’, use the setting when initializing.
trace: bool (default: False): If True, the warning/error log and runtime for gam/gamlss will be returned.
return_py: bool (default: ‘default’): If True, functions will return a result easy for manipulation in python. Default is ‘default’, use the setting when initializing.

Output:

A dict like object. Each key corresponds to one gene name and the length is equal to the total gene number.

Every gene has the following keys.

fit: rpy2.rlike.container.OrdDict: The fitted regression models.
removed_cell: numpy.ndarray: The removed cell (observation) when fitting the marginal regression model.
warning: rpy2.rlike.container.OrdDict: If @trace = True, this key is returned. Hosting the warning/error log when fitting the marginal models.
time: numpy.ndarray: If @trace = True, this key is returned. The runtime for each marginal model. The first value corresponds to runtime for gam model and the second one corresponds to runtime for gamlss model.