Classes and Utilities

TSeries Class

All methods for computing signatures, fitting null models, validating, building graphs, plotting, and performing community detection are included in the TSeries class.

class siberia.TSeries(data=None, n_jobs=1)[source]

TSeries: Graph-Based Time Series Analysis Class The TSeries class provides a comprehensive framework for analyzing time series data using graph-based methods. It is designed to process a weighted adjacency matrix (2D numpy array) representing a N x T time series and extract a rich set of statistics, including binary signatures, motif statistics, model fitting, signed graph projection, and signed community detection. Key Features: ————- - Standardizes input time series data (row-wise mean ~0, std ~1). - Computes binary time series representations (positive/negative motifs). - Calculates motif-based signatures and concordance/discordance matrices. - Supports model fitting for binary Signed Random Graph Model (bSRGM) and binary Signed Configuration Model (bSCM). - Predicts event probabilities for fitted models. - Validates signature distributions via ensemble simulations and analytical methods (KS test). - Builds filtered graphs based on statistical significance (with FDR correction). - Detects communities using greedy minimization of BIC or frustration objectives. - Visualizes graphs, communities, and block matrices. data : np.ndarray

2D numpy array (N x T) representing the time series matrix (N nodes, T time steps).

n_jobsint, optional: Number of parallel jobs for computations (default: 1).

Attributes n_jobs : int

Number of parallel jobs used.

Nint: Number of nodes (rows in data).
Tint: Number of time steps (columns in data).
tseriesnp.ndarray: Standardized time series matrix.
binary_tseriesnp.ndarray: Binary sign matrix of time series.
binary_tseries_positivenp.ndarray: Matrix of positive binary motifs.
binary_tseries_negativenp.ndarray: Matrix of negative binary motifs.
ai_plus, ai_minusnp.ndarray: Row-wise sums of positive/negative motifs.
kt_plus, kt_minusnp.ndarray: Column-wise sums of positive/negative motifs.
a_plus, a_minusfloat: Total sum of positive/negative motifs.
binary_concordant_motifsnp.ndarray: Matrix of concordant motif counts.
binary_discordant_motifsnp.ndarray: Matrix of discordant motif counts.
binary_signaturenp.ndarray: Matrix of binary signature values.
paramsnp.ndarray: Fitted model parameters.
llfloat: Log-likelihood of the fitted model.
jacnp.ndarray: Jacobian of the fitted model.
normfloat: Norm of the Jacobian.
aicfloat: Akaike Information Criterion of the fitted model.
norm_rel_errorfloat: Relative error of the fitted model. Name of the fitted model.
x0np.ndarray: Initial guess for model parameters.
tolfloat: Tolerance for optimization.
epsfloat: Step size for numerical approximation.
maxiterint: Maximum number of optimization iterations.
verboseint: Verbosity level.
pit_plus, pit_minusnp.ndarray: Predicted probabilities for positive/negative events.
n_ensembleint: Number of ensemble realizations for validation.
ensemble_signaturenp.ndarray: Ensemble signature matrix.
analytical_signaturenp.ndarray: Analytical signature matrix.
analytical_signature_distnp.ndarray: Analytical signature distribution.
ks_scorefloat: Kolmogorov-Smirnov score for signature validation.
p_values_correctednp.ndarray: FDR-corrected p-values matrix.
cdfx_conditionnp.ndarray: Matrix indicating direction of statistical significance.
graphnp.ndarray: Filtered adjacency matrix (projection graph).
communitiesnp.ndarray: Community labels for each node.

Methods __init__(self, data=None, n_jobs=1)

Initialize the TSeries instance and compute marginals.

compute_signature(self): Compute binary signatures and motif statistics.
fit(self, model, …): Fit a specified model (‘bSRGM’, ‘bSCM’) to the data.
predict(self): Predict event probabilities for the fitted model.
check_distribution_signature(self, n_ensemble=1000, ks_score=True, alpha=0.05): Validate signature distribution using ensemble and analytical methods.
build_graph(self, fdr_correction_flag=True, alpha=0.05): Build filtered graph using statistical significance (with FDR correction).
plot_graph(self, export_path=’’, show=True): Plot the adjacency matrix as a heatmap.
community_detection(self, trials=500, n_jobs=None, method=”bic”, show=False, …): Detect communities using greedy minimization of BIC or frustration.
plot_communities(self, export_path=””, show=True): Plot reordered adjacency matrix with community blocks.

plot_block_matrix(self, export_path=””, show=True) - The class is optimized for parallel computation and large time series datasets. - All statistical tests and corrections are performed on the upper triangular part of the projection matrices for efficiency. - Community detection supports robust initialization strategies for reproducibility. - Visualization methods use discrete colormaps for signed graphs.

If input data is missing or incorrectly formatted, or if required computations are not performed. If input types are incorrect or unsupported.

compute_signature()[source]: Computes the binary signatures of time series data. This method calculates the concordant and discordant motifs for binary time series data. It then computes the binary signature by subtracting the discordant motifs from the concordant motifs. The method performs the following steps: 1. Computes pairwise motifs for binary time series data (positive-positive, positive-negative, negative-positive, negative-negative). 2. Calculates the binary concordant motifs as the sum of positive-positive and negative-negative motifs. 3. Calculates the binary discordant motifs as the sum of positive-negative and negative-positive motifs. 4. Computes the binary signature as the difference between binary concordant and discordant motifs. Attributes:

binary_concordant_motifs (int): Sum of concordant motifs for binary time series data. binary_discordant_motifs (int): Sum of discordant motifs for binary time series data. binary_signature (int): Difference between binary concordant and discordant motifs.

fit(model, x0=None, maxiter=1000, max_nfev=1000, verbose=0, tol=1e-08, eps=1e-08, output_params_path=None, imported_params=None, solver_type='fixed_point')[source]

Fit the specified model to the data. Parameters: ———– model : str

The model to be fitted. Must be one of the implemented models: ‘bSRGM’, ‘bSCM’.

x0array-like, optional: Initial guess for the parameters. If None, a random initialization will be used.
maxiterint, optional: Maximum number of iterations for the optimization algorithm. Default is 1000.
max_nfevint, optional: Maximum number of function evaluations for the optimization algorithm. Default is 1000.
verboseint, optional: Verbosity level of the optimization algorithm. Default is 0.
tolfloat, optional: Tolerance for termination by the optimization algorithm. Default is 1e-8.
epsfloat, optional: Step size used for numerical approximation of the Jacobian. Default is 1e-8.
output_params_pathstr, optional: Path to save the fitted parameters. If None, the parameters will not be saved.

Raises:

ValueError: If the model is not initialized or not implemented.
TypeError: If output_params_path is not a string.

Returns:

None

predict()[source]

Predict the probabilities of events based on the specified model. This method computes the probabilities of the occurrence of events for the implemented models: - binary Signed Random Graph Model (bSRGM) - binary Signed Configuration Model (bSCM) Returns:

tuple: For “bSRGM” and “bSCM”, returns the computed probabilities:

(pit_plus, pit_minus)

check_distribution_signature(n_ensemble=1000, ks_score=True, alpha=0.05)[source]

Validate the signature of the model using either ensemble or analytical methods. Parameters: ———– n_ensemble : int, optional

Number of ensemble realizations used to build the empirical signature distribution. Default is 1000.

ks_scorebool, optional

If True, compute the Kolmogorov–Smirnov agreement score between empirical and analytical signature distributions. Default is True.

alphafloat, optional

Significance level used in the KS test when computing the KS score. Default is 0.05.

Flag to indicate whether to use analytical methods for validation. Default is True.

Raises:

ValueError: If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.

Notes:

This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type and the analytical flag, it uses different methods for validation: - For ensemble-based validation, it computes ensemble signatures and elaborates statistics. - For analytical validation, it computes p-values using specific analytical models for different types of models.

build_graph(fdr_correction_flag=True, alpha=0.05)[source]

This function validates the signature of the model by computing p-values and applying False Discovery Rate (FDR) correction. Depending on the model type, it uses analytical methods for validation. The function supports two model types: ‘bSRGM’ and ‘bSCM’.

A filtered signature matrix where elements are retained based on the significance level. - For the ‘bSRGM’ model, p-values are computed using a binomial cumulative distribution function. - For the ‘bSCM’ model, p-values are computed using the Poisson Binomial distribution. - The FDR correction is applied to the upper triangular part of the p-values matrix, and the

corrected matrix is made symmetric.

The filtered signature matrix is computed by retaining elements of the empirical signature matrix where the corrected p-values are below the significance level.

Validate the signature of the model using analytical methods. Parameters: ———– fdr_correction_flag : bool, optional

Flag to indicate whether to apply False Discovery Rate (FDR) correction. Default is True.

alphafloat, optional: Significance level for statistical tests. Default is 0.05.

Raises:

ValueError: If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.

Notes:

This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type, it uses analytical methods for validation: - It computes p-values using specific analytical models for different types of models.

plot_graph(export_path='', show=True)[source]

Plots the naive and filtered adjacency matrices as heatmaps. Parameters: ———– export_path : str, optional

The file path (excluding extension) where the plot will be saved as a PDF. If not provided, the plot will not be saved. Default is an empty string.

showbool, optional: If True, displays the plot. Default is True.

Raises:

ValueError: If self.filtered_graph is None, indicating that the graph has not been built.

Notes:

The naive adjacency matrix is plotted on the left, and the filtered adjacency matrix is plotted on the right.
The heatmaps use a discrete colormap with three colors: red (-1), white (0), and blue (1).
If export_path is provided, the plot is saved as a PDF with the suffix “_adjacency.pdf”.

community_detection(trials: int = 500, n_jobs: int = 1, method: str = 'bic', show: bool = False, random_state: int = 42, starter: str = 'uniform')[source]

Detect communities in the current graph via greedy minimization with multiple randomized restarts.

This method partitions the nodes of self.graph into communities by greedily minimizing an objective function. Two objectives are supported:

"bic": Bayesian Information Criterion of a signed stochastic block model (separate
probabilities for positive and negative edges in each block);
"frustration": signed network frustration, penalizing negative edges inside
communities and positive edges across communities.

For robustness to local minima, the algorithm performs several independent trials, each starting from a different random community assignment. Trials are run in parallel and the partition with the lowest objective value is returned.

Parameters

trialsint, optional: Number of independent random restarts (trials) of the greedy algorithm. Each trial starts from a different initial community assignment. Default is 500.
n_jobsint or None, optional: Number of parallel jobs used to run the trials. If None, uses self.n_jobs. Default is 1.
method{“bic”, “frustration”}, optional: Objective to minimize. "bic" uses the BIC of a signed SBM; "frustration" uses network frustration. Default is "bic".
showbool, optional: If True, passes a verbose flag to the underlying parallel execution to log progress information. Default is False.
random_stateint or None, optional: Seed for the global random number generator that produces per-trial seeds. Use this for reproducible community assignments. Default is 42.
starterstr, optional: Strategy used to generate initial community labels for each trial. If "uniform", each trial starts from a shuffled identity labeling (one unique label per node). Any other value triggers a mixture strategy that randomly chooses between shuffled identity and a random partition into k communities (with 2 ≤ k ≤ min(10, N)). Default is "uniform".

Returns

np.ndarray: One-dimensional array of length N with the community label of each node (labels are relabeled to be contiguous integers starting at 0). The same array is also stored in self.communities.

Raises

ValueError: If self.graph is None (i.e., .build_graph() must be called first), or if method is not one of "bic" or "frustration".

Notes

The underlying graph is represented by self.graph, a signed adjacency matrix. For the BIC objective, a signed stochastic block model with separate probabilities for positive and negative edges in each community pair is fitted, and the BIC is computed as a penalized negative log-likelihood. For the frustration objective, the loss counts (with weights) negative edges inside communities and positive edges between communities.

During optimization, if a community becomes empty after a node move, the labels are renumbered so that community indices remain compact (0, 1, …, K-1).

plot_communities(export_path='', show=True)[source]

Plot reordered adjacency matrix by community labels with boxes.

Parameters:

graph_typestr, optional: Either “naive” or “filtered” (default=”filtered”).
export_pathstr, optional: Path to save the PDF figure. If empty, the plot is not saved.
showbool, optional: If True, display the figure.

plot_block_matrix(export_path='', show=True)[source]

Plot block matrix of the graph based on detected communities.

Parameters:

export_pathstr, optional: Path to save the PDF figure. If empty, the plot is not saved.
showbool, optional: If True, display the figure.