Classes and Utilities

TSeries Class

All methods for computing signatures, fitting null models, validating, building graphs, plotting, and performing community detection are included in the TSeries class.

class siberia.TSeries(data=None, n_jobs=1)[source]

TSeries: Graph-Based Time Series Analysis Class The TSeries class provides a comprehensive framework for analyzing time series data using graph-based methods. It is designed to process a weighted adjacency matrix (2D numpy array) representing a N x T time series and extract a rich set of statistics, including binary signatures, motif statistics, model fitting, signed graph projection, and signed community detection. Key Features: ————- - Standardizes input time series data (row-wise mean ~0, std ~1). - Computes binary time series representations (positive/negative motifs). - Calculates motif-based signatures and concordance/discordance matrices. - Supports model fitting for binary Signed Random Graph Model (bSRGM) and binary Signed Configuration Model (bSCM). - Predicts event probabilities for fitted models. - Validates signature distributions via ensemble simulations and analytical methods (KS test). - Builds filtered graphs based on statistical significance (with FDR correction). - Detects communities using greedy minimization of BIC or frustration objectives. - Visualizes graphs, communities, and block matrices. data : np.ndarray

2D numpy array (N x T) representing the time series matrix (N nodes, T time steps).

n_jobsint, optional

Number of parallel jobs for computations (default: 1).

Attributes n_jobs : int

Number of parallel jobs used.

Nint

Number of nodes (rows in data).

Tint

Number of time steps (columns in data).

tseriesnp.ndarray

Standardized time series matrix.

binary_tseriesnp.ndarray

Binary sign matrix of time series.

binary_tseries_positivenp.ndarray

Matrix of positive binary motifs.

binary_tseries_negativenp.ndarray

Matrix of negative binary motifs.

ai_plus, ai_minusnp.ndarray

Row-wise sums of positive/negative motifs.

kt_plus, kt_minusnp.ndarray

Column-wise sums of positive/negative motifs.

a_plus, a_minusfloat

Total sum of positive/negative motifs.

binary_concordant_motifsnp.ndarray

Matrix of concordant motif counts.

binary_discordant_motifsnp.ndarray

Matrix of discordant motif counts.

binary_signaturenp.ndarray

Matrix of binary signature values.

paramsnp.ndarray

Fitted model parameters.

llfloat

Log-likelihood of the fitted model.

jacnp.ndarray

Jacobian of the fitted model.

normfloat

Norm of the Jacobian.

aicfloat

Akaike Information Criterion of the fitted model.

norm_rel_errorfloat

Relative error of the fitted model. Name of the fitted model.

x0np.ndarray

Initial guess for model parameters.

tolfloat

Tolerance for optimization.

epsfloat

Step size for numerical approximation.

maxiterint

Maximum number of optimization iterations.

verboseint

Verbosity level.

pit_plus, pit_minusnp.ndarray

Predicted probabilities for positive/negative events.

n_ensembleint

Number of ensemble realizations for validation.

ensemble_signaturenp.ndarray

Ensemble signature matrix.

analytical_signaturenp.ndarray

Analytical signature matrix.

analytical_signature_distnp.ndarray

Analytical signature distribution.

ks_scorefloat

Kolmogorov-Smirnov score for signature validation.

p_values_correctednp.ndarray

FDR-corrected p-values matrix.

cdfx_conditionnp.ndarray

Matrix indicating direction of statistical significance.

graphnp.ndarray

Filtered adjacency matrix (projection graph).

communitiesnp.ndarray

Community labels for each node.

Methods __init__(self, data=None, n_jobs=1)

Initialize the TSeries instance and compute marginals.

compute_signature(self)

Compute binary signatures and motif statistics.

fit(self, model, …)

Fit a specified model (‘bSRGM’, ‘bSCM’) to the data.

predict(self)

Predict event probabilities for the fitted model.

check_distribution_signature(self, n_ensemble=1000, ks_score=True, alpha=0.05)

Validate signature distribution using ensemble and analytical methods.

build_graph(self, fdr_correction_flag=True, alpha=0.05)

Build filtered graph using statistical significance (with FDR correction).

plot_graph(self, export_path=’’, show=True)

Plot the adjacency matrix as a heatmap.

community_detection(self, trials=500, n_jobs=None, method=”bic”, show=False, …)

Detect communities using greedy minimization of BIC or frustration.

plot_communities(self, export_path=””, show=True)

Plot reordered adjacency matrix with community blocks.

plot_block_matrix(self, export_path=””, show=True) - The class is optimized for parallel computation and large time series datasets. - All statistical tests and corrections are performed on the upper triangular part of the projection matrices for efficiency. - Community detection supports robust initialization strategies for reproducibility. - Visualization methods use discrete colormaps for signed graphs.

If input data is missing or incorrectly formatted, or if required computations are not performed. If input types are incorrect or unsupported.

compute_signature()[source]

Computes the binary signatures of time series data. This method calculates the concordant and discordant motifs for binary time series data. It then computes the binary signature by subtracting the discordant motifs from the concordant motifs. The method performs the following steps: 1. Computes pairwise motifs for binary time series data (positive-positive, positive-negative, negative-positive, negative-negative). 2. Calculates the binary concordant motifs as the sum of positive-positive and negative-negative motifs. 3. Calculates the binary discordant motifs as the sum of positive-negative and negative-positive motifs. 4. Computes the binary signature as the difference between binary concordant and discordant motifs. Attributes:

binary_concordant_motifs (int): Sum of concordant motifs for binary time series data. binary_discordant_motifs (int): Sum of discordant motifs for binary time series data. binary_signature (int): Difference between binary concordant and discordant motifs.

fit(model, x0=None, maxiter=1000, max_nfev=1000, verbose=0, tol=1e-08, eps=1e-08, output_params_path=None, imported_params=None, solver_type='fixed_point')[source]

Fit the specified model to the data. Parameters: ———– model : str

The model to be fitted. Must be one of the implemented models: ‘bSRGM’, ‘bSCM’.

x0array-like, optional

Initial guess for the parameters. If None, a random initialization will be used.

maxiterint, optional

Maximum number of iterations for the optimization algorithm. Default is 1000.

max_nfevint, optional

Maximum number of function evaluations for the optimization algorithm. Default is 1000.

verboseint, optional

Verbosity level of the optimization algorithm. Default is 0.

tolfloat, optional

Tolerance for termination by the optimization algorithm. Default is 1e-8.

epsfloat, optional

Step size used for numerical approximation of the Jacobian. Default is 1e-8.

output_params_pathstr, optional

Path to save the fitted parameters. If None, the parameters will not be saved.

Raises:

ValueError

If the model is not initialized or not implemented.

TypeError

If output_params_path is not a string.

Returns:

None

predict()[source]

Predict the probabilities of events based on the specified model. This method computes the probabilities of the occurrence of events for the implemented models: - binary Signed Random Graph Model (bSRGM) - binary Signed Configuration Model (bSCM) Returns:

tuple: For “bSRGM” and “bSCM”, returns the computed probabilities:
  • (pit_plus, pit_minus)

check_distribution_signature(n_ensemble=1000, ks_score=True, alpha=0.05)[source]

Validate the signature of the model using either ensemble or analytical methods. Parameters: ———– n_ensemble : int, optional

Number of ensemble realizations used to build the empirical signature distribution. Default is 1000.

ks_scorebool, optional

If True, compute the Kolmogorov–Smirnov agreement score between empirical and analytical signature distributions. Default is True.

alphafloat, optional

Significance level used in the KS test when computing the KS score. Default is 0.05.

Flag to indicate whether to use analytical methods for validation. Default is True.

Raises:

ValueError

If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.

Notes:

This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type and the analytical flag, it uses different methods for validation: - For ensemble-based validation, it computes ensemble signatures and elaborates statistics. - For analytical validation, it computes p-values using specific analytical models for different types of models.

build_graph(fdr_correction_flag=True, alpha=0.05)[source]

This function validates the signature of the model by computing p-values and applying False Discovery Rate (FDR) correction. Depending on the model type, it uses analytical methods for validation. The function supports two model types: ‘bSRGM’ and ‘bSCM’.

A filtered signature matrix where elements are retained based on the significance level. - For the ‘bSRGM’ model, p-values are computed using a binomial cumulative distribution function. - For the ‘bSCM’ model, p-values are computed using the Poisson Binomial distribution. - The FDR correction is applied to the upper triangular part of the p-values matrix, and the

corrected matrix is made symmetric.

  • The filtered signature matrix is computed by retaining elements of the empirical signature matrix where the corrected p-values are below the significance level.

Validate the signature of the model using analytical methods. Parameters: ———– fdr_correction_flag : bool, optional

Flag to indicate whether to apply False Discovery Rate (FDR) correction. Default is True.

alphafloat, optional

Significance level for statistical tests. Default is 0.05.

Raises:

ValueError

If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.

Notes:

This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type, it uses analytical methods for validation: - It computes p-values using specific analytical models for different types of models.

plot_graph(export_path='', show=True)[source]

Plots the naive and filtered adjacency matrices as heatmaps. Parameters: ———– export_path : str, optional

The file path (excluding extension) where the plot will be saved as a PDF. If not provided, the plot will not be saved. Default is an empty string.

showbool, optional

If True, displays the plot. Default is True.

Raises:

ValueError

If self.filtered_graph is None, indicating that the graph has not been built.

Notes:

  • The naive adjacency matrix is plotted on the left, and the filtered adjacency matrix is plotted on the right.

  • The heatmaps use a discrete colormap with three colors: red (-1), white (0), and blue (1).

  • If export_path is provided, the plot is saved as a PDF with the suffix “_adjacency.pdf”.

community_detection(trials: int = 500, n_jobs: int = 1, method: str = 'bic', show: bool = False, random_state: int = 42, starter: str = 'uniform')[source]

Detect communities in the current graph via greedy minimization with multiple randomized restarts.

This method partitions the nodes of self.graph into communities by greedily minimizing an objective function. Two objectives are supported:

  • "bic": Bayesian Information Criterion of a signed stochastic block model (separate

    probabilities for positive and negative edges in each block);

  • "frustration": signed network frustration, penalizing negative edges inside

    communities and positive edges across communities.

For robustness to local minima, the algorithm performs several independent trials, each starting from a different random community assignment. Trials are run in parallel and the partition with the lowest objective value is returned.

Parameters

trialsint, optional

Number of independent random restarts (trials) of the greedy algorithm. Each trial starts from a different initial community assignment. Default is 500.

n_jobsint or None, optional

Number of parallel jobs used to run the trials. If None, uses self.n_jobs. Default is 1.

method{“bic”, “frustration”}, optional

Objective to minimize. "bic" uses the BIC of a signed SBM; "frustration" uses network frustration. Default is "bic".

showbool, optional

If True, passes a verbose flag to the underlying parallel execution to log progress information. Default is False.

random_stateint or None, optional

Seed for the global random number generator that produces per-trial seeds. Use this for reproducible community assignments. Default is 42.

starterstr, optional

Strategy used to generate initial community labels for each trial. If "uniform", each trial starts from a shuffled identity labeling (one unique label per node). Any other value triggers a mixture strategy that randomly chooses between shuffled identity and a random partition into k communities (with 2 ≤ k ≤ min(10, N)). Default is "uniform".

Returns

np.ndarray

One-dimensional array of length N with the community label of each node (labels are relabeled to be contiguous integers starting at 0). The same array is also stored in self.communities.

Raises

ValueError

If self.graph is None (i.e., .build_graph() must be called first), or if method is not one of "bic" or "frustration".

Notes

The underlying graph is represented by self.graph, a signed adjacency matrix. For the BIC objective, a signed stochastic block model with separate probabilities for positive and negative edges in each community pair is fitted, and the BIC is computed as a penalized negative log-likelihood. For the frustration objective, the loss counts (with weights) negative edges inside communities and positive edges between communities.

During optimization, if a community becomes empty after a node move, the labels are renumbered so that community indices remain compact (0, 1, …, K-1).

plot_communities(export_path='', show=True)[source]

Plot reordered adjacency matrix by community labels with boxes.

Parameters:

graph_typestr, optional

Either “naive” or “filtered” (default=”filtered”).

export_pathstr, optional

Path to save the PDF figure. If empty, the plot is not saved.

showbool, optional

If True, display the figure.

plot_block_matrix(export_path='', show=True)[source]

Plot block matrix of the graph based on detected communities.

Parameters:

export_pathstr, optional

Path to save the PDF figure. If empty, the plot is not saved.

showbool, optional

If True, display the figure.