Classes and Utilities
TSeries Class
All methods for computing signatures, fitting null models, validating, building graphs,
plotting, and performing community detection are included in the TSeries class.
- class siberia.TSeries(data=None, n_jobs=1)[source]
TSeries: Graph-Based Time Series Analysis Class The TSeries class provides a comprehensive framework for analyzing time series data using graph-based methods. It is designed to process a weighted adjacency matrix (2D numpy array) representing a N x T time series and extract a rich set of statistics, including binary signatures, motif statistics, model fitting, signed graph projection, and signed community detection. Key Features: ————- - Standardizes input time series data (row-wise mean ~0, std ~1). - Computes binary time series representations (positive/negative motifs). - Calculates motif-based signatures and concordance/discordance matrices. - Supports model fitting for binary Signed Random Graph Model (bSRGM) and binary Signed Configuration Model (bSCM). - Predicts event probabilities for fitted models. - Validates signature distributions via ensemble simulations and analytical methods (KS test). - Builds filtered graphs based on statistical significance (with FDR correction). - Detects communities using greedy minimization of BIC or frustration objectives. - Visualizes graphs, communities, and block matrices. data : np.ndarray
2D numpy array (N x T) representing the time series matrix (N nodes, T time steps).
- n_jobsint, optional
Number of parallel jobs for computations (default: 1).
Attributes n_jobs : int
Number of parallel jobs used.
- Nint
Number of nodes (rows in data).
- Tint
Number of time steps (columns in data).
- tseriesnp.ndarray
Standardized time series matrix.
- binary_tseriesnp.ndarray
Binary sign matrix of time series.
- binary_tseries_positivenp.ndarray
Matrix of positive binary motifs.
- binary_tseries_negativenp.ndarray
Matrix of negative binary motifs.
- ai_plus, ai_minusnp.ndarray
Row-wise sums of positive/negative motifs.
- kt_plus, kt_minusnp.ndarray
Column-wise sums of positive/negative motifs.
- a_plus, a_minusfloat
Total sum of positive/negative motifs.
- binary_concordant_motifsnp.ndarray
Matrix of concordant motif counts.
- binary_discordant_motifsnp.ndarray
Matrix of discordant motif counts.
- binary_signaturenp.ndarray
Matrix of binary signature values.
- paramsnp.ndarray
Fitted model parameters.
- llfloat
Log-likelihood of the fitted model.
- jacnp.ndarray
Jacobian of the fitted model.
- normfloat
Norm of the Jacobian.
- aicfloat
Akaike Information Criterion of the fitted model.
- norm_rel_errorfloat
Relative error of the fitted model. Name of the fitted model.
- x0np.ndarray
Initial guess for model parameters.
- tolfloat
Tolerance for optimization.
- epsfloat
Step size for numerical approximation.
- maxiterint
Maximum number of optimization iterations.
- verboseint
Verbosity level.
- pit_plus, pit_minusnp.ndarray
Predicted probabilities for positive/negative events.
- n_ensembleint
Number of ensemble realizations for validation.
- ensemble_signaturenp.ndarray
Ensemble signature matrix.
- analytical_signaturenp.ndarray
Analytical signature matrix.
- analytical_signature_distnp.ndarray
Analytical signature distribution.
- ks_scorefloat
Kolmogorov-Smirnov score for signature validation.
- p_values_correctednp.ndarray
FDR-corrected p-values matrix.
- cdfx_conditionnp.ndarray
Matrix indicating direction of statistical significance.
- graphnp.ndarray
Filtered adjacency matrix (projection graph).
- communitiesnp.ndarray
Community labels for each node.
Methods __init__(self, data=None, n_jobs=1)
Initialize the TSeries instance and compute marginals.
- compute_signature(self)
Compute binary signatures and motif statistics.
- fit(self, model, …)
Fit a specified model (‘bSRGM’, ‘bSCM’) to the data.
- predict(self)
Predict event probabilities for the fitted model.
- check_distribution_signature(self, n_ensemble=1000, ks_score=True, alpha=0.05)
Validate signature distribution using ensemble and analytical methods.
- build_graph(self, fdr_correction_flag=True, alpha=0.05)
Build filtered graph using statistical significance (with FDR correction).
- plot_graph(self, export_path=’’, show=True)
Plot the adjacency matrix as a heatmap.
- community_detection(self, trials=500, n_jobs=None, method=”bic”, show=False, …)
Detect communities using greedy minimization of BIC or frustration.
- plot_communities(self, export_path=””, show=True)
Plot reordered adjacency matrix with community blocks.
plot_block_matrix(self, export_path=””, show=True) - The class is optimized for parallel computation and large time series datasets. - All statistical tests and corrections are performed on the upper triangular part of the projection matrices for efficiency. - Community detection supports robust initialization strategies for reproducibility. - Visualization methods use discrete colormaps for signed graphs.
If input data is missing or incorrectly formatted, or if required computations are not performed. If input types are incorrect or unsupported.
- compute_signature()[source]
Computes the binary signatures of time series data. This method calculates the concordant and discordant motifs for binary time series data. It then computes the binary signature by subtracting the discordant motifs from the concordant motifs. The method performs the following steps: 1. Computes pairwise motifs for binary time series data (positive-positive, positive-negative, negative-positive, negative-negative). 2. Calculates the binary concordant motifs as the sum of positive-positive and negative-negative motifs. 3. Calculates the binary discordant motifs as the sum of positive-negative and negative-positive motifs. 4. Computes the binary signature as the difference between binary concordant and discordant motifs. Attributes:
binary_concordant_motifs (int): Sum of concordant motifs for binary time series data. binary_discordant_motifs (int): Sum of discordant motifs for binary time series data. binary_signature (int): Difference between binary concordant and discordant motifs.
- fit(model, x0=None, maxiter=1000, max_nfev=1000, verbose=0, tol=1e-08, eps=1e-08, output_params_path=None, imported_params=None, solver_type='fixed_point')[source]
Fit the specified model to the data. Parameters: ———– model : str
The model to be fitted. Must be one of the implemented models: ‘bSRGM’, ‘bSCM’.
- x0array-like, optional
Initial guess for the parameters. If None, a random initialization will be used.
- maxiterint, optional
Maximum number of iterations for the optimization algorithm. Default is 1000.
- max_nfevint, optional
Maximum number of function evaluations for the optimization algorithm. Default is 1000.
- verboseint, optional
Verbosity level of the optimization algorithm. Default is 0.
- tolfloat, optional
Tolerance for termination by the optimization algorithm. Default is 1e-8.
- epsfloat, optional
Step size used for numerical approximation of the Jacobian. Default is 1e-8.
- output_params_pathstr, optional
Path to save the fitted parameters. If None, the parameters will not be saved.
Raises:
- ValueError
If the model is not initialized or not implemented.
- TypeError
If output_params_path is not a string.
Returns:
None
- predict()[source]
Predict the probabilities of events based on the specified model. This method computes the probabilities of the occurrence of events for the implemented models: - binary Signed Random Graph Model (bSRGM) - binary Signed Configuration Model (bSCM) Returns:
- tuple: For “bSRGM” and “bSCM”, returns the computed probabilities:
(pit_plus, pit_minus)
- check_distribution_signature(n_ensemble=1000, ks_score=True, alpha=0.05)[source]
Validate the signature of the model using either ensemble or analytical methods. Parameters: ———– n_ensemble : int, optional
Number of ensemble realizations used to build the empirical signature distribution. Default is 1000.
- ks_scorebool, optional
If True, compute the Kolmogorov–Smirnov agreement score between empirical and analytical signature distributions. Default is True.
- alphafloat, optional
Significance level used in the KS test when computing the KS score. Default is 0.05.
Flag to indicate whether to use analytical methods for validation. Default is True.
Raises:
- ValueError
If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.
Notes:
This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type and the analytical flag, it uses different methods for validation: - For ensemble-based validation, it computes ensemble signatures and elaborates statistics. - For analytical validation, it computes p-values using specific analytical models for different types of models.
- build_graph(fdr_correction_flag=True, alpha=0.05)[source]
This function validates the signature of the model by computing p-values and applying False Discovery Rate (FDR) correction. Depending on the model type, it uses analytical methods for validation. The function supports two model types: ‘bSRGM’ and ‘bSCM’.
A filtered signature matrix where elements are retained based on the significance level. - For the ‘bSRGM’ model, p-values are computed using a binomial cumulative distribution function. - For the ‘bSCM’ model, p-values are computed using the Poisson Binomial distribution. - The FDR correction is applied to the upper triangular part of the p-values matrix, and the
corrected matrix is made symmetric.
The filtered signature matrix is computed by retaining elements of the empirical signature matrix where the corrected p-values are below the significance level.
Validate the signature of the model using analytical methods. Parameters: ———– fdr_correction_flag : bool, optional
Flag to indicate whether to apply False Discovery Rate (FDR) correction. Default is True.
- alphafloat, optional
Significance level for statistical tests. Default is 0.05.
Raises:
- ValueError
If the predicted probabilities and conditional weights are not computed before validation. If the model specified is not valid.
Notes:
This function validates the signature of the model by computing p-values and applying FDR correction. Depending on the model type, it uses analytical methods for validation: - It computes p-values using specific analytical models for different types of models.
- plot_graph(export_path='', show=True)[source]
Plots the naive and filtered adjacency matrices as heatmaps. Parameters: ———– export_path : str, optional
The file path (excluding extension) where the plot will be saved as a PDF. If not provided, the plot will not be saved. Default is an empty string.
- showbool, optional
If True, displays the plot. Default is True.
Raises:
- ValueError
If self.filtered_graph is None, indicating that the graph has not been built.
Notes:
The naive adjacency matrix is plotted on the left, and the filtered adjacency matrix is plotted on the right.
The heatmaps use a discrete colormap with three colors: red (-1), white (0), and blue (1).
If export_path is provided, the plot is saved as a PDF with the suffix “_adjacency.pdf”.
- community_detection(trials: int = 500, n_jobs: int = 1, method: str = 'bic', show: bool = False, random_state: int = 42, starter: str = 'uniform')[source]
Detect communities in the current graph via greedy minimization with multiple randomized restarts.
This method partitions the nodes of
self.graphinto communities by greedily minimizing an objective function. Two objectives are supported:"bic": Bayesian Information Criterion of a signed stochastic block model (separateprobabilities for positive and negative edges in each block);
"frustration": signed network frustration, penalizing negative edges insidecommunities and positive edges across communities.
For robustness to local minima, the algorithm performs several independent trials, each starting from a different random community assignment. Trials are run in parallel and the partition with the lowest objective value is returned.
Parameters
- trialsint, optional
Number of independent random restarts (trials) of the greedy algorithm. Each trial starts from a different initial community assignment. Default is 500.
- n_jobsint or None, optional
Number of parallel jobs used to run the trials. If
None, usesself.n_jobs. Default is 1.- method{“bic”, “frustration”}, optional
Objective to minimize.
"bic"uses the BIC of a signed SBM;"frustration"uses network frustration. Default is"bic".- showbool, optional
If
True, passes a verbose flag to the underlying parallel execution to log progress information. Default isFalse.- random_stateint or None, optional
Seed for the global random number generator that produces per-trial seeds. Use this for reproducible community assignments. Default is 42.
- starterstr, optional
Strategy used to generate initial community labels for each trial. If
"uniform", each trial starts from a shuffled identity labeling (one unique label per node). Any other value triggers a mixture strategy that randomly chooses between shuffled identity and a random partition intokcommunities (with 2 ≤ k ≤ min(10, N)). Default is"uniform".
Returns
- np.ndarray
One-dimensional array of length
Nwith the community label of each node (labels are relabeled to be contiguous integers starting at 0). The same array is also stored inself.communities.
Raises
- ValueError
If
self.graphisNone(i.e.,.build_graph()must be called first), or ifmethodis not one of"bic"or"frustration".
Notes
The underlying graph is represented by
self.graph, a signed adjacency matrix. For the BIC objective, a signed stochastic block model with separate probabilities for positive and negative edges in each community pair is fitted, and the BIC is computed as a penalized negative log-likelihood. For the frustration objective, the loss counts (with weights) negative edges inside communities and positive edges between communities.During optimization, if a community becomes empty after a node move, the labels are renumbered so that community indices remain compact (0, 1, …, K-1).
- plot_communities(export_path='', show=True)[source]
Plot reordered adjacency matrix by community labels with boxes.
Parameters:
- graph_typestr, optional
Either “naive” or “filtered” (default=”filtered”).
- export_pathstr, optional
Path to save the PDF figure. If empty, the plot is not saved.
- showbool, optional
If True, display the figure.