SciPy CSGraph – Compressed Sparse Graph

gzip.compress(s) in Python

Graphs are powerful mathematical structures used to represent relationships between entities in various fields, including computer science, social networks, transportation systems, and more. Analyzing and computing graphs is a fundamental task in many applications, but it can be challenging, especially when dealing with large graphs with sparse connectivity. Fortunately, the scipy.sparse.csgraph subpackage in the SciPy library offers a comprehensive set of tools and algorithms specifically designed for efficient graph analysis using sparse matrix representations. Sparse matrices are matrices where the majority of elements are zero, making them ideal for representing and manipulating large graphs with sparse connectivity.

Note: Before going further strongly recommended to know how to create a sparse matrix in Python (Refer to this article How to Create a Sparse Matrix in Python).

Key Functionalities of SciPy CSGraph

The scipy.sparse.csgraph subpackage offers a wide range of functionalities and algorithms for efficient graph analysis. Let’s delve into its key features:

Shortest Path Algorithms
- Dijkstra’s Algorithm: Find the shortest path between nodes using the shortest_path function.
- Bellman-Ford Algorithm: Compute the shortest path considering negative edge weights with bellman_ford.
- Floyd-Warshall Algorithm: Determine the shortest path between all pairs of nodes using floyd_warshall.
Connected Components
- connected_components: Identify the connected components in a graph, providing the number of components and labels for each node.
- connected_components_dist: Compute the connected components considering edge weights.
Minimum Spanning Tree
- minimum_spanning_tree: Calculate the minimum spanning tree of a graph, finding the subset of edges with the minimum total weight.
- minimum_spanning_tree_csr: Compute the minimum spanning tree for graphs represented as Compressed Sparse Row (CSR) matrices.
Strongly Connected Components
- strongly_connected_components: Identify strongly connected components in a directed graph.
- strongly_connected_components_csr: Compute strongly connected components for CSR matrix representation.

Creating CSGraph From Adjacency Matrix

Define an adjacency matrix that represents the connectivity of the graph.
Convert the adjacency matrix to a sparse matrix representation (e.g., CSR, CSC).
Use the csgraph_from_dense function to convert the sparse matrix to a graph representation
The graph is directed.

In this example, we are using the Numpy and Scipy for creating a sparse matrix and then it’s converted into a graph.

Python3

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Creating a 3 * 3 sparse matrix .
sparseMatrix = csr_matrix((3, 3),
                          dtype=np.int8).toarray()
 
# converting sparse matrix to graph
graph = csgraph_from_dense(sparseMatrix)
 
print(graph.toarray())

Output:

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

In this example first, we created the adjacency matrix, then we converted it into a sparse matrix.

Python3

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Define the adjacency matrix for a directed graph
adjacency_matrix = [[0, 1, 0, 1],
                    [0, 0, 1, 0],
                    [0, 0, 0, 1],
                    [0, 0, 0, 0]]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix).toarray()
 
# Convert CSR format to graph representation
graph = csgraph_from_dense(graph_sparse)
# it will print graph as=> (source,destination) edge-weight
print(graph)  

Output:

  (0, 1)    1.0
  (0, 3)    1.0
  (1, 2)    1.0
  (2, 3)    1.0

Creating CSGraph from Edge List

Define an edge list that represents the connectivity of the graph.
Convert the edge list to a sparse matrix representation (e.g., COO).
Use the csgraph_from_dense function to convert the sparse matrix to a graph representation
The graph is directed.

Python3

import numpy as np
from scipy.sparse import coo_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# creating the edge list
edgeList = coo_matrix((3, 3),
                      dtype=np.int8).toarray()
 
# converting the edge list to graph
graph = csgraph_from_dense(edgeList)
 
print(graph.toarray())

Output:

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Creating The undirected Graph:

To create an undirected graph using scipy.sparse.csgraph, you can use the symmetric adjacency matrix.

Symmetric Matrix: When we say that a matrix is symmetric, it means that the matrix is equal to its transpose. In other words, for a square matrix, if the element at row i and column j is equal to the element at row j and column i, then the matrix is symmetric.

Python3

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Define the adjacency matrix for an undirected graph
# Here 1 represents the edge weight between source to destination
adjacency_matrix = [[0, 1, 0, 1],
                    [1, 0, 1, 0],
                    [0, 1, 0, 1],
                    [1, 0, 1, 0]]
 
# Set the matrix symmetrically
adjacency_matrix = [[max(adjacency_matrix[i][j],
                         adjacency_matrix[j][i])
                     for j in range(len(adjacency_matrix))]
                    for i in range(len(adjacency_matrix))]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix).toarray()
 
# Convert CSR format to graph representation
graph = csgraph_from_dense(graph_sparse)
 
print(graph)

Output:

  (0, 1)    1.0
  (0, 3)    1.0
  (1, 0)    1.0
  (1, 2)    1.0
  (2, 1)    1.0
  (2, 3)    1.0
  (3, 0)    1.0
  (3, 2)    1.0

Syntax:

breadth_first_order(csgraph, i_start, directed=True)

Parameters

csgraph : The N x N array representing the input graph.

i_start :(int) The index of starting node

Return:

node_array: ndarray(one dimension) The breadth-first list of nodes, starting with specified node. The length of node_array is the number of nodes reachable from the specified node.

Python3

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import breadth_first_order
 
adjMat = [
    [0, 1, 2, 0],
    [0, 0, 0, 1],
    [2, 0, 0, 3],
    [0, 0, 0, 0]
]
 
graph = csr_matrix(adjMat)
print(graph)  
 
# bfs start from Node 0
bfs = breadth_first_order(graph, 0, 
                          return_predecessors=False)
 
print("Breadth-first travelling order:", bfs)

Output:

  (0, 1)    1
  (0, 2)    2
  (1, 3)    1
  (2, 0)    2
  (2, 3)    3
Breadth-first travelling order: [0 1 2 3]

depth_first_order(csgraph, i_start, directed=True): Return a depth-first ordering starting with the specified node.

Python3

from scipy.sparse.csgraph import depth_first_order
 
# dfs Travel Start from Node 1
dfs = depth_first_order(graph, i_start=1,
                        return_predecessors=False)
print("Depth First Travelling order:", dfs)

Output:

Depth First Travelling order: [1 3]

Syntax:

shortest_path(csgraph, method=’auto’, directed=True,indices=None)

Parameters:

csgraph : The N x N array of distances representing the input graph.

method : (string [‘auto’|’FW’|’D’], optional) Algorithm to use for shortest paths. Options are:

‘auto’ – (default) select the best among ‘FW’, ‘D’, ‘BF’, or ‘J’

‘FW’ – Floyd-Warshall algorithm. Computational cost is

‘D’ – Dijkstra’s algorithm with Fibonacci heaps.

‘BF’ – Bellman-Ford algorithm. This algorithm can be used

‘J’ – Johnson’s algorithm. Like the Bellman-Ford

directed: (bool, optional):

If True (default), then find the shortest path on a directed graph:

If False, then find the shortest path on an undirected graph

indices : (arrays/int) If specified, only compute the paths from the points at the given indices.

Returns:

dist_matrixnd : (array)The N x N matrix of distances between graph nodes. dist_matrix[i,j] gives the shortest distance from point i to point j along the graph

Python3

from scipy.sparse.csgraph import shortest_path
 
# the shortest path distance between
# the Node 1 to remaning Nodes
source = 1
dist1 = shortest_path(csgraph=graph,
                      method="auto",
                      directed=False,
                      indices=source)
print("Distance from Node {source} to remaning Nodes",
      dist1)
 
# the shortest path distances between All Nodes
dist_matrix = shortest_path(csgraph=graph,
                            method='FW',
                            directed=False)
print("Distance between the All the Nodes\n",
      dist_matrix)

Output:

Distance from Node {source} to remaning Nodes [1. 0. 3. 1.]
Distance between the All the Nodes
 [[0. 1. 2. 2.]
 [1. 0. 3. 1.]
 [2. 3. 0. 3.]
 [2. 1. 3. 0.]]

output1: dist1[j] represents the shortest distance between the source node(In example 1) to Node j
output2: distance[i, j] represents the shortest path between the node i to j.

Syntax:

minimum_spanning_tree(csgraph, overwrite=False)

A minimum spanning tree is a graph consisting of the subset of edges which together connect all connected nodes, while minimizing the total sum of weights on the edges. This is computed using the Kruskal algorithm

Parameters:

csgraph : input graph

overwrite :(bool ,optional) If true, then parts of the input graph will be overwritten for efficiency. Default is False.

Return:

span_tree :(csr_matrix) The N x N compressed-sparse representation of the undirected minimum spanning tree over the input

Python3

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import minimum_spanning_tree
 
X = csr_matrix([[0, 8, 0, 3],
                [0, 0, 2, 5],
                [0, 0, 0, 6],
                [0, 0, 0, 0]])
 
# Finding minimum span tree
Tcsr = minimum_spanning_tree(X)
 
# Minimum Span tree
print(Tcsr.toarray())

Output:

[[0. 0. 0. 3.]
 [0. 0. 2. 5.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Syntax:

maximum_flow(csgraph, source, sink)

Parameters:

csgraph: input graph

source : source node

sink : destination node

Return:

return instance of MaximumFlowResult class

The Attributes of the class are flow_value(Max flow to graph) and flow_matrix

Python3

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import maximum_flow
 
# Define the adjacency matrix for a directed graph
adjacency_matrix = [[0, 16, 13, 0, 0, 0],
                    [0, 0, 0, 12, 0, 0],
                    [0, 4, 0, 0, 14, 0],
                    [0, 0, 9, 0, 0, 20],
                    [0, 0, 0, 7, 0, 4],
                    [0, 0, 0, 0, 0, 0]]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix)
 
# Compute the maximum flow in the graph
flow_dict = maximum_flow(graph_sparse, 0, 5)
 
# Retrieve the maximum flow value
max_flow_value = flow_dict.flow_value
 
# Retrieve the flow distribution along the edges
flow_matrix = flow_dict.flow
 
print("Maximum Flow Value:", max_flow_value)
print("Flow Distribution:")
print(flow_matrix.toarray())

Output:

Maximum Flow Value: 23
Flow Distribution:
[[  0  12  11   0   0   0]
 [-12   0   0  12   0   0]
 [-11   0   0   0  11   0]
 [  0 -12   0   0  -7  19]
 [  0   0 -11   7   0   4]
 [  0   0   0 -19  -4   0]]

The maximum flow value is 23, indicating that a maximum of 23 units of flow can be sent from the source node to the sink node
The flow distribution matrix shows the flow along each edge. For example, the element flow_matrix[0, 1] represents the flow from node 0 to node 1, which is 12.

Directed v/s Undirected Graph

	Directed Graph	Undirected Graph
Edge Representation	Edges have a specific direction between Nodes. For example: If you see the output of Example-2 there is a directed edge from 0 to 1, it signifies that we can move from 0 to 1. But we can’t move from 1 to 0.	In an undirected graph, the edges do not have any specific direction For example: If you see the output of the above example there is an edge from 0 to 1 and also an edge from 1 to 0. It signifies that we can move from 0 to 1 and also we can move from 1 to 0
Symmetry	The adjacency matrix is asymmetric or the Relationship between vertices is asymmetric. The adjacency matrix of example-2 is asymmetric.	The adjacency matrix is symmetric or the Relationship between vertices is symmetric. The adjacency matrix of the above example is symmetric.
Edge Notation	Represented as (source vertex, target vertex).	Represented as an unordered pair {vertex A, vertex B}.
	Flow charts, one-way streets	Bidirectional streets

Conclusion:

Throughout this article, we explored the key features and functionalities of scipy.sparse.csgraph. We discussed how to create a graph using different methods such as COO matrix representation and dense matrix conversion. We learned about important graph algorithms like Dijkstra’s algorithm for finding the shortest paths and the maximum flow algorithm for network flow problems.

As you continue exploring the capabilities of scipy.sparse.csgraph, you’ll discover a rich collection of algorithms and methods that can be applied to a wide range of graph-related problems. From graph traversal and connectivity analysis to graph partitioning and network flow optimization, scipy.sparse.csgraph is a versatile tool that opens up a world of possibilities for graph analysis and optimization.

Tags:

#Python #python

gzip.compress(s) in Python

SciPy CSGraph – Compressed Sparse Graph

Key Functionalities of SciPy CSGraph

Creating CSGraph From Adjacency Matrix

Python3

Python3

Creating CSGraph from Edge List

Python3

Creating The undirected Graph:

Python3

Python3

Python3

Python3

Python3

Python3

Directed v/s Undirected Graph

Conclusion:

Contact Us