Louvain
SQL function: cugraph_louvain
Detect communities with Louvain modularity optimization.
Signature
cugraph_louvain(table_name [, src_col, dst_col [, weight_col [, options_json]]])
Allowed argument counts: 1, 3, 4, 5.
Quickstart
SELECT * FROM cugraph_louvain('target_edges')
Positional arguments
| Argument | Type | Required | Default | Notes |
|---|---|---|---|---|
table_name | Utf8 | yes | ||
src_col | Utf8 | no | src | |
dst_col | Utf8 | no | dst | |
weight_col | Utf8|null | no | accepted as an edge-column binding; native algorithm execution does not consume weights; semantic effect: none for this algorithm | |
options_json | Utf8 | no |
JSON options
| Option | Type | Default | Constraints | Description |
|---|---|---|---|---|
max_level | UInt32 | 100 | min 1 | |
resolution | Float64 | 1 | > 0 | |
threshold | Float64 | 1e-7 | min 0 |
Graph construction options
Shared by all cuGraph functions, shown here with this function's defaults. The construction_policy option controls whether Nexus requests Python cuGraph-compatible edge normalization or bypasses it for raw libcugraph-style construction; see graph construction options for the full policy guide.
| Option | Type | Default | Constraints | Description |
|---|---|---|---|---|
construction_policy | Utf8 | "python_cugraph" | one of "python_cugraph", "raw_libcugraph" | Edge-list construction semantics used before calling libcugraph. |
directed | Boolean | true | Whether graph construction treats edges as directed. | |
renumber | Boolean | true | Whether graph construction may renumber external vertex identifiers internally. |
Output schema
| Column | Type | Nullable | Description |
|---|---|---|---|
vertex | Int64 | no | Vertex assigned to a Louvain community. |
partition | Int64 | no | Community identifier assigned by Louvain. |
These are the generic registry schemas. Run cugraph_validate_call for the concrete, table-specific output schema of a particular call.
Examples
This example runs on the citation network demo dataset.
Do citation communities match the field labels?
Two views carve out the 2010s AI literature — nodes by field-of-study label,
edges where both endpoints qualify — and Louvain partitions it purely by
citation structure. Cross-tabulating each community against primary_fos
(with a window function to keep the top 3 labels per community) shows how
well the labels survive contact with the actual graph:
CREATE VIEW ai_nodes AS
SELECT paper_id FROM papers
WHERE year >= 2010 AND primary_fos IN (
'Deep learning', 'Artificial neural network', 'Convolutional neural network',
'Recurrent neural network', 'Natural language processing',
'Reinforcement learning', 'Image segmentation', 'Feature extraction',
'Object detection', 'Speech recognition');
CREATE VIEW ai_edges AS
SELECT e.src, e.dst
FROM citation_edges e
JOIN ai_nodes a ON a.paper_id = e.src
JOIN ai_nodes b ON b.paper_id = e.dst;
WITH community_fos AS (
SELECT c."partition" AS community, p.primary_fos, COUNT(*) AS n
FROM cugraph_louvain('ai_edges', 'src', 'dst') c
JOIN papers p ON p.paper_id = c.vertex
GROUP BY c."partition", p.primary_fos),
ranked AS (
SELECT SUM(n) OVER (PARTITION BY community) AS members,
ROW_NUMBER() OVER (PARTITION BY community ORDER BY n DESC) AS rn,
primary_fos,
n
FROM community_fos)
SELECT members, rn, primary_fos, n
FROM ranked
WHERE members > 2500 AND rn <= 3
ORDER BY members DESC, rn;
| members | rn | primary_fos | n |
|---|---|---|---|
| 10,684 | 1 | Convolutional neural network | 3,002 |
| 10,684 | 2 | Object detection | 2,467 |
| 10,684 | 3 | Deep learning | 2,247 |
| 6,707 | 1 | Deep learning | 2,168 |
| 6,707 | 2 | Convolutional neural network | 2,061 |
| 6,707 | 3 | Feature extraction | 778 |
| 5,118 | 1 | Deep learning | 1,591 |
| 5,118 | 2 | Recurrent neural network | 1,289 |
| 5,118 | 3 | Convolutional neural network | 714 |
| 3,960 | 1 | Reinforcement learning | 3,157 |
| 3,960 | 2 | Artificial neural network | 337 |
| 3,960 | 3 | Deep learning | 204 |
Louvain (1,960 communities over 38k papers) recovers real subfield borders:
a computer-vision community, a sequence-modeling community — and a
reinforcement-learning community that is 80% one label. Note the quoted
"partition" — the output column name is a SQL keyword. cugraph_leiden
is a drop-in replacement in the same shape.
Limitations & notes
- dry-run validates table resolution, column presence, static dtypes, and options only
- dry-run does not scan edge data, construct a graph, or prove source-vertex existence
Validate before running
Always dry-run a call before executing it. Validation checks the function, table, columns, dtypes, and options without touching the GPU:
SELECT * FROM cugraph_validate_call(
'cugraph_louvain',
'your_edges_table',
'{"src_col":"src","dst_col":"dst"}'
);
See Discovery & validation for the full cugraph_validate_call contract.