K-Core
SQL function: cugraph_k_core
Return the graph k-core edge set.
Signature
cugraph_k_core(table_name [, src_col, dst_col [, weight_col [, options_json]]])
Allowed argument counts: 1, 3, 4, 5.
Quickstart
SELECT * FROM cugraph_k_core('target_edges')
Positional arguments
| Argument | Type | Required | Default | Notes |
|---|---|---|---|---|
table_name | Utf8 | yes | ||
src_col | Utf8 | no | src | |
dst_col | Utf8 | no | dst | |
weight_col | Utf8|null | no | accepted as an edge-column binding; native algorithm execution does not consume weights; semantic effect: none for this algorithm | |
options_json | Utf8 | no |
JSON options
| Option | Type | Default | Constraints | Description |
|---|---|---|---|---|
degree_type | Utf8 | "in_out" | one of "in", "out", "in_out" | |
k | UInt32 | 2 | min 1 |
Graph construction options
Shared by all cuGraph functions, shown here with this function's defaults. The construction_policy option controls whether Nexus requests Python cuGraph-compatible edge normalization or bypasses it for raw libcugraph-style construction; see graph construction options for the full policy guide.
| Option | Type | Default | Constraints | Description |
|---|---|---|---|---|
construction_policy | Utf8 | "python_cugraph" | one of "python_cugraph", "raw_libcugraph" | Edge-list construction semantics used before calling libcugraph. |
directed | Boolean | false | Whether graph construction treats edges as directed. | |
renumber | Boolean | true | Whether graph construction may renumber external vertex identifiers internally. |
Output schema
| Column | Type | Nullable | Description |
|---|---|---|---|
src | Int64 | no | Source vertex of an edge retained in the k-core subgraph. |
dst | Int64 | no | Destination vertex of an edge retained in the k-core subgraph. |
These are the generic registry schemas. Run cugraph_validate_call for the concrete, table-specific output schema of a particular call.
Examples
These examples run on the citation network demo dataset.
Extract the conversation backbone — then audit it with SQL
Unlike most functions here, cugraph_k_core returns edges, not scores:
the subgraph where every remaining vertex keeps at least k in-edges and
k out-edges. Because the output is reused as an edge relation, materialize it
in the local workspace; plain SQL can then verify the contract it promises:
-- Local workspace materialization; this does not write to lake.citation_network.
CREATE TABLE kcore30 AS
SELECT src, dst FROM cugraph_k_core('citation_edges', 'src', 'dst', NULL, '{"k":30}');
WITH deg AS (
SELECT v, SUM(o) AS outd, SUM(i) AS ind
FROM (SELECT src AS v, 1 AS o, 0 AS i FROM kcore30
UNION ALL
SELECT dst AS v, 0 AS o, 1 AS i FROM kcore30) u
GROUP BY v)
SELECT COUNT(*) AS vertices, MIN(outd) AS min_out, MIN(ind) AS min_in
FROM deg;
| vertices | min_out | min_in |
|---|---|---|
| 33,389 | 30 | 30 |
45.6M edges collapse to a 33k-vertex backbone of papers that both cite and are cited heavily — and the audit confirms every vertex meets the k=30 floor in both directions.
The k-core edge list comes back symmetrized: each undirected core edge
appears in both directions (2,043,052 rows here, i.e. ~1.0M undirected
edges). Note also that this per-direction floor is a stricter condition than
cugraph_core_number's in_out degree, which sums the two directions.
Chain it into the next algorithm
The materialized backbone is itself a valid edge relation, so it can feed
another cugraph_* call — a two-stage GPU pipeline held together by a local
workspace table name:
SELECT p.year, p.title
FROM cugraph_pagerank('kcore30', 'src', 'dst') r
JOIN papers p ON p.paper_id = r.vertex
ORDER BY r.value DESC
LIMIT 5;
| year | title |
|---|---|
| 2004 | Distinctive Image Features from Scale-Invariant Keypoints |
| 2014 | VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION |
| 2005 | Histograms of oriented gradients for human detection |
| 2009 | ImageNet: A large-scale hierarchical image database |
| 2016 | Deep Residual Learning for Image Recognition |
Limitations & notes
- dry-run validates table resolution, column presence, static dtypes, and options only
- dry-run does not scan edge data, construct a graph, or prove source-vertex existence
Validate before running
Always dry-run a call before executing it. Validation checks the function, table, columns, dtypes, and options without touching the GPU:
SELECT * FROM cugraph_validate_call(
'cugraph_k_core',
'your_edges_table',
'{"src_col":"src","dst_col":"dst"}'
);
See Discovery & validation for the full cugraph_validate_call contract.