Short answer: Yes — very feasible.

Below I give a complete, careful, production-ready plan (end-to-end) with concrete SQL + Python snippets, recommended hyperparameters, evaluation/monitoring, pitfalls & mitigations, and rollout steps — everything you need to build topic clustering + automated RCA/summarization for ServiceNow incidents so you can reduce reassignment counts.

I’ll present this as a staged pipeline. Read through once for the full flow, then use the code snippets in each stage.


1) High-level architecture (one-line)

  1. Aggregate and clean journal rows per incident in Snowflake → produce document units (per-incident or per-chunk).
  2. Create embeddings for those documents (Cortex embeddings) and store them.
  3. Cluster embeddings (unsupervised) to find topic buckets.
  4. Build cluster-level summaries and extract resolution steps / candidate RCA using Cortex (LLM).
  5. Map clusters to assignment groups / possible RCA labels (rule suggestion + supervised model).
  6. Deploy routing/suggestion system with human-in-the-loop and measurement (A/B test to reduce reassignment).

2) Detailed step-by-step plan (with code & params)

Step 0 — prerequisites & safety


Step 1 — Preprocessing & aggregation (Snowflake SQL)