Skip to content

Update blog post draft about Cyberpunk 2077#87

Open
stichbury wants to merge 8 commits into
kedro-org:mainfrom
stichbury:docs/update-blog-post-draft
Open

Update blog post draft about Cyberpunk 2077#87
stichbury wants to merge 8 commits into
kedro-org:mainfrom
stichbury:docs/update-blog-post-draft

Conversation

@stichbury
Copy link
Copy Markdown
Contributor

@lrcouto
Copy link
Copy Markdown
Contributor

lrcouto commented Nov 28, 2025

Added the relevant skills part, capitalized Cyberpunk in the title and added a link to the project at the bottom of the post.

For graphs to illustrate, a couple options I thought could work:

Flowchart showing how the project works after introducing the Wiki data, could fit right before the "Making it a conversation" section:

flowchart TD
    A["Data Sources"] --> B["Transcript<br/>400 pages"]
    A --> C["Clean wiki data<br/>13,000 pages"]
    
    B --> D["Processing Pipeline"]
    C --> D
    
    D --> E["Chunk transcript"]
    
    E --> G["Generate embeddings"]
    
    G --> H[("Kedro Datasets:<br/>Chunked Data<br/>Wiki Embeddings<br/>Character List")]
    
    I["User Query"] --> J["Query Pipeline"]
    H --> J
    
    J --> K["Find relevant<br/>contexts"]
    J --> L["Extract character<br/>mentions"]
    
    K --> M["Semantic Search<br/>with embeddings"]
    L --> M
    
    M --> N["Select most <br/>relevant chunks"]
    
    N --> O["Format Prompt from LangChainPromptDataset with Context"]
    
    O --> P["Prompt LLM"]
    
    P --> Q["Response"]
    
    style A fill:#e1f5ff
    style H fill:#fff3e0
    style I fill:#f3e5f5
    style Q fill:#e8f5e9
    style P fill:#fce4ec
Loading

Same thing, with a bit of a different formatting:

flowchart TD

    %% --- DATA SECTION ---
    subgraph Data["Raw Data"]
        direction TB
        T["Transcript<br/>(400 pages)"]
        W["Clean Wiki Data<br/>(13k pages)"]
    end

    %% --- PROCESSING SECTION ---
    subgraph Process["Processing Pipeline"]
        direction TB
        C["Chunk Transcript"]
        E["Generate Embeddings"]
    end

    %% --- STORAGE SECTION ---
    subgraph Storage["Processed Kedro Datasets"]
        direction TB
        KD1["Chunked Data"]
        KD2["Wiki Embeddings"]
        KD3["Character List"]
    end

    %% --- QUERY SECTION ---
    subgraph Query["Prepare Prompt"]
        direction TB
        R["Find Contexts"]
        M["Extract Character<br/>Mentions"]
        S["Semantic Search"]
        P["Select Relevant<br/>Chunks"]
        F["Format Prompt"]
    end

    %% --- OUTPUT SECTION ---
    subgraph Output["LLM Interaction"]
        direction TB
        LLM["Prompt LLM"]
        RSP["Get Response"]
    end

    %% --- ARROWS BETWEEN BLOCKS ---
    Data --> Process
    Process --> Storage
    Storage --> Query
    Query --> Output

    %% --- COLORS ---
    style Data fill:#e1f5ff,stroke:#c3dff4
    style Storage fill:#fff3e0,stroke:#f6ddb7
    style Query fill:#f3e5f5,stroke:#e4cae7
    style Output fill:#fce4ec,stroke:#f7ccd9
Loading

Explaining how prompt history is saved and reused as context in the CLI conversational chatbot:

graph TD
    A["CLI Query Node Starts"] --> B["Initialize LLM<br/>& History"]
    
    B --> C["Loop Iteration"]
    
    C --> D["Get User Input"]
    
    D --> E{"Exit?"}
    
    E -->|Yes| F["Return"]
    
    E -->|No| G["Find Contexts<br/>& Format Prompt"]
    
    G --> H["Add to History"]
    
    H --> I["Query LLM"]
    
    I --> J["Display Response"]
    
    J --> K["Append to History"]
    
    K --> C
    
    style A fill:#e3f2fd
    style C fill:#fff3e0
    style D fill:#f3e5f5
    style K fill:#e8f5e9
    style F fill:#ffebee
Loading

Showing how tags are used to use a single pipeline for both CLI and Discord outputs:

graph TD
    A["User Query"] --> B["find_relevant_contexts<br/>tags: cli, discord"]
    
    B --> C["format_prompt_with_context<br/>tags: cli, discord"]
    
    C --> D{"Which tags<br/>to run?"}
    
    D -->|kedro run --tags=cli| E["query_llm_cli<br/>tags: cli"]
    D -->|kedro run --tags=discord| F["query_llm_discord<br/>tags: discord"]
    
    E --> G["Interactive Loop<br/>Maintains History"]
    F --> H["Discord Response<br/>Chunked Message"]
    
    G --> I["CLI Output"]
    H --> J["Discord Message"]
    
    style A fill:#f3e5f5
    style B fill:#e8f5e9
    style C fill:#e8f5e9
    style D fill:#fff3e0
    style E fill:#e3f2fd
    style F fill:#fce4ec
    style G fill:#e3f2fd
    style H fill:#fce4ec
    style I fill:#c8e6c9
    style J fill:#f8bbd0
Loading

Could be cool too, screenshot of the bot's output on Discord:

image

Copy link
Copy Markdown
Contributor Author

@stichbury stichbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks 🎉

@stichbury
Copy link
Copy Markdown
Contributor Author

Hi @lrcouto I have realised that the problem with the graphics is that they are very tall vertically, and this is going to consume a lot of screen realestate in an already long article. Folks are going to get RSI from scrolling!

Would it be possible to convert the Mermaid code to flow horizontally rather than vertically?

@lrcouto
Copy link
Copy Markdown
Contributor

lrcouto commented Dec 2, 2025

It looks super long on the horizontal as well. Maybe we can break the flowchart in two like this?

flowchart LR
    A["Data Sources"] --> B["Transcript<br/>(400 pages)"]
    A --> C["Clean Wiki Data<br/>(13,000 pages)"]

    B --> D["Processing Pipeline"]
    C --> D

    D --> E["Chunk Transcript"]
    E --> G["Generate Embeddings"]

    G --> H[("Kedro Datasets:<br/>• Chunked Data<br/>• Wiki Embeddings<br/>• Character List")]

    style A fill:#e1f5ff
    style H fill:#fff3e0
Loading
flowchart LR

    H[("Kedro Datasets:<br/>• Chunked Data<br/>• Wiki Embeddings<br/>• Character List")]

    I["User Query"] --> J["Query Pipeline"]
    H --> J

    J --> K["Find Relevant<br/>Contexts"]
    J --> L["Extract Character<br/>Mentions"]

    K --> M["Semantic Search<br/>with Embeddings"]
    L --> M

    M --> N["Select Most<br/>Relevant Chunks"]
    N --> O["Format Prompt"]
    O --> P["Prompt LLM"]
    P --> Q["Response"]

    style H fill:#fff3e0
    style I fill:#f3e5f5
    style P fill:#fce4ec
    style Q fill:#e8f5e9
Loading

@lrcouto
Copy link
Copy Markdown
Contributor

lrcouto commented Dec 2, 2025

The other two are too small to break I think. Here's what they look like horizontally.

graph LR
    A["CLI Query Node Starts"] --> B["Initialize LLM<br/>& History"]
    
    B --> C["Loop Iteration"]
    
    C --> D["Get User Input"]
    
    D --> E{"Exit?"}
    
    E -->|Yes| F["Return"]
    
    E -->|No| G["Find Contexts<br/>& Format Prompt"]
    
    G --> H["Add to History"]
    
    H --> I["Query LLM"]
    
    I --> J["Display Response"]
    
    J --> K["Append to History"]
    
    K --> C
    
    style A fill:#e3f2fd
    style C fill:#fff3e0
    style D fill:#f3e5f5
    style K fill:#e8f5e9
    style F fill:#ffebee
Loading
graph LR
    A["User Query"] --> B["find_relevant_contexts<br/>tags: cli, discord"]
    
    B --> C["format_prompt_with_context<br/>tags: cli, discord"]
    
    C --> D{"Which tags<br/>to run?"}
    
    D -->|kedro run --tags=cli| E["query_llm_cli<br/>tags: cli"]
    D -->|kedro run --tags=discord| F["query_llm_discord<br/>tags: discord"]
    
    E --> G["Interactive Loop<br/>Maintains History"]
    F --> H["Discord Response<br/>Chunked Message"]
    
    G --> I["CLI Output"]
    H --> J["Discord Message"]
    
    style A fill:#f3e5f5
    style B fill:#e8f5e9
    style C fill:#e8f5e9
    style D fill:#fff3e0
    style E fill:#e3f2fd
    style F fill:#fce4ec
    style G fill:#e3f2fd
    style H fill:#fce4ec
    style I fill:#c8e6c9
    style J fill:#f8bbd0
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants