Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 42 additions & 27 deletions conf/base/catalog.yml
Original file line number Diff line number Diff line change
@@ -1,46 +1,61 @@
pypi_kedro_downloads:
type: kedro_datasets.snowflake.SnowparkTableDataset
table_name: "V_PYPI_KEDRO_DOWNLOADS"
database: "KEDRO_BI_DB" # Snowflake database
schema: "PYPI" # Schema inside the database
credentials: snowflake_credentials
heap_project_statistics:
type: kedro_datasets.ibis.TableDataset
table_name: KEDRO_PROJECT_STATISTICS
database: HEAP_FRAMEWORK_VIZ_PRODUCTION.HEAP
connection:
backend: snowflake
credentials: heap_snowflake_credentials

# Save results locally as CSV
pypi_kedro_downloads_table:
type: pandas.CSVDataset
filepath: data/02_intermediate/pypi_kedro_downloads.csv
save_args:
index: False
heap_any_command_run:
type: kedro_datasets.ibis.TableDataset
table_name: ANY_COMMAND_RUN
database: HEAP_FRAMEWORK_VIZ_PRODUCTION.HEAP
connection:
backend: snowflake
credentials: heap_snowflake_credentials

new_kedro_users_monthly:
type: pandas.CSVDataset
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/new_kedro_users_monthly.csv
save_args:
index: False
file_format: csv

mau_kedro:
type: pandas.CSVDataset
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/mau_kedro.csv
save_args:
index: False
file_format: csv

kedro_plugins_mau:
type: pandas.CSVDataset
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/kedro_plugins_mau.csv
file_format: csv

kedro_commands_mau:
type: pandas.CSVDataset
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/kedro_commands_mau.csv
file_format: csv

pypi_kedro_downloads:
type: kedro_datasets.ibis.TableDataset
table_name: V_PYPI_KEDRO_DOWNLOADS
database: KEDRO_BI_DB.PYPI
connection:
backend: snowflake
credentials: snowflake_credentials

pypi_kedro_downloads_table:
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/pypi_kedro_downloads.csv
file_format: csv

downloads_by_country:
type: kedro_datasets.snowflake.SnowparkTableDataset
table_name: "V_DOWNLOADS_BY_COUNTRY"
database: "KEDRO_BI_DB"
schema: "PYPI"
type: kedro_datasets.ibis.TableDataset
table_name: V_DOWNLOADS_BY_COUNTRY
database: KEDRO_BI_DB.PYPI
connection:
backend: snowflake
credentials: snowflake_credentials

downloads_by_country_table:
type: pandas.CSVDataset
type: kedro_datasets.ibis.FileDataset
filepath: data/02_intermediate/downloads_by_country.csv
save_args:
index: False
file_format: csv
7 changes: 7 additions & 0 deletions conf/base/credentials.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,10 @@ snowflake_credentials:
password: ${oc.env:SNOWFLAKE_PASSWORD}
warehouse: "KEDRO_BI_WH_WH"

heap_snowflake_credentials:
account: ${oc.env:SNOWFLAKE_ACCOUNT}
user: ${oc.env:SNOWFLAKE_USER}
password: ${oc.env:SNOWFLAKE_PASSWORD}
role: HEAP_NTD_KEDRO
warehouse: HEAP_NTD_KEDRO_WH

26 changes: 21 additions & 5 deletions conf/base/parameters_telemetry_data.yml
Comment thread
deepyaman marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# This is a boilerplate parameters config generated for pipeline 'telemetry_data'
# using Kedro 1.0.0.
#
# Documentation for this file format can be found in "Parameters"
# Link: https://docs.kedro.org/en/1.0.0/configuration/parameters.html
plugins:
- "kedro mlflow"
- "kedro docker"
- "kedro airflow"
- "kedro databricks"
- "kedro azureml"
- "kedro vertexai"
- "kedro gql"
- "kedro boot"
- "kedro sagemaker"
- "kedro coda"
- "kedro kubeflow"

commands:
- "kedro run"
- "kedro viz"
- "kedro new"
- "kedro pipeline"
- "kedro jupyter"
- "kedro ipython"
- "kedro package"
4 changes: 1 addition & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,4 @@ ipython>=8.10
jupyterlab>=3.0
notebook
kedro~=1.0.0
kedro-datasets[snowflake]
pandas
pyarrow
kedro-datasets[ibis-snowflake]
9 changes: 5 additions & 4 deletions src/kedro_pycafe_data/pipelines/data_transfer/nodes.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
def fetch_and_save(snowpark_df):
"""Simply returns data read from Snowflake so Kedro saves it as CSV."""
pdf = snowpark_df.to_pandas()
return pdf
import ibis.expr.types as ir


def identity(tbl: ir.Table) -> ir.Table:
return tbl
19 changes: 8 additions & 11 deletions src/kedro_pycafe_data/pipelines/data_transfer/pipeline.py
Original file line number Diff line number Diff line change
@@ -1,25 +1,22 @@
"""
This is a boilerplate pipeline 'data_transfer'
generated using Kedro 1.0.0
"""
from kedro.pipeline import Pipeline, node

from .nodes import identity

from kedro.pipeline import Node, Pipeline # noqa
from .nodes import fetch_and_save

def create_pipeline(**kwargs) -> Pipeline:
return Pipeline(
[
Node(
func=fetch_and_save,
node(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are following the updated pattern of using capital Node and Pipeline. @jitu5 is that right ?

identity,
inputs="pypi_kedro_downloads",
outputs="pypi_kedro_downloads_table",
name="fetch_and_save_snowflake_data",
),
Node(
func=fetch_and_save,
node(
identity,
inputs="downloads_by_country",
outputs="downloads_by_country_table",
name="fetch_and_save_downloads_by_country",
),
]
)
)
Loading