sempy_labs.dataflow package
Module contents
- sempy_labs.dataflow.assign_workspace_to_dataflow_storage(dataflow_storage_account: str, workspace: str | UUID | None = None)
Assigns a dataflow storage account to a workspace.
This is a wrapper function for the following API: Dataflow Storage Accounts - Groups AssignToDataflowStorage.
- Parameters:
- sempy_labs.dataflow.discover_dataflow_parameters(dataflow: str | UUID, workspace: str | UUID) DataFrame
Retrieves all parameters defined in the specified Dataflow.
This is a wrapper function for the following API: Items - Discover Dataflow Parameters.
Service Principal Authentication is supported (see here for examples).
- Parameters:
- Returns:
A pandas dataframe showing all parameters defined in the specified Dataflow.
- Return type:
- sempy_labs.dataflow.execute_query(dataflow: str | UUID, query_name: str, custom_mashup_document: str | None = None, workspace: str | UUID | None = None) DataFrame
Executes a query against a dataflow and returns the result.
This is a wrapper function for the following API: Query Execution - Execute Query.
Service Principal Authentication is supported (see here for examples).
- Parameters:
query_name (str) – The name of the query to execute from the dataflow (or from the custom mashup document if provided).
custom_mashup_document (str, default=None) – Optional custom mashup document to override the dataflow’s default mashup.
workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
- Returns:
A pandas dataframe showing the results of the query execution.
- Return type:
- sempy_labs.dataflow.get_dataflow_definition(dataflow: str | UUID, workspace: str | UUID | None = None, decode: bool = True) dict
Obtains the definition of a dataflow. This supports Gen1, Gen2 and Gen 2 CI/CD dataflows.
This is a wrapper function for the following API: Dataflows - Get Dataflow.
- Parameters:
dataflow (str | uuid.UUID) – The name or ID of the dataflow.
workspace (str | uuid.UUID, default=None) – The Fabric workspace name. Defaults to None, which resolves to the workspace of the attached lakehouse or if no lakehouse is attached, resolves to the workspace of the notebook.
decode (bool, optional) – If True, decodes the dataflow definition file.
- Returns:
The dataflow definition.
- Return type:
- sempy_labs.dataflow.list_dataflow_storage_accounts() DataFrame
Shows the accessible dataflow storage accounts.
This is a wrapper function for the following API: Dataflow Storage Accounts - Get Dataflow Storage Accounts.
- Returns:
A pandas dataframe showing the accessible dataflow storage accounts.
- Return type:
- sempy_labs.dataflow.list_dataflows(workspace: str | UUID | None = None)
Shows a list of all dataflows which exist within a workspace.
This is a wrapper function for the following API: Items - List Dataflows.
Service Principal Authentication is supported (see here for examples).
- Parameters:
workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
- Returns:
A pandas dataframe showing the dataflows which exist within a workspace.
- Return type:
- sempy_labs.dataflow.list_upstream_dataflows(dataflow: str | UUID, workspace: str | UUID | None = None) DataFrame
Shows a list of upstream dataflows for the specified dataflow.
This is a wrapper function for the following API: Dataflows - Get Upstream Dataflows In Group.
Service Principal Authentication is supported (see here for examples).
- Parameters:
- Returns:
A pandas dataframe showing a list of upstream dataflows for the specified dataflow.
- Return type:
- sempy_labs.dataflow.upgrade_dataflow(dataflow: str | UUID, workspace: str | UUID | None = None, new_dataflow_name: str | None = None, new_dataflow_workspace: str | UUID | None = None)
Creates a Dataflow Gen2 CI/CD item based on the mashup definition from an existing Gen1/Gen2 dataflow. After running this function, update the connections in the dataflow to ensure the data can be properly refreshed.
- Parameters:
dataflow (str | uuid.UUID) – The name or ID of the dataflow.
workspace (str | uuid.UUID, default=None) – The workspace name or ID. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
new_dataflow_name (str, default=None) – Name of the new dataflow.
new_dataflow_workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID of the dataflow to be created. Defaults to None which resolves to the existing workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.