sempy_labs.lakehouse package
Module contents
- sempy_labs.lakehouse.create_shortcut_onelake(table_name: str, source_lakehouse: str, source_workspace: str | UUID, destination_lakehouse: str, destination_workspace: str | UUID | None = None, shortcut_name: str | None = None)
Creates a shortcut to a delta table in OneLake.
This is a wrapper function for the following API: OneLake Shortcuts - Create Shortcut.
- Parameters:
table_name (str) – The table name for which a shortcut will be created.
source_lakehouse (str) – The Fabric lakehouse in which the table resides.
source_workspace (str | uuid.UUID) – The name or ID of the Fabric workspace in which the source lakehouse exists.
destination_lakehouse (str) – The Fabric lakehouse in which the shortcut will be created.
destination_workspace (str | uuid.UUID, default=None) – The name or ID of the Fabric workspace in which the shortcut will be created. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
shortcut_name (str, default=None) – The name of the shortcut ‘table’ to be created. This defaults to the ‘table_name’ parameter value.
- sempy_labs.lakehouse.delete_shortcut(shortcut_name: str, lakehouse: str | None = None, workspace: str | UUID | None = None)
Deletes a shortcut.
This is a wrapper function for the following API: OneLake Shortcuts - Delete Shortcut.
- Parameters:
shortcut_name (str) – The name of the shortcut.
lakehouse (str, default=None) – The Fabric lakehouse name in which the shortcut resides. Defaults to None which resolves to the lakehouse attached to the notebook.
workspace (str | UUID, default=None) – The name or ID of the Fabric workspace in which lakehouse resides. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
- sempy_labs.lakehouse.get_lakehouse_columns(lakehouse: str | UUID | None = None, workspace: str | UUID | None = None) DataFrame
Shows the tables and columns of a lakehouse and their respective properties.
- Parameters:
lakehouse (str | uuid.UUID, default=None) – The Fabric lakehouse name or ID. Defaults to None which resolves to the lakehouse attached to the notebook.
lakehouse_workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID used by the lakehouse. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
- Returns:
Shows the tables/columns within a lakehouse and their properties.
- Return type:
- sempy_labs.lakehouse.get_lakehouse_tables(lakehouse: str | UUID | None = None, workspace: str | UUID | None = None, extended: bool = False, count_rows: bool = False, export: bool = False) DataFrame
Shows the tables of a lakehouse and their respective properties. Option to include additional properties relevant to Direct Lake guardrails.
This is a wrapper function for the following API: Tables - List Tables plus extended capabilities.
- Parameters:
lakehouse (str | uuid.UUID, default=None) – The Fabric lakehouse name or ID. Defaults to None which resolves to the lakehouse attached to the notebook.
workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID used by the lakehouse. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
extended (bool, default=False) – Obtains additional columns relevant to the size of each table.
count_rows (bool, default=False) – Obtains a row count for each lakehouse table.
export (bool, default=False) – Exports the resulting dataframe to a delta table in the lakehouse.
- Returns:
Shows the tables/columns within a lakehouse and their properties.
- Return type:
- sempy_labs.lakehouse.lakehouse_attached() bool
Identifies if a lakehouse is attached to the notebook.
- Returns:
Returns True if a lakehouse is attached to the notebook.
- Return type:
- sempy_labs.lakehouse.optimize_lakehouse_tables(tables: str | List[str] | None = None, lakehouse: str | None = None, workspace: str | UUID | None = None)
Runs the OPTIMIZE function over the specified lakehouse tables.
- Parameters:
tables (str | List[str], default=None) – The table(s) to optimize. Defaults to None which resovles to optimizing all tables within the lakehouse.
lakehouse (str, default=None) – The Fabric lakehouse. Defaults to None which resolves to the lakehouse attached to the notebook.
workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID used by the lakehouse. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
- sempy_labs.lakehouse.vacuum_lakehouse_tables(tables: str | List[str] | None = None, lakehouse: str | None = None, workspace: str | UUID | None = None, retain_n_hours: int | None = None)
Runs the VACUUM function over the specified lakehouse tables.
- Parameters:
tables (str | List[str] | None) – The table(s) to vacuum. If no tables are specified, all tables in the lakehouse will be optimized.
lakehouse (str, default=None) – The Fabric lakehouse. Defaults to None which resolves to the lakehouse attached to the notebook.
workspace (str | uuid.UUID, default=None) – The Fabric workspace name or ID used by the lakehouse. Defaults to None which resolves to the workspace of the attached lakehouse or if no lakehouse attached, resolves to the workspace of the notebook.
retain_n_hours (int, default=None) – The number of hours to retain historical versions of Delta table files. Files older than this retention period will be deleted during the vacuum operation. If not specified, the default retention period configured for the Delta table will be used. The default retention period is 168 hours (7 days) unless manually configured via table properties.