mt.sql.redshift

Useful modules for accessing Redshift

Functions

mt.sql.redshift.rename_schema(old_schema, new_schema, engine, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Renames a schema.

Parameters:
  • old_schema (str) – old schema name

  • new_schema (str) – new schema name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.get_frame_length(frame_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Gets the number of rows of a dataframes (tables/views/materialized views).

Parameters:
  • frame_name (str) – name of the dataframe

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

Returns:

out – number of rows

Return type:

int

Notes

The dataframe must exist.

mt.sql.redshift.rename_table(old_table_name, new_table_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Renames a table of a schema.

Parameters:
  • old_table_name (str) – old table name

  • new_table_name (str) – new table name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

Return type:

whatever exec_sql() returns

mt.sql.redshift.drop_table(table_name, engine, schema: str | None = None, restrict=True, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Drops a table if it exists, with restrict or cascade options.

Parameters:
  • table_name (str) – table name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • restrict (bool) – If True, refuses to drop table if there is any object depending on it. Otherwise it is the ‘cascade’ option which allows you to remove those dependent objects together with the table automatically.

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

Return type:

whatever exec_sql() returns

mt.sql.redshift.rename_view(old_view_name, new_view_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Renames a view of a schema.

Parameters:
  • old_view_name (str) – old view name

  • new_view_name (str) – new view name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.drop_view(view_name, engine, schema: str | None = None, restrict=True, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Drops a view if it exists, with restrict or cascade options.

Parameters:
  • view_name (str) – view name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • restrict (bool) – If True, refuses to drop table if there is any object depending on it. Otherwise it is the ‘cascade’ option which allows you to remove those dependent objects together with the table automatically.

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

Return type:

whatever exec_sql() returns

mt.sql.redshift.rename_matview(old_matview_name, new_matview_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Renames a materialized view of a schema.

Parameters:
  • old_matview_name (str) – old materialized view name

  • new_matview_name (str) – new materialized view name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.refresh_matview(matview_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Refreshes a materialized view of a schema.

Parameters:
  • matview_name (str) – materialized view name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.drop_matview(matview_name, engine, schema: str | None = None, restrict=True, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Drops a mateiralized view if it exists, with restrict or cascade options.

Parameters:
  • matview_name (str) – materialized view name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

  • restrict (bool) – If True, refuses to drop table if there is any object depending on it. Otherwise it is the ‘cascade’ option which allows you to remove those dependent objects together with the table automatically.

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

Return type:

whatever exec_sql() returns

mt.sql.redshift.rename_column(table_name, old_column_name, new_column_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Renames a column of a table.

Parameters:
  • table_name (str) – table name

  • old_column_name (str) – old column name

  • new_column_name (str) – new column name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – schema name

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.drop_column(table_name, column_name, engine, schema: str | None = None, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None)

Drops a column of a table.

Parameters:
  • table_name (str) – table name

  • column_name (str) – column name

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – schema name

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

mt.sql.redshift.conform(df: DataFrame, table_decl: Table) DataFrame

Conforms a dataframe to a declarative base so that the columns are properly represented.

The idea is so that the output dataframe can be used to upload data to a Redshift DB. Primary keys and indices are ignored. But whether an integer column is nullable or not is inspected.

Parameters:
  • df (pandas.DataFrame) – input dataframe

  • table_decl (sqlalchemy.sql.schema.Table) – the table declaration to conform to. The output columns are converted where possible to the right dtype declared by the base. If you have a declarative base x, an instance of sqlalchemy.orm.decl_api.DeclarativeMeta, you can pass x.__table__.

Returns:

out_df – the output dataframe, where columns of the input dataframe are copied and converted properly

Return type:

pandas.DataFrame