mt.sql.base

Base functions dealing with an SQL database.

Functions

mt.sql.base.frame_sql(frame_name, schema: str | None = None)
mt.sql.base.indices(df)

Returns the list of named indices of the dataframe, ignoring any unnamed index.

mt.sql.base.run_func(func, *args, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None, **kwargs)

Attempt to run a function a number of times to overcome OperationalError exceptions.

Parameters:
  • func (function) – function to be invoked

  • args (sequence) – arguments to be passed to the function

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

  • kwargs (dict) – keyword arguments to be passed to the function

mt.sql.base.conn_ctx(engine)
mt.sql.base.engine_execute(engine, sql, *args, **kwargs)
mt.sql.base.read_sql(sql, engine, index_col: str | List[str] | None = None, chunksize: int | None = None, nb_trials: int = 3, exception_handling: str = 'raise', logger: IndentedLoggerAdapter | None = None, **kwargs) DataFrame

Read an SQL query with a number of trials to overcome OperationalError.

The function wraps pandas.read_sql_query(). However, when chunksize is not None, it iterates over chunks and concatenates them. In addition, if logger is not None, a progress bar is shown in that case.

A dataframe is always returned.

Parameters:
  • sql (str or object) – SQL query to be executed. The query can be a string or an sqlalchemy object that can be used for querying. Passed as-is to pandas.read_sql_query().

  • engine (sqlalchemy.engine.Engine) – connection engine to the server

  • index_col (string or list of strings, optional, default: None) – Column(s) to set as index(MultiIndex). Passed as-is to pandas.read_sql_query().

  • chunksize (int, default None) – If specified, iteratively reads a number of chunksize rows. In this case, a progress bar is also shown if logger is provided.

  • nb_trials (int) – number of query trials. If chunksize is provided, this is only effective before an iterator is returned from pandas.

  • exception_handling ({'warn', 'raise'}) – policy for handling SQL-raised exceptions when iterating over many chunks to completely download the result. Only valid when chunksize is provided. Right now there are only 2 policies. Either to raise the exception as-is (‘raise’), or to raise the exception as a warning (‘warn’) and return whatever has been downloaded.

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

  • kwargs (dict) – other keyword arguments to be passed directly to pandas.read_sql_query()

Returns:

the output dataframe

Return type:

pandas.DataFrame

See also

pandas.read_sql_query

mt.sql.base.read_sql_table(table_name, engine, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None, **kwargs)

Read an SQL table with a number of trials to overcome OperationalError.

Parameters:
  • table_name (str) – name of the table to be read

  • engine (sqlalchemy.engine.Engine) – connection engine to the server

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

See also

pandas.read_sql_table

mt.sql.base.exec_sql(sql, engine, *args, nb_trials: int = 3, logger: IndentedLoggerAdapter | None = None, **kwargs)

Execute an SQL query with a number of trials to overcome OperationalError.

Parameters:
  • sql (str) – SQL query to be executed

  • engine (sqlalchemy.engine.Engine) – connection engine to the server

  • args (list) – positional arguments to be passed as-is to sqlalchemy.engine.Engine.execute()

  • nb_trials (int) – number of query trials

  • logger (mt.logg.IndentedLoggerAdapter, optional) – logger for debugging

See also

sqlalchemy.engine.Engine.execute

for more details

mt.sql.base.list_schemas(engine)

Lists all schemas.

Parameters:

engine (sqlalchemy.engine.Engine) – connection engine to the server

Returns:

list of all schema names

Return type:

list

mt.sql.base.list_tables(engine, schema: str | None = None)

Lists all tables of a given schema.

Parameters:
  • engine (sqlalchemy.engine.Engine) – connection engine to the server

  • schema (str, optional) – a valid schema name returned from list_schemas(). Default to sqlalchemy

Returns:

list of all table names

Return type:

list

mt.sql.base.list_views(engine, schema: str | None = None)

Lists all views of a given schema.

Parameters:
  • engine (sqlalchemy.engine.Engine) – connection engine to the server

  • schema (str, optional) – a valid schema name returned from list_schemas(). Default to sqlalchemy

Returns:

list of all view names

Return type:

list

mt.sql.base.table_exists(table_name, engine, schema: str | None = None)

Checks if a table exists.

Parameters:
  • table_name (str) – name of table

  • engine (sqlalchemy.engine.Engine) – an sqlalchemy connection engine created by function create_engine()

  • schema (str or None) – a valid schema name returned from list_schemas()

Returns:

retval – whether a table or a view exists with the given name

Return type:

bool

mt.sql.base.create_temp_id_table(l_ids: list, conn: Connection) str

Creates a temporary table to containing a list of ids.

Parameters:
  • l_ids (list) – list of ids to be inserted into the table

  • conn (sqlalchemy.engine.Connection) – a connection that has been opened

Returns:

table_name – name of the temporary table. The table will be deleted at the end of the connection

Return type:

str