logo

Upgrading to Atoti Python API 0.9.0

Andrew Yang

Atoti Python API 0.9.0 has many new features but breaking changes. This guide will help with upgrading and getting up and running!

Introduction

With the release of Atoti Python API 0.9.0 comes new and exciting features, but also some code changes that affect how we build Atoti applications moving forward.

The main code changes in 0.9.0 affect:

  • Sessions
  • Tables
  • Hierarchies
  • Queries
  • Plugins

Python projects and notebooks built with previous versions of Atoti Python API will need some updates to be compatible with version 0.9.0. In this post, we’ll walk through the steps needed to migrate from a previous version of Atoti Python API to Atoti Python API 0.9.0.

Let’s begin!

💡Note: Visit our documentation and changelog for Atoti Python API 0.9.0 to get more info on what has changed in this latest release.

Environment Upgrade to Atoti Python API 0.9.0

First things first, we’ll need to upgrade our existing atoti Python package to the latest version of 0.9.0. We can check to see our currently installed version of atoti by executing the following in a terminal:

$ uv pip list | grep atoti         
atoti                     0.8.14
atoti-core                0.8.14
atoti-jupyterlab          0.8.14
atoti-query               0.8.14

💡Note: We recommend using uv, but for alternative installation methods like pip and conda, check out our documentation’s installation page. Also, we may need to use findstr instead of grep when using the terminal in Windows OS.

Keep in mind that Atoti Python API 0.9.0 now requires (at minimum) Python 3.10 and Java 21. When ready, we can upgrade to the latest version of Atoti Python API 0.9.0 by running the following:

$ uv add "atoti[jupyterlab]"==0.9.0
Resolved 112 packages in 71ms
Uninstalled 4 packages in 90ms
Installed 6 packages in 28ms
 - atoti==0.8.14
 + atoti==0.9.0
 + atoti-client==0.9.0
 - atoti-core==0.8.14
 + atoti-core==0.9.0
 - atoti-jupyterlab==0.8.14
 + atoti-jupyterlab==0.9.0
 + atoti-server==0.9.0
 - jdk4py==17.0.9.2
 + jdk4py==21.0.4.1

$ uv sync                      
Resolved 112 packages in 0.51ms
Uninstalled 3 packages in 26ms
 - atoti-query==0.8.14
 - importlib-metadata==8.5.0
 - zipp==3.20.2

Great, now that we’re up and running with Atoti Python API 0.9.0, let’s take a look at the main code changes in 0.9.0 and how to migrate our code from previous versions of Atoti to version 0.9.0.

Sessions

Spinning up an Atoti session is the first thing we need to do when building Atoti applications. Therefore, it makes sense to begin with session code changes for Atoti Python API 0.9.0.

Session start

Previously, we started an Atoti session by using Session().

import atoti as tt
session = tt.Session()

In 0.9.0, we use the Session.start() function to start our Atoti session.

import atoti as tt
session = tt.Session.start()

Session configuration

Previously, Atoti sessions were configured by setting session parameters, such as user_content_storage and port.

import atoti as tt

session = tt.Session(
    user_content_storage="./content",
    port=9092,
)

In Atoti Python API 0.9.0, we use SessionConfig() in the Session.start() function to configure our session, which looks like the following:

import atoti as tt

session = tt.Session.start(
    tt.SessionConfig(
        user_content_storage="./content",
        port=9092
    )
)

💡 Note: Check out the documentation for more options on SessionConfig.

In addition, when building a URL to the Atoti session in previous versions, we used Session.port to do this.

import atoti as tt

session = tt.Session(
    user_content_storage="./content",
    port=9092,
)

port = session.port
url = f"http://localhost:{port}"

However, in 0.9.0 we now use Session.url to build a URL to the Atoti session. If only the port value is needed, then we can use the urllib package to parse out the port info from the returned output of Session.url.

import atoti as tt
from urllib.parse import urlparse

session = tt.Session.start(
    tt.SessionConfig(
        user_content_storage="./content",
        port=9092
    )
)

url = session.url
port = urlparse(url).port

Finally, prior to 0.9.0, we configured user content storage by passing UserContentStorageConfig to the Session.user_content_storage input parameter.

import atoti as tt

config = tt.UserContentStorageConfig(url)
session = tt.Session(
    user_content_storage=config,
    port=9092,
)

However, in 0.9.0, we need to import the atoti_jdbc package (formerly known as atoti-sql) and use atoti_jdbc.UserContentStorageConfig instead.

import atoti as tt
from atoti_jdbc import UserContentStorageConfig

config = UserContentStorageConfig(url)
session = tt.Session.start(
    tt.SessionConfig(
        user_content_storage=config,
        port=9092
    )
)

Session security

In previous versions of Atoti Python API, configuring session security of an Atoti session was done by passing the following to the Session.authentication input parameter:

  • BasicAuthenticationConfig
  • KerberosConfig
  • LdapConfig
  • OidcConfig
import atoti as tt

authentication = tt.BasicAuthenticationConfig()
session = tt.Session(
    port=10011,
    authentication=authentication,
    user_content_storage="./content",
    java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
)

In 0.9.0, we pass a SecurityConfig to the SessionConfig.security input parameter of Session.start().

import atoti as tt

config = tt.BasicAuthenticationConfig()
session = tt.Session.start(
    tt.SessionConfig(
        port=10011,
        security=tt.SecurityConfig(basic_authentication=config),
        user_content_storage="./content",
        java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
    )
)

If we are configuring basic authentication, we need to use the SecurityConfig.basic_authentication input parameter. However, if we are setting up single sign-on (SSO) types of authentication, we need to use the SecurityConfig.sso input parameter.

  • basic_authentication
    • BasicAuthenticationConfig
  • sso
    • KerberosConfig
    • LdapConfig
    • OidcConfig

The example below shows us how to set up Kerberos SSO in Atoti Python API 0.9.0:

import atoti as tt

config = tt.KerberosConfig(...)
session = tt.Session.start(
    tt.SessionConfig(
        port=10011,
        security=tt.SecurityConfig(sso=config),
        user_content_storage="./content",
        java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
    )
)

Session create_cube

Previously, when creating an Atoti session and cube, we used the Session.create_cube() function and passed in a table to be used as the base_table to the cube.

table = session.create_table(
    "Table",
    types={"id": tt.STRING, "value": tt.DOUBLE},
)
cube = session.create_cube(
    base_table=table
)

In 0.9.0, the base_table input parameter is fact_table.

table = session.create_table(
    "Table",
    types={"id": tt.STRING, "value": tt.DOUBLE},
)
cube = session.create_cube(
    fact_table=table
)

Session transactions

Atoti Python API allows us to batch several table operations within a single transaction for better performance.

Previously, this was done using the Session.start_transaction() function.

import atoti as tt
import pandas as pd

df = pd.DataFrame(
    columns=["City", "Price"],
    data=[
        ("Berlin", 150.0),
        ("London", 240.0),
        ("New York", 270.0),
        ("Paris", 200.0),
    ],
)
session = tt.Session()
table = session.read_pandas(
    df, keys=["City"], table_name="start_transaction example"
)
cube = session.create_cube(table)
extra_df = pd.DataFrame(
    columns=["City", "Price"],
    data=[
        ("Singapore", 250.0),
    ],
)
with session.start_transaction():
    table += ("New York", 100.0)
    table.drop(table["City"] == "Paris")
    table.load_pandas(extra_df)
table.head().sort_index()
           Price
City
Berlin     150.0
London     240.0
New York   100.0
Singapore  250.0

In Atoti Python API 0.9.0, we use Session.tables.data_transaction() to achieve the same result.

import pandas as pd
import atoti as tt

df = pd.DataFrame(
    columns=["City", "Price"],
    data=[
        ("Berlin", 150.0),
        ("London", 240.0),
        ("New York", 270.0),
        ("Paris", 200.0),
    ],
)
session = tt.Session.start()
table = session.read_pandas(df, keys=["City"], table_name="Cities")
cube = session.create_cube(table)
extra_df = pd.DataFrame(
    columns=["City", "Price"],
    data=[
        ("Singapore", 250.0),
    ],
)
with session.tables.data_transaction():
    table += ("New York", 100.0)
    table.drop(table["City"] == "Paris")
    table.load_pandas(extra_df)
table.head().sort_index()
           Price
City
Berlin     150.0
London     240.0
New York   100.0
Singapore  250.0

Similarly, in Atoti Python API 0.9.0, we no longer have to perform independent Measures.update() operations for related measure changes. 

For example, if we execute the following code, we get a KeyError.

import atoti as tt

session = tt.Session()
fact_table = session.create_table(
    "Base",
    keys={"ID"},
    types={"ID": "String", "Quantity": "int"},
)
fact_table += ("123xyz", 1)
cube = session.create_cube(fact_table)
h, l, m = cube.hierarchies, cube.levels, cube.measures
m.update({"Quantity - 1": m["Quantity.SUM"] - 1, "Quantity + 1": m["Quantity.SUM"] + 1, "Quantity + 2": m["Quantity + 1"] + 1})
...
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
---> 13 m.update({"Quantity - 1": m["Quantity.SUM"] - 1, "Quantity + 1": m["Quantity.SUM"] + 1, "Quantity + 2": m["Quantity + 1"] + 1})

This is because when creating our measures,  m[“Quantity + 1”] does not yet exist even though we’ve specified it in our chain of operations.

In Atoti 0.9.0, we can use Session.data_model_transaction() to batch measure operations together and allow for intermediary steps to be visible to subsequent statements within a transaction.

with session.data_model_transaction():
    m["Quantity - 1"] = m["Quantity.SUM"] - 1
    m["Quantity + 1"] = m["Quantity.SUM"] + 1
    m["Quantity + 2"] = m["Quantity + 1"] + 1

Tables

Let’s now walk through some of the minor changes affecting tables in Atoti Python API 0.9.0.

Table keys

Prior to 0.9.0, we used Table.keys to return a list of table keys.

import atoti as tt

table = session.create_table(
    "Example",
    keys=["Country", "City"],
    types={
        "Country": "String",
        "City": "String",
        "Year": "int",
        "Population": "int",
    },
)
table.keys
['Country', 'City']

In Atoti Python API 0.9.0, Table.keys now returns a tuple, signifying that keys can no longer be changed once the table exists.

import atoti as tt

table = session.create_table(
    "Example",
    keys=["Country", "City"],
    types={
        "Country": "String",
        "City": "String",
        "Year": "int",
        "Population": "int",
    },
)
table.keys
('Country', 'City')

Table row and column counts

Previously, when counting Atoti Table row and column counts we would use the len() function to produce the count of rows for a Table or ExternalTable, and the len() function to count the number of columns returned from Table.columns.

print(f"# rows: {len(product_tbl)}, # columns: {len(product_tbl.columns)}")

However, in 0.9.0, we’ve introduced Table.row_count for the count of rows, and deprecated Table.columns. For column counts, we now use the list() function to return the list of columns for the table and then use the len() function on the list of returned column names.

print(f"# rows: {product_tbl.row_count}, # columns: {len(list(product_tbl))}")

In addition, to iterate over table columns in 0.9.0 we create a for loop to iterate over the table itself (rather than on returned Table.columns output).

for column_name in product_tbl:
    print(column_name)

Hierarchies

Let’s take a look at how 0.9.0 code changes affect hierarchies.

Levels

Previously, we used the list() function in conjunction with Hierarchy.levels to return the levels for a given hierarchy.

prices_df = pd.DataFrame(
    columns=["Nation", "City", "Color", "Price"],
    data=[
        ("France", "Paris", "red", 20.0),
        ("France", "Lyon", "blue", 15.0),
        ("France", "Toulouse", "green", 10.0),
        ("UK", "London", "red", 20.0),
        ("UK", "Manchester", "blue", 15.0),
    ],
)
table = session.read_pandas(prices_df, table_name="Prices")
cube = session.create_cube(table, mode="manual")
h = cube.hierarchies
level_names = list(h["Nation"].levels)

In Atoti Python API 0.9.0, Hierarchy.levels has been deprecated and we now use the list() function on the Hierarchy itself to return the levels for a given hierarchy.

prices_df = pd.DataFrame(
    columns=["Nation", "City", "Color", "Price"],
    data=[
        ("France", "Paris", "red", 20.0),
        ("France", "Lyon", "blue", 15.0),
        ("France", "Toulouse", "green", 10.0),
        ("UK", "London", "red", 20.0),
        ("UK", "Manchester", "blue", 15.0),
    ],
)
table = session.read_pandas(prices_df, table_name="Prices")
cube = session.create_cube(table, mode="manual")
h = cube.hierarchies
level_names = list(h["Nation"])

Queries

Let’s now explore the changes in Atoti Python API 0.9.0 that affect how we implement queries in Atoti.

Query explains

In Atoti Python API, we can return query execution plans for a specified query that reveals insights into how a database processes a query. This provides valuable information for database performance tuning and query optimization.

Previously, we used Session.explain_mdx_query() to return the query execution plan for a query written in MDX.

from datetime import date
import pandas as pd
import atoti as tt

df = pd.DataFrame(
    columns=["Country", "Date", "Price"],
    data=[
        ("China", date(2020, 3, 3), 410.0),
        ("France", date(2020, 1, 1), 480.0),
        ("France", date(2020, 2, 2), 500.0),
        ("France", date(2020, 3, 3), 400.0),
        ("India", date(2020, 1, 1), 360.0),
        ("India", date(2020, 2, 2), 400.0),
        ("UK", date(2020, 2, 2), 960.0),
    ],
)
table = session.read_pandas(
    df, keys=["Country", "Date"], table_name="Prices"
)
cube = session.create_cube(table)
mdx = (
    "SELECT"
    "  NON EMPTY Hierarchize("
    "    DrilldownLevel("
    "      [Prices].[Country].[ALL].[AllMember]"
    "    )"
    "  ) ON ROWS,"
    "  NON EMPTY Crossjoin( "
    "    [Measures].[Price.SUM],"
    "    Hierarchize("
    "      DrilldownLevel("
    "        [Prices].[Date].[ALL].[AllMember]"
    "      )"
    "    )"
    "  ) ON COLUMNS"
    "  FROM [Prices]"
)
session.explain_mdx_query(mdx)

In Atoti Python API 0.9.0, we use Session.query_mdx() with an explain parameter to achieve the same result.

session.query_mdx(mdx, explain=True)

Similarly, we used Cube.explain_query() to return the query execution plan for an MDX query using Atoti Python API syntax.

import pandas as pd
import atoti as tt

df = pd.DataFrame(
    columns=["Continent", "Country", "Currency", "Year", "Month", "Price"],
    data=[
        ("Europe", "France", "EUR", 2023, 10, 200.0),
        ("Europe", "Germany", "EUR", 2024, 2, 150.0),
        ("Europe", "United Kingdom", "GBP", 2022, 10, 120.0),
        ("America", "United states", "USD", 2020, 5, 240.0),
        ("America", "Mexico", "MXN", 2021, 3, 270.0),
    ],
)
session = tt.Session()
table = session.read_pandas(
    df,
    keys={"Continent", "Country", "Currency", "Year", "Month"},
    table_name="Prices",
)
cube = session.create_cube(table)
cube.explain_query(
    m["Price.SUM"],
    levels=[l["Country"]],
    filter=l["Continent"] == "Europe",
)

In Atoti Python API 0.9.0, we use Cube.query() with an explain parameter as well to achieve the same result.

cube.query(
    m["Price.SUM"],
    levels=[l["Country"]],
    filter=l["Continent"] == "Europe",
    explain=True
)

Querying existing Atoti sessions

Some of us use Atoti Python API to connect to and read data from another existing Atoti cube.

In previous versions of Atoti Python API, we connected to an existing Atoti session and queried it by importing QuerySession from the atoti-query package.

from atoti_query import QuerySession
existing_session = QuerySession(url)
existing_session.query_mdx(...)

However, as of Atoti Python API 0.9.0, the atoti-query package has been deprecated and we now use Session.connect() to connect and query an existing Atoti session instead.

import atoti as tt
existing_session = tt.Session.connect(url)
existing_session.query_mdx(...)

💡Note: We can also use the slimmer atoti-client package for projects that exclusively use the Atoti Python API to connect to an existing session. This significantly reduces the size of installed dependencies.

Plugins (optional)

Finally, we look at how plugins have been affected in 0.9.0.

DirectQuery configurations

Atoti Python API supports using DirectQuery to connect and query external databases as external tables in Atoti without having to load data in-memory. This is useful when working with cold vs. hot data where increased in-memory performance is not needed for cold or in-frequently used data.

💡Note: Read our documentation for more information on Atoti DirectQuery.

Previously, when using the Atoti DirectQuery feature, we would need to import the respective ConnectionInfo and TableOption classes from our DirectQuery plugin packages. 

For example, when we downloaded the atoti-directquery-clickhouse package, we imported the following classes, made the connection to the external database, and set up an external table in Atoti.

from atoti_directquery_clickhouse import (
    ClickhouseConnectionInfo, 
    ClickhouseTableOptions
)

session = tt.Session()
connection_info = ClickhouseConnectionInfo(                f"clickhouse:https://{os.environ['CLICKHOUSE_USER']}@{os.environ['CLICKHOUSE_HOST']}:{os.environ['CLICKHOUSE_PORT']}",
    password=os.environ["CLICKHOUSE_PASSWORD"],
)
db = session.connect_to_external_database(connection_info)
db.cache = True

trades_atoti = session.add_external_table(
    db.tables["TRADE_PNLS"],
    table_name="Trade PnL",
    options=ClickhouseTableOptions(keys=["AsOfDate", "TradeId", "BookId", "DataSet"]),
)

In 0.9.0, we import and use ConnectionConfig and TableConfig, respectively. In addition, we’ll need to update the following:

  1. Explicitly use the ConnectionConfig.url parameter to specify our URL.
  2. Move the DirectQuery connection cache property to ConnectionConfig as a parameter.
  3. Rename the options input parameter for Session.add_external_table() to config.
from atoti_directquery_clickhouse import (
    ConnectionConfig, 
    TableConfig
)

session = tt.Session.start()
connection_info = ConnectionConfig(
    url=f"clickhouse:https://{os.environ['CLICKHOUSE_USER']}@{os.environ['CLICKHOUSE_HOST']}:{os.environ['CLICKHOUSE_PORT']}",
    password=os.environ["CLICKHOUSE_PASSWORD"], 
    cache=True
)

db = session.connect_to_external_database(connection_info)
trades_atoti = session.add_external_table(
    db.tables["TRADE_PNLS"],
    table_name="Trade PnL",
    config=TableConfig(keys=["AsOfDate", "TradeId", "BookId", "DataSet"]),
)

Lastly, a few more minor changes: 

When using the atoti_directquery_redshift plugin, we can now use the newly added atoti_directquery_redshift.ConnectionConfig.connection_pool_size. This configuration sets the maximum size of the connection pool when connecting to Redshift using Atoti DirectQuery. 

When using the atoti_directquery_databricks plugin, we’ve renamed atoti_directquery_databricks.DatabricksConnectionInfo.heavy_load_url to atoti_directquery_databricks.ConnectionConfig.feeding_url.

Cloud storage client-side encryption

Atoti Python API can connect to and load data from cloud storage services such as Amazon S3, Azure Blob Storage, and Google Cloud Storage. For client-side encrypted data existing in cloud storage, Atoti also supports client-side encryption to decrypt data loaded into Atoti tables.

💡Note: Learn more about client-side encryption.

Previously, we performed client-side data encryption using the atoti_aws and atoti_azure plugins and imported the respective classes.

import atoti
from atoti_aws import AwsKeyPair, AwsKmsConfig
from atoti_azure import AzureKeyPair

client_side_encryption_aws_keypair = AwsKeyPair(
    "public_key",
    "private_key",
    region="eu-west-3",
)

client_side_encryption_aws_kms = AwsKmsConfig(
    region="eu-west-3",
    key_id="key_id",
)

client_side_encryption_azure_keypair = AzureKeyPair(
    "public_key",
    "private_key",
    key_id="key_id"
)

In Atoti Python API 0.9.0, these classes have been renamed.

  • atoti_aws.AwsKeyPairatoti_aws.KeyPair
  • atoti_aws.AwsKmsConfigatoti_aws.KmsConfig
  • atoti_azure.AzureKeyPairatoti_azure.KeyPair

We now need to import the renamed classes.

import atoti
import atoti_aws
import atoti_azure

client_side_encryption_aws_keypair = atoti_aws.KeyPair(
    "public_key",
    "private_key",
    region="eu-west-3",
)

client_side_encryption_aws_kms = atoti_aws.KmsConfig(
    "public_key",
    "private_key",
    region="eu-west-3",
)

client_side_encryption_azure_keypair = atoti_azure.KeyPair(
    "public_key",
    "private_key",
    key_id="key_id"
)

Final Thoughts

Migrating assets can be hard, but we hope that this walkthrough helps!

If you have any questions or feedback, we’d love to hear from you. Stay tuned for more guides on Atoti Python API 0.9.0. Until then, take care! 👋

Like this post ? Please share

Latest Articles

View All

Atoti Limits 4.0: Simplified project structure and upgrades

Read More

Atoti and pandas: Enabling High Performance of Advanced Analytics

Read More

NEWSLETTER

Join our Community for the latest insights and information about Atoti