Atoti Python API 0.9.0 has many new features but breaking changes. This guide will help with upgrading and getting up and running!
Introduction
With the release of Atoti Python API 0.9.0 comes new and exciting features, but also some code changes that affect how we build Atoti applications moving forward.
The main code changes in 0.9.0 affect:
- Sessions
- Tables
- Hierarchies
- Queries
- Plugins
Python projects and notebooks built with previous versions of Atoti Python API will need some updates to be compatible with version 0.9.0. In this post, we’ll walk through the steps needed to migrate from a previous version of Atoti Python API to Atoti Python API 0.9.0.
Let’s begin!
💡Note: Visit our documentation and changelog for Atoti Python API 0.9.0 to get more info on what has changed in this latest release.
Environment Upgrade to Atoti Python API 0.9.0
First things first, we’ll need to upgrade our existing atoti
Python package to the latest version of 0.9.0. We can check to see our currently installed version of atoti
by executing the following in a terminal:
$ uv pip list | grep atoti
atoti 0.8.14
atoti-core 0.8.14
atoti-jupyterlab 0.8.14
atoti-query 0.8.14
💡Note: We recommend using uv
, but for alternative installation methods like pip
and conda
, check out our documentation’s installation page. Also, we may need to use findstr
instead of grep
when using the terminal in Windows OS.
Keep in mind that Atoti Python API 0.9.0 now requires (at minimum) Python 3.10 and Java 21. When ready, we can upgrade to the latest version of Atoti Python API 0.9.0 by running the following:
$ uv add "atoti[jupyterlab]"==0.9.0
Resolved 112 packages in 71ms
Uninstalled 4 packages in 90ms
Installed 6 packages in 28ms
- atoti==0.8.14
+ atoti==0.9.0
+ atoti-client==0.9.0
- atoti-core==0.8.14
+ atoti-core==0.9.0
- atoti-jupyterlab==0.8.14
+ atoti-jupyterlab==0.9.0
+ atoti-server==0.9.0
- jdk4py==17.0.9.2
+ jdk4py==21.0.4.1
$ uv sync
Resolved 112 packages in 0.51ms
Uninstalled 3 packages in 26ms
- atoti-query==0.8.14
- importlib-metadata==8.5.0
- zipp==3.20.2
Great, now that we’re up and running with Atoti Python API 0.9.0, let’s take a look at the main code changes in 0.9.0 and how to migrate our code from previous versions of Atoti to version 0.9.0.
Sessions
Spinning up an Atoti session is the first thing we need to do when building Atoti applications. Therefore, it makes sense to begin with session code changes for Atoti Python API 0.9.0.
Session start
Previously, we started an Atoti session by using Session()
.
import atoti as tt
session = tt.Session()
In 0.9.0, we use the Session.start()
function to start our Atoti session.
import atoti as tt
session = tt.Session.start()
Session configuration
Previously, Atoti sessions were configured by setting session parameters, such as user_content_storage
and port
.
import atoti as tt
session = tt.Session(
user_content_storage="./content",
port=9092,
)
In Atoti Python API 0.9.0, we use SessionConfig()
in the Session.start()
function to configure our session, which looks like the following:
import atoti as tt
session = tt.Session.start(
tt.SessionConfig(
user_content_storage="./content",
port=9092
)
)
💡 Note: Check out the documentation for more options on SessionConfig.
In addition, when building a URL to the Atoti session in previous versions, we used Session.port
to do this.
import atoti as tt
session = tt.Session(
user_content_storage="./content",
port=9092,
)
port = session.port
url = f"http://localhost:{port}"
However, in 0.9.0 we now use Session.url
to build a URL to the Atoti session. If only the port value is needed, then we can use the urllib
package to parse out the port info from the returned output of Session.url
.
import atoti as tt
from urllib.parse import urlparse
session = tt.Session.start(
tt.SessionConfig(
user_content_storage="./content",
port=9092
)
)
url = session.url
port = urlparse(url).port
Finally, prior to 0.9.0, we configured user content storage by passing UserContentStorageConfig
to the Session.user_content_storage
input parameter.
import atoti as tt
config = tt.UserContentStorageConfig(url)
session = tt.Session(
user_content_storage=config,
port=9092,
)
However, in 0.9.0, we need to import the atoti_jdbc
package (formerly known as atoti-sql
) and use atoti_jdbc.UserContentStorageConfig
instead.
import atoti as tt
from atoti_jdbc import UserContentStorageConfig
config = UserContentStorageConfig(url)
session = tt.Session.start(
tt.SessionConfig(
user_content_storage=config,
port=9092
)
)
Session security
In previous versions of Atoti Python API, configuring session security of an Atoti session was done by passing the following to the Session.authentication
input parameter:
- BasicAuthenticationConfig
- KerberosConfig
- LdapConfig
- OidcConfig
import atoti as tt
authentication = tt.BasicAuthenticationConfig()
session = tt.Session(
port=10011,
authentication=authentication,
user_content_storage="./content",
java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
)
In 0.9.0, we pass a SecurityConfig
to the SessionConfig.security
input parameter of Session.start()
.
import atoti as tt
config = tt.BasicAuthenticationConfig()
session = tt.Session.start(
tt.SessionConfig(
port=10011,
security=tt.SecurityConfig(basic_authentication=config),
user_content_storage="./content",
java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
)
)
If we are configuring basic authentication, we need to use the SecurityConfig.basic_authentication
input parameter. However, if we are setting up single sign-on (SSO) types of authentication, we need to use the SecurityConfig.sso
input parameter.
- basic_authentication
- BasicAuthenticationConfig
- sso
- KerberosConfig
- LdapConfig
- OidcConfig
The example below shows us how to set up Kerberos SSO in Atoti Python API 0.9.0:
import atoti as tt
config = tt.KerberosConfig(...)
session = tt.Session.start(
tt.SessionConfig(
port=10011,
security=tt.SecurityConfig(sso=config),
user_content_storage="./content",
java_options=["-Dlogging.level.org.springframework.security=DEBUG"],
)
)
Session create_cube
Previously, when creating an Atoti session and cube, we used the Session.create_cube()
function and passed in a table to be used as the base_table
to the cube.
table = session.create_table(
"Table",
types={"id": tt.STRING, "value": tt.DOUBLE},
)
cube = session.create_cube(
base_table=table
)
In 0.9.0, the base_table input parameter is fact_table
.
table = session.create_table(
"Table",
types={"id": tt.STRING, "value": tt.DOUBLE},
)
cube = session.create_cube(
fact_table=table
)
Session transactions
Atoti Python API allows us to batch several table operations within a single transaction for better performance.
Previously, this was done using the Session.start_transaction()
function.
import atoti as tt
import pandas as pd
df = pd.DataFrame(
columns=["City", "Price"],
data=[
("Berlin", 150.0),
("London", 240.0),
("New York", 270.0),
("Paris", 200.0),
],
)
session = tt.Session()
table = session.read_pandas(
df, keys=["City"], table_name="start_transaction example"
)
cube = session.create_cube(table)
extra_df = pd.DataFrame(
columns=["City", "Price"],
data=[
("Singapore", 250.0),
],
)
with session.start_transaction():
table += ("New York", 100.0)
table.drop(table["City"] == "Paris")
table.load_pandas(extra_df)
table.head().sort_index()
Price
City
Berlin 150.0
London 240.0
New York 100.0
Singapore 250.0
In Atoti Python API 0.9.0, we use Session.tables.data_transaction()
to achieve the same result.
import pandas as pd
import atoti as tt
df = pd.DataFrame(
columns=["City", "Price"],
data=[
("Berlin", 150.0),
("London", 240.0),
("New York", 270.0),
("Paris", 200.0),
],
)
session = tt.Session.start()
table = session.read_pandas(df, keys=["City"], table_name="Cities")
cube = session.create_cube(table)
extra_df = pd.DataFrame(
columns=["City", "Price"],
data=[
("Singapore", 250.0),
],
)
with session.tables.data_transaction():
table += ("New York", 100.0)
table.drop(table["City"] == "Paris")
table.load_pandas(extra_df)
table.head().sort_index()
Price
City
Berlin 150.0
London 240.0
New York 100.0
Singapore 250.0
Similarly, in Atoti Python API 0.9.0, we no longer have to perform independent Measures.update()
operations for related measure changes.
For example, if we execute the following code, we get a KeyError
.
import atoti as tt
session = tt.Session()
fact_table = session.create_table(
"Base",
keys={"ID"},
types={"ID": "String", "Quantity": "int"},
)
fact_table += ("123xyz", 1)
cube = session.create_cube(fact_table)
h, l, m = cube.hierarchies, cube.levels, cube.measures
m.update({"Quantity - 1": m["Quantity.SUM"] - 1, "Quantity + 1": m["Quantity.SUM"] + 1, "Quantity + 2": m["Quantity + 1"] + 1})
...
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
---> 13 m.update({"Quantity - 1": m["Quantity.SUM"] - 1, "Quantity + 1": m["Quantity.SUM"] + 1, "Quantity + 2": m["Quantity + 1"] + 1})
This is because when creating our measures, m[“Quantity + 1”]
does not yet exist even though we’ve specified it in our chain of operations.
In Atoti 0.9.0, we can use Session.data_model_transaction()
to batch measure operations together and allow for intermediary steps to be visible to subsequent statements within a transaction.
with session.data_model_transaction():
m["Quantity - 1"] = m["Quantity.SUM"] - 1
m["Quantity + 1"] = m["Quantity.SUM"] + 1
m["Quantity + 2"] = m["Quantity + 1"] + 1
Tables
Let’s now walk through some of the minor changes affecting tables in Atoti Python API 0.9.0.
Table keys
Prior to 0.9.0, we used Table.keys
to return a list of table keys.
import atoti as tt
table = session.create_table(
"Example",
keys=["Country", "City"],
types={
"Country": "String",
"City": "String",
"Year": "int",
"Population": "int",
},
)
table.keys
['Country', 'City']
In Atoti Python API 0.9.0, Table.keys
now returns a tuple
, signifying that keys can no longer be changed once the table exists.
import atoti as tt
table = session.create_table(
"Example",
keys=["Country", "City"],
types={
"Country": "String",
"City": "String",
"Year": "int",
"Population": "int",
},
)
table.keys
('Country', 'City')
Table row and column counts
Previously, when counting Atoti Table row and column counts we would use the len()
function to produce the count of rows for a Table
or ExternalTable
, and the len()
function to count the number of columns returned from Table.columns
.
print(f"# rows: {len(product_tbl)}, # columns: {len(product_tbl.columns)}")
However, in 0.9.0, we’ve introduced Table.row_count
for the count of rows, and deprecated Table.columns
. For column counts, we now use the list()
function to return the list of columns for the table and then use the len()
function on the list of returned column names.
print(f"# rows: {product_tbl.row_count}, # columns: {len(list(product_tbl))}")
In addition, to iterate over table columns in 0.9.0 we create a for loop to iterate over the table itself (rather than on returned Table.columns
output).
for column_name in product_tbl:
print(column_name)
Hierarchies
Let’s take a look at how 0.9.0 code changes affect hierarchies.
Levels
Previously, we used the list()
function in conjunction with Hierarchy.levels
to return the levels for a given hierarchy.
prices_df = pd.DataFrame(
columns=["Nation", "City", "Color", "Price"],
data=[
("France", "Paris", "red", 20.0),
("France", "Lyon", "blue", 15.0),
("France", "Toulouse", "green", 10.0),
("UK", "London", "red", 20.0),
("UK", "Manchester", "blue", 15.0),
],
)
table = session.read_pandas(prices_df, table_name="Prices")
cube = session.create_cube(table, mode="manual")
h = cube.hierarchies
level_names = list(h["Nation"].levels)
In Atoti Python API 0.9.0, Hierarchy.levels
has been deprecated and we now use the list()
function on the Hierarchy
itself to return the levels for a given hierarchy.
prices_df = pd.DataFrame(
columns=["Nation", "City", "Color", "Price"],
data=[
("France", "Paris", "red", 20.0),
("France", "Lyon", "blue", 15.0),
("France", "Toulouse", "green", 10.0),
("UK", "London", "red", 20.0),
("UK", "Manchester", "blue", 15.0),
],
)
table = session.read_pandas(prices_df, table_name="Prices")
cube = session.create_cube(table, mode="manual")
h = cube.hierarchies
level_names = list(h["Nation"])
Queries
Let’s now explore the changes in Atoti Python API 0.9.0 that affect how we implement queries in Atoti.
Query explains
In Atoti Python API, we can return query execution plans for a specified query that reveals insights into how a database processes a query. This provides valuable information for database performance tuning and query optimization.
Previously, we used Session.explain_mdx_query()
to return the query execution plan for a query written in MDX.
from datetime import date
import pandas as pd
import atoti as tt
df = pd.DataFrame(
columns=["Country", "Date", "Price"],
data=[
("China", date(2020, 3, 3), 410.0),
("France", date(2020, 1, 1), 480.0),
("France", date(2020, 2, 2), 500.0),
("France", date(2020, 3, 3), 400.0),
("India", date(2020, 1, 1), 360.0),
("India", date(2020, 2, 2), 400.0),
("UK", date(2020, 2, 2), 960.0),
],
)
table = session.read_pandas(
df, keys=["Country", "Date"], table_name="Prices"
)
cube = session.create_cube(table)
mdx = (
"SELECT"
" NON EMPTY Hierarchize("
" DrilldownLevel("
" [Prices].[Country].[ALL].[AllMember]"
" )"
" ) ON ROWS,"
" NON EMPTY Crossjoin( "
" [Measures].[Price.SUM],"
" Hierarchize("
" DrilldownLevel("
" [Prices].[Date].[ALL].[AllMember]"
" )"
" )"
" ) ON COLUMNS"
" FROM [Prices]"
)
session.explain_mdx_query(mdx)
In Atoti Python API 0.9.0, we use Session.query_mdx()
with an explain
parameter to achieve the same result.
session.query_mdx(mdx, explain=True)
Similarly, we used Cube.explain_query()
to return the query execution plan for an MDX query using Atoti Python API syntax.
import pandas as pd
import atoti as tt
df = pd.DataFrame(
columns=["Continent", "Country", "Currency", "Year", "Month", "Price"],
data=[
("Europe", "France", "EUR", 2023, 10, 200.0),
("Europe", "Germany", "EUR", 2024, 2, 150.0),
("Europe", "United Kingdom", "GBP", 2022, 10, 120.0),
("America", "United states", "USD", 2020, 5, 240.0),
("America", "Mexico", "MXN", 2021, 3, 270.0),
],
)
session = tt.Session()
table = session.read_pandas(
df,
keys={"Continent", "Country", "Currency", "Year", "Month"},
table_name="Prices",
)
cube = session.create_cube(table)
cube.explain_query(
m["Price.SUM"],
levels=[l["Country"]],
filter=l["Continent"] == "Europe",
)
In Atoti Python API 0.9.0, we use Cube.query()
with an explain
parameter as well to achieve the same result.
cube.query(
m["Price.SUM"],
levels=[l["Country"]],
filter=l["Continent"] == "Europe",
explain=True
)
Querying existing Atoti sessions
Some of us use Atoti Python API to connect to and read data from another existing Atoti cube.
In previous versions of Atoti Python API, we connected to an existing Atoti session and queried it by importing QuerySession
from the atoti-query
package.
from atoti_query import QuerySession
existing_session = QuerySession(url)
existing_session.query_mdx(...)
However, as of Atoti Python API 0.9.0, the atoti-query
package has been deprecated and we now use Session.connect()
to connect and query an existing Atoti session instead.
import atoti as tt
existing_session = tt.Session.connect(url)
existing_session.query_mdx(...)
💡Note: We can also use the slimmer atoti-client
package for projects that exclusively use the Atoti Python API to connect to an existing session. This significantly reduces the size of installed dependencies.
Plugins (optional)
Finally, we look at how plugins have been affected in 0.9.0.
DirectQuery configurations
Atoti Python API supports using DirectQuery to connect and query external databases as external tables in Atoti without having to load data in-memory. This is useful when working with cold vs. hot data where increased in-memory performance is not needed for cold or in-frequently used data.
💡Note: Read our documentation for more information on Atoti DirectQuery.
Previously, when using the Atoti DirectQuery feature, we would need to import the respective ConnectionInfo
and TableOption
classes from our DirectQuery plugin packages.
For example, when we downloaded the atoti-directquery-clickhouse
package, we imported the following classes, made the connection to the external database, and set up an external table in Atoti.
from atoti_directquery_clickhouse import (
ClickhouseConnectionInfo,
ClickhouseTableOptions
)
session = tt.Session()
connection_info = ClickhouseConnectionInfo( f"clickhouse:https://{os.environ['CLICKHOUSE_USER']}@{os.environ['CLICKHOUSE_HOST']}:{os.environ['CLICKHOUSE_PORT']}",
password=os.environ["CLICKHOUSE_PASSWORD"],
)
db = session.connect_to_external_database(connection_info)
db.cache = True
trades_atoti = session.add_external_table(
db.tables["TRADE_PNLS"],
table_name="Trade PnL",
options=ClickhouseTableOptions(keys=["AsOfDate", "TradeId", "BookId", "DataSet"]),
)
In 0.9.0, we import and use ConnectionConfig
and TableConfig
, respectively. In addition, we’ll need to update the following:
- Explicitly use the
ConnectionConfig.url
parameter to specify our URL. - Move the DirectQuery connection
cache
property to ConnectionConfig as a parameter. - Rename the
options
input parameter forSession.add_external_table()
toconfig
.
from atoti_directquery_clickhouse import (
ConnectionConfig,
TableConfig
)
session = tt.Session.start()
connection_info = ConnectionConfig(
url=f"clickhouse:https://{os.environ['CLICKHOUSE_USER']}@{os.environ['CLICKHOUSE_HOST']}:{os.environ['CLICKHOUSE_PORT']}",
password=os.environ["CLICKHOUSE_PASSWORD"],
cache=True
)
db = session.connect_to_external_database(connection_info)
trades_atoti = session.add_external_table(
db.tables["TRADE_PNLS"],
table_name="Trade PnL",
config=TableConfig(keys=["AsOfDate", "TradeId", "BookId", "DataSet"]),
)
Lastly, a few more minor changes:
When using the atoti_directquery_redshift
plugin, we can now use the newly added atoti_directquery_redshift.ConnectionConfig.connection_pool_size
. This configuration sets the maximum size of the connection pool when connecting to Redshift using Atoti DirectQuery.
When using the atoti_directquery_databricks plugin, we’ve renamed atoti_directquery_databricks.DatabricksConnectionInfo.heavy_load_url
to atoti_directquery_databricks.ConnectionConfig.feeding_url
.
Cloud storage client-side encryption
Atoti Python API can connect to and load data from cloud storage services such as Amazon S3, Azure Blob Storage, and Google Cloud Storage. For client-side encrypted data existing in cloud storage, Atoti also supports client-side encryption to decrypt data loaded into Atoti tables.
💡Note: Learn more about client-side encryption.
Previously, we performed client-side data encryption using the atoti_aws
and atoti_azure
plugins and imported the respective classes.
import atoti
from atoti_aws import AwsKeyPair, AwsKmsConfig
from atoti_azure import AzureKeyPair
client_side_encryption_aws_keypair = AwsKeyPair(
"public_key",
"private_key",
region="eu-west-3",
)
client_side_encryption_aws_kms = AwsKmsConfig(
region="eu-west-3",
key_id="key_id",
)
client_side_encryption_azure_keypair = AzureKeyPair(
"public_key",
"private_key",
key_id="key_id"
)
In Atoti Python API 0.9.0, these classes have been renamed.
atoti_aws.AwsKeyPair
→atoti_aws.KeyPair
atoti_aws.AwsKmsConfig
→atoti_aws.KmsConfig
atoti_azure.AzureKeyPair
→atoti_azure.KeyPair
We now need to import the renamed classes.
import atoti
import atoti_aws
import atoti_azure
client_side_encryption_aws_keypair = atoti_aws.KeyPair(
"public_key",
"private_key",
region="eu-west-3",
)
client_side_encryption_aws_kms = atoti_aws.KmsConfig(
"public_key",
"private_key",
region="eu-west-3",
)
client_side_encryption_azure_keypair = atoti_azure.KeyPair(
"public_key",
"private_key",
key_id="key_id"
)
Final Thoughts
Migrating assets can be hard, but we hope that this walkthrough helps!
If you have any questions or feedback, we’d love to hear from you. Stay tuned for more guides on Atoti Python API 0.9.0. Until then, take care! 👋