Analytics Platform configuration¶
This guide covers the configuration of the Analytics Platform (AP) software.
Default client and user account¶
AP features a multi-tenant architecture, which means at least one client (tenant) must be created in order to use the software. The first time AP is deployed and started, a default client and default user account will be inserted into the database.
Default client¶
Property | Value |
---|---|
UID | WxMvtZb9eNP |
Code | ADMIN |
Name | Admin |
The default client can be renamed, alternatively, a new client can be created and the default client can be removed.
Default user account¶
Property | Value |
---|---|
UID | Xk6Gfr24Rj7 |
Username | administrator |
Password | Admin_1234 |
Name | Admin |
Note
Change the password of the default user after logging in from profile menu and change password.
It is required to change the password of the default user account after logging in for the first time, and before making AP available on the network, as the default password is publicly known.
Data pipeline config¶
The configuration of the infrastructure/cloud provider, the data storage and the data warehouse to for AP is referred to as the data pipeline config. The data pipeline config can be specified using the API or using the web UI.
Environments¶
The following infrastructure environments are supported.
Infrastructure | Data storage | Data warehouse |
---|---|---|
AWS | Amazon S3 | Amazon Redshift |
AWS | Amazon S3 | ClickHouse |
Azure | Azure Blob Storage | SQL Database |
Azure | Azure Blob Storage | Synapse |
On-prem | Local filesystem | ClickHouse |
On-prem | Local filesystem | SQL Server |
On-prem | Local filesystem | PostgreSQL |
Configuration of AWS and Azure cloud environments can be done in several ways, is well covered in online guides. It is hence considered outside the scope of this guide, which will address on-premise deployments using the local filesystem as blob store (data storage) and ClickHouse or PostgreSQL as the data warehouse.
UID¶
The UID format specification is as follows:
- Is exactly 11 characters long
- Contains only uppercase letters, lowercase letters and digits
- Starts with a letter
Properties¶
The following properties are required for the data pipeline config. The blobStoreConfig
and dataWarehouseConfig
objects are required. The publicHostname
property is optional and can be used in situations where AP should connect to the data warehouse using a local IP address, while external clients should connect using a public hostname. The supersetConfig
object is optional, and refers to an integrated instance of Apache Superset.
| Property | Description | Value |
| client | Client identifier | UID |
| provider | Infrastructure provider, cloud provider/on-premise environment and data warehouse platform. | AWS_REDSHIFT
, S3_CLICKHOUSE
, AZURE_SQL_SERVER
, AZURE_SYNAPSE
, LOCAL_CLICKHOUSE
, LOCAL_SQL_SERVER
, LOCAL_POSTGRESQL
|
| blobStoreConfig | Configuration for blob store, i.e. data storage environment | Object |
| identity | AWS: Access key. Azure: Storage account name. On-prem: NA. | String |
| credential | AWS: Secret key. Azure: Storage account key. On-prem: NA. | String |
| container | AWS: Bucket name. Azure: Container name. On-prem: Root directory name. | String |
| account | Azure: Storage account name. AWS and on-prem: NA. | String |
| dataWarehouseConfig | Data warehouse configuration | Object |
| hostname | Data warehouse hostname | String |
| publicHostname | Data warehouse public hostname (optional) | String |
| database | Database name | String |
| username | Data warehouse admin account username | String |
| password | Data warehouse admin account password | String |
| iamRoleArn | AWS: Redshift IAM role ARN. Azure and on-prem: NA. | String |
| supersetConfig | Apache Superset configuration (optional) | Object |
| url | Superset domain name | String |
| username | Superset username | String |
| password | Superset password | String |
| databaseId | Superset AP database identifier | Integer |
API configuration¶
The data pipeline config can be configured using the API:
The configuration payload in JSON format using local filesystem for data storage and ClickHouse as data warehouse can be defined as below. Value na
refers to "not applicable". Under blobStoreConfig
, field bao-ap-client-main
refers to the directory below the data storage directory on the local file system.
{
"client": "TKcmL3RbA3I",
"provider": "LOCAL_CLICKHOUSE",
"blobStoreConfig": {
"identity": "na",
"credential": "na",
"container": "bao-ap-client-main"
},
"dataWarehouseConfig": {
"hostname": "127.0.0.1",
"publicHostname": null,
"database": "baoanalytics",
"username": "baoanalytics",
"password": "{secret}"
},
"supersetConfig": {
"url": "https://superset.mydomain.org",
"username": "admin",
"password": "{secret}",
"databaseId": 1
}
}
Web UI configuration¶
The data pipeline can be be configured using the web UI. The steps assumes being logged in as the relevant client.
- From the top-right context menu, click Clients.
- Select the relevant client name.
- Click Manage data pipeline config.
- Click Update.
- Enter the required values using the information above.
- Click Update.
An example of values to use with the local server filesystem as data storage and ClickHouse as data warehouse in configuration form is found below.
Field | Value |
---|---|
Client | Prefilled |
Provider | Local - ClickHouse |
Blob store config | |
Identity | na |
Credential | na |
Container | bao-ap-client-main |
Account | |
Data warehouse config | |
Hostname | 127.0.0.1 |
Public hostname | |
Username | baoanalytics |
Password | {secret} |
IAM role ARN | |
Superset config | |
URL | https://superset.mydomain.org |
Username | admin |
Password | {secret} |
Database ID | 1 |
Connection test¶
In the web UI, the data pipeline config page offers testing the blob store connection and the data warehouse connection.
- To test the blob store connection, click Test blob store connection.
- To test the data warehouse connection, click Test data warehouse connection.
- To test the Apache Superset connection, click Test Superset connection.
A window will open and indicate the test outcome.
Initialize data warehouse¶
The support data warehouses may require initial setup.
- To perform the work to initialize the datawarehouse, click Initialize data warehouse.
If the initialization is done multiple times, the operation will fail, but will not cause an invalid state.