Use cases¶
Overview¶
This page describes common use-cases for data analytics with Analytics Platform.
Natural language queries and analytics¶
AP provides the following main capabilities for natural language analytics, also referred to as conversational analytics.
- Convo web app: Convo is a web app for DHIS2 for conversational analytics which allows users to ask questions about their data using natural text. Convo understands natural text questions and uses DHIS2 API calls to retrieve the relevant information. See the "Convo" documenation page.
- Data browser: The data browser in AP allows users to ask questions about their data using natural text. The text-to-SQL engine converts questions in SQL queries which retrieve the relevant data.
- Script editor: The script editor supports Python and R scripting and allows users to declare what type of data analytics and data science they want to achieve in natural text. The text-to-script engine converts the natural text input into Python and R scripts.
Geospatial visualization and maps¶
AP does not provide a native geospatial visualization and maps (GIS) component. The following systems are recommended to be used with AP for geospatial analysis:
- DHIS2 Maps: The DHIS2 platform featues a web-based and powerful thematic mapping and geospatial visualization tool. DHIS2 is integrated with AP through the DHIS2 data pipeline.
- ArcGIS: ArcGIS from Esri is a comprehensive geographic information system (GIS) platform that enables users to create, manage, analyze, and map spatial data to gain actionable insights. ArcGIS is integrated through the DHIS2 to ArcGIS Connector app.
Audit trail¶
AP includes audit records when loading data from source systems with data pipelines, assuming that the source system stores audit information. The main entities in AP, including workflows, data pipelines, data quality check groups, materialized views and destinations, create and store a detailed change log for every task. For example, every time a workflow runs, either scheduled or ad-hoc, a change log with detailed logs are stored. The changelog includes the time the task started and completed. For ad-hoc runs, the changelog includes information about which user started the task. This ensures auditability of the system, both for source system data changes and for operations within AP including data loading and transformation.
Project-based configuration model¶
AP supports stand-alone deployment in on-premise server environments. AP is designed with a multi-tenant architecture. These features allows for setting AP in way where multiple projects or teams within the same organization can work independently and isolated from each other. To achieve this, separate clients for each team are created. This allows each team to create data pipelines, views, scripts and workflows, as well as a dedicated data warehouse, independently of each other.
Unified and integrated data layer¶
AP allows for importing data from data sources into a central data warehouse through the data pipeline system. Each data pipeline creates and loads data into several data tables in a dedicated data warehouse schema. SQL views and Python and R scripts are used to join the data tables, either by speciying SQL and script code directly, or from natural text, where AP converts the natural text input to SQL and script code with the use LLM/AI technology.
Data capture¶
AP does not provide a built-in data capture app or module. Instead, AP provides no-code, easily configured connectors (data pipelines) for a variety for data capture solutions. The following software systems are integrated and recommended for data capture.
- DHIS2: A highly flexible configurable information system with support for data capture of aggregate data, survey and event data, and individual data. Supports desktop web clients and Android mobile clients. Data is ingested into AP using the DHIS2 data pipeline.
- ODK: Flexible and easy-to-configure software for survey data collection through an Android mobile app. Data is ingested into AP using the ODK data pipeline.
- CSV: Data can be collected in CSV and Excel files. Data is ingested into AP using the CSV file upload data pipeline.
Data governance¶
AP supports the key aspects of data governance through its extensive feature set.
- Data quality: The data quality checks, SQL views, scripting environment, workflows and notification system in AP are useful for ensuring that data is accurate, complete, and consistent.
- Data security and privacy: The multi-dimensional sharing and access control system, secure-by-default approach and compliance are useful for managing the data security aspect.
- Role based access control: User groups in AP allows for grouping users by their role within the organization, allowing access and permission to be granted at the role level.
- Metadata management: The data catalog in AP allows for creating a detailed inventory of data sources and datasets through its rich metadata model.
- Auditing: All main entities in AP including data pipelines, data quality checks, views and workflows provide changelog with detailed logs for every task, ensuring all activity is audited.
- Data lifecycle management: In addition to data loading, AP supports data archiving and data deletion, allowing for managing the lifecycle of datasets and data records.
Role based access control (RBAC)¶
Role-Based Access Control (RBAC) is a security model that restricts system access and permissions based on roles within an organization, rather than individual user identities. It simplifies administration by granting permissions to roles, such as data owner, data steward, data analyst and data viewer, and not directly to individual users. AP supports user groups, users can be assigned to groups depending on their role in the organizations, after which user groups can be granted access to datasets, workflows, views and other entites and objects.
Data archiving and removal¶
AP supports archiving data from data sources. A "frozen" data pipeline can be used to ingest data from a source system and then prohobit the data pipeline to subsequently be refreshed or deleted. This allows for archival of historical datasets, and means the historical data can be deleted from the source system, and instead be accessed with a data exploration or BI tool in AP. Data can be permanently removed based on policies and time thresholds. This is achieved through SQL statements with filters and workflows. Automated data lifecycle management can be enforced through scheduled workflows running on a fixed interval. The data archiving and removal operations are audited through the workflow changelog system.