Middleware installation¶
This guide covers installation of required middleware for the Analytics Platform (AP). This guide assumes Ubuntu Linux 22.04 LTS is used as the operating system and that the reader has some familiarity with Linux and terminals. The text editor used is nano
.
Please consider the following:
- There are many approaches to hosting a Java-based application such as AP. This guide outlines one of them.
- Topics including security hardening and backup strategy are important but beyond the scope of this guide.
- There may be several managed cloud middleware offerings available. This guide is focused on the on-premise installation scenario.
OpenJDK 17¶
Start by updating the operating system packages.
Install OpenJDK version 17.
PostgreSQL 14¶
Install PostgreSQL version 14. Note that later versions of PostgreSQL are supported. The installation of PostgreSQL is well covered in online installation guides.
The PostgreSQL service is enabled on boot by default after installation. Verify the status of the PostgreSQL process.
Set the PostgreSQL authentication method to md5
.
Make sure the authentication method is set to md5
for localhost connections, typically by modifying the two last lines.
Adjust performance settings by creating a new configuration file.
# PostgreSQL performance settings
max_connections = 100
shared_buffers = 768MB
work_mem = 16MB
maintenance_work_mem = 256MB
temp_buffers = 16MB
effective_cache_size = 2GB
checkpoint_completion_target = 0.8
wal_writer_delay = 1s
random_page_cost = 1.1
max_locks_per_transaction = 1024
track_activity_query_size = 8192
Set owner and permissions for the configuration file, and move it to the PostgreSQL configuration directory.
Restart PostgreSQL to have changes take effect.
nginx¶
Install nginx.
The nginx service is enabled on boot by default after installation. Verify the status of the nginx process.
Configure a proxy cache inside the http
element of the nginx config.
Configure nginx by creating a file analytics-platform.conf
and place it in the nginx sites-available
directory.
Configure nginx with SSL and static web app UI served from Amazon S3.
- SSL and certificate configuration are left out, and should be configured appropriately.
- The
apigateway
,web
andidentity
services are defined as upstreams and referred to later in the config. - The manager and user web apps are served from Amazon S3.
- Additional security hardening may be appropriate in a production environment.
- Update
server_name
fromap.mydomain.org
to match your environment.
# Upstream
upstream apigateway {
server 127.0.0.1:8085;
}
upstream web {
server 127.0.0.1:8081;
}
upstream identity {
server 127.0.0.1:8086;
}
# Redirect HTTP to HTTPS
server {
listen [::]:80;
listen 80;
server_name ap.mydomain.org;
return 301 https://$host$request_uri;
}
# HTTPS server
server {
listen [::]:443 ssl;
listen 443 ssl;
server_name ap.mydomain.org;
# Compression
gzip on;
gzip_types application/json application/javascript text/javascript text/css text/plain;
# Includes for the default hostname
include default.d/*.conf;
# Includes for the default hostname under HTTPS
include default.d/*-https.inc;
# https://developer.mozilla.org/en-US/docs/Web/HTTP/X-Frame-Options
add_header X-Frame-Options DENY;
# https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP
add_header Content-Security-Policy "frame-ancestors 'none';";
# Enable Strict Transport Security (HSTS) for https
add_header Strict-Transport-Security "max-age=31536000" always;
# Root URL rewrite to login page
location = / {
return 301 http://$host/manager/;
}
# Proxy settings
proxy_set_header host $http_host;
proxy_set_header x-forwarded-host $host;
proxy_set_header x-real-ip $remote_addr;
proxy_set_header x-forwarded-for $proxy_add_x_forwarded_for;
proxy_set_header x-forwarded-proto $scheme;
proxy_set_header x-forwarded-port $server_port;
proxy_buffer_size 128k;
proxy_buffers 8 128k;
proxy_busy_buffers_size 256k;
# Proxy forwards
# Login check and logout
location /login_check {
proxy_pass http://identity/login_check;
}
location /session_logout {
proxy_pass http://identity/session_logout;
}
# App
location ~* ^/(app|doc|node_modules) {
rewrite ^/(.*) /$1 break;
proxy_pass http://web;
}
# Manager web app to Amazon S3
location /manager {
proxy_intercept_errors on;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_hide_header x-amz-id-2;
proxy_hide_header x-amz-request-id;
proxy_pass http://bao-cloud-manager-prod.s3-website-us-east-1.amazonaws.com/manager;
proxy_cache ap;
}
# User web app to Amazon S3
location /users {
proxy_intercept_errors on;
proxy_set_header x-real-ip $remote_addr;
proxy_set_header x-forwarded-for $proxy_add_x_forwarded_for;
proxy_hide_header x-amz-id-2;
proxy_hide_header x-amz-request-id;
proxy_pass http://bao-cloud-manager-prod.s3-website-us-east-1.amazonaws.com/users;
proxy_cache ap;
}
# Increased max upload size and timeout for file upload API endpoints
location /api/dataPipelines {
proxy_pass http://apigateway/api/dataPipelines;
client_max_body_size 2048M;
proxy_read_timeout 600;
proxy_connect_timeout 600;
proxy_send_timeout 600;
}
# API requests to API gateway service
location /api {
proxy_pass http://apigateway/api;
}
}
Enable the server configuration by creating a symlink to the nginx sites-enabled
directory.
sudo ln -s /etc/nginx/sites-available/analytics-platform.conf \
/etc/nginx/sites-enabled/analytics-platform.conf
Remove the default server configuration file.
Restart nginx to make changes take effect.
Redis¶
Install redis server.
The redis service is enabled on boot by default after installation. Verify the status of the redis process.
Edit the redis configuration file.
Apache Pulsar¶
Installation¶
Install Apache Pulsar using the binary distribution. First, download and extract Pulsar using wget
. Alternatively, visit the Apache Pulsar downloads page. You may want to check for a later version of Apache Pulsar.
PULSAR_VER="3.3.3"
wget https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_VER}/apache-pulsar-${PULSAR_VER}-bin.tar.gz
tar xvfz apache-pulsar-${PULSAR_VER}-bin.tar.gz
mv apache-pulsar-${PULSAR_VER} apache-pulsar
Set root as owner and make binary files executable.
Configuration¶
Optional: Adjust memory usage by modifying pulsar_env.sh
.
Set the PULSAR_MEM
variable and specify memory usage, adjusted to available server resources.
Optional: Set new port for the HTTP server not to occupy port 8080.
Set the webServicePort
property to 8098
.
Move the directory to suitable installation location.
Create a systemd
service file called apache-pulsar.service
for running Pulsar in standalone mode.
[Unit]
Description = Apache Pulsar
[Service]
ExecStart = /var/lib/apache-pulsar/bin/pulsar standalone -nss
[Install]
WantedBy = multi-user.target
Note
The -nss
flag is set due to Pulsar bug 5668.
Set owner and permissions for the init script.
Move the init script to the systemd
directory.
Reload the systemd
daemon.
Enable Pulsar on startup.
Start Pulsar.
Verify that the Pulsar service is running.
View the Pulsar log.
You should now have Pulsar running on port 6650.
To run Pulsar manually.
Troubleshooting: If Apache Pulsar fails to start due to local data corruption, a solution is to stop the service, delete the local data director and start the service. Local data will be lost, however, Apache Pulsar topics are not persisted and the data directory will be recreated on next start.
Extra¶
Shorthand notation for installing packages in standard Ubuntu repositories.