Troubleshooting Okta Privileged Access

This post looks at the tools to use when troubleshooting issues with Okta Privileged Access (OPA). It’s not a “if you see this error, go do this” article – Google is great for that! It will look at where to go to look for diagnostic info to help troubleshoot issues.

This article is based off the older Troubleshooting Okta Advanced Server Access (ASA) post as many of the infrastructure components, and thus diagnosis approaches, are the same.

Revisiting the Okta Privileged Access Components and Flows

Before diving into troubleshooting, it’s worth revisiting the product components and flows so you know where to look for something breaking.

Okta Privileged Access Architecture

The Okta Privileged Access architecture is shown in the following figure.

There are multiple cloud services involved: the Okta Workforce Identity Cloud (WIC) org, the Okta Access Requests Platform tenant, and the Okta Privileged Access tenant (“team”). Okta Workflows is not included in Okta Privileged Access license, but may be used for bespoke automation.

The bottom quarter of the figure shows the infrastructure components that are currently used in the server access use cases. These are: the Client (“sft”) that runs on the users workstation, the Agent (“sftd”) that runs on the servers, and optionally the Gateway (“sft-gatewayd”) that runs on a gateway server.

These infrastructure components are the same as for Okta Advanced Server Access.

Components and Flows

The flows between the components are shown in the following figure.

The three infrastructure components will only ever communicate with the Okta Privileged Access team (the team will never contact the components):

The Client will contact the team to perform authentication and authorization of the user and the command they are trying to run, request and consume policy evaluation, raise an access request or get passwords/certificates issued.
The Gateway will contact the team for enrollment and to get certs issued.
The Agent will contact the team for enrollment, certificate issue, user provisioning and writing audit events.

The Client will establish SSH or RDP sessions either directly with the server, or via the Gateway. Any gateway used will be based on the Settings in the Project for the relevant server.

The Okta Privileged Access team will interact with the other cloud services to raise access requests, consume users and groups from Okta WIC and write audit events to the Okta System Log.

Communication Protocols and Ports

The port requirements are listed in the product documentation – https://help.okta.com/en-us/content/topics/privileged-access/pam-default-ports.htm.

The following figure summarizes the protocols and ports used.

All of the cloud components communicate over HTTPS, using port 443. When the infrastructure components communicate with the Okta Privileged Access team, they use HTTPS to port 443.

The gateway service (“sft-gatewayd”) is listening on port 7234. If the client is to connect to a server via the gateway (based on policy) the client will connect over SSH to the gateway on port 7234.

The agent service (“sftd”) is listening on port 4421. If a connection involves just-in-time provisioning, the client (optionally via the gateway) will connect to the agent on port 4421 to check if the account is there and trigger provisioning.

If the target server is a Microsoft Windows server, the client (optionally via the gateway) will also establish a SSH tunnel (mTLS) to the server (the agent in this case will act as the ssh daemon listening on 4421). The RDP session will be established as a localhost connection on the server to port 3389. See also https://help.okta.com/en-us/content/topics/privileged-access/pam-windows.htm.

If the target server is a Linux server, the client will establish a ssh connection from the local (workstation) ssh client to the ssh daemon on the server. This is a standard ssh connection to port 22 on the server.

Note that the gateway and agent listening ports can be configured. Also, the arrows in the figure going to the Agent (B, C, D, E) could terminate at the server or agent depending on the flows described above.

Troubleshooting the Cloud Services

The three cloud services, the Okta WIC org, the Okta Privileged Access team, and the Okta Access Requests platform instance, are closed applications hosted by Okta.

The only logging that is available to users is the Okta System Log. Events from the three platforms are written here and can be used to check activity and timestamps. Note that All Okta Privileged Access events have an EventType that begins with “pam”.

Expanding an event shows key event information, including the team, project and user.

Expanding all for an event shows a wealth of information, including the DebugData that is often very useful to determine what was going on. In the following example, you can see the policy being used, the target system, and MFA challenge info and the access method (“sally via RDP (admin level individual account)”).

This information can be useful in debugging issues with policy configuration. Similarly if you suspect issues with MFA or access request conditions, you should see events in the system log corresponding to those activities.

Troubleshooting the Client

The infrastructure components are the client (“sft”), server agent (“sftd”) and gateway (“sft-gatewayd”). Let’s look at the client first as it will often be the where issues with connecting to a server show up.

Verifying the Client Version

With the constant rate of change of functionality, it’s not uncommon for failures to be due to an old version of the client. Before starting debugging you should check the client is at the appropriate version (preferably the latest version).

The client version can be confirmed by running the sft -version command in a command window.

larry@larrys-desktop:~$ sft -version sft version 1.84.0

Workstation platforms like Windows and MacOS may have other means to check the version (such as the ScaleFT icon on the task bar).

Client Logs

On a Mac or Linux system you should see logs in <user>/Library/Logs/ScaleFT/sft or equivalent (e.g. <user>/.cache/ScaleFT/logs/sft). On a Windows system you should see the logs in <user>\AppData\Local\ScaleFT\logs.

Client Command Debugging

You can also run the sft client in debug mode by setting the SFT_DEBUG environment variable before running any sft command.

On Windows workstations:

In Powershell PS> C:\Users\<user>> $env:SFT_DEBUG="1" PS> C:\Users\<user>> sft rdp <server_name>

# Windows CMD C:\Users\<user>> set SFT_DEBUG="1" C:\Users\<user>> sft rdp <server_name>

On Linux workstations:

larry@larrys-desktop:~$ SFT_DEBUG=1 sft rdp <server_name>

You can set this for any sft command (example below is on a MacOS workstation).

david.edwards@N3F4YXC99J ~ % SFT_DEBUG=1 sft list-servers

2024-10-01T14:20:57.501+1000 INFO sft command {"version": "1.83.5", "pid": 35101, "args": ["sft", "list-servers"], "log_directory": "/Users/david.edwards/Library/Logs/ScaleFT/sft"}

2024-10-01T14:20:57.519+1000 DEBUG macOS User Defaults check returned error {"error": "exit status 1"}

2024-10-01T14:20:57.743+1000 INFO RecordSpan {"tags": null, "operation": "device.cloud.info", "start": "2024-10-01T14:20:57.743+1000", "duration": "4.292µs", "traceID": 1907366058819643408, "t": "trace", "spanID": 2600321935164281505}

2024-10-01T14:20:57.775+1000 INFO RecordSpan {"t": "trace", "traceID": 1907366058819643408, "spanID": 2043889579375440998, "tags": null, "operation": "device.mac.filevault", "start": "2024-10-01T14:20:57.743+1000", "duration": "31.972416ms"}

With SSH commands you can also pass the debug argument (-v = debug level 1, -vv = debug level 2 and -vvv = debug level 3) if you have aliased ssh to run via sft. For example:

larry@larrys-desktop:~$ ssh -v opa-demo-linux OpenSSH_8.6p1, Ubuntu-3ubuntu0.10, OpenSSL 3.0.2 15 Mar 2022 debug1: Reading configuration data /home/larry/.ssh/config debug1: Executing command: '/usr/local/bin/sft resolve -q opa-demo-linux'

If you need to log a ticket with Okta support, there are sft support collect and sft support submit commands. There are equivalent commands for the other infrastructure components – https://help.okta.com/en-us/content/topics/privileged-access/pam-get-support.htm.

Troubleshooting the Agent

The agent may be called the “agent”, the “server agent”, or identified as “ScaleFT Server Tools”. The process/daemon is sftd.

Verifying the Agent Version

On Windows servers you can view the Control Panel > Programs > Programs and Features window and find ScaleFT Server Tools.

On both Windows and Linux you can check the log files (see next section) and look for the last log entry with “sftd: Starting”. It will list the version.

The sftd binary also supports a sftd -v (or –version) option.

Agent Logs

On a Windows server the logs files can be found in: C:\Windows\System32\config\systemprofile\AppData\Local\scaleft\Logs\sftd

On Linux systems the native system logging mechanism is used and can be different depending on the platform. From a colleague “When the agent is installed on a Linux server, it identifies if the server is running systemd (RHEL7+, etc.), and specifically the journald component of systemd. If it finds journald, the agent will use systemd-journald.service for logging. If systemd is NOT present, it will fall back to whichever syslog server is available, i.e. syslog-ng or rsyslog”.

For most Linux servers you can use the journalctl -u sftd command to see the logs. Otherwise, look in the /var/log/sftd folder.

You may also find useful information in system/security logs. For example, the following is from the /var/log/auth.log on an Ubuntu system and shows both sftd (user creation) and user activity via ssh.

System logs like these may help with adding context to the system logs for the sftd process. If you have a SIEM you should capture these system logs as well as the sftd process logs and correlate the events.

Setting the LogLevel for Agents

The logging level is set to info by default, but can be set to any one of warn, info or debug (increasing levels of verbosity). This is controlled by the LogLevel option in the sftd.yaml file. The location of the configuration file depends on the operating system running the server agent (it can be created if it doesn’t exist).

Linux: /etc/sft/sftd.yaml
Windows: C:\Windows\System32\config\systemprofile\AppData\Local\scaleft\sftd.yaml

You may need to restart the sftd process to get the new log level setting.

On Linux use the systemctl stop sftd and systemctl start sftd commands.
On Windows use the Services function (services.msc) or the commands listed in https://help.okta.com/oie/en-us/content/topics/privileged-access/server-agent/pam-manage-client-servers.htm.

Troubleshooting the Gateway

Gateways may be used to route traffic between clients and servers, and also perform additional functions like session recording.

The gateway is software running on a Linux server and has its own process – sft-gatewayd.

root@ip-172-31-26-3:/home/ubuntu# ps -ef | grep gateway root 363 1 0 03:45 ? 00:00:10 /usr/sbin/sft-gatewayd service sft-gat+ 698 363 0 03:45 ? 00:00:18 /usr/sbin/sft-gatewayd proxy --log-level info

Verifying the Gateway Version

As with the other components there is a command to check the version:

root@ip-172-31-26-3:/home/ubuntu# sft-gatewayd -v sft-gatewayd version 1.83.1

You can also look in the logs for the last “service starting” message to get the current version.

… sft-gatewayd service starting {"version": "1.83.1"}

Gateway Logs

As with the agent on Linux, the gateway logs are viewed using the journalctl command – journalctl -u sft-gatewayd.

The logs are a rich source of information on all traffic flowing through the gateway and will often pinpoint connection errors. There may be some spurious ERROR messages relating to the config file and setup token, but you can generally get a lot of information about the user and the connection from the client to the server.

Setting the LogLevel for Gateways

The logging level is set to info by default, but can be set to any one of error, warn, info or debug (increasing level of verbosity). This is controlled by the LogLevel option in the /etc/sft/sft-gatewayd.yaml file (it can be created if it does not exist).

To restart the service, use the systemctl stop sft-gatewayd and systemctl start sft-gatewayd commands.

What About Specific Issues?

As mentioned at the outset, this article is meant to help you diagnose issues, not list the common problems. If you encounter a problem it’s worth Googling any error messages you see.

There is a dedicated Okta Privileged Access tag in the support knowledgebase – see https://support.okta.com/help/s/topic/0TO4z000000a942GAA/okta-privileged-access?language=en_US.

Two particularly useful support articles:

Okta Privielged Access FAQs – https://support.okta.com/help/s/article/okta-privileged-access-frequently-asked-questions?language=en_US,
Okta Privielged Access Troubleshooting – https://support.okta.com/help/s/article/okta-privileged-access-troubleshooting-tips?language=en_US

The suggested approach to any debugging exercise is:

Check the versions of the different components
Google the error message returned by the client
Check the gateway logs if the gateway is in the flow – this log is the most useful in debugging
If there isn’t a gateway, or it doesn’t help identifying the issue, look at the agent logs

Conclusion

This article has provided a guide to troubleshooting Okta Privileged Access. It has looked at the architecture, components and flows to help you zero in on the problem area. It has explored the logging and files that can be used for troubleshooting.

Hopefully you won’t need to use the information in this article, but if you do it should help you identify the issue and continue using Okta Privileged Access. We recommend you have a look at your environment and explore the process and logs to understand how the components are working together.

Troubleshooting Okta Privileged Access