Game Health dashboards
Only Private Cloud customers can access the Game Health dashboards in Grafana Cloud.
Overview
The AccelByte Gaming Services (AGS) Game Health dashboards in Grafana Cloud offer a clear view of important metrics that affect game performance. These dashboards gather data from various services, providing real-time insights into key areas such as concurrent users (CCU), server health, and matchmaking activity.
This unified solution helps game developers monitor and optimize performance, ensuring smooth gameplay and a great player experience. The dashboard features tailored views for different services, including IAM, Lobby, and Matchmaking, giving a comprehensive overview of game health.
To learn how to access and view Grafana Cloud dashboards, refer to the Access Grafana Cloud and View dashboards pages.
Game Health dashboards
The Game Health dashboards provided by the AGS and Grafana Cloud integration are as follows:
- Game Health Overview Dashboard
- IAM Service Dashboard
- Lobby Service Dashboard
- Matchmaking Service Dashboard
- AccelByte Multiplayer Servers (AMS) Service Dashboard
- Cloud Save Dashboard
- E-commerce Service Dashboard
Game Health Overview Dashboard
The Game Health Overview Dashboard provides a comprehensive snapshot of the game's performance and player engagement. By identifying trends and potential issues, it enables data-driven decisions to enhance gameplay experiences and further optimization.
The Game Health Overview Dashboard has three sections: Login and Lobby Overview, Matchmaking and Session, Overview, and Other Service Activity Overview.
Login and Lobby Overview section
This section shows metrics about your game's logins and lobby connections, presented in the following panels:
Panel | Description |
---|---|
Current CCU | The current number of concurrent users (CCU) actively connected to the game. |
CCU Change per Minute | The number of concurrent users compared to one minute prior. |
CCU Per Platform | The number of concurrent users currently connected to the game, tagged by platform (e.g., PSN, Xbox, Steam). |
CCU by Closest Configured Region | The number of concurrent users currently connected to the game, based on their proximity to the nearest game server region. A “To be determined” tag indicates that the region information is still being processed on the client side as QoS measurements and lobby connections occur simultaneously. |
Login Activity Count by Platform | The number of login attempts for each platform, based on the chosen game namespace. |
Login Success Rate by Platform | The percentage of successful logins, calculated by dividing the number of successful attempts by the total login attempts for each platform. |
Matchmaking and Session Overview section
This section shows metrics related to the matchmaking events and sessions in your game, presented in the following panels:
Panel | Description |
---|---|
Current Total Players In Match | The total number of players currently engaged in active matches across all game modes. |
Total Players by Matchpool | The number of active players currently participating in each specific match pool. |
Acquiring DS Wait Time | Measures the time players wait to acquire a dedicated server (DS) in seconds, helping to identify potential delays in server allocation. |
Average Time to Match | The average time (in seconds) it takes for players to be matched with others. This is measured across all matchmaking attempts. |
Matchmaking Success Rate | The average percentage of players successfully matched for a game session compared to the total number of matchmaking attempts. |
Graceless/Abnormal Lobby Disconnects | The number of unexpected disconnections of players from the Lobby service. |
Other Service Activity Overview section
This section shows metrics related to the performance and reliability of various services, such as Platform, Social, and Cloud Save, presented in the following panels:
Panel | Description |
---|---|
All Services Availability / Success Rate | The overall percentage of successful service requests across all systems. This reflects the health and availability of all game services. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
IAM Service Dashboard
The IAM Service Dashboard enables quick identification of trends and issues affecting user authorization and authentication through transparent metrics. Tracking key data such as Login Success Rates and 3rd-Party Token Validation insights enables you to proactively address problems and enhance user experience.
The IAM Service Dashboard has two sections: Overview and Resilience and Monitoring.
Overview section
This section shows metrics related to login activities and third-party token validation from various platforms, presented in the following panels:
Panel | Description |
---|---|
Login Activity Count by Platform | The number of login attempts for each platform based on the chosen game namespace. |
Login Success Rate by Platform | The percentage of successful logins, calculated by dividing the number of successful attempts by the total login attempts for each platform. |
3rd Party Token Validation Errors (PSN, Xbox, Epic, Steam) | The number of token validation errors from the AGS IAM Service to various third-party platforms, such as PSN, Xbox, Epic, and Steam. |
3rd Party Token Validation p95 Latency (PSN, Xbox, Epic, Steam) | This metric indicates the longest response time for 95% of players logging in with a third-party platform, measured from the AGS IAM service to the platform. This shows how quickly most players are logging into the game on each platform. |
3rd Party Token Validation p99 Latency (PSN, Xbox, Epic, Steam) | This metric indicates the longest response time for 99% of players logging in with a third-party platform, measured from the AGS IAM service to the platform. This shows how quickly most players are logging into the game on each platform. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AGS IAM service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the IAM service handles each second, providing insight into its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting to the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the IAM service restarts, which may indicate stability issues that could disrupt player experiences. |
Lobby Service Dashboard
This dashboard tracks key metrics for the Lobby service, providing insights into real-time concurrent users (CCU) connected to the service, lobby connection health, and the success of lobby notifications and third-party friend synchronization.
The Lobby Service Dashboard has two sections: Overview and Resilience and Monitoring.
Overview section
This section shows metrics related to concurrent users and connectivity with the AGS Lobby service within the game, presented in the following panels:
Panel | Description |
---|---|
Current CCU | The current number of concurrent users (CCU) actively connected to the game at this moment. |
CCU Rate/Min | The rate at which new users are joining or disconnecting from the game per minute. |
CCU Per Platform | The number of concurrent users currently connected to the game, broken down by platform (e.g., PSN, Xbox, Steam). |
CCU by Closest Configured Region | Shows the total number of concurrent users based on their proximity to the nearest game server region, with 'To be determined' indicating that the region information is still being processed on the client side as QoS measurements and lobby connections occur simultaneously. |
Graceless/Abnormal Lobby Disconnect | The number of unexpected disconnections of players from the lobby service. |
Get Notification Success Rate | Displays the success rate of lobby notifications, ensuring proper communication between users and the service. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AGS Lobby service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the Lobby service handles each second, giving an idea of its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting to the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the Lobby service restarts, which may indicate stability issues that could disrupt player experiences. |
Matchmaking Service Dashboard
This dashboard provides key insights into the performance of the matchmaking service, helping developers monitor player activity, matchmaking efficiency, and wait times.
The Matchmaking Service Dashboard has two sections: Overview and Resilience and Load Monitoring.
Overview section
This section shows metrics related to matchmaking engagement and performance, presented in the following panels:
Panel | Description |
---|---|
Current Total Players In Match | The total number of players currently engaged in active matches across all game modes. |
Players by Matchpool | The number of active players currently participating in each specific match pool. |
Matchmaking Success Rate | The percentage of players successfully matched for a game session compared to the total matchmaking attempts. |
Average Time to Match in Seconds | The average time it takes for players to be matched with others, measured in seconds across all matchmaking attempts. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AGS Matchmaking service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the Matchmaking service handles each second, giving an idea of its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting to the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the Matchmaking service restarts, which may indicate stability issues that could disrupt player experiences. |
AMS Service Dashboard
This dashboard provides visibility into the availability and performance of dedicated game servers, focusing on server count and wait times for acquiring servers.
The AMS Service Dashboard has two sections: Overview and Resilience and Load Monitoring.
Overview section
This section shows metrics related to the availability and performance of dedicated game servers (DS), presented in the following panels:
Panel | Description |
---|---|
DS Count | Displays the total number of active dedicated servers available to handle game sessions. |
Acquiring DS Wait Time | Measures the time players wait to acquire a dedicated server (DS), helping to identify potential delays in server allocation. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AMS service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the AMS service handles each second, giving an idea of its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting to the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the AMS service restarts, which may indicate stability issues that could disrupt player experiences. |
Cloud Save Service Dashboard
This dashboard monitors the performance of the cloud save service, focusing on data payload sizes and the success rate of save and update operations.
The Cloud Save Service Dashboard has two sections: Overview and Resilience and Load Monitoring.
Overview section
This section shows metrics related to the performance and efficiency of Cloud Save operations, presented in the following panels:
Panel | Description |
---|---|
Payload Size Distribution | Displays the distribution of data payload sizes to help optimize storage and transmission efficiency. JSON records over 1MB can negatively impact performance because of the additional space required for formatting and escaping characters. For larger payloads, using Binary Cloud Save is recommended for better performance and efficiency. For more details, refer to the official documentation. |
Create and Update Success Rate | Tracks the percentage of successful creation and update of Public Game Records operations to ensure reliable cloud save functionality. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AGS Cloud Save service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the Cloud Save service handles each second, giving an idea of its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting or enjoying the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the Cloud Save service restarts, which may indicate stability issues that could disrupt player experiences. |
E-commerce Service Dashboard
This dashboard provides insights into the performance of the e-commerce service, focusing on transaction success and DLC synchronization efficiency.
The E-commerce Service Dashboard has two sections: Overview and Resilient and Load Monitoring.
Overview section
This section shows metrics related to player purchases, including in-app purchases and downloadable content (DLC) synchronization, presented in the following panels:
Panel | Description |
---|---|
In-app Purchase Success Rate | Tracks the percentage of successful transactions to ensure smooth and reliable purchases. |
DLC Sync Success Rate | Measures the success rate of 3rd Party Store DLC synchronization, ensuring that purchased content is correctly delivered to users. |
Resilience and Load Monitoring section
This section shows metrics related to the overall performance and reliability of the AGS Platform service, focusing on its ability to handle requests efficiently and maintain stability under load, presented in the following panels:
Panel | Description |
---|---|
Requests per second (RPS) | Measures how many requests the Platform service handles each second, giving an idea of its overall workload. |
4xx Error Rate | Shows how often players encounter issues due to invalid requests (e.g., accessing unavailable content). |
5xx Error Rate | Tracks the frequency of server errors that can prevent players from connecting or enjoying the game. |
P95 Latency | The 95th percentile of latency, meaning 95% of requests are completed in this latency or less. |
P99 Latency | The 99th percentile of latency, meaning 99% of requests are completed in this latency or less. |
Service Restarts | Counts how often the Platform service restarts, which may indicate stability issues that could disrupt player experiences. |