The CFS sessions race condition test validates the robustness and reliability of the Configuration Framework Service (CFS) API when handling concurrent operations on CFS sessions. This test is designed to detect race conditions, verify proper handling of parallel requests, and ensure data consistency when multiple API operations execute simultaneously.
This page provides information about the CFS sessions race condition test and describes how the test script works.
The CFS sessions race condition test serves several critical purposes:
This test is particularly important after CFS upgrades, configuration changes, or when troubleshooting intermittent CFS issues that may be related to concurrent operations.
cray-cmstools RPM.The script file location is /opt/cray/tests/integration/csm/cfs_sessions_rc_test.
Review the Test prerequisites before proceeding.
If no parameters are specified, this script performs the following steps:
Environment setup:
cray-cfs-operator deployment replicas.cray-cfs-operator deployment to 0 replicas to prevent sessions from being processed (this keeps sessions in pending state for testing).CFS version configuration:
× max-sessions).Pre-existing session cleanup:
cfs-race-condition-test- by default).--delete-previous-sessions is specified, deletes any matching sessions found.--delete-previous-sessions is not specified and matching sessions exist, exits with an error.Session creation:
Subtest execution:
--run-subtests or --skip-subtests).Cleanup and restoration:
cray-cfs-operator deployment to its original replica count.The script provides output along the way to report progress, and also provides a link to a log file with more detailed information. If the test fails, the place to begin the investigation is whatever service was being used at the time of the failure.
Each subtest may take several seconds to minutes, depending on the number of parallel requests and sessions configured.
The CFS sessions race condition test includes six distinct subtests, each designed to test different concurrent operation patterns.
By default, all subtests are executed. You can control which subtests run using the --run-subtests or --skip-subtests
options (see Controlling which subtests run).
single_deletePurpose: Tests concurrent deletion of the same individual CFS session by multiple parallel requests.
Behavior:
Expected results:
2xx status code.multi_deletePurpose: Tests concurrent batch deletion operations using the CFS multi-delete API endpoint.
Behavior:
Expected results:
single_delete_single_getPurpose: Tests concurrent operations combining single-session deletion with single-session retrieval.
Behavior:
Expected results:
single_delete_multi_getPurpose: Tests concurrent operations combining single-session deletion with multi-session retrieval.
Behavior:
Expected results:
multi_delete_single_getPurpose: Tests concurrent operations combining batch deletion with single-session retrieval.
Behavior:
Expected results:
multi_delete_multi_getPurpose: Tests concurrent operations combining batch deletion with multi-session retrieval.
Behavior:
Expected results:
(ncn-mw#) The script usage message can be displayed by running it with the --help argument.
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --help
The following sections cover the most commonly used options.
By default, the test runs all six subtests. You can control which subtests execute using one of two mutually exclusive options:
Run specific subtests only using the --run-subtests argument with a comma-separated list of subtest names:
(ncn-mw#) Example of running only the single_delete and multi_delete subtests:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --run-subtests single_delete,multi_delete
Skip specific subtests using the --skip-subtests argument with a comma-separated list of subtest names to exclude:
(ncn-mw#) Example of running all subtests except multi_delete_multi_get:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --skip-subtests multi_delete_multi_get
Valid subtest names are:
single_deletemulti_deletesingle_delete_single_getsingle_delete_multi_getmulti_delete_single_getmulti_delete_multi_getThe --max-sessions argument controls how many CFS sessions are created for subtests that use multiple sessions (default: 20).
(ncn-mw#) Example of creating 50 sessions for testing:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --max-sessions 50
The --name argument specifies the prefix for all session names created by the test (default: cfs-race-condition-test).
(ncn-mw#) Example of using a custom session name prefix:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --name my-test-prefix
Note: The session name prefix must be between 1 and 40 characters in length.
The --cfs-version argument specifies which CFS API version to use for the test (default: v3).
(ncn-mw#) Example of testing with CFS API v2:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --cfs-version v2
Valid values are:
v2 - Uses CFS API version 2v3 - Uses CFS API version 3 (default)Important considerations for CFS v2:
--max-sessions to prevent GET request failures.--page-size value less than --max-sessions when using v2, the test will fail with an error.Several arguments control the maximum number of parallel requests for different operation types:
Multi-delete requests using --max-multi-delete-reqs (default: 4):
(ncn-mw#) Example of allowing 8 parallel multi-delete requests:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --max-multi-delete-reqs 8
Multi-get requests using --max-multi-get-reqs (default: 4):
(ncn-mw#) Example of allowing 10 parallel multi-get requests:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --max-multi-get-reqs 10
Single-delete requests using --max-single-delete-reqs (default: 4):
(ncn-mw#) Example of allowing 6 parallel single-delete requests:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --max-single-delete-reqs 6
Single-get requests using --max-single-get-reqs (default: 4):
(ncn-mw#) Example of allowing 8 parallel single-get requests:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --max-single-get-reqs 8
Note: All parallel request limits must be at least 1.
The --page-size argument specifies the page size for CFS API multi-get requests.
Default behavior:
× --max-sessions (minimum: 1)× --max-sessions) or the current global page-size (minimum: equal to --max-sessions)(ncn-mw#) Example of specifying a custom page size:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --page-size 100
Important considerations:
--page-size, it must be at least equal to --max-sessions.Output is directed to both the console calling the script as well as a log file that will hold more detailed information
on the run and any potential problems found. The log file is written to
/opt/cray/tests/integration/logs/csm/cmstools/cfs_sessions_rc_test/ with a
timestamp-based filename (e.g., 20250101_120000.log). Each test run creates a new log file,
preserving the history of previous test executions.
The messages output to the console and the log file may be controlled separately through environment variables.
To control the information being sent to the console, set the variable CONSOLE_LOG_LEVEL.
To control the information being sent to the log file, set the variable FILE_LOG_LEVEL.
Valid values in increasing levels of detail are: CRITICAL, ERROR, WARNING, INFO, DEBUG.
The default for the console output is INFO and the default for the log file is DEBUG.
(ncn-mw#) Here is an example of running the script with more information displayed on the console during the execution of the test:
CONSOLE_LOG_LEVEL=DEBUG /opt/cray/tests/integration/csm/cfs_sessions_rc_test
By default, if the test finds any existing CFS sessions with the configured name prefix (default: cfs-race-condition-test) in pending state, it will exit with an error.
The --delete-previous-sessions flag instructs the test to automatically delete any such sessions before proceeding:
(ncn-mw#) Example of automatically cleaning up pre-existing test sessions:
/opt/cray/tests/integration/csm/cfs_sessions_rc_test --delete-previous-sessions
Note: Only sessions in pending state with the specified name prefix will be deleted. Sessions in other states or with different names are not affected.
This error occurs when sessions with the configured name prefix already exist in pending state.
Solutions:
--delete-previous-sessions to automatically clean them up.--name prefix that doesn’t conflict with existing sessions.This error occurs when the specified --page-size is less than --max-sessions when using CFS v2.
Solutions:
--page-size to at least equal --max-sessions.--max-sessions to match your desired page size.--page-size to let the test calculate an appropriate value automatically.This can occur if the CFS operator is not properly scaled down or if there are connectivity issues.
Solutions:
cray-cfs-operator deployment can be scaled (sufficient permissions)./opt/cray/tests/integration/logs/csm/cmstools/cfs_sessions_rc_test/ for more information.If the test is interrupted or fails unexpectedly, the CFS operator may remain scaled to 0 replicas.
Solutions:
Manually check the current replica count:
kubectl get deployment -n services cray-cfs-operator
Manually restore the replica count (typically 1):
kubectl scale deployment -n services cray-cfs-operator --replicas=1
If the global page-size was modified (CFS v2 only), it may need to be manually restored through the cray cli.
(ncn-mw#) Example of restoring the default page size to 1000:
cray cfs v3 options update --default-page-size 1000
Example output:
additional_inventory_source = ""
additional_inventory_url = ""
batch_size = 25
batch_window = 60
batcher_check_interval = 10
batcher_disable = false
batcher_max_backoff = 3600
batcher_pending_timeout = 300
debug_wait_time = 3600
default_ansible_config = "cfs-default-ansible-cfg"
default_batcher_retry_policy = 3
default_page_size = 1000
default_playbook = "site.yml"
hardware_sync_interval = 10
include_ara_links = true
logging_level = "INFO"
session_ttl = "7d"