© KX 2025
"nano" calculates basic raw CPU and I/O capabilities of non-volatile storage, as measured from kdb+ perspective. It is considered a storage, memory and CPU benchmark that performs sequential/random read and write together with aggregation (e.g. sum and sort) and meta operations (like opening a file).
Nano measures results from running on one kdb+ process or aggregated across several worker processes. The kdb+ processes can be attached either directly to storage, or connected to a distributed/shared storage system.
The throughput and latency measurements are taken directly from kdb+/q, results include read/mmap allocation, creation and ingest of data. There is an option to test compressed data.
This utility is used to confirm the basic expectations for the CPU and storage subsystem prior to a more detailed testing regime.
Nano does not simulate parallel IO rates via the creation of multi-column tick databases, but instead shows both a maximum I/O capability for kdb+ on the system under test, and is a useful utility to examine OS settings, scalability, and identify any pinch points related to the kdb+ I/O models.
Multi-node client testing can be used to test read/write rates for multiple hosts using multiple different storage targets on a shared storage device.
In this document, we use "kdb+ process" and "q process" interchangeably.
Please install kdb+ 4.1. If the q home differs from $HOME/q
then you need to set QHOME
in config/kdbenv
.
The script assumes that the following commands are available - see Dockerfile
for more information
The scripts start several kdb+ processes that open a port for incoming messages. By default, the controller opens port 5100 and the workers open 5500, 5501, 5502, etc. You can change these ports by editing the variables CONTROLLERPORT and WORKERBASEPORT in nano.sh
.
Place these scripts in a single working directory. Do not place these scripts directly in the destination location being used for the IO testing, as that directory may be unmounted during the tests. If the scripts are to be executed across multiple nodes, place the directory containing them in a shared (e.g NFS) directory.
The scripts can best be run as root user or started with high priority.
The benchmark can run with many parameters. Some parameters are set as environment variables (e.g. the number of secondary threads of the kdb+ processes) and some are set by command line parameters (e.g. the number of worker kdb+ processes). Environment variables are placed in config/env
.
One key environment variable is FLUSH
which points to a storage cache flush script. Some sample scripts are provided and the default assumes locally attached block storage, for which echo 3 > /proc/sys/vm/drop_caches
does the job. If the root user is not available to you, the flush script will have to be placed in the sudo list by your systems administrator.
Create a directory on each storage medium where data will be written to during the test. List these data directories in file partitions
. Use absolute path names.
Using a single-line entry is a simple way of testing a single shared/parallel system. This would allow the shared FS to control the distribution of the data across a number of storage targets (objects, et al) automatically.
It is recommended to exit all memory-heavy applications. The benchmark uses cca. half of the free memory and creates files on disks accordingly.
Scripts below rely on environment variables (e.g. QBIN
) so first do
$ source ./config/kdbenv
$ source ./config/env
Starts multiple processes of execution of the benchmark. Pass -h
to learn about the parameters.
Example usages
$ ./nano.sh -p $(nproc)
$ ./nano.sh -p 8 --noclean
$ ./nano.sh -p 8 -s readonly --noclean -d 0408_152349
Typical examples of the number of worker processes to test are 1, 2, 4, 8, 16, 32, 64, 128.
The results are saved as PSV files set by the command line parameter resultdir
. If resultdir
is not set then results/mmdd_HHMMSS
is used.
File HOSTNAME-throughput.psv
aggregates (calculates the sum) the throughput metrics from the detailed result files. Column throughput
displays the throughput of the test from kdb+ perspective. The read/write throughput based on iostat
is also displayed. Column accuracy
tries to capture the impact of the offset problem. For each test, it calculates the maximal difference of start times and divides it by the average test elapsed time.
If you are interested in how the storage medium scales with the number of parallel requests, then you can run multiproc.sh
. It simply calls nano.sh
with different process numbers and aggregates the results to a final PSV file. The results are saved in the file results/throughput_total.csv
but this can be overwritten by a command line parameter. Run the script with --help
to learn about the command line parameters.
For multinode tests (multihost.sh
), list the other nodes in the file hostlist
. The script will ssh
into the remote node to start the main script.
Each server in hostlist
list will create the same number of processes of execution on each node,
and no data is shared between any individual processes.
If running across a cluster of nodes, each of the nodes must be time-synced (e.g. ntp
).
Note that the execution of the top-level script multihost.sh
may require tty
control to be added to the sudoers file if you are not already running as root. multihost.sh
does ssh to the remote host, so you may need to use ~/.ssh/config
to set passwords or identity files.
$ source ./config/kdbenv
$ source ./config/env
$ ./multihost.sh -p 32
The script reports the progress on the standard output. The log of each kdb+ process is available in the directory logs/mmdd_HHMMSS
.
nano version 2.1+ supports object storage. You simply need to put the object storage path into file partitions. See object storage kx page for information about setup and required environment variables. Example lines:
s3://kxnanotest/firsttest
gs://kxnanotest/firsttest
ms://kxnanotest.blob.core.windows.net/firsttest
The environment executing this test must have the associated cloud vendor CLI set up and configured as if they were manually uploading files to object storage.
You can also test the cache impact of the object storage library. The recommended way is to
- Provide an object storage path in
partitions
DBDIR=$(date +%m%d_%H%M%S)
- Run a full test to populate data:
./nano.sh --processnr $(nproc) --noclean --dbsubdir $DBDIR
. - Assign an empty directory on a fast local disk to the environment variable
KX_OBJSTR_CACHE_PATH
in file./config/env
- Run the test to populate cache:
./nano.sh --processnr $(nproc) --scope readonly --dbsubdir $DBDIR --noclean
- Run the test again to use cache:
./nano.sh --processnr $(nproc) --scope readonly --dbsubdir $DBDIR
- Delete cache files in object storage cache:
source ./config/env; rm -rf ${KX_OBJSTR_CACHE_PATH}/objects
The random read kdb+ script is deterministic, i.e. it reads the same blocks in consecutive runs. The script uses roll for selecting random blocks which uses a fixed seed.
Environment variable THREADNR
plays an important role in random read speed from object storage. See https://code.kx.com/insights/1.6/core/objstor/main.html#secondary-threads for more accounts.
The bash script starts a controller kdb+ process that is responsible for starting each test kdb+ function at the same time on all worker kdb+ processes. This cannot be achieved perfectly, there is some offset so the aggregate results are better considered as upper bounds. You can check the startup offset by e.g. checking the starttime
column of the detailed results files. The more kdb+ workers there are, the larger the offset is.
A Docker image is available for nano on Gitlab and on nexus:
$ docker pull registry.gitlab.com/kxdev/benchmarking/nano/nano:latest
$ docker pull ext-dev-registry.kxi-dev.kx.com/benchmarking/nano:latest
The nano scripts are placed in the docker directory /opt/kx/app
-see Dockerfile
To run the script by docker, you need to mount there directories to target directories
/tmp/qlic
: contains your kdb+ license file/appdir
: results and logs are saved here into subdirectoriesresults
andlogs
respectively/data
: that is on the storage you would like to test. For multi-partition test, you need to overwrite/opt/kx/app/partitions
with your partition file.
You can overwrite default environment variables (e.g. THREADNR
) that are listed in ./config/env
.
By default flush/directmount.sh
is selected as the flush script which requires the --privileged
options. You can choose other flush scripts by setting the environment variable FLUSH
. For example, setting FLUSH
to /opt/kx/app/noflush.sh
does not require privileged mode.
Example usages:
$ docker run --rm -it -v $QHOME:/tmp/qlic:ro -v /mnt/$USER/nano:/appdir -v /mnt/$USER/nanodata:/data --privileged ext-dev-registry.kxi-dev.kx.com/benchmarking/nano:latest -p 4
$ docker run --rm -it -v $QHOME:/tmp/qlic:ro -v /mnt/$USER/nano:/appdir -v /mnt/storage1/nanodata:/data1 -v /mnt/storage2/nanodata:/data2 -v ${PWD}/partitions_2disks:/opt/kx/app/partitions:ro -e FLUSH=/opt/kx/app/flush/noflush.sh -e THREADNR=5 ext-dev-registry.kxi-dev.kx.com/benchmarking/nano:latest -p 4
If you see the error Too many open files
in a log file, then increase the limit by
$ ulimit -n 2048
The controller open a TCP connection to all workers so this error typically occurs with a large number of workers (e.g. 512).
The ulimit change above is temporary and applies only to the current shell session.
The memory need is proportional to the number of threads (THREADNR
) for several q operations. For CPU tests, because vector lengths are fixed, the total memory required scales directly with the number of kdb+ workers.
If your test run fails with an OutOfMemory (OOM) error, you can address this by excluding the most memory-intensive tests. The primary tests known for high memory consumption are .cpu.groupIntLarge
and .cpu.groupFloatLarge
. Set the EXCLUDETESTS
environment variable as shown below:
EXCLUDETESTS=".cpu.groupIntLarge .cpu.groupFloatLarge"
and rerun your test.
If the benchmark fails with e.g.
ERROR: Port 5501 is in use. Maybe leftover kdb+ processes are running.
then you either need to stop the process that takes the port or set alternative ports for the benchmark.
Most likely, the port is taken by a previously failed test, so either
$ killall q
or
$ kill -9 $(pidof q)
will do the job. Be careful with these commands as they will terminate ALL q processes running under your user account. Alternatively, you can use lsof
to find out the process using the port, for example, lsof -i :5501
.
All kdb+ workers and the kdb+ controller need a port. Environment variables WORKERBASEPORT
and CONTROLLERPORT
exported in nano.sh
set these ports. Feel free to modify them. For N
workers, ports WORKERBASEPORT+1
through WORKERBASEPORT+N
will be used.
The script calculates the throughput (MiB/sec) of an operation by calculating the data size and the elapsed time of the operation.
Script ./nano.sh
executes 7 major tests:
- CPU
- Write
- Sequential read
- Sequential reread
- Meta
- Random read and reread
- xasc
If the scope is readonly then Write and Meta tests are omitted.
All tests start multiple kdb+ processes (set by the parameter -p
of ./nano.sh
) each having its own working space on the disk.
The cache is flushed before each test, except for reread tests.
We detail each test in the next section.
- starts a few tests on in-memory lists that mainly stress the CPU and memory. It uses tiny, small (16k long), medium and large vectors fitting into L1, L2, L3 caches and into memory. Test include
- creating random permutation
- sorting
- calculating deltas
- generating indices based on modulo
- calculates moving and weighted averages
- arithmetics and implicit iteration
- serialization, deserialization and compressesion
- performs three write tests
open append tiny
: appending tiny integer list to the end of list (this operation includes opening and closing a file):[; (); ,; 2 3 5 7]
append tiny
: appending tiny integer list to a handle of a kdb+ file:H: hopen ...; H 2 3 5 7
open replace tiny
: overwriting file content with two integers:[; (); :; 42 7]
create list
: creates a list in memory (functiontil
), i.e. allocating memory and filling it with consecutive longs. The length of the list is set bySEQWRITETESTLIMIT
.write rate
: writes the list (set
) to the filereadtest
.sync rate
: calling the system commandsync
onreadtest
.open append small
: appends a small integer list many times to a file.open append large sym
: appends a large block a few times to a file. The result file is thesym
column of a splayed table used in thexasc
test.- saves files for meta test:
- two long lists of length 63k
- a long list of length 31M
- memory maps (
get
) filereadtest
to variablemapped
- attempts to force
readtest
to be resident in memory (-23!
) by calling the Linux commandmadvice
with parameterMADV_WILLNEED
. - calls
max
on themapped
that sequentially marches through all pages - performs binary read by calling
read1
This is the same as the first two steps of test read. We do not flush the cache before this test so the data pages are already in cache. Reread is not expected to involve any data copy so its execution (mmap
and madvise
) should be much faster than Read.
The meta operations are executed (variable N
) thousand times and the average execution time is returned.
- opening then closing a kdb+ file:
hclose hopen x
- getting the size of a file:
hcount
- read and parse kdb+ data:
get
- locking file. Enum extend is used for this which achieves more.
This test consists of four subtests. Each subtest random reads 800 MiB of data by indexing a list stored on disk. Consecutive integers are used for indexing. Each random read uses a different random offset (deal). 800 MB is achieved either by reading blocks of sizes 1M, 64k or 4k. Each random read can also perform a mmap
. This test is denoted by a mmaps
postfix, e.g. mmaps,random read 64k
stands for random reading integer list of size 64k after a memory map.
In a typical select
statement with a where
clause kdb+ does random reads with memory mappings. If you started your kdb+ process with .Q.MAP then memory mapping is done during .Q.MAP
and the select statement only does a random read.
The throughput is based on the useful data read. For example, if you index a vector of long by 8000 consecutive numbers then the useful data size is 8x8000 bytes (the size of a long is 8 bytes). In reality, Linux may read much more data from the disk due to e.g. the prefetch technique. Just change the content of /sys/block/DBDEVICE/queue/read_ahead_kb
and see how the throughput changes. The disk may be fully saturated but the useful throughput is smaller.
The script does an on-disk sort by xasc
on sym
. The second test is applying attribute p
on column sym
.