8000 haproxy2.4 up to 3.1.6 cpu up 10% · Issue #2968 · haproxy/haproxy · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

haproxy2.4 up to 3.1.6 cpu up 10% #2968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
QQxiaoyuyu opened this issue May 12, 2025 · 4 comments
Closed

haproxy2.4 up to 3.1.6 cpu up 10% #2968

QQxiaoyuyu opened this issue May 12, 2025 · 4 comments
Labels
status: works as designed This issue stems from a misunderstanding of how HAProxy is supposed to work. type: bug This issue describes a bug.

Comments

@QQxiaoyuyu
Copy link

Detailed Description of the Problem

when i up haproxy 2.4 to haproxy 3.1.6, the cpu use is up 10%

Expected Behavior

I think new haproxy version performance is good than old ,but the cpu is using to up . I don't know if this is normal or not

Steps to Reproduce the Behavior

my haproxy2.4 is using nbproc 36, but harproxy3.1.6 is using nbthread 36

Do you have any idea what may have caused this?

I don't know

Do you have an idea how to solve the issue?

No response

What is your configuration?

global
    chroot /home/system/haproxy1_5
    daemon
    group root
    user root
    log 127.0.0.1:514 local5 info
    stats socket        /tmp/haproxy.socket uid haproxy mode 770 level admin
    pidfile /home/system/haproxy1_5/haproxy3.pid
    spread-checks 3
    nbthread 36
    cpu-map auto:1/1-36 0-35
    tune.applet.zero-copy-forwarding on
    tune.ssl.default-dh-param 2048
    tune.ssl.cachesize 1000000
    ssl-default-bind-options ssl-min-ver TLSv1.2



defaults
    mode http
    option httplog
    maxconn 100000
    option redispatch
    option tcpka
    option srvtcpka
    option clitcpka
    retries 3
    timeout connect 20
    timeout http-request 10000
    timeout client 10000
    timeout server 10000
    timeout     check 500
    timeout http-keep-alive 30000
    stats enable
    stats refresh 30s
    stats uri /admin?stats
    stats realm afa\haproxy
    stats hide-version

Output of haproxy -vv

HAProxy version 3.1.6-d929ca2 2025/03/20 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-3.1.6.html
Running on: Linux 5.4.219-1.el7.elrepo.x86_64 #1 SMP Sun Oct 16 10:03:45 EDT 2022 x86_64
Build options :
  TARGET  = linux-glibc
  CC      = cc
  CFLAGS  = -O2 -g -fwrapv
  OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_PCRE=1
  DEBUG   = 

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT +PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN -SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL +ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=40).
Built with OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.4.4
Built with network namespace support.
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 9.3.1 20200408 (Red Hat 9.3.1-2)

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
       spop : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG

Available services : none

Available filters :
        [BWLIM] bwlim-in
        [BWLIM] bwlim-out
        [CACHE] cache
        [COMP] compression
        [FCGI] fcgi-app
        [SPOE] spoe
        [TRACE] trace

Last Outputs and Backtraces


Additional Information

No response

@QQxiaoyuyu QQxiaoyuyu added type: bug This issue describes a bug. status: needs-triage This issue needs to be triaged. labels May 12, 2025
@wtarreau
Copy link
Member

Honestly, only 10% increase when switching from independent processes that share nothing to threads doesn't seem much at all. In exchange for this you gain a lot more nice stuff, like sharing the outgoing idle connections between all threads, a per-server maxconn that really works (no longer one per process), same for leastconn, stick-tables which are always known by all threads, consistent health checks, common stats, reduced memory usage when updating certs/maps/acls, and many others things that don't immediately come to my mind. Synchronizing states between threads necessarily causes a bit 8000 of overhead, which depends on the subsystems.

If you're interested, you can run "perf top" and see the outliers. Maybe some of them are already known, maybe some have already been addressed or maybe what you're seeing is already the best we can do.

Also what CPU are you running on ? Could you please issue lscpu -e, maybe there's even room for improvement there.

@wtarreau wtarreau added status: works as designed This issue stems from a misunderstanding of how HAProxy is supposed to work. and removed status: needs-triage This issue needs to be triaged. labels May 12, 2025
@QQxiaoyuyu
Copy link
Author

Okay, that means upgrading to a new version is expected to result in CPU improvements, right

@wtarreau
Copy link
Member

What this means is that there is always a tradeoff between CPU efficiency and scalability. The previous model (nbproc) had long reached its limits by no longer being observable, monitorable, reliable, manageable, and absolutely needed to be reintegrated in a consistent way. This does necessarily cost a bit of CPU, there's no way around this, but our efforts since 1.8 have been mainly focused on diminishing this CPU impact while keeping the features. For example in the nbproc model, if you needed to delete a server, you would have had to create 36 sockets, iterate on all of them to check if the server still had traffic, then iterate on all of them to turn the server down, then iterate again on all of them to delete the server, etc. You really had 36 totally independent processes. For sure this has zero CPU cost but this is impractical for any serious use. And for example peers were not usable with nbproc. Now all of this is totally addressed at the expense of a little CPU cost (depending on which features, for the vast majority most there's zero cost). And a number of features benefit from the new model (e.g. connection reuse) and will even increase performance. Total RAM usage is way better as well (stick tables, ACLs, maps are no longer multiplied by the number of processes).

3.2 further reduces the cost (e.g. certain LB algos like leastconn still had an impact in threads). So in a sense, yes, new versions improve CPU usage. But there's inevitably a CPU cost from switching from the old nbproc to nbthread.

If you're in a rare case where you're not monitoring your processes, you're not doing health checks, using stick-tables nor peers, doing nothing dynamic etc, you could still work in the old model by starting 36 times the program using the same
conf. It will basically do the same, and everyone understands why this generally isn't a good idea for most use cases.

@QQxiaoyuyu
Copy link
Author

Okay, I understand. Thank you very much for your answer and reply

@QQxiaoyuyu QQxiaoyuyu closed this as 511F completed May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: works as designed This issue stems from a misunderstanding of how HAProxy is supposed to work. type: bug This issue describes a bug.
Projects
None yet
Development

No branches or pull requests

2 participants
0