8000 baseline by lgarithm · Pull Request #2 · lgarithm/faabric · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

baseline #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

baseline #2

wants to merge 21 commits into from

Conversation

lgarithm
Copy link
Owner

No description provided.

@lgarithm
Copy link
Owner Author

test

@lgarithm
Copy link
Owner Author

@lgarithm
Copy link
Owner Author
BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0138s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0143s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0142s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0143s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0130s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0154s, total workload: 384000B, rate: 0.023GiB/s
bench_allreduce(np=4) took 0.3252s, total workload: 1.144GiB, rate: 3.517GiB/s
bench_allreduce(np=4) took 0.3012s, total workload: 1.144GiB, rate: 3.797GiB/s
bench_allreduce(np=4) took 0.3010s, total workload: 1.144GiB, rate: 3.799GiB/s
bench_allreduce(np=4) took 0.2990s, total workload: 1.144GiB, rate: 3.824GiB/s
bench_allreduce(np=4) took 0.3048s, total workload: 1.144GiB, rate: 3.752GiB/s
bench_allreduce(np=4) took 0.3038s, total workload: 1.144GiB, rate: 3.765GiB/s
bench_allreduce(np=4) took 0.3036s, total workload: 1.144GiB, rate: 3.767GiB/s
bench_allreduce(np=4) took 0.3024s, total workload: 1.144GiB, rate: 3.782GiB/s
bench_allreduce(np=4) took 0.3026s, total workload: 1.144GiB, rate: 3.779GiB/s
bench_allreduce(np=4) took 0.3025s, total workload: 1.144GiB, rate: 3.781GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.4133s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3736s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3564s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3705s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3595s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3422s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3583s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3532s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3533s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3522s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.1201s, total workload: 1.144GiB, rate: 1.021GiB/s
bench_allreduce(np=4) took 1.1197s, total workload: 1.144GiB, rate: 1.021GiB/s
bench_allreduce(np=4) took 1.0404s, total workload: 1.144GiB, rate: 1.099GiB/s
bench_allreduce(np=4) took 1.0849s, total workload: 1.144GiB, rate: 1.054GiB/s
bench_allreduce(np=4) took 1.0356s, total workload: 1.144GiB, rate: 1.104GiB/s
bench_allreduce(np=4) took 1.0564s, total workload: 1.144GiB, rate: 1.083GiB/s
bench_allreduce(np=4) took 1.0920s, total workload: 1.144GiB, rate: 1.047GiB/s
bench_allreduce(np=4) took 1.0471s, total workload: 1.144GiB, rate: 1.092GiB/s
bench_allreduce(np=4) took 1.0921s, total workload: 1.144GiB, rate: 1.047GiB/s
bench_allreduce(np=4) took 1.0669s, total workload: 1.144GiB, rate: 1.072GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author

7483943.log

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0144s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0138s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0139s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0139s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0148s, total workload: 384000B, rate: 0.024GiB/s
bench_allreduce(np=4) took 0.0152s, total workload: 384000B, rate: 0.023GiB/s
bench_allreduce(np=4) took 0.0142s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0141s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0143s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0142s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.3064s, total workload: 1.144GiB, rate: 3.733GiB/s
bench_allreduce(np=4) took 0.2885s, total workload: 1.144GiB, rate: 3.965GiB/s
bench_allreduce(np=4) took 0.2835s, total workload: 1.144GiB, rate: 4.034GiB/s
bench_allreduce(np=4) took 0.2832s, total workload: 1.144GiB, rate: 4.038GiB/s
bench_allreduce(np=4) took 0.2804s, total workload: 1.144GiB, rate: 4.079GiB/s
bench_allreduce(np=4) took 0.2787s, total workload: 1.144GiB, rate: 4.103GiB/s
bench_allreduce(np=4) took 0.2824s, total workload: 1.144GiB, rate: 4.049GiB/s
bench_allreduce(np=4) took 0.2817s, total workload: 1.144GiB, rate: 4.060GiB/s
bench_allreduce(np=4) took 0.2839s, total workload: 1.144GiB, rate: 4.029GiB/s
bench_allreduce(np=4) took 0.2810s, total workload: 1.144GiB, rate: 4.070GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.4241s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3890s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3636s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3709s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3625s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3358s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3466s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3395s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3374s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3349s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.0444s, total workload: 1.144GiB, rate: 1.095GiB/s
bench_allreduce(np=4) took 1.0518s, total workload: 1.144GiB, rate: 1.087GiB/s
bench_allreduce(np=4) took 0.9910s, total workload: 1.144GiB, rate: 1.154GiB/s
bench_allreduce(np=4) took 1.0018s, total workload: 1.144GiB, rate: 1.142GiB/s
bench_allreduce(np=4) took 1.0422s, total workload: 1.144GiB, rate: 1.097GiB/s
bench_allreduce(np=4) took 1.0015s, total workload: 1.144GiB, rate: 1.142GiB/s
bench_allreduce(np=4) took 1.0119s, total workload: 1.144GiB, rate: 1.130GiB/s
bench_allreduce(np=4) took 1.0095s, total workload: 1.144GiB, rate: 1.133GiB/s
bench_allreduce(np=4) took 1.0127s, total workload: 1.144GiB, rate: 1.129GiB/s
bench_allreduce(np=4) took 1.0237s, total workload: 1.144GiB, rate: 1.117GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author

8d10fa2.log

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0132s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0124s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0127s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0129s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0158s, total workload: 384000B, rate: 0.023GiB/s
bench_allreduce(np=4) took 0.0158s, total workload: 384000B, rate: 0.023GiB/s
bench_allreduce(np=4) took 0.0163s, total workload: 384000B, rate: 0.022GiB/s
bench_allreduce(np=4) took 0.0162s, total workload: 384000B, rate: 0.022GiB/s
bench_allreduce(np=4) took 0.0160s, total workload: 384000B, rate: 0.022GiB/s
bench_allreduce(np=4) took 0.2806s, total workload: 1.144GiB, rate: 4.076GiB/s
bench_allreduce(np=4) took 0.2663s, total workload: 1.144GiB, rate: 4.294GiB/s
bench_allreduce(np=4) took 0.2694s, total workload: 1.144GiB, rate: 4.245GiB/s
bench_allreduce(np=4) took 0.2684s, total workload: 1.144GiB, rate: 4.260GiB/s
bench_allreduce(np=4) took 0.2707s, total workload: 1.144GiB, rate: 4.225GiB/s
bench_allreduce(np=4) took 0.2719s, total workload: 1.144GiB, rate: 4.207GiB/s
bench_allreduce(np=4) took 0.2709s, total workload: 1.144GiB, rate: 4.221GiB/s
bench_allreduce(np=4) took 0.2796s, total workload: 1.144GiB, rate: 4.090GiB/s
bench_allreduce(np=4) took 0.2706s, total workload: 1.144GiB, rate: 4.227GiB/s
bench_allreduce(np=4) took 0.2748s, total workload: 1.144GiB, rate: 4.162GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3628s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3409s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3419s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3453s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3327s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3349s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3218s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3159s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3158s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3102s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.4023s, total workload: 1.144GiB, rate: 0.816GiB/s
bench_allreduce(np=4) took 1.1479s, total workload: 1.144GiB, rate: 0.996GiB/s
bench_allreduce(np=4) took 1.0784s, total workload: 1.144GiB, rate: 1.061GiB/s
bench_allreduce(np=4) took 1.1027s, total workload: 1.144GiB, rate: 1.037GiB/s
bench_allreduce(np=4) took 1.0772s, total workload: 1.144GiB, rate: 1.062GiB/s
bench_allreduce(np=4) took 1.0918s, total workload: 1.144GiB, rate: 1.047GiB/s
bench_allreduce(np=4) took 1.0852s, total workload: 1.144GiB, rate: 1.054GiB/s
bench_allreduce(np=4) took 1.0531s, total workload: 1.144GiB, rate: 1.086GiB/s
bench_allreduce(np=4) took 0.9709s, total workload: 1.144GiB, rate: 1.178GiB/s
bench_allreduce(np=4) took 1.0547s, total workload: 1.144GiB, rate: 1.084GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

7483943

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0151s, total workload: 384000B, rate: 0.024GiB/s
bench_allreduce(np=4) took 0.0138s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0130s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0129s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0129s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0126s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.2667s, total workload: 1.144GiB, rate: 4.289GiB/s
bench_allreduce(np=4) took 0.2396s, total workload: 1.144GiB, rate: 4.773GiB/s
bench_allreduce(np=4) took 0.2357s, total workload: 1.144GiB, rate: 4.853GiB/s
bench_allreduce(np=4) took 0.2451s, total workload: 1.144GiB, rate: 4.667GiB/s
bench_allreduce(np=4) took 0.2535s, total workload: 1.144GiB, rate: 4.512GiB/s
bench_allreduce(np=4) took 0.2547s, total workload: 1.144GiB, rate: 4.490GiB/s
bench_allreduce(np=4) took 0.2548s, total workload: 1.144GiB, rate: 4.488GiB/s
bench_allreduce(np=4) took 0.2567s, total workload: 1.144GiB, rate: 4.455GiB/s
bench_allreduce(np=4) took 0.2589s, total workload: 1.144GiB, rate: 4.417GiB/s
bench_allreduce(np=4) took 0.2648s, total workload: 1.144GiB, rate: 4.319GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3935s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3199s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3076s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3013s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3354s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3307s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3446s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3810s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3932s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3753s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.4619s, total workload: 1.144GiB, rate: 0.782GiB/s
bench_allreduce(np=4) took 1.1683s, total workload: 1.144GiB, rate: 0.979GiB/s
bench_allreduce(np=4) took 0.9212s, total workload: 1.144GiB, rate: 1.242GiB/s
bench_allreduce(np=4) took 0.8812s, total workload: 1.144GiB, rate: 1.298GiB/s
bench_allreduce(np=4) took 0.8741s, total workload: 1.144GiB, rate: 1.308GiB/s
bench_allreduce(np=4) took 0.8586s, total workload: 1.144GiB, rate: 1.332GiB/s
bench_allreduce(np=4) took 0.8343s, total workload: 1.144GiB, rate: 1.371GiB/s
bench_allreduce(np=4) took 0.8592s, total workload: 1.144GiB, rate: 1.331GiB/s
bench_allreduce(np=4) took 0.8379s, total workload: 1.144GiB, rate: 1.365GiB/s
bench_allreduce(np=4) took 0.8672s, total workload: 1.144GiB, rate: 1.319GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

8d10fa2

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0143s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0137s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0137s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0131s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0122s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0122s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.2649s, total workload: 1.144GiB, rate: 4.317GiB/s
bench_allreduce(np=4) took 0.2585s, total workload: 1.144GiB, rate: 4.425GiB/s
bench_allreduce(np=4) took 0.2618s, total workload: 1.144GiB, rate: 4.369GiB/s
bench_allreduce(np=4) took 0.2572s, total workload: 1.144GiB, rate: 4.447GiB/s
bench_allreduce(np=4) took 0.2375s, total workload: 1.144GiB, rate: 4.816GiB/s
bench_allreduce(np=4) took 0.2565s, total workload: 1.144GiB, rate: 4.459GiB/s
bench_allreduce(np=4) took 0.2618s, total workload: 1.144GiB, rate: 4.368GiB/s
bench_allreduce(np=4) took 0.2624s, total workload: 1.144GiB, rate: 4.358GiB/s
bench_allreduce(np=4) took 0.2637s, total workload: 1.144GiB, rate: 4.337GiB/s
bench_allreduce(np=4) took 0.2654s, total workload: 1.144GiB, rate: 4.310GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3693s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3492s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3510s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3492s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3472s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3508s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3212s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3187s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3108s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3327s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.4654s, total workload: 1.144GiB, rate: 0.780GiB/s
bench_allreduce(np=4) took 1.3593s, total workload: 1.144GiB, rate: 0.841GiB/s
bench_allreduce(np=4) took 1.0765s, total workload: 1.144GiB, rate: 1.062GiB/s
bench_allreduce(np=4) took 0.9870s, total workload: 1.144GiB, rate: 1.159GiB/s
bench_allreduce(np=4) took 0.9946s, total workload: 1.144GiB, rate: 1.150GiB/s
bench_allreduce(np=4) took 0.9512s, total workload: 1.144GiB, rate: 1.202GiB/s
bench_allreduce(np=4) took 0.9667s, total workload: 1.144GiB, rate: 1.183GiB/s
bench_allreduce(np=4) took 0.9778s, total workload: 1.144GiB, rate: 1.170GiB/s
bench_allreduce(np=4) took 0.9656s, total workload: 1.144GiB, rate: 1.184GiB/s
bench_allreduce(np=4) took 0.9692s, total workload: 1.144GiB, rate: 1.180GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
8000 Copy link
Owner Author
lgarithm commented Mar 1, 2024

578c079

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

5c7afeb

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

3f0f4d7

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

7483943

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0151s, total workload: 384000B, rate: 0.024GiB/s
bench_allreduce(np=4) took 0.0138s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0130s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0129s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0129s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0126s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0125s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.2667s, total workload: 1.144GiB, rate: 4.289GiB/s
bench_allreduce(np=4) took 0.2396s, total workload: 1.144GiB, rate: 4.773GiB/s
bench_allreduce(np=4) took 0.2357s, total workload: 1.144GiB, rate: 4.853GiB/s
bench_allreduce(np=4) took 0.2451s, total workload: 1.144GiB, rate: 4.667GiB/s
bench_allreduce(np=4) took 0.2535s, total workload: 1.144GiB, rate: 4.512GiB/s
bench_allreduce(np=4) took 0.2547s, total workload: 1.144GiB, rate: 4.490GiB/s
bench_allreduce(np=4) took 0.2548s, total workload: 1.144GiB, rate: 4.488GiB/s
bench_allreduce(np=4) took 0.2567s, total workload: 1.144GiB, rate: 4.455GiB/s
bench_allreduce(np=4) took 0.2589s, total workload: 1.144GiB, rate: 4.417GiB/s
bench_allreduce(np=4) took 0.2648s, total workload: 1.144GiB, rate: 4.319GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3935s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3199s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3076s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3013s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3354s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3307s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3446s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3810s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3932s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3753s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.4619s, total workload: 1.144GiB, rate: 0.782GiB/s
bench_allreduce(np=4) took 1.1683s, total workload: 1.144GiB, rate: 0.979GiB/s
bench_allreduce(np=4) took 0.9212s, total workload: 1.144GiB, rate: 1.242GiB/s
bench_allreduce(np=4) took 0.8812s, total workload: 1.144GiB, rate: 1.298GiB/s
bench_allreduce(np=4) took 0.8741s, total workload: 1.144GiB, rate: 1.308GiB/s
bench_allreduce(np=4) took 0.8586s, total workload: 1.144GiB, rate: 1.332GiB/s
bench_allreduce(np=4) took 0.8343s, total workload: 1.144GiB, rate: 1.371GiB/s
bench_allreduce(np=4) took 0.8592s, total workload: 1.144GiB, rate: 1.331GiB/s
bench_allreduce(np=4) took 0.8379s, total workload: 1.144GiB, rate: 1.365GiB/s
bench_allreduce(np=4) took 0.8672s, total workload: 1.144GiB, rate: 1.319GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

8d10fa2

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0143s, total workload: 384000B, rate: 0.025GiB/s
bench_allreduce(np=4) took 0.0137s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0137s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0136s, total workload: 384000B, rate: 0.026GiB/s
bench_allreduce(np=4) took 0.0131s, total workload: 384000B, rate: 0.027GiB/s
bench_allreduce(np=4) took 0.0122s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0122s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.2649s, total workload: 1.144GiB, rate: 4.317GiB/s
bench_allreduce(np=4) took 0.2585s, total workload: 1.144GiB, rate: 4.425GiB/s
bench_allreduce(np=4) took 0.2618s, total workload: 1.144GiB, rate: 4.369GiB/s
bench_allreduce(np=4) took 0.2572s, total workload: 1.144GiB, rate: 4.447GiB/s
bench_allreduce(np=4) took 0.2375s, total workload: 1.144GiB, rate: 4.816GiB/s
bench_allreduce(np=4) took 0.2565s, total workload: 1.144GiB, rate: 4.459GiB/s
bench_allreduce(np=4) took 0.2618s, total workload: 1.144GiB, rate: 4.368GiB/s
bench_allreduce(np=4) took 0.2624s, total workload: 1.144GiB, rate: 4.358GiB/s
bench_allreduce(np=4) took 0.2637s, total workload: 1.144GiB, rate: 4.337GiB/s
bench_allreduce(np=4) took 0.2654s, total workload: 1.144GiB, rate: 4.310GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3693s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3492s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3510s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3492s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3472s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3508s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3212s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3187s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3108s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3327s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.4654s, total workload: 1.144GiB, rate: 0.780GiB/s
bench_allreduce(np=4) took 1.3593s, total workload: 1.144GiB, rate: 0.841GiB/s
bench_allreduce(np=4) took 1.0765s, total workload: 1.144GiB, rate: 1.062GiB/s
bench_allreduce(np=4) took 0.9870s, total workload: 1.144GiB, rate: 1.159GiB/s
bench_allreduce(np=4) took 0.9946s, total workload: 1.144GiB, rate: 1.150GiB/s
bench_allreduce(np=4) took 0.9512s, total workload: 1.144GiB, rate: 1.202GiB/s
bench_allreduce(np=4) took 0.9667s, total workload: 1.144GiB, rate: 1.183GiB/s
bench_allreduce(np=4) took 0.9778s, total workload: 1.144GiB, rate: 1.170GiB/s
bench_allreduce(np=4) took 0.9656s, total workload: 1.144GiB, rate: 1.184GiB/s
bench_allreduce(np=4) took 0.9692s, total workload: 1.144GiB, rate: 1.180GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

578c079

< 8000 /div>
@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

5c7afeb

@lgarithm
Copy link
Owner Author
lgarithm commented Mar 1, 2024

3f0f4d7

@lgarithm
Copy link
Owner Author

5b7667e

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0130s, total workload: 384000B, rate: 0.028GiB/s
bench_allreduce(np=4) took 0.0124s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0123s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0122s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0124s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0120s, total workload: 384000B, rate: 0.030GiB/s
bench_allreduce(np=4) took 0.0116s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0116s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0116s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.2637s, total workload: 1.144GiB, rate: 4.337GiB/s
bench_allreduce(np=4) took 0.2543s, total workload: 1.144GiB, rate: 4.497GiB/s
bench_allreduce(np=4) took 0.2547s, total workload: 1.144GiB, rate: 4.491GiB/s
bench_allreduce(np=4) took 0.2571s, total workload: 1.144GiB, rate: 4.448GiB/s
bench_allreduce(np=4) took 0.2604s, total workload: 1.144GiB, rate: 4.391GiB/s
bench_allreduce(np=4) took 0.2504s, total workload: 1.144GiB, rate: 4.567GiB/s
bench_allreduce(np=4) took 0.2560s, total workload: 1.144GiB, rate: 4.468GiB/s
bench_allreduce(np=4) took 0.2545s, total workload: 1.144GiB, rate: 4.494GiB/s
bench_allreduce(np=4) took 0.2544s, total workload: 1.144GiB, rate: 4.495GiB/s
bench_allreduce(np=4) took 0.2533s, total workload: 1.144GiB, rate: 4.515GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3831s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3537s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3587s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3512s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3410s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3503s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3591s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3488s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3470s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3470s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.0902s, total workload: 1.144GiB, rate: 1.049GiB/s
bench_allreduce(np=4) took 1.0794s, total workload: 1.144GiB, rate: 1.060GiB/s
bench_allreduce(np=4) took 0.9957s, total workload: 1.144GiB, rate: 1.149GiB/s
bench_allreduce(np=4) took 0.8494s, total wor
8000
kload: 1.144GiB, rate: 1.346GiB/s
bench_allreduce(np=4) took 0.7802s, total workload: 1.144GiB, rate: 1.466GiB/s
bench_allreduce(np=4) took 0.7733s, total workload: 1.144GiB, rate: 1.479GiB/s
bench_allreduce(np=4) took 0.7723s, total workload: 1.144GiB, rate: 1.481GiB/s
bench_allreduce(np=4) took 0.7585s, total workload: 1.144GiB, rate: 1.508GiB/s
bench_allreduce(np=4) took 0.7731s, total workload: 1.144GiB, rate: 1.479GiB/s
bench_allreduce(np=4) took 0.7653s, total workload: 1.144GiB, rate: 1.494GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author

ba0d691

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0124s, total workload: 384000B, rate: 0.029GiB/s
bench_allreduce(np=4) took 0.0120s, total workload: 384000B, rate: 0.030GiB/s
bench_allreduce(np=4) took 0.0120s, total workload: 384000B, rate: 0.030GiB/s
bench_allreduce(np=4) took 0.0119s, total workload: 384000B, rate: 0.030GiB/s
bench_allreduce(np=4) took 0.0119s, total workload: 384000B, rate: 0.030GiB/s
bench_allreduce(np=4) took 0.0115s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0114s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0114s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0114s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.0115s, total workload: 384000B, rate: 0.031GiB/s
bench_allreduce(np=4) took 0.2637s, total workload: 1.144GiB, rate: 4.336GiB/s
bench_allreduce(np=4) took 0.2531s, total workload: 1.144GiB, rate: 4.519GiB/s
bench_allreduce(np=4) took 0.2512s, total workload: 1.144GiB, rate: 4.553GiB/s
bench_allreduce(np=4) took 0.2541s, total workload: 1.144GiB, rate: 4.500GiB/s
bench_allreduce(np=4) took 0.2541s, total workload: 1.144GiB, rate: 4.501GiB/s
bench_allreduce(np=4) took 0.2536s, total workload: 1.144GiB, rate: 4.510GiB/s
bench_allreduce(np=4) took 0.2544s, total workload: 1.144GiB, rate: 4.495GiB/s
bench_allreduce(np=4) took 0.2535s, total workload: 1.144GiB, rate: 4.512GiB/s
bench_allreduce(np=4) took 0.2543s, total workload: 1.144GiB, rate: 4.497GiB/s
bench_allreduce(np=4) took 0.2528s, total workload: 1.144GiB, rate: 4.524GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3830s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3861s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3815s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3696s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3268s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3261s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3305s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3322s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3092s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3340s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.1843s, total workload: 1.144GiB, rate: 0.966GiB/s
bench_allreduce(np=4) took 1.0431s, total workload: 1.144GiB, rate: 1.096GiB/s
bench_allreduce(np=4) took 1.0408s, total workload: 1.144GiB, rate: 1.099GiB/s
bench_allreduce(np=4) took 0.9793s, total workload: 1.144GiB, rate: 1.168GiB/s
bench_allreduce(np=4) took 0.9878s, total workload: 1.144GiB, rate: 1.158GiB/s
bench_allreduce(np=4) took 1.0487s, total workload: 1.144GiB, rate: 1.091GiB/s
bench_allreduce(np=4) took 0.9917s, total workload: 1.144GiB, rate: 1.153GiB/s
bench_allreduce(np=4) took 1.0071s, total workload: 1.144GiB, rate: 1.136GiB/s
bench_allreduce(np=4) took 0.9798s, total workload: 1.144GiB, rate: 1.167GiB/s
bench_allreduce(np=4) took 1.0379s, total workload: 1.144GiB, rate: 1.102GiB/s
END ======================================== bench_allreduce remote ========================================

@lgarithm
Copy link
Owner Author

71d3f79

BGN ======================================== bench_allreduce local ========================================
bench_allreduce(np=4) took 0.0038s, total workload: 384000B, rate: 0.093GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.097GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.097GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.097GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.097GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.098GiB/s
bench_allreduce(np=4) took 0.0037s, total workload: 384000B, rate: 0.097GiB/s
bench_allreduce(np=4) took 0.0035s, total workload: 384000B, rate: 0.103GiB/s
bench_allreduce(np=4) took 0.0033s, total workload: 384000B, rate: 0.108GiB/s
bench_allreduce(np=4) took 0.0032s, total workload: 384000B, rate: 0.110GiB/s
bench_allreduce(np=4) took 0.2252s, total workload: 1.144GiB, rate: 5.079GiB/s
bench_allreduce(np=4) took 0.2040s, total workload: 1.144GiB, rate: 5.606GiB/s
bench_allreduce(np=4) took 0.2034s, total workload: 1.144GiB, rate: 5.622GiB/s
bench_allreduce(np=4) took 0.2023s, total workload: 1.144GiB, rate: 5.652GiB/s
bench_allreduce(np=4) took 0.2098s, total workload: 1.144GiB, rate: 5.450GiB/s
bench_allreduce(np=4) took 0.2046s, total workload: 1.144GiB, rate: 5.590GiB/s
bench_allreduce(np=4) took 0.2052s, total workload: 1.144GiB, rate: 5.573GiB/s
bench_allreduce(np=4) took 0.2044s, total workload: 1.144GiB, rate: 5.596GiB/s
bench_allreduce(np=4) took 0.2060s, total workload: 1.144GiB, rate: 5.551GiB/s
bench_allreduce(np=4) took 0.2059s, total workload: 1.144GiB, rate: 5.553GiB/s
END ======================================== bench_allreduce local ========================================
BGN ======================================== bench_allreduce remote ========================================
bench_allreduce(np=4) took 0.3308s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3140s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3002s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3112s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3261s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3326s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3107s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.3072s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.2969s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 0.2862s, total workload: 384000B, rate: 0.001GiB/s
bench_allreduce(np=4) took 1.0909s, total workload: 1.144GiB, rate: 1.048GiB/s
bench_allreduce(np=4) took 0.9866s, total workload: 1.144GiB, rate: 1.159GiB/s
bench_allreduce(np=4) took 0.9789s, total workload: 1.144GiB, rate: 1.168GiB/s
bench_allreduce(np=4) took 0.9276s, total workload: 1.144GiB, rate: 1.233GiB/s
bench_allreduce(np=4) took 0.9301s, total workload: 1.144GiB, rate: 1.230GiB/s
bench_allreduce(np=4) took 0.9553s, total workload: 1.144GiB, rate: 1.197GiB/s
bench_allreduce(np=4) took 0.9490s, total workload: 1.144GiB, rate: 1.205GiB/s
bench_allreduce(np=4) took 0.9726s, total workload: 1.144GiB, rate: 1.176GiB/s
bench_allreduce(np=4) took 0.8959s, total workload: 1.144GiB, rate: 1.277GiB/s
bench_allreduce(np=4) took 0.9749s, total workload: 1.144GiB, rate: 1.173GiB/s
END ======================================== bench_allreduce remote ========================================
67E6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0