I get different results on simple network operation on a computer with and without AVX512

@jgong5

🐛 Describe the bug

I get different results when executing the following minimal example on a computer with AVX-512 compared to a compter without AVX-512

import numpy as np
import torch
import torch.nn as nn
import random
torch.use_deterministic_algorithms(True)

seed = 42

random.seed(seed)
torch.manual_seed(seed)
np.random.seed(seed)

device = torch.device("cpu")

def layer_init(layer, std=np.sqrt(2), bias_const=0.0):
    torch.nn.init.orthogonal_(layer.weight, std)
    torch.nn.init.constant_(layer.bias, bias_const)
    return layer

network = nn.Sequential(layer_init(nn.Linear(100, 100)),
                        layer_init(nn.Linear(100,1), std=1.0),).to(device)
with torch.no_grad():
    action = network(torch.rand(size=(100,)).to(device))

print(format(action.cpu().numpy()[0].item(), '.60g'))

Result on AVX512 computer is 0.4804637432098388671875 on non-AVX512 computer I get 0.4804628789424896240234375, a difference of order 1e-5.

I did the test on a vm with and without avx512 in order to make sure that the problem comes from avx512 and not from some dependency or hardware differences although I also did the test on several computers to confirm. For more reproducibility, I use torch built in guix although I did the test without using guix and also obtained different results.

Is it a bug or is it a known limitation? If it is a known limitation, where does it come from and can I disable the acceleration to get reproducible results ?

Versions

PyTorch version: 2.5.0a0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: linux (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.39

Python version: 3.10.7 (main, Jan  1 1970, 00:00:01) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.12.31-1-lts-x86_64-with-glibc2.39
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
/gnu/store/m0xdsa8cfq6mq1kxgxmpmpg71la4f0b9-bash-minimal-5.1.16/bin/sh: line 1: lscpu: command not found

Versions of relevant libraries:
[pip3] numpy==1.24.4
[pip3] 
[pip3] 
[pip3] optree==0.14.0
[pip3] torch==2.5.0a0+gitunknown
[conda] Could not collect

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @frank-wei

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions