10000 Optimise `applyQFT()` · Issue #604 · QuEST-Kit/QuEST · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Optimise applyQFT() #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
TysonRayJones opened this issue Apr 30, 2025 · 1 comment
Open

Optimise applyQFT() #604

TysonRayJones opened this issue Apr 30, 2025 · 1 comment

Comments

@TysonRayJones
Copy link
Member

The current implementation of applyQuantumFourierTransform() merely effects the canonical gates of the QFT in-turn. An optimised routine is possible, whereby we merge contiguous phase gates into a single, diagonal operator. In QuEST v3, this was implemented using the (now deprecated and absolutely awful) applyPhaseFunc() routine.

In QuEST v4, an equivalent diagonal operator could be effected by a applyFullStateDiagMatr using a temporarily allocated applyFullStateDiagMatr. However, this incurs gratuitous memory allocation costs given that the elements are compile-time known.

We should instead implement the necessary diagonal operator as a bespoke routine. This requires re-deriving it (since removed last-minute from the distributed manuscript) and updating applyQuantumFourierTransform() to make use of it. The algorithm is very cute, compatible with both distribution, and density matrices, and non-contiguous arbitrarily-ordered target qubits. As such, it may be a superior solution to performing the FFT upon the constituent amplitudes directly.

This may serve as an excellent project for a student and worthy of publication 🙌

@TysonRayJones
Copy link
Member Author

Worth pointing out that QFT on a non-contiguous, arbitrarily-ordered subset of qubits can make use of FFT; we simply swap active qubits to the lowest ordered positions, apply many instances of FFT in parallel upon all 2^#targets sub-arrays, and SWAP back. This works also on density matrices whereby we process each column independently. However...

  • this precludes full-state QFT in distributed settings; we could not target log2(#nodes) qubits.
  • this would really benefit from a fused SWAP
  • this complicates threading logic; we should parallelise within one big FFT, but across multiple small FFTs.
  • this requires ill-motivated use of an external library (like FFTW) for a core function, or reinvention of the wheel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant
0