You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've come across some simple Bayesian models for which Mooncake is significantly (~4times) slower than Enzyme or an alternative, very limited, Proof-Of-Concept Julia AD method (StanBlocksAD.jl). AFAICT, Mooncake should be able to reach Enzyme's/StanBlocksAD.jl performance. It's a bit unclear to me what exactly is "dragging Mooncake down".
Furthermore, for a batched version of that model, neither Enzyme nor Mooncake achieve the same scaling as StanBlocksAD.jl. To clarify/summarize, the timings relative to the scalar StanBlocksAD.jl/Enzyme.jl timing are roughly:
I don't intend to continue developing StanBlocksAD.jl, but I find it interesting that there are apparently still possible performance gains for something purely Julian. We can discuss what StanBlocksAD.jl does differently than Mooncake and what if anything could be ported to Mooncake. But this issue is mainly meant to record this link, and to be revisited at some later point.
The text was updated successfully, but these errors were encountered:
Using StanBlocks.constviewhere instead of a regular view (as commented out above) made all versions faster IIRC, because it avoids an allocation that Base.view apparently feels compelled to do.
This is a low priority issue.
I've come across some simple Bayesian models for which Mooncake is significantly (~4times) slower than Enzyme or an alternative, very limited, Proof-Of-Concept Julia AD method (StanBlocksAD.jl). AFAICT, Mooncake should be able to reach Enzyme's/StanBlocksAD.jl performance. It's a bit unclear to me what exactly is "dragging Mooncake down".
Furthermore, for a batched version of that model, neither Enzyme nor Mooncake achieve the same scaling as StanBlocksAD.jl. To clarify/summarize, the timings relative to the scalar StanBlocksAD.jl/Enzyme.jl timing are roughly:
Notebook with (slightly different) timings and potentially reproducible code: https://nsiccha.github.io/StanBlocksAD.jl/#why
I don't intend to continue developing StanBlocksAD.jl, but I find it interesting that there are apparently still possible performance gains for something purely Julian. We can discuss what StanBlocksAD.jl does differently than Mooncake and what if anything could be ported to Mooncake. But this issue is mainly meant to record this link, and to be revisited at some later point.
The text was updated successfully, but these errors were encountered: