default search action
Hadi Daneshmand
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c15]Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Rätsch, Hadi Daneshmand:
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion. ICLR 2024 - [i19]Jiuqi Wang, Ethan Blaser, Hadi Daneshmand, Shangtong Zhang:
Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning. CoRR abs/2405.13861 (2024) - [i18]Hadi Daneshmand:
Provable optimal transport with transformers: The essence of depth and prompt engineering. CoRR abs/2410.19931 (2024) - 2023
- [c14]Hadi Daneshmand, Jason D. Lee, Chi Jin:
Efficient displacement convex optimization with particle gradient descent. ICML 2023: 6836-6854 - [c13]Amir Joudaki, Hadi Daneshmand, Francis R. Bach:
On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization. ICML 2023: 15388-15400 - [c12]Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra:
Transformers learn to implement preconditioned gradient descent for in-context learning. NeurIPS 2023 - [c11]Amir Joudaki, Hadi Daneshmand, Francis R. Bach:
On the impact of activation and normalization in obtaining isometric embeddings at initialization. NeurIPS 2023 - [i17]Hadi Daneshmand, Jason D. Lee, Chi Jin:
Efficient displacement convex optimization with particle gradient descent. CoRR abs/2302.04753 (2023) - [i16]Amir Joudaki, Hadi Daneshmand, Francis R. Bach:
On the impact of activation and normalization in obtaining isometric embeddings at initialization. CoRR abs/2305.18399 (2023) - [i15]Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra:
Transformers learn to implement preconditioned gradient descent for in-context learning. CoRR abs/2306.00297 (2023) - [i14]Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Rätsch, Hadi Daneshmand:
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion. CoRR abs/2310.02012 (2023) - 2022
- [i13]Hadi Daneshmand, Francis R. Bach:
Polynomial-time sparse measure recovery. CoRR abs/2204.07879 (2022) - [i12]Amir Joudaki, Hadi Daneshmand, Francis R. Bach:
Entropy Maximization with Depth: A Variational Principle for Random Neural Networks. CoRR abs/2205.13076 (2022) - 2021
- [c10]Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith:
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization. AISTATS 2021: 3979-3987 - [c9]Hadi Daneshmand, Amir Joudaki, Francis R. Bach:
Batch Normalization Orthogonalizes Representations in Deep Random Networks. NeurIPS 2021: 4896-4906 - [c8]Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand:
Rethinking the Variational Interpretation of Accelerated Optimization Methods. NeurIPS 2021: 14396-14406 - [i11]Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith:
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization. CoRR abs/2102.11537 (2021) - [i10]Hadi Daneshmand, Amir Joudaki, Francis R. Bach:
Batch Normalization Orthogonalizes Representations in Deep Random Networks. CoRR abs/2106.03970 (2021) - 2020
- [b1]Hadi Daneshmand:
Optimization for Neural Networks: Quest for Theoretical Understandings. ETH Zurich, Zürich, Switzerland, 2020 - [c7]Hadi Daneshmand, Jonas Moritz Kohler, Francis R. Bach, Thomas Hofmann, Aurélien Lucchi:
Batch normalization provably avoids ranks collapse for randomly initialised deep networks. NeurIPS 2020 - [i9]Hadi Daneshmand, Jonas Moritz Kohler, Francis R. Bach, Thomas Hofmann, Aurélien Lucchi:
Theoretical Understanding of Batch-normalization: A Markov Chain Perspective. CoRR abs/2003.01652 (2020)
2010 – 2019
- 2019
- [c6]Leonard Adolphs, Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann:
Local Saddle Point Optimization: A Curvature Exploitation Approach. AISTATS 2019: 486-495 - [c5]Jonas Moritz Kohler, Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann, Ming Zhou, Klaus Neymeyr:
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization. AISTATS 2019: 806-815 - [i8]Peiyuan Zhang, Hadi Daneshmand, Thomas Hofmann:
Mixing of Stochastic Accelerated Gradient Descent. CoRR abs/1910.14616 (2019) - 2018
- [c4]Hadi Daneshmand, Jonas Moritz Kohler, Aurélien Lucchi, Thomas Hofmann:
Escaping Saddles with Stochastic Gradients. ICML 2018: 1163-1172 - [i7]Hadi Daneshmand, Jonas Moritz Kohler, Aurélien Lucchi, Thomas Hofmann:
Escaping Saddles with Stochastic Gradients. CoRR abs/1803.05999 (2018) - [i6]Leonard Adolphs, Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann:
Local Saddle Point Optimization: A Curvature Exploitation Approach. CoRR abs/1805.05751 (2018) - [i5]Jonas Moritz Kohler, Hadi Daneshmand, Aurélien Lucchi, Ming Zhou, Klaus Neymeyr, Thomas Hofmann:
Towards a Theoretical Understanding of Batch Normalization. CoRR abs/1805.10694 (2018) - 2017
- [i4]Hadi Daneshmand, Hamed Hassani, Thomas Hofmann:
Accelerated Dual Learning by Homotopic Initialization. CoRR abs/1706.03958 (2017) - 2016
- [j1]Manuel Gomez-Rodriguez, Le Song, Hadi Daneshmand, Bernhard Schölkopf:
Estimating Diffusion Networks: Recovery Conditions, Sample Complexity and Soft-thresholding Algorithm. J. Mach. Learn. Res. 17: 90:1-90:29 (2016) - [c3]Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann:
Starting Small - Learning with Adaptive Sample Sizes. ICML 2016: 1463-1471 - [c2]Aryan Mokhtari, Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann, Alejandro Ribeiro:
Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy. NIPS 2016: 4062-4070 - [i3]Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann:
Starting Small - Learning with Adaptive Sample Sizes. CoRR abs/1603.02839 (2016) - [i2]Hadi Daneshmand, Aurélien Lucchi, Thomas Hofmann:
DynaNewton - Accelerating Newton's Method for Machine Learning. CoRR abs/1605.06561 (2016) - 2014
- [c1]Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schölkopf:
Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm. ICML 2014: 793-801 - [i1]Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schölkopf:
Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm. CoRR abs/1405.2936 (2014)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-30 00:13 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint