default search action
Dhawal Gupta
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j7]Alex Ayoub, David Szepesvari, Francesco Zanini, Bryan Chan, Dhawal Gupta, Bruno Castro da Silva, Dale Schuurmans:
Mitigating the Curse of Horizon in Monte-Carlo Returns. RLJ 2: 563-572 (2024) - [j6]Kartik Choudhary, Dhawal Gupta, Philip S. Thomas:
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data. RLJ 4: 1546-1566 (2024) - [j5]Mehwash Weqar, Shabana Mehfuz, Dhawal Gupta, Shabana Urooj:
Adaptive Switching Based Data-Communication Model for Internet of Healthcare Things Networks. IEEE Access 12: 11530-11548 (2024) - [c7]Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva:
From Past to Future: Rethinking Eligibility Traces. AAAI 2024: 12253-12260 - [i8]Kartik Choudhary, Dhawal Gupta, Philip S. Thomas:
ICU-Sepsis: A Benchmark MDP Built from Real Medical Data. CoRR abs/2406.05646 (2024) - [i7]Erfan Entezami, Mahsa Sahebdel, Dhawal Gupta:
A Safe Exploration Strategy for Model-free Task Adaptation in Safety-constrained Grid Environments. CoRR abs/2408.00997 (2024) - 2023
- [c6]Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, Dhawal Gupta, Moonkyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier:
A Mixture-of-Expert Approach to RL-based Dialogue Management. ICLR 2023 - [c5]Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno C. da Silva:
Behavior Alignment via Reward Function Optimization. NeurIPS 2023 - [c4]Dhawal Gupta, Yinlam Chow, Azamat Tulepbergenov, Mohammad Ghavamzadeh, Craig Boutilier:
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management. NeurIPS 2023 - [i6]Dhawal Gupta, Yinlam Chow, Mohammad Ghavamzadeh, Craig Boutilier:
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management. CoRR abs/2302.10850 (2023) - [i5]James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas:
Coagent Networks: Generalized and Scaled. CoRR abs/2305.09838 (2023) - [i4]Simeng Sun, Dhawal Gupta, Mohit Iyyer:
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF. CoRR abs/2309.09055 (2023) - [i3]Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno Castro da Silva:
Behavior Alignment via Reward Function Optimization. CoRR abs/2310.19007 (2023) - [i2]Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva:
From Past to Future: Rethinking Eligibility Traces. CoRR abs/2312.12972 (2023) - 2021
- [j4]Tulika Saha, Dhawal Gupta, Sriparna Saha, Pushpak Bhattacharyya:
Emotion Aided Dialogue Act Classification for Task-Independent Conversations in a Multi-modal Framework. Cogn. Comput. 13(2): 277-289 (2021) - [j3]Tulika Saha, Dhawal Gupta, Sriparna Saha, Pushpak Bhattacharyya:
A hierarchical approach for efficient multi-intent dialogue policy learning. Multim. Tools Appl. 80(28-29): 35025-35050 (2021) - [j2]Tulika Saha, Dhawal Gupta, Sriparna Saha, Pushpak Bhattacharyya:
A Unified Dialogue Management Strategy for Multi-intent Dialogue Conversations in Multiple Languages. ACM Trans. Asian Low Resour. Lang. Inf. Process. 20(6): 99:1-99:22 (2021) - [c3]Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James E. Kostas, Philip S. Thomas, Martha White:
Structural Credit Assignment in Neural Networks using Reinforcement Learning. NeurIPS 2021: 30257-30270 - 2020
- [j1]Tulika Saha, Dhawal Gupta, Sriparna Saha, Pushpak Bhattacharyya:
Towards integrated dialogue policy learning for multiple domains and intents using Hierarchical Deep Reinforcement Learning. Expert Syst. Appl. 162: 113650 (2020) - [c2]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. ICML 2020: 3524-3534 - [i1]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. CoRR abs/2007.00611 (2020)
2010 – 2019
- 2018
- [c1]Tulika Saha, Dhawal Gupta, Sriparna Saha, Pushpak Bhattacharyya:
Reinforcement Learning Based Dialogue Management Strategy. ICONIP (3) 2018: 359-372
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-25 22:48 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint