[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2835776.2835847acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
invited-talk

Is Mail The Next Frontier In Search And Data Mining?

Published: 08 February 2016 Publication History

Abstract

The nature of Web mail traffic has significantly evolved in the last two decades, and consequently the behavior of Web mail users has also changed. For instance a recent study conducted by Yahoo Labs showed that today 90% of Web mail traffic is machine-generated. This partly explains why email traffic continues to grow even if a significant amount of personal communications has moved towards social media. Most users today are receiving in their inbox important invoices, receipts, and travel itineraries, together with non-malicious junk mail such as hotel newsletters or shopping promotions that could safely ignore. This is one of the reasons that a majority of messages remain unread, and many are deleted without being read. In that sense, Web mail has become quite similar to traditional snail mail. In spite of this drastic change in nature, many mail features remain unchanged. While 70% of mail users do not define even a single folder, folders are still predominant in the left trail of many Web mail clients. Mail search results are still mostly ranked by date, which makes the retrieving of older messages extremely challenging. This is even more painful to users, as unlike in Web search, they will know when a relevant previously read message has not been returned.
In this talk, I present the results of multiple large-scale studies that have been conducted at Yahoo Labs in the last few years. I highlight the inherent challenges associated with such studies, especially around privacy concerns. I will discuss the new nature of consumer Web mail, which is dominated by machine-generated messages of highly heterogeneous forms and value. I will show how the change has not been fully recognized yet by my most email clients. As an example, why should there still be a reply option associated with a message coming from a "do-not-reply@" address?. I will introduce some approaches for large-scale mail mining specifically tailored to machine-generated email. I will conclude by discussing possible applications and research directions.

References

[1]
N. Ailon, Z. Karnin, E. Liberty, and Y. Maarek. Threading machine generated email. In Proceedings of WSDM'2013, New York, NY, USA, 2013.
[2]
David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, and Ariel Raviv. Rank by time of by relevance? revisiting email search. In Proceedings of CIKM'2015, November 2015.
[3]
Dotan DiCastro, Liane Lewin-Eytan, Yoelle Maarek, and Zohar Karnin. You've got mail, and here is what you could do with it! analyzing and predicting actions on email messages. In Proceedings of WSDM'2016, San-Francisco, CA, Feb 2016.
[4]
Dotan DiCastro, Liane Lewin-Eytan, Yoelle Maarek, Ran Wolff, and Eyal Zohar. Enforcing k-anonymity in web mail auditing. In Proceedings of WSDM'2016, San-Francisco, CA, Feb 2016.
[5]
Y. Koren, E. Liberty, Y. Maarek, and R. Sandler. Automatically tagging email by leveraging other users' folders. In Proceedings of KDD'2011, San Diego, CA, Aug 2011.

Cited By

View all
  • (2019)Online template induction for machine-generated emailsProceedings of the VLDB Endowment10.14778/3342263.334226412:11(1235-1248)Online publication date: 1-Jul-2019
  • (2019)RiSER: Learning Better Representations for Richly Structured EmailsThe World Wide Web Conference10.1145/3308558.3313720(886-895)Online publication date: 13-May-2019
  • (2018)Anatomy of a Privacy-Safe Large-Scale Information Extraction System Over EmailProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219901(734-743)Online publication date: 19-Jul-2018
  • Show More Cited By

Index Terms

  1. Is Mail The Next Frontier In Search And Data Mining?

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
    February 2016
    746 pages
    ISBN:9781450337168
    DOI:10.1145/2835776
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 February 2016

    Check for updates

    Author Tags

    1. machine-generated email
    2. mail search
    3. web mail

    Qualifiers

    • Invited-talk

    Conference

    WSDM 2016
    WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining
    February 22 - 25, 2016
    California, San Francisco, USA

    Acceptance Rates

    WSDM '16 Paper Acceptance Rate 67 of 368 submissions, 18%;
    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Online template induction for machine-generated emailsProceedings of the VLDB Endowment10.14778/3342263.334226412:11(1235-1248)Online publication date: 1-Jul-2019
    • (2019)RiSER: Learning Better Representations for Richly Structured EmailsThe World Wide Web Conference10.1145/3308558.3313720(886-895)Online publication date: 13-May-2019
    • (2018)Anatomy of a Privacy-Safe Large-Scale Information Extraction System Over EmailProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219901(734-743)Online publication date: 19-Jul-2018
    • (2018)Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category Prediction2018 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2018.8622048(1846-1855)Online publication date: Dec-2018
    • (2018)Template Trees: Extracting Actionable Information from Machine Generated EmailsDatabase and Expert Systems Applications10.1007/978-3-319-98812-2_1(3-18)Online publication date: 9-Aug-2018
    • (2017)Email Category PredictionProceedings of the 26th International Conference on World Wide Web Companion10.1145/3041021.3055166(495-503)Online publication date: 3-Apr-2017

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media