SciTePress - Publication Details

More Web Proxy on the site http://driver.im/

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Filtering a Reference Corpus to Generalize Stylometric Representations

Topics: Deep Learning; Machine Learning; Neural Networks

In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 0IC3K, 259-268, 2020

Authors: Julien Hay ^{1

;

2

;

3} ; Bich-Liên Doan ^{1

;

2} ; Fabrice Popineau ^{1

;

2} and Ouassim Ait Elhara ³

Affiliations: ¹ CentraleSupélec, Paris-Saclay University, 91190 Gif-sur-Yvette, France ; ² Laboratoire de Recherche en Informatique, Paris-Saclay University, 91190 Gif-sur-Yvette, France ; ³ Octopeek SAS, 95880 Enghien-les-Bains, France

Keyword(s): Writing Style, Authorship Analysis, Representation Learning, Deep Learning, Filtering, Preprocessing.

Abstract: Authorship analysis aims at studying writing styles to predict authorship of a portion of a written text. Our main task is to represent documents so that they reflect authorship. To reach the goal, we use these representations for the authorship attribution, which means the author of a document is identified out of a list of known authors. We have recently shown that style can be generalized to a set of reference authors. We trained a DNN to identify the authors of a large reference corpus and then learnt how to represent style in a general stylometric space. By using such a representation learning method, we can embed new documents into this stylometric space, and therefore stylistic features can be highlighted. In this paper, we want to validate the following hypothesis: the more authorship terms are filtered, the more models can be generalized. Attention can thus be focused on style-related and constituent linguistic structures in authors’ styles. To reach this aim, we suggest a n ew efficient and highly scalable filtering process. This process permits a higher accuracy on various test sets on both authorship attribution and clustering tasks. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 79.170.44.78

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Hay, J. ; Doan, B. ; Popineau, F. and Elhara, O. (2020). Filtering a Reference Corpus to Generalize Stylometric Representations. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - KDIR; ISBN 978-989-758-474-9; ISSN 2184-3228, SciTePress, pages 259-268. DOI: 10.5220/0010138802590268

@conference{kdir20,
author={Julien Hay and Bich{-}Liên Doan and Fabrice Popineau and Ouassim Ait Elhara},
title={Filtering a Reference Corpus to Generalize Stylometric Representations},
booktitle={Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - KDIR},
year={2020},
pages={259-268},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010138802590268},
isbn={978-989-758-474-9},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - KDIR
TI - Filtering a Reference Corpus to Generalize Stylometric Representations
SN - 978-989-758-474-9
IS - 2184-3228
AU - Hay, J.
AU - Doan, B.
AU - Popineau, F.
AU - Elhara, O.
PY - 2020
SP - 259
EP - 268
DO - 10.5220/0010138802590268
PB - SciTePress