Hongyu Shi
Applied Filters
- Hongyu Shi
Affiliations
Publication Date
Author Profile Pages
- Description: The Author Profile Page initially collects all the professional information known about authors from the publications record as known by the ACM bibliographic database, the Guide. Coverage of ACM publications is comprehensive from the 1950's. Coverage of other publishers generally starts in the mid 1980's. The Author Profile Page supplies a quick snapshot of an author's contribution to the field and some rudimentary measures of influence upon it. Over time, the contents of the Author Profile page may expand at the direction of the community.
Please see the following 2007 Turing Award winners' profiles as examples: - History: Disambiguation of author names is of course required for precise identification of all the works, and only those works, by a unique individual. Of equal importance to ACM, author name normalization is also one critical prerequisite to building accurate citation and download statistics. For the past several years, ACM has worked to normalize author names, expand reference capture, and gather detailed usage statistics, all intended to provide the community with a robust set of publication metrics. The Author Profile Pages reveal the first result of these efforts.
- Normalization: ACM uses normalization algorithms to weigh several types of evidence for merging and splitting names.
These include:- co-authors: if we have two names and cannot disambiguate them based on name alone, then we see if they have a co-author in common. If so, this weighs towards the two names being the same person.
- affiliations: names in common with same affiliation weighs toward the two names being the same person.
- publication title: names in common whose works are published in same journal weighs toward the two names being the same person.
- keywords: names in common whose works address the same subject matter as determined from title and keywords, weigh toward being the same person.
The more conservative the merging algorithms, the more bits of evidence are required before a merge is made, resulting in greater precision but lower recall of works for a given Author Profile. Many bibliographic records have only author initials. Many names lack affiliations. With very common family names, typical in Asia, more liberal algorithms result in mistaken merges.
Automatic normalization of author names is not exact. Hence it is clear that manual intervention based on human knowledge is required to perfect algorithmic results. ACM is meeting this challenge, continuing to work to improve the automated merges by tweaking the weighting of the evidence in light of experience.
- Bibliometrics: In 1926, Alfred Lotka formulated his power law (known as Lotka's Law) describing the frequency of publication by authors in a given field. According to this bibliometric law of scientific productivity, only a very small percentage (~6%) of authors in a field will produce more than 10 articles while the majority (perhaps 60%) will have but a single article published. With ACM's first cut at author name normalization in place, the distribution of our authors with 1, 2, 3..n publications does not match Lotka's Law precisely, but neither is the distribution curve far off. For a definition of ACM's first set of publication statistics, see Bibliometrics
- Future Direction:
The initial release of the Author Edit Screen is open to anyone in the community with an ACM account, but it is limited to personal information. An author's photograph, a Home Page URL, and an email may be added, deleted or edited. Changes are reviewed before they are made available on the live site.
ACM will expand this edit facility to accommodate more types of data and facilitate ease of community participation with appropriate safeguards. In particular, authors or members of the community will be able to indicate works in their profile that do not belong there and merge others that do belong but are currently missing.
A direct search interface for Author Profiles will be built.
An institutional view of works emerging from their faculty and researchers will be provided along with a relevant set of metrics.
It is possible, too, that the Author Profile page may evolve to allow interested authors to upload unpublished professional materials to an area available for search and free educational use, but distinct from the ACM Digital Library proper. It is hard to predict what shape such an area for user-generated content may take, but it carries interesting potential for input from the community.
Bibliometrics
The ACM DL is a comprehensive repository of publications from the entire field of computing.
It is ACM's intention to make the derivation of any publication statistics it generates clear to the user.
- Average citations per article = The total Citation Count divided by the total Publication Count.
- Citation Count = cumulative total number of times all authored works by this author were cited by other works within ACM's bibliographic database. Almost all reference lists in articles published by ACM have been captured. References lists from other publishers are less well-represented in the database. Unresolved references are not included in the Citation Count. The Citation Count is citations TO any type of work, but the references counted are only FROM journal and proceedings articles. Reference lists from books, dissertations, and technical reports have not generally been captured in the database. (Citation Counts for individual works are displayed with the individual record listed on the Author Page.)
- Publication Count = all works of any genre within the universe of ACM's bibliographic database of computing literature of which this person was an author. Works where the person has role as editor, advisor, chair, etc. are listed on the page but are not part of the Publication Count.
- Publication Years = the span from the earliest year of publication on a work by this author to the most recent year of publication of a work by this author captured within the ACM bibliographic database of computing literature (The ACM Guide to Computing Literature, also known as "the Guide".
- Available for download = the total number of works by this author whose full texts may be downloaded from an ACM full-text article server. Downloads from external full-text sources linked to from within the ACM bibliographic space are not counted as 'available for download'.
- Average downloads per article = The total number of cumulative downloads divided by the number of articles (including multimedia objects) available for download from ACM's servers.
- Downloads (cumulative) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server since the downloads were first counted in May 2003. The counts displayed are updated monthly and are therefore 0-31 days behind the current date. Robotic activity is scrubbed from the download statistics.
- Downloads (12 months) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 12-month period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (12-month download counts for individual works are displayed with the individual record.)
- Downloads (6 weeks) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 6-week period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (6-week download counts for individual works are displayed with the individual record.)