Large-Scale Shill Bidder Detection in E-commerce

Michael Fire, Data4Good Lab, Ben Gurion University, Israel, mickyfi@bgu.ac.il

Rami Puzis, Ben-Gurion University, Israel, puzis@bgu.ac.il

Dima Kagan, Ben-Gurion University, Israel, kagandi@post.bgu.ac.il

Yuval Elovici, Ben-Gurion University, Israel, elovici@bgu.ac.il

DOI: https://doi.org/10.1145/3589462.3589479
IDEAS 2023: International Database Engineered Applications Symposium Conference, Heraklion, Crete, Greece, May 2023

User feedback is one of the most effective methods to build and maintain trust in electronic commerce platforms. Unfortunately, dishonest sellers often bend over backward to manipulate users’ feedback or place phony bids in order to increase their own sales and harm competitors. The black market of user feedback, supported by a plethora of shill bidders, prospers on top of legitimate electronic commerce. In this paper, we investigate the ecosystem of shill bidders based on large-scale data by analyzing hundreds of millions of users who performed billions of transactions, and we propose a machine-learning-based method for identifying communities of users that methodically provide dishonest feedback. Our results show that (1) shill bidders can be identified with high precision based on their transaction and feedback statistics; and (2) in contrast to legitimate buyers and sellers, shill bidders form cliques to support each other.

CCS Concepts: • Security and privacy; • Social and professional topics → Financial crime; • Applied computing → Online auctions;

Keywords: Big Data, Ecommerce, Cyber Security & Privacy, Fraud Detection, Data Science, Social Network Analysis

ACM Reference Format:
Michael Fire, Rami Puzis, Dima Kagan, and Yuval Elovici. 2023. Large-Scale Shill Bidder Detection in E-commerce. In International Database Engineered Applications Symposium Conference (IDEAS 2023), May 05--07, 2023, Heraklion, Crete, Greece. ACM, New York, NY, USA 8 Pages. https://doi.org/10.1145/3589462.3589479

1 INTRODUCTION

Electronic commerce (e-commerce) usage has increased sharply as e-commerce platforms have become interwoven into people's everyday lives as places to buy and sell products. The e-commerce worldwide sales to consumers is expected to pass the eight trillion dollars mark in 2026, almost quadrupling itself compared to 2014 [5, 9]. E-commerce platforms, such as Amazon,¹ Alibaba,² eBay,³ Etsy,⁴ and OnlineAuction,⁵ already have hundreds of millions of active users [7, 11, 39]. In many e-commerce platforms, users can use the platforms to sell and buy various products from each other, from simple everyday products, such as a cup holder which costs only a few dollars, to more exotic products, such as a fighter jet which costs several million dollars [32]. Furthermore, more than 25 million individuals use e-commerce platforms as their primary or secondary source of income [4].

In many e-commerce platforms, after a user purchases a product, he or she can rate the product and write a review regarding the product's features, such as the product's quality [33]. Furthermore, many e-commerce platforms employ reputation systems, in which the buyer can leave feedback regarding the seller of the product [37]. The accumulated feedback on each seller, which in many cases is viewable to other users of the website, can assist other users in building trust towards the seller [37]. According to Lucking-Reile et al. [30] the feedback these sellers receive can have a measurable effect on their auction prices, where negative feedback has a much greater effect than positive feedback ratings. Moreover, in cases where sellers receive too much negative feedback from other users, the website operator can decide to revoke the seller's selling privileges. For example, Facebook banned and reduced the number of ads for businesses who received too much negative feedback from buyers [3].

As in many other online platforms, such as search engines, online social networks, and online gaming platforms, the platform's users can utilize dishonest techniques to manipulate the platform's statistics in order to create profits [24]. Moreover, in many online platforms, the platform's users can purchase this type of service from third-party providers, which in most cases violates the platforms terms of service. For example, the market of buying fake followers and fake retweets in Twitter is already a multimillion-dollar business [34]. The equivalent to Twitter's fake followers in e-commerce are fake reviews. The fake review industry is flourishing; there are many paid reviewers and even people who receive free products in exchange for a positive review [2, 42]. In e-commerce platforms dishonest users (referred to as shill bidders) can increase their product's price in auctions by bidding on products with the intent to artificially increase the product's price or desirability [19, 41]. Additionally, shill bidders are also users who buy products in order to artificially improve a seller's feedback or the product's search standing [19]. Shill bidding is forbidden in many e-commerce platforms, such as eBay [19], flippa [1], etc. Moreover, shill bidding in online auctions is illegal in some countries, such as the United States, and can be considered as wire fraud, a felony which can lead to a maximum penalty of up to four years in prison and a million dollars in fines [10].

In this paper we study the shill bidder ecosystem. The main contributions of this study are threefold. First, we offer generic algorithms for identifying shill bidders in e-commerce platforms. Moreover, we evaluate these algorithms on one the biggest e-commerce datasets in the world (referred to as the e-commerce dataset), which includes several billion buying transactions and several billion feedback interactions between the platform's users. Second, we analyze the activities of over 187,244 identified shill bidders to better understand their characteristics. Lastly, we investigate the ecosystem of shill bidders and offer methods on how this ecosystem's properties can be utilized to better identify shill bidders. To the best of our knowledge, this study which aims to identify shill bidders is the largest of its kind to date.

The remainder of the paper is structured as follows: In Section 2 we give a brief overview of previous relevant studies related to e-commerce security and shill bidders. In this section, we also introduce several studies which used similar data mining algorithms as this study in other online platforms, such as Facebook and Twitter. Next, in Section 3 we present the methods and algorithms we developed for identifying shill bidders. Afterwards, in Section 4 we describe the e-commerce dataset. Additionally, in Section 4, we present the performed empirical evaluation process and the results of the shill bidder identification algorithms on the e-commerce dataset. Then, in Section 5, we present the performed empirical study of the shill bidders’ characteristics. Furthermore, in this section we also present our analysis of the shill bidders’ ecosystem. Lastly, in Section 6, we present our conclusions and also offer future research directions.

2 RELATED WORK

In the past decade, alongside the rise in Internet availability, there was also a constant growth in popularity and volume of e-commerce [23]. However, the extreme popularity and the high anonymity of e-commerce attracted many fraudsters. According to Aite report [6], the 2021 online revenue loss estimation due to fraud estimated at 443 billion USD.

Over the years, many research devoted efforts have to study e-commerce platform's security and trust [43]; however, only a tiny portion of this work centered on detecting shill bidders. Chakraborty et al. [15] defined shill bidding as the illicit participation of sellers in auctions designed to increase the price at which a product sells. There are many types of shill bidding, and we are going to present two notable types as examples: The first type is competitive shilling, which is the most straightforward kind of shill. Kauffman et al. [26] explain that it is used to make bidders pay more for a product. The seller or his or her accomplices enter bids to make legitimate bidders pay more. The second type is reserve price shilling. Kauffman et al. [26] note that this is a shilling used to avoid paying auction house fees such as insertion fees or secret reserve fees. In addition, there are questionable techniques that stand on a thin line between tips or tricks and shill bidding. For example, Roth and Ockenfels [38] describe last-minute bidding (“sniping”), which is basically placing your bid exactly before the auction ends in order to pay a lower price and snatch the product. There is controversy about sniping and it is considered legal [29, 40]. Moreover, Roth and Ockenfels [38] note that late bidding may be used by a dishonest seller who attempts to raise the price by using shill bidders.

The classical approach for tackling the shill-bidding problem is to use detection and prediction techniques. In 2005, Chau and Faloutsos [17] proposed a feature based method for fraudster detection in online auctions. The features were extracted from a dataset of 115 users from eBay. Next, they inserted the feature sets into a C5.0 decision-tree-based classification algorithm. In their best combination of features, they were able to detect malicious users with a precision of 82%, a True Positive (TP) rate of 83%, and a False Positive (FP) rate of 11%.

In 2007, Pandit [35] developed a model representing users and transactions as a Markov Random Field. The model uses a Belief Propagation mechanism in order to detect fraudulent users. The detected fraudsters with a precision of 0.9 for the synthetic dataset; however, they were unable to present algorithm evaluation results on real-world datasets because they could not label them. In 2008, Beyene et al. [12] presented a model representing eBay transactions and feedback in a graph. In their study, they explained the difference between the eBays graph and a standard social network graph. First of all, they found that a rich club phenomenon [44] did not exist in the eBay feedback graph. Second, they showed that preferential attachment holds only partially. In addition, they discovered that when a user is sufficiently trustable, the exact number of positive reviews became less critical. However, negative reviews significantly hurt the user's credibility. In 2011, Chang et al. [16] proposed a new two-stage phased model for early fraud detection in online auctions. The model's first phase constructed behavior models based on users’ transaction histories. In the first phase, extracted features representing significant behavioral differences between legitimate users and fraudsters. The second phase was fraud detection, where the data of a suspicious account was inserted into the detection model to test if the seller is legitimate. They performed an Instance-Based Learning (IBL) algorithm. They were able to get an average recall rate of classifying accounts of over 93%.

In 2016 Majadi et al. [31] analyzed previous studies on shill bidders and found that the most common features used in past literature were: first bidding, last bidding, bid increment, outbid time, bid frequency, affinity to the sellers, and winning ratio.

In 2017 Kaghazgaran et al. [25] analyzed fake-review properties. They found that fake reviewers give longer reviews and they tend to write their reviews in bursts.

In 2018, Ganguly et al. [21] proposed an SVM-based method for shill bidder detection. They evaluated their method on a dataset that contained information on 149 auctions and 1024 bidders who bid on PDAs on eBay. In order to label the dataset, they used hierarchical clustering and manually labeled clusters that looked suspicious. Their classifier achieved an AUC of 0.86 using 10-fold cross validation.

Table 1: Features

Name	Description	Formula
Transaction Features
Buy-Trans-Num(v)	The total number of buying transactions which v performed. With respect to G_T’s topology, the Buy-Trans-Num(v) feature is equal to the out-degree of v in the multigraph G_T.	∣{(v, u, p, d) ∈ E_T∣∃u ∈ V}∣
Sell-Trans-Num(v)	The total number of selling transactions which v performed. With respect to G_T’s topology, the Sell-Trans-Num(v) feature is equal to the in-degree of v in the multigraph G_T.	∣{(v, u, p, d) ∈ E_T∣∃u ∈ V}∣
Unique-Sellers(v)	The distinct number of users which v bought products from. With respect to G_T’s topology, the Unique-Sellers(v) feature is equal to the number of vertices in the multigraph G_T which are connected to v by at least one out-link.	∣{u ∈ V∣∃(u, v, p, d) ∈ E_T}∣
Unique-Buyers(v)	The distinct number of users which v sold products to. With respect to G_T’s topology, the Unique-Buyers(v) feature is equal to the number of vertices in the multigraph G_T which are connected to v by at least one in-link.	{u ∈ V∣∃(v, u, p, d) ∈ E_T}∣
Bidir-Trans-Users(v)	The distinct number of users which v sold products to, and also bought products from.	∣{u ∈ V∣∃(u, v, p, d), (v, u, p, d) ∈ E_T}∣
Max-Buy-Price(v)	The maximal paid amount in USD which v paid to another user in a single buying transaction.	$max(\lbrace e^a_T \mid \exists e_T:=(v,u,p,d) \in E_T, u \in V\rbrace)$
Min-Buy-Price(v)	The minimal paid amount in USD which v paid to another user in a single buying transaction.	$min(\lbrace e^a_T \mid \exists e_T:=(v,u,p,d) \in E_T, u \in V\rbrace)$
Max-Buy-Quantity(v)	$max(\lbrace e^q_T \mid \exists e_T:=(v,u,p,d) \in E_T, u \in V\rbrace)$
Total-Buy-Quantity(v)	The maximal number of products which v bought from another user in a single buying transaction.	$ \sum _{\lbrace e_T=(v,u,p,d) \in E_T \mid u \in V\rbrace }e^q_T$
Total-Buy-Amount(v)	The total amount in USD which v bought in products from other users.	$ \sum _{\lbrace e_T=(v,u,p,d) \in E_T \mid u \in V\rbrace }e^a_T$
Max-Sell-Price(v)	The maximal amount in USD which v received from another user in a single selling transaction.	$max(\lbrace e^a_T \mid \exists e_T:=(u,v,p,d) \in E_T, u \in V\rbrace)$
Min-Sell-Price(v)	The minimal amount in USD which v received from another user in a single selling transaction.	$ min(\lbrace e^a_T \mid \exists e_T:=(u,v,p,d) \in E_T, u \in V\rbrace)$
Max-Sell-Quantity(v)	The maximal number of products which v sold to another user in a single buying transaction.	$max(\lbrace e^q_T \mid \exists e_T:=(u,v,p,d) \in E_T, u \in V\rbrace)$
Total-Sell-Quantity(v)	The overall number of products which v sold to other users.	$\sum _{\lbrace e_T=(v,u,p,d) \in E_T \mid u \in V\rbrace }e^q_T$
Total-Sell-Amount(v)	The total amount in USD which v sold in products to other users.	$\sum _{\lbrace e_T=(u,v,p,d) \in E_T \mid u \in V\rbrace }e^a_T$
Feedback Features
Gvn-Fdbk-Num(v)	The total amount of feedback a user v gave to other users.	∣{(v, u, r, d) ∈ E_F∣∀u ∈ V}∣
Rcv-Fdbk-Num(v)	The total amount of feedback a user v received from other users.	∣{(u, v, r, d) ∈ E_F∣∀u ∈ V}∣
Gvn-Unique-Fdbk(v)	The number of unique users which received feedback from v.	∣{u ∈ V∣∃(v, u, r, d) ∈ E_F}∣
Rcv-Unique-Fdbk(v)	The number of unique users, which gave feedback to v.	∣{u ∈ V∣∃(u, v, r, d) ∈ E_F}∣
Bidir-Fdbk-Users(v)	The distinct number of users which v gave feedback to, and also received feedback from.	∣{u ∈ V∣∃(u, v, p, d), (v, u, p, d) ∈ E_F}∣
Gvn-Pos-Fdbk(v)	The number of positive feedback ratings a user v received from other users.	∣{(v, u, r, d) ∈ E_F∣r > 0}∣
Gvn-Neg-Fdbk(v)	The number of negative feedback ratings a user v received from other users.	∣{(v, u, r, d) ∈ E_F∣r < 0}∣
Rcv-Pos-Fdbk(v)	The number of positive feedback ratings a user v received from other users.	∣{(u, v, r, d) ∈ E_F∣r > 0}∣
Rcv-Neg-Fdbk(v)	The number of negative feedback ratings a user v received from other users.	∣{(u, v, r, d) ∈ E_F∣r < 0}∣
Gvn-Fdbk-RSum(v)	The sum of the feedback ratings a user v gave to other users.	$\sum _{\lbrace (v,u,r,d) \in E_F \mid \forall u \in V \rbrace } r$
Rcv-Fdbk-RSum(v)	The sum of the feedback ratings a user v received from other users.	$\sum _{\lbrace (u,v,r,d) \in E_F \mid \forall u \in V \rbrace } r$
Gvn-Fdbk-Avg(v)	The average of the feedback ratings a user v gave to other users.	$\frac{Gvn-Fdbk-RSum(v)}{Gvn-Fdbk-Num(v)}$
Rcv-Fdbk-Avg(v)	The average of the feedback ratings a user v received from other users.	$\frac{Rcv-Fdbk-RSum(v)}{Rcv-Fdbk-Num(v)}$

3 IDENTIFYING SHILL BIDDERS USING SUPERVISED LEARNING

In this study, we focus on developing generic classifiers that can identify shill bidders in various e-commerce platforms. To cope with the challenge of identifying shill bidders in e-commerce platforms, we follow the standard methodology of using supervised learning algorithms to predict the likelihood of a user being a shill bidder. User features employed in this study to identify shill bidders are common to many e-commerce platforms.

3.1 Feature Extraction

In order to construct classifiers for identifying shill bidders, we define features to be extracted from the transaction and feedback data of each e-commerce user. Next, we describe in detail each one of the features we extract for every e-commerce user.

Let be v a user in an e-commerce platform. For each user v, we can extract features based on v’s buying and selling transactions, such as v’s number of buying transactions, and based on v’s given and received feedback, such as the amount of feedback v gave. In this section, we describe in detail all the features we extracted and used during this study's experiments. We open this section with the formal definitions of the extracted features based on the users’ buying and selling transactions. Afterward, we introduce the formal definition of all the extracted features based on the users’ feedback activities. Next, we present several personal user features, such as the country the user has declared to live in. Lastly, we present the target class feature.

Transaction Features. Let G_T = < V, E_T > be the directed multigraph that represents the buying and selling transactions between two users in the e-commerce platform, where V is the multigraph vertices set, which contains all the e-commerce users, and E_T is the multigraph's links set, which contains data on all the transaction interactions between e-commerce users. The links in the transactions multigraph are denoted by e_T ≔ (u, v, p, d) ∈ E_T, where u, v ∈ V are two e-commerce users, p is the product which was purchased, and d is the purchase time and date. Each link e_T represents a buying transaction of a single product p, which a user u bought from a user v in time and date d. For each link e_T ∈ E_T, we also define the following three additional properties:

$e^q_T$ the transaction's product quantity. Namely, the purchased quantity of the product in the transaction.
$e^p_T$ the purchased product price in US dollars.
$e^a_T := e^q_T \cdot e^p_T$ - the transaction total amount in US dollars.

Using these definitions, we define the features for each v ∈ V (see Table 4).

Feedback Features. Similar to the transactions-directed multigraph, we can also define the feedback-directed multigraph, which is based on the e-commerce users’ feedback. Formally, let G_F = < V, E_F > be the directed multigraph that represents the feedback activities between two users in the e-commerce platforms, where V is the multigraph vertices set, which contains all the e-commerce users, and E_F is the multigraph's links set, which contains data on all the feedback interactions between e-commerce users. The links in the feedback multigraph are denoted by e_F ≔ (u, v, r, d) ∈ E_F, where u, v ∈ V are two e-commerce users, $ r \in \mathbb {Z}$ is the feedback rating, and d is the feedback's time and date. Each link e_F represents a feedback interaction with a rating of r, which a user u gave a user v in time and date d.⁶ Using these feedback-directed multigraph definitions, we define features for each user v ∈ V (see Table 1).

Users Details Features. To construct our supervised learning classifiers, we also utilized the following users’ details, which can be mainly extracted from the user's registration form, which exists in many e-commerce platforms:

Birth-Year(v) - the declared birth year of v.
State(v) - the declared state of v. For consistency, we converted the feature State(v) to an integer using CRC32 hash function [13].
Active-Days(v) - the number of days v was active in the e-commerce platform. In this study, we calculate this feature by calculating the number of days passed between the date the user created a profile in the e-commerce platform and the last date the user performed a selling or buying transaction.

Target class. Every instance in the training set includes a binary target attribute indicating whether the user was identified as a shill bidder. In this study, we assumed that a list of identified shill bidders (referred to as shill bidders list) is provided. These shill bidders are identified by the e-commerce platform experts and are used to train the supervised learning algorithms. A part of this list is also used as a ground truth during the evaluation of the algorithms, as described below.

Although unidentified shill bidders do not appear in the shill bidders list, as we see next, the fraction of such users is expected to be extremely low. Therefore, for training supervised learning algorithms, we assume that a randomly picked user is a benign user if it does not appear in the shill bidders list.

3.2 Selecting Users for the Training Set

In this study, we assumed that the actual ratio between shill bidders and benign users in an e-commerce platform is unknown, yet there are indications that the ratio between benign users and shill bidders is much in favor of the benign users. For example, in other online platforms, such as online social networks, there is a clear indication that the ratio between the number of benign entities and the number of malicious entities favors benign entities. The official Facebook estimation is that approximately 5% of Facebook users are malicious users [8], and according to Rahman et al. [36] about 13% of applications in Facebook are malicious. To construct our classifiers, we choose to create a balanced training set with an equal number of benign users and shill bidders, similar to the methodology used by Leskovec et al. [27] to predict positive and negative links, and by Fire et al. [20] to identify fake users in Facebook.

3.3 Choosing a Supervised Learning Algorithm

To identify the supervised learning algorithm which yields the best classification results on our datasets, we trained the classifiers on all the users in the provided shill bidders list and an equal number of benign users. We then extracted for each user all the 31 features (see Table 1), which were described in Section 3.1. Afterward, we used the constructed balanced training set and fed it to Weka [22], a popular suite of machine learning software. We used Weka's OneR, C4.5 (J48) decision tree, K-Nearest-Neighbors (IBk; with K=3), Naive-Bayes, Random-Forest, LogitBoost, Rotation-Forest, and Bagging implementations of the corresponding algorithms. For each of these algorithms, all of the configurable parameters were set to their default values.

We evaluated each classifier using the 10-fold cross validation method and calculated the True-Positive (TP) rate, False-Positive (FP) rate, F-Measure (FM) value, and the Area-Under-Curve (AUC) measure. These metrics assisted us in selecting the best supervised learning algorithm for identifying shill bidders.

3.4 Supervised Learning Algorithm Evaluation

To evaluate the precision of the selected supervised learning algorithm, which received the highest AUC in “the wild” on a real e-commerce imbalanced dataset, we performed the following steps:⁷

We created a balanced training set by randomly selecting 90% of the shill bidder users in the shill bidders list and randomly selecting an equal number of benign users.
The remaining 10% of the users in the shill bidders list, which were excluded in step 1, were utilized to construct several test sets having various imbalance rates. We created five test sets with 2, 5, 10, 20, and 100 benign users per single shill bidder.
For each user in the constructed training and testing sets, we extracted all the features, which were described in Section 3.1.
Using the balanced training set and the selected supervised learning algorithm, we constructed a shill bidder identification classifier.
For each user in each of the five imbalanced testing sets, we used the constructed balanced classifier to predict the user's likelihood of being a shill bidder.
For each imbalanced testing set, we calculated the classifier precision at the top k (precision@k). Namely, for each one of the five imbalanced datasets and for an integer k ∈ [1, n], we calculated the percent of the top k users, which received the highest likelihood of being shill bidders and were actually shill bidders.
Lastly, to reduce variability, we repeated steps 1 to 6 three times and averaged the obtained precision at k results for each k.

4 EMPIRICAL EVALUATION

4.1 The E-commerce Dataset

In this study, we used anonymized datasets provided by one of the largest e-commerce companies in the world to evaluate our shill-bidder detection algorithms. Additionally, in order to query the provided datasets, we used Hadoop infrastructure, which includes several thousands of Hadoop nodes. The e-commerce dataset included information of several billions of buying and selling transactions and several billions of feedback transactions. Each feedback transaction included feedback ratings with one of three possible values: negative feedback (-1), neutral feedback (0), and positive feedback (+1). All of the transactions in the e-commerce dataset were actual transactions performed by over several hundred million platform users through the end of 2012. Furthermore, we were provided with a list of 187,224 users, which were marked as shill bidders by the company's proprietary algorithms. To select a benign users list for our training and testing sets throughout this study, we randomly selected from the e-commerce dataset a list of 500,000 seller-users who performed at least one sell transaction. We then removed from the benign users list all the users who also appeared in the shill bidders list.

4.2 Experiment Setup

We evaluated various supervised learning algorithms to construct classifiers that can identify which users are shill bidders. Using the provided 187,224 shill bidders and an additional 187,224 benign users from the e-commerce dataset, we constructed a balanced training set and evaluated various supervised learning algorithms (see Section 3.3). Furthermore, using the e-commerce dataset, we constructed five imbalanced testing sets with the following ratios: (1) 1 to 2 - with 18,722 shill bidders and 37,444 benign users; (2) 1 to 5 - with 18,722 shill bidders and 93,610 benign users; (3) 1 to 10 - with 18,722 shill bidders and 187,220 benign users; (4) 1 to 20 - with 15,000 shill bidders and 300,000 benign users; and (5) 1 to 100 - with 3,200 shill bidders and 320,000 benign users.

We used these imbalanced datasets to evaluate the constructed classifiers’ precision at k for k ∈ [1, 30, 1000], using the methods described in Section 3.4.

Table 2: Supervised Learning Classifiers’ Results

Classifier	TP	FP	FN	AUC
OneR	0.800	0.252	0.780	0.774
Naïve-Bayes	0.886	0.764	0.645	0.752
Decision-Tree(J48)	0.822	0.193	0.816	0.860
Random-Forest	0.854	0.230	0.820	0.885
Bagging	0.834	0.179	0.829	0.902
LogitBoost	0.811	0.170	0.819	0.901
Rotation-Forest	0.845	0.173	0.838	0.912

4.3 Results

In this section, we present the results obtained by the method described in Section 3.

According to our evaluation results, among all tested supervised learning algorithms, the Rotation-Forest classifiers performed best on the e-commerce dataset, with an especially good AUC result of 0.912 and a TP rate of 0.845 (see Table 2). Therefore, we chose to construct shill bidder identification classifiers using the Rotation-Forest algorithm and to evaluate the constructed classifiers’ precision at k for the five imbalanced testing sets, which were defined in Section 3.4. The results showing the classifiers’ precision at k average values for all five testing sets are presented in Figure 1. According to the evaluation results, the developed shill bidders identification classifiers presented high precision at k, with precision at 1,000 of 1, 1, 0.999, 0.925, and 0.358, when the ratio between the number of shill bidder and the number of benign users was 1 to 2, 1 to 5, 1 to 10, 1 to 20, and 1 to 100, respectively (see Figure 1). These results indicate that the presented shill bidder identification algorithms can detect shill bidders far better than a random algorithm. Moreover, the presented algorithm gave very high precision rates when the percentage of the shill bidder in the e-commerce dataset was at least 5%.

Figure 1: Rotation-Forest classifier precision at k results - for five imbalance testing sets.

Table 3: Features’ Information Gain Scores

	Info Gain Score
Min-Sell-Price	0.268
Sell-Trans-Num	0.248
Total-Sell-Quantity	0.247
Unique-Buyers	0.240
State	0.240
Total-Sell-Amount	0.196
Rcv-Unique-Fdbk	0.188
Rcv-Fdbk-Num	0.187
Min-Buy-Price	0.180
Rcv-Fdbk-RSum	0.162
Rcv-Pos-Fdbk	0.162
Gvn-Unique-Fdbk	0.154
Gvn-Fdbk-Num	0.153
Gvn-Pos-Fdbk	0.153
Gvn-Fdbk-RSum	0.153
Fdbk-Bi-Degree	0.150
Total-Buy-Quantity	0.142
Buy-Trans-Num	0.143
Unique-Sellers	0.128
Max-Sell-Price	0.116
Total-Buy-Amount	0.120
Rcv-Fdbk-Avg	0.112
Gvn-Fdbk-Avg	0.089
Max-Buy-Price	0.076
Active-Days	0.079
Max-Buy-Quantity	0.070
Gvn-Neg-Fdbk	0.059
Rcv-Neg-Fdbk	0.047
Max-Sell-Quantity	0.016
Birth-Year	0.006
Trans-Bi-Degree	0.002

5 THE SHILL BIDDER ECOSYSTEM

In this study, we utilize the provided shill bidders list to empirically analyze the characteristics of shill bidders and study the shill bidder ecosystem, i.e., the interactions of shill bidders with each other.

To analyze the shill bidders’ characteristics using the e-commerce dataset, we calculated the average and median values for all the numeric features we defined in Section 3.1 for the users in the provided shill bidders list. Additionally, to better understand which features assist the supervised learning algorithms to distinguish between shill bidders and benign, we used the balanced training set (see Section 4.2) to calculate the features importance using Weka's Information Gain features’ selection algorithm. The results of the shill bidders’ characteristics analysis are presented in Tables 3 and 4.

Table 4: Features Average and Median Values Ratio Between Shill Bidders and Random Seller-Users

	Average Ratio	Median Ratio
Buy-Trans-Num	2.689	4.421
Sell-Trans-Number	3.341	16.75
Unique-Sellers	2.642	4.094
Unique-Buyers	3.506	13.625
Bidir-Trans-Users	1.412	2
Max-Buy-Price	1.663	2.168
Min-Buy-Price	0.035	0.966
Total-Buy-Quantity	2.683	4.525
Total-Buy-Amount	2.441	4.295
Min-Sell-Price	0.063	0.227
Max-Sell-Quantity	1.373	1
Total-Sell-Quantity	3.36	16.875
Total-Sell-Amount	3.583	9.874
Gvn-Fdbk-Num	2.76	6.286
Rcv-Fdbk-Num	2.964	6.463
Gvn-Unique-Fdbk	2.921	6.054
Rcv-Unique-Fdbk	3.112	6.222
Bidir-Fdbk-Users	2.989	6.091
Gvn-Pos-Fdbk	2.76	6.366
Gvn-Neg-Fdbk	2.748	inf
Rcv-Pos-Fdbk	2.786	5.702
Rcv-Neg-Fdbk	2.04	inf
Gvn-Fdbk-RSum	2.76	6.317
Rcv-Fdbk-RSum	2.789	5.66
Active-Days	1.387	1.545

From the shill bidders characteristics analysis results presented in Table 4, it can be observed that shill bidders behaved differently from the random seller-users in the following manner: First, on average, shill bidders are active for more days and perform far more selling and buying transactions than random sellers. These results may indicate that shill bidder users are, in general, active users, which perform many buying and selling transactions. Second, on average, shill bidders sell more products to unique buyers and buy more products from unique sellers than random seller-users. However, the shill bidders’ buying and selling minimum price was, on average much less than the buying and selling minimum price of random-seller users. We believe that these results are due to the shill bidders’ attempts to maximize their profits and minimize their losses when buying or selling feedback. Third, on average, shill bidders received more negative feedback than random seller-users. We assume that these results indicate that in the end, many shill bidders utilize their obtained positive feedback to mislead other users in the platform, which in return gives the shill bidders negative feedback. Lastly, on average, shill bidders gave more negative feedback than random seller-users. We believe that this result may indicate that shill bidders are also being utilized to perform sybil attacks [28]. We hope to prove these assumptions in a future study. Additionally, we discovered that the shill bidders had 798 unique State feature values, while the randomly selected seller-users had 1,435 unique State feature values. Furthermore, the most common State feature value among the shill bidders was the “default” value which appeared in the details of 134,979 shill bidders, while the “default” state value appeared only in the details of 46,805 randomly selected seller-users.

From the features’ Information Gain scores results, which are presented in Table 3, it can be observed that the Min-Sell-Price, Sell-Trans-Num, and Total-Sell-Quantity features received the highest Information Gain scores. We believe that these features received the highest scores due to the shill bidders behavioral patterns. In many cases, shill bidders attempted to decrease their losses by selling cheap products using many transactions to collect a great deal of positive feedback and spend as little money as possible.

To study the shill bidder ecosystem, we analyzed the feedback graph created by the shill bidders. The shill bidder feedback graph can assist us in understanding the “big picture” beyond the shill bidders and their interactions and even assist us in understanding the shill bidders’ working methods. We defined the feedback graph as following: Given a list $\hat{V} \subseteq V$ of e-commerce users, we can define $\hat{V}$’s feedback graph to be a weighted directed graph, where each directed weighted link is defined to be the amount of feedback user $u \in \hat{V}$ gave user $v \in \hat{V}$. Formally, the feedback graph is defined to be $ H_{\hat{V}} := < \hat{V},E_{\hat{V}}> $, where each link $e_{\hat{V}} \in E_{\hat{V}}$ in the graph defined as

Using the provided shill bidders list B, we first constructed the feedback graph H_B ≔ < B, E_B > as explained above. We then calculated various graph properties. Namely, we mostly used the igraph software package [ 18] to calculate the following graph's properties: (a) Number of vertices; (b) Number of links; (c) Maximum and minimum link weight; (d) Average link weight; (e) Number of bidirectional links (∣{( u, v, w) ∈ E_B∣∃( v, u, w) ∈ E_B}∣); (e) Density; (f) Components number; and (h) Largest component size.

Additionally, we used the igraph implementation of the Bron-Kerbosch algorithm [14] to find the maximal cliques in the graph⁸ and calculate their distribution. By identifying cliques, we can identify a group of shill bidders who worked together. Lastly, we used a random sample to select a list with an equal number of seller-users, and we compared the properties of the feedback graph created by the randomly sampled seller-users of those the feedback graph created by the shill bidders.

From the ecosystem analysis results, it can be observed that the shill bidder feedback graph is a relatively dense graph that spawns 1,805,199 directed links between 156,769 shill bidders, with an average of 1.31 feedback occurrences per link (see Table 5). Additionally, 79.09% of the shill bidders are located in a single component. Furthermore, according to the maximal clique detection results, it can be noticed that in contrast to the random user feedback graph, the shill bidder feedback graph consists of many cliques (see Table 6). These results may indicate that many shill bidders work together and assist each other to receive positive feedback. Another alternative explanation is that shill bidders open several accounts and use them to boost their reputation or sell feedback to other users.

Table 5: Feedback Graphs Properties

White

Shill Bidders

Random Users

Feedback Graph

Number of Users

187,224

Number of Feedback

Between Users

2,391,312

91,863

Number of Positive

Feedback Between Users

2,373,993

91,097

Number of Negative

Feedback Between Users

8,786

341

Number of Non-Isolated Users

156,769

35,599

Number of Links

1,805,199

67,383

Average Link Weight

1.31

1.35

Max Link Weight

914

219

Min Link Weight

-11

-8

Number of Bidirectional Links

1,675,411

59,692

Density

7.35 · 10^{− 5}

5.32 · 10^{− 5}

Component Number

8,123

9,309

Largest Component Size

148,072 (79.09%)

21,443 (11.45%)

Maximal Clique Size

Maximal Clique Number

895,844

37,080

Table 6: Maximal Cliques Size Distributions

Clique	Random	Shill
Size	Feedback Graph	Feedback Graph
3	260	66,892
4	0	3,945
5	0	803
6	0	173
7	0	23

6 CONCLUSIONS

According to the presented method, various user features are first extracted from user transactions (buying and selling) and feedback activities (giving and receiving). Second, supervised learning algorithms are utilized for training a model for classifying users into shill bidders and legitimate accounts. We evaluated the algorithms using a real large-scale anonymized e-commerce dataset. This dataset includes over several billion buying and selling occurrences performed by several hundred million users, as well as several billions of feedback occurrences they gave or received through the end of 2012. The dataset also includes a list of 187,224 users, which were marked as shill bidders in e-commerce platform systems and were used as ground truth for training and testing our algorithms. Evaluation results of the presented method showed the area under the ROC curve (AUC) of up to 0.912 and precision at 1,000 of 0.999 when the ratio between the shill bidders and the benign users was 1 to 10 (see Section 4.3).

By analyzing the e-commerce dataset, we also empirically studied the characteristics of shill bidders and the shill bidder ecosystem. As a result of this analysis, we discovered that, on average, shill bidders were more active, performed more selling and buying of transactions, and gave and received more feedback compared to randomly selected seller-users (see Table 4). Moreover, we also discovered that shill bidders gave more negative feedback compared to randomly selected seller-users. These results may indicate that shill bidders can also be used to perform sybil attacks [28]. These sybil attacks can be against targeted platform users, such as shill bidders competitors, and aim to damage the targeted users’ reputations by giving them unjust negative feedback. Additionally, we discovered that the shill bidder feedback graph, which was constructed from all the feedback links between each two shill bidders (see Section 5), is a relatively dense graph with 1,805,199 links among 156,769 shill bidders (see Table 5). Furthermore, in the shill bidders feedback graph we identified 66,892 cliques with at least 3 shill bidders, and 23 cliques with at least 7 shill bidders (see Table 6). These results indicate that many shill users collaborate to increase their overall reputations. Moreover, these results may indicate the existence of e-commerce bots that perform automatic buying and selling transactions. These bots also submit feedback to each other and other users.

We believe that these observations regarding the shill bidder ecosystem can assist in improving the detection of shill bidders in the following manner. First, we can utilize the shill bidder graph structure results and improve our shill bidder identification classifiers by extracting additional graph structure-based features, such as the number of cliques a user is a member of and the number of shill bidders the user is connected to. Second, we can use the fact that many shill bidders are connected to each other (see Table 5) and set our shill bidder identification classifier to identify shill bidders among users who are connected to several shill bidders, instead of applying the classifier on random sets of users chosen out of several hundred million e-commerce platform users. We believe that this technique of focusing the shill bidder identification classifier on these connected users can identify shill bidders with even higher precision. Lastly, we can utilize the results that indicate the shill bidders tend to formulate relatively large cliques (see Table 6) and use various clique identification algorithms to identify large cliques in the feedback graph created from all the e-commerce users’ feedback interactions. We believe that large cliques in this feedback graph have a high likelihood of containing shill bidders. We hope to verify these three assumptions in our future research.

The study presented here offers many additional future research directions to pursue. One possible research direction is to use Natural Language Processing algorithms to analyze the users’ content data, such as the feedback comments and the products information page. An additional possible research direction is to utilize various clustering algorithms to identify shill bidders. Another possible research direction is to construct classifiers, which utilize similar features to those presented in Section 3.1, to identify other types of malicious users, such as fraudsters who sell fictional products.

REFERENCES

2012. What is Shill Bidding?https://support.flippa.com/hc/en-us/articles/202469674-What-is-Shill-Bidding-. (Accessed on 10/01/2023).
2016. I Get Paid To Write Fake Reviews For Amazon. http://www.cracked.com/personal-experiences-2376-i-get-paid-to-write-fake-reviews-amazon.html. (Accessed on 10/06/2022).
2018. Facebook Will Ban Sellers of Shoddy Products. https://www.wsj.com/articles/facebook-will-ban-sellers-of-shoddy-products-1528794000. (Accessed on 10/06/2022).
2018. There Are 168 Million Active Buyers on eBay Right Now (INFOGRAPHIC) - Small Business Trends. https://smallbiztrends.com/2018/03/ebay-statistics-march-2018.html. (Accessed on 10/07/2022).
2019. Global Ecommerce 2019 - eMarketer Trends, Forecasts & Statistics. https://www.emarketer.com/content/global-ecommerce-2019. (Accessed on 03/03/2022).
2020. The True Cost of E-Commerce Fraud. https://blog.clear.sale/the-true-cost-of-e-commerce-fraud. (Accessed on 05/01/2023).
2021. Alibaba Group Announces March Quarter and Full Fiscal Year 2021 Results. https://www.sec.gov/Archives/edgar/data/1577552/000110465921065916/tm2116252d1_ex99-1.htm. (Accessed on 10/01/2023).
2021. Facebook - Financials - SEC Filings Details. https://investor.fb.com/financials/sec-filings-details/default.aspx?FilingId=15030787. (Accessed on 10/01/2023).
2022. Retail e-commerce sales worldwide from 2014 to 2026. https://www.statista.com/statistics/379046/worldwide-retail-e-commerce-sales/. (Accessed on 10/06/2022).
2022. Shill Bidding. https://www.nyccriminallawyer.com/fraud-charge/auction-fraud/shill-bidding/. (Accessed on 10/01/2023).
Amazon. [n. d.]. Amazon Global Selling, Sell & Ship Products Internationally - Amazon. https://sell.amazon.com/global-selling.html. (Accessed on 10/01/2023).
Yordanos Beyene, Michalis Faloutsos, Duen Horng Chau, and Christos Faloutsos. 2008. The eBay Graph: How do online auction users interact?. In INFOCOM Workshops 2008, IEEE. IEEE, 1–6.
Richard Black. 1994. Fast CRC32 in software. ATM Document Collection 3 (1994).
Coen Bron and Joep Kerbosch. 1973. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16, 9 (1973), 575–577.
Indranil Chakraborty and Georgia Kosmopoulou. 2004. Auctions with shill bidding. Economic Theory 24, 2 (2004), 271–287.
Wen-Hsi Chang and Jau-Shien Chang. 2011. A novel two-stage phased modeling framework for early fraud detection in online auctions. Expert Systems with Applications 38, 9 (2011), 11244 – 11260. https://doi.org/10.1016/j.eswa.2011.02.172
Duen Horng Chau and Christos Faloutsos. 2005. Fraud detection in electronic auction. In European Web Mining Forum at ECML/PKDD. 87–97.
Gabor Csardi and Tamas Nepusz. 2006. The igraph software package for complex network research. InterJournal, Complex Systems 1695, 5 (2006).
eBay. 2022. Shill bidding policy. http://pages.ebay.com/help/policies/seller-shill-bidding.html. (Accessed on 10/06/2022).
Michael Fire, Dima Kagan, Aviad Elyashar, and Yuval Elovici. 2014. Friend or foe? Fake profile identification in online social networks. Social Network Analysis and Mining 4, 1 (2014), 194.
Swati Ganguly and Samira Sadaoui. 2018. Online Detection of Shill Bidding Fraud Based on Machine Learning Techniques. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, 303–314.
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I.H. Witten. 2009. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11, 1 (2009), 10–18.
Chuck Jones. 2013. Ecommerce Is Growing Nicely While Mcommerce Is On A Tear. Forbes (October 2013). http://www.forbes.com/sites/chuckjones/2013/10/02/ecommerce-is-growing-nicely-while-mcommerce-is-on-a-tear/ [Online; accessed 10/06/2022].
Patric Kabus, Wesley W. Terpstra, Mariano Cilia, and Alejandro P. Buchmann. 2005. Addressing Cheating in Distributed MMOGs. In Proceedings of 4th ACM SIGCOMM Workshop on Network and System Support for Games (Hawthorne, NY) (NetGames ’05). ACM, New York, NY, USA, 1–6. https://doi.org/10.1145/1103599.1103607
Parisa Kaghazgaran, James Caverlee, and Majid Alfifi. 2017. Behavioral Analysis of Review Fraud: Linking Malicious Crowdsourcing to Amazon and Beyond.. In ICWSM. 560–563.
Robert J Kauffman and Charles A Wood. 2003. Running up the bid: detecting, predicting, and preventing reserve price shilling in online auctions. In Proceedings of the 5th international conference on Electronic commerce. ACM, 259–265.
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Predicting positive and negative links in online social networks. In Proceedings of the 19th international conference on World wide web. ACM, 641–650.
B.N. Levine, C. Shields, and N.B. Margolin. 2006. A survey of solutions to the sybil attack. University of Massachusetts Amherst, Amherst, MA (2006).
Joshua Lockhart. [n. d.]. http://www.makeuseof.com/tag/can-you-really-win-almost-any-ebay-auction-by-sniping/. [Online; accessed 10/06/2022].
David Lucking-Reiley, Doug Bryan, Naghi Prasad, and Daniel Reeves. 2007. Pennies from Ebay: The Determinants of Price in Online Auctions. The Journal of Industrial Economics 55, 2 (2007), 223–233.
Nazia Majadi, Jarrod Trevathan, and Neil Bergmann. 2016. Analysis on Bidding Behaviours for Detecting Shill Bidders in Online Auctions. In Computer and Information Technology (CIT), 2016 IEEE International Conference on. IEEE, 383–390.
Jason Mccormick. 2012. 10 most expensive items ever listed on eBay - CBS News. https://www.cbsnews.com/media/10-most-expensive-items-ever-listed-on-ebay. (Accessed on 10/06/2022).
Susan M Mudambi and David Schuff. 2010. What makes a helpful online review? A study of customer reviews on Amazon. com. MIS quarterly 34, 1 (2010), 185–200.
Richard Harris Nicholas Confessore, Gabriel J.X. Dance and Mark Hansen. 2018. The Follower Factory - The New York Times. https://www.nytimes.com/interactive/2018/01/27/technology/social-media-bots.html. (Accessed on 10/06/2022).
Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th international conference on World Wide Web. ACM, 201–210.
Md Sazzadur Rahman, Ting-Kai Huang, Harsha V Madhyastha, and Michalis Faloutsos. 2012. FRAppE: detecting malicious facebook applications. In Proceedings of the 8th international conference on Emerging networking experiments and technologies. ACM, 313–324.
Paul Resnick, Ko Kuwabara, Richard Zeckhauser, and Eric Friedman. 2000. Reputation systems. Commun. ACM 43, 12 (2000), 45–48.
Alvin E Roth and Axel Ockenfels. 2000. Last minute bidding and the rules for ending second-price auctions: Theory and evidence from a natural experiment on the Internet. Technical Report. National bureau of economic research.
Statista. [n. d.]. Number of eBay's total active buyers from 1st quarter 2010 to 2nd quarter 2022. https://www.statista.com/statistics/242235/number-of-ebays-total-active-users/. (Accessed on 10/01/2023).
Techopedia.com. [n. d.]. http://www.techopedia.com/definition/27959/auction-sniping/. [Online; accessed 10/06/2022].
Jarrod Trevathan and Wayne Read. 2009. Detecting shill bidding in online English auctions. Handbook of research on social and organizational liabilities in information security (2009), 446–470.
Emma Woollacott. 2017. Amazon's Fake Review Problem Is Now Worse Than Ever, Study Suggests. https://www.forbes.com/sites/emmawoollacott/2017/09/09/exclusive-amazons-fake-review-problem-is-now-worse-than-ever/. (Accessed on 10/1/2023).
Yu Zhang, Jing Bian, and Weixiang Zhu. 2013. Trust fraud: A crucial challenge for China's e-commerce market. Electronic Commerce Research and Applications 12, 5 (2013), 299 – 308. https://doi.org/10.1016/j.elerap.2012.11.005 Chinese E-Commerce.
Shi Zhou and Raúl J Mondragón. 2004. The rich-club phenomenon in the Internet topology. Communications Letters, IEEE 8, 3 (2004), 180–182.

FOOTNOTE

¹ http://www.amazon.com/

² http://www.alibaba.com/

³ http://www.ebay.com/

⁴ http://www.etsy.com/

⁵ http://www.onlineauction.com/

⁶It is worth mentioning that some e-commerce platforms provide the sellers the opportunity also to rank the buyers. In this study, we treated feedback that was given from a seller to a buyer as the same as feedback from a buyer to a seller.

⁷An additional method for evaluating the classifier precision is to validate the classifiers results manually. However, due to privacy limitations, these types of methods were not available in this study.

⁸The used igraph cliques detections algorithm implementation treats the directed graph as an undirected graph.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

IDEAS 2023, May 05–07, 2023, Heraklion, Crete, Greece