Table of Contents
- Reviews API is the backend infrastructure designed and created from scratch for our client's ecommerce website with the primary goal of improving the website's performance to handle up to 10,000 requests per second and replace the currently outdated system that could not handle large amounts of user traffic to the website.
- Node.js
- Express
- MySQL
- Hosted on separated AWS EC2 instances
- k6.io (development evironment stress testing)
- loader.io (production/cloud environment stress testing)
- Integrated with NewRelic to analyze service's average response time, error rate, and throughput.
- Clone this repo or fork to your own repo
- cd into the directory on your local machine
- Install dependencies
npm install
- Start server
npm start
- You are connected to a remote MySQL database hosted on AWS EC2 instance and can test out the API endpoints (see API documentation)
localhost:3000/reviews?product_id=1
- Run tests
cd load-tests
k6 run script.js
- Designed and selected two DBMS technologies (RDBMS and NoSQL DBMS) taking performances and complexity into consideration. Designed a primary database (MySQL) and secondary database (Mongo).
- Performed Extract, Transform, Load (ETL) process to transfer a large application dataset into MySQL.
- ~1 million records of Products
- ~5.5 million records of Reviews
- ~18 million records of Product's Characteristics Metadata
- ~2.5 million records of Reviews Photos
- Designed and created API logic, defined routes expected by the API, integrated server and database.
- Performed DBMS and Service API stress testing with k6 integrated with New Relic in development environment (local machine).
- Deployed DBMS and API onto separate EC2 instances (t2.micro).
- Performed stress testing with loader.io integrated with New Relic in production environment.
- Scaled the application to handle loads up to 10,000 requests per second (RPS).
Response time (latency)
: <2000ms under load.Throughput
: 1000 RPS on EC2.Error rate
: <1% under load.
(for more details see engineering-journal)
- One of the early challenges that I faced was transforming the given data set to fit into MySQL database schema. Since MySQL does not have built-in
BOOLEAN
type and instead allTRUE
andFALSE
value was converted into0
and stored asTINYINT
. I had to setTRUE
andFALSE
values to0
and1
before loading into the database.
-
I was able to achieve
1000 RPS
on development environment (local machine) with5-10ms
average response time stress testing with k6. However, when deployed onto AWS EC2 and stress testing with loader.io, I noticed my average response time jumped to~3500ms
for1000 RPS
.
- Replaced MySQL createConnection with createPool since it allows for more queries to be executed at the same time whereas createConnection only has a single connection and would affect performance.
- To lower my average response time and increase performance, I decided to use
NGINX
as a load balancer with the built-in Least Connection method and created 2 more servers on AWS in order to distribute the requests for evenly. With horizontal scaling, I was able to lower my response time from~3500ms
to~65ms
for1000 RPS
. - Horizontal scaling
~3500ms
->~500ms
for1000 RPS
and12% error rate
with 2 servers~500ms
->~65ms
for1000 RPS
and0% error rate
with 3 servers- After 3 servers, I noticed a diminishing returns in response time and concluded
~65ms
is the best I could get from horizontal scaling and that the next step would bevertical scaling
which costs money.
Returns a list of reviews for a particular product. This list does not include any reported reviews.
- GET
/api/reviews
Query Parameters:
product_id
Specifies the product for which to retrieve reviews.count
Specifies how many results per page to return. Default 5.sort
Changes the sort order of reviews to be based on "newest", "helpful", or "relevant"
Success Status Code: 200
Returns: JSON
{
"product": "2",
"page": 0,
"count": 5,
"results": [
{
"review_id": 5,
"rating": 3,
"summary": "I'm enjoying wearing these shades",
"recommend": false,
"response": null,
"body": "Comfortable and practical.",
"date": "2019-04-14T00:00:00.000Z",
"reviewer_name": "shortandsweeet",
"helpfulness": 5,
"photos": [
{
"id": 1,
"url": "urlplaceholder/review_5_photo_number_1.jpg"
},
{
"id": 2,
"url": "urlplaceholder/review_5_photo_number_2.jpg"
}
]
},
{
"review_id": 3,
"rating": 4,
"summary": "I am liking these glasses",
"recommend": false,
"response": "Glad you're enjoying the product!",
"body": "They are very dark. But that's good because I'm in very sunny spots",
"date": "2019-06-23T00:00:00.000Z",
"reviewer_name": "bigbrotherbenjamin",
"helpfulness": 5,
"photos": []
}
]
}
- GET
/api/reviews/meta
Success Status Code: 200
Returns: JSON
{
"product_id": "2",
"ratings": {
2: 1,
3: 1,
4: 2,
},
"recommended": {
0: 5
},
"characteristics": {
"Size": {
"id": 14,
"value": "4.0000"
},
"Width": {
"id": 15,
"value": "3.5000"
},
"Comfort": {
"id": 16,
"value": "4.0000"
},
}
- POST
/api/reviews
Body Parameters:
product_id
Required ID of the product to post the review forrating
Integer (1-5) indicating the review ratingsummary
Summary text of the reviewbody
Continued or full text of the reviewrecommend
Value indicating if the reviewer recommends the productname
Username for question askeremail
Email address for question askerphotos
Array of text urls that link to images to be showncharacteristics
Object of keys representing characteristic_id and values representing the review value for that characteristic.
Success Status Code: 201
Request Body: Expects JSON with any of the following keys (include only keys to be updated)
{
"product_id": 123,
"rating": 5,
"summary": "String",
"body": "String",
"recommend": true,
"name": "String",
"email": "String",
"photos": [
{
"id": 1,
"url": "urlplaceholder/review_5_photo_number_1.jpg"
},
{
"id": 2,
"url": "urlplaceholder/review_5_photo_number_2.jpg"
}
]
}
- PUT
/api/reviews/:review_id/helpful
Path Parameters:
review_id
Required ID of the review to update
Success Status Code: 204
- PUT
/api/reviews/:review_id/report
Path Parameters:
review_id
Required ID of the review to update
Success Status Code: 204
- Dockerize server for easier deployment.
- Implement Redis caching.
- Explore vertical scaling & other options.