Added a dashboard in python! - real_estate_dashboard.py
This script is designed to scrape real estate listings from Realtor.ca using Puppeteer. It connects to an existing browser session through a debugging port, enabling the reuse of local user data, cookies, and preferences. This approach eliminates the need to launch a new browser instance and allows manual pre-scraping interactions, such as logging in if required. Once connected, the script navigates to the provided Realtor.ca URL and dynamically determines the total number of pages to scrape by extracting the pagination count using an XPath query. Depending on the configuration, it can scrape either all available pages or limit itself to the first two.
The script opens each page in a new browser tab with a slight delay between each to avoid server overload. It retrieves data from the listings on each page, extracting key details such as price, square footage, address, and a link to the property. To ensure the links are usable, it constructs absolute URLs by appending the base domain (https://www.realtor.ca) to the relative href attributes found in the anchor tags. The data is organized into objects containing the price, square footage, street, city, province, and link for each listing. It applies a filter to include only properties with a minimum square footage of 1,800, ensuring the results meet predefined criteria.
Once all the data has been collected, the script formats it into a CSV file. It begins with a header row and escapes values appropriately for compatibility. The file is saved locally in the script’s directory with a unique filename to avoid overwriting previous results. If any issues occur during scraping, the script logs errors and continues to process the remaining listings or pages, ensuring robustness and continuity.
This scraper is particularly useful for real estate investors, agencies, or researchers looking to quickly gather property data for analysis. By connecting to an active browser session, it can handle websites that require authentication or specific user data. Additionally, the script's dynamic handling of pagination and filtering capabilities make it adaptable to a variety of use cases, such as analyzing market trends or gathering targeted property information. It provides a highly automated yet customizable solution for scraping and saving real estate listings locally.
Filter all available listings
POST https://api37.realtor.ca/Listing.svc/PropertySearch_Post
Get details about a specific listing
GET https://api37.realtor.ca/Listing.svc/PropertyDetails
realtor.post(options)
- returns a promise which will resolve with a JSON object containing the query results returned from realtor.ca.
realtor.getPropertyDetails(options)
- returns a promise which will resolve with a JSON object containing the detailed information of a property. Passing PropertyId
, and ReferenceNumber
(MLS number) as options is required. The PropertyId
can be obtained from the listing URL.
realtor.buildUrl
- returns a URL with the query string constructed from the specified options.
realtor.optionsFromUrl
- returns options from a URL from realtor.ca (from the map view).
^ Note: The website link from buildUrl
only allows specific price quantities. Any inconsistent PriceMin
and PriceMax
values will be rounded up to the next price level. Read the source for clarification.
const realtor = require('realtorca');
let opts = {
LongitudeMin: -79.6758985519409,
LongitudeMax: -79.6079635620117,
LatitudeMin: 43.57601549736786,
LatitudeMax: 43.602250137362276,
PriceMin: 100000,
PriceMax: 410000
};
console.log( realtor.buildUrl(opts) );
//https://www.realtor.ca/Residential/Map.aspx#LongitudeMin=-79.6758985519409&LongitudeMax=-79.6079635620117&LatitudeMin=43.57601549736786&LatitudeMax=43.602250137362276&PriceMin=100000&PriceMax=425000
// Parse options from url
console.log(realtor.optionsFromUrl("https://www.realtor.ca/Residential/Map.aspx#LongitudeMin=-79.6758985519409&LongitudeMax=-79.6079635620117&LatitudeMin=43.57601549736786&LatitudeMax=43.602250137362276&PriceMin=100000&PriceMax=425000"));
realtor.post(opts)
.then(data => {
//json response
})
.catch(err => {
});
Most of the information was pulled from the DOM nodes on the website
Most of the following options are optional. The first 3 listed are required but the wrapper will provide a default if they aren't specified.
CultureId
- language identifier,1
for English,2
for French. Defaults to 1.ApplicationId
- Mandatory for some endpoints, defaults to 37, their mobile app uses the value 37PropertySearchTypeId
- Defaulted to 1. Determines the type of property, possible values:0
No Preference1
Residential2
Recreational3
Condo/Strata4
Agriculture5
Parking6
Vacant Land8
Multi Family
HashCode
- Mandatory for some endpoints; their mobile app defaults it to 0
Most useful options
PriceMin
- Defaults to 0PriceMax
LongitudeMin
- bottom left longitude of the map view portLatitudeMin
- bottom left latitude of the map view portLongitudeMax
- top right longitude of the map view portLatitudeMax
- top right latitude of the map view portTransactionTypeId
- Defaults to 2?1
For sale or rent2
For sale3
For rent
StoreyRange
-"min-max"
i.e."2-3"
BedRange
-"min-max"
if min = max, it searches for the exact value. If it's1-0
, it means it's 1+. Maxes at 5BathRange
-"min-max"
Others
SortBy
- How to sort (e.g. price, date, etc)SortOrder
- How to order the items after sorting them by a given field (e.g. ascending, descending)
Type | Sort |
---|---|
Low to High ($) | 1-A |
High to Low ($) | 1-D |
Date Posted: New to Old | 6-D |
Date Posted: Old to New | 6-A |
Open Houses First | 12-D |
More Photos First | 13-D |
Virtual Tour First | 11-D |
organizationID
- sort/search by organizationID of a group of realtors. Value of this field can be found using a URL such as https://www.realtor.ca/Residential/OfficeDetails.aspx?OrganizationId=271479 as pointed out by Froren.individualID
- sort/search by agentID. Can be found using a URL such as https://www.realtor.ca/Agent/1914698/Gaetan-Kill-130---1152-Main... (in this case individualID = 1914698) as indicated by Kris.viewState
-m
,g
, or1
. Seems irrelevant.Longitude
- (Optional) Longitude of the current user's locationLatitude
- (Optional) Latitude of the current user's locationZoomLevel
- not sure what this doesCurrentPage
- read somewhere that it maxes at 51RecordsPerPage
- their mobile app uses500
as the default valueMaximumResults
PropertyTypeGroupID
- ???OwnershipTypeGroupId
0
Any1
Freehold2
Condo/Strata3
Timeshare/Fractional4
Leasehold
ViewTypeGroupId
0
Any1
City2
Mountain3
Ocean4
Lake5
River6
Ravine7
Other8
All Water Views
BuildingTypeId
0
Any1
House2
Duplex3
Triplex5
Residential Commercial Mix6
Mobile Home12
Special Purpose14
Other16
Row/Townhouse17
Apartment19
Fourplex20
Garden Home27
Manufactured Home/Mobile28
Commercial Apartment29
Manufactured Home
ConstructionStyleId
0
Any1
Attached3
Detached5
Semi-detached7
Stacked9
Link
UnitRange
- how many units within a given building, similar toBathRange
, such as2-0
to denote 2 or more unitsAirCondition
-0
or1
, defaults 0Pool
-0
or1
, defaults 0Fireplace
-0
or1
, defaults 0Garage
-0
or1
, defaults 0Waterfront
-0
or1
, defaults 0Acreage
-0
or1
, defaults 0Keywords
- search textListingIds
- Comma Separated listing Ids to scope the search toReferenceNumber
- Search using MLS #, this is required for viewing a listing detailOpenHouse
-0
or1
, must include if filtering by open houseOpenHouseStartDate
- MM/DD/YYYYOpenHouseEndDate
- MM/DD/YYYY
Feel free to PR and fork.