In this section, we analyze temporal and seasonal energy usage by individual homes and perform cluster analysis to identify the different types of home load profiles in the city. By analyzing data at an individual home level, privacy concerns such as occupancy detection can be introduced. Therefore, for this analysis, we analyze the aggregate energy consumption across all buildings in the city collectively. Since all meter data is anonymized, our analysis minimizes privacy risks that may come with load disaggregation.
5.1 Temporal and Seasonal Analysis
We first analyze the distribution of electricity usage across residential and commercial customers. Figure
5(a) depicts the histogram of average electricity usage for residential customers. The figure shows a mean power consumption of 0.8 kW and the distribution shows a long tail where the 99
\({\rm {th}}\) percentile of the consumption is 4.6 times the mean. The mean consumption is lower than the average usage of 1.24 kW reported for a typical US household [
2]. Figure
5(b) depicts the histogram of average electricity usage for commercial customers, and differs from residential consumption in a number of ways. First, the figure shows a mean power consumption of 1.4 kW, which is 1.75 times the residential mean consumption. This is because commercial customers typically have higher energy usage than residential ones driven by higher occupancy and energy for production needs. The figure also reveals that the highest consuming commercial buildings draw up to 20 kW, which is 14.3
\(\times\) the average consumption. Figure
2(a) shows the aggregate energy demand of all customers for each hour of the day for the summer and winter seasons. In this study, the months with most days having an average daily temperature greater than 60°F are categorized as summer, whereas the rest are categorized as winter. We chose this threshold because the resulting summer and winter months coincide with summer and winter seasons in the North Eastern region of the United States, and included September in summer months because we observed a high number of days with an average temperature of 60°F or more. Thus, unless stated otherwise, winter days are defined as the days from Jan-Apr and Oct-Dec 2019, whereas summer days are defined as the period from May-Sep 2019.
Figures
6(a) and
6(b) are heat maps showing electric usage from two individual homes. Here, the energy consumption for each home has been normalized using Min-Max scaling. In Figure
6(a), the electricity usage pattern reveals clear peaks during morning and evening hours over the entire year with somewhat higher usage during summer evenings. Figure
6(b), on the other hand, reveals higher usage in winter months than summer months. Figure
6(b) also reveals higher usage during the morning (around 7AM) throughout the year—presumably due to the need for hot water for showers. Prior work that has analyzed daily energy trends also revealed morning and evening peaks [
37]. However, these analyses lack seasonal patterns in the data as it is available across a shorter period of time.
Figure
6(b) depicts a home in which electricity is used to provide heat during the winter. The figure shows higher electricity demand in winter for electric heating and also shows higher morning and evening peaks. It also reveals higher usage for a few days in August—presumably due to higher cooling demand. Prior work that has analyzed seasonal energy consumption of residential customers also found higher energy usage during summer and winter months [
30]. However, due to the coarse granularity of the data used, our analysis reveals these patterns at a much more granular scale.
In summary: (1) Energy usage at individual homes shows time of day effects with morning & evening peaks as well as seasonal effects. (2) Electricity demand is higher in summer and gas demand is higher in winter due to the use of electric ACs and gas heaters, respectively.
5.2 Load Profile Analysis
Having examined the temporal and weather influence on energy use, we next study how different customers use energy on a day-to-day basis in their homes. For this analysis, we only cluster data from residential homes. Our hypothesis is that the energy usage within a home is largely determined by the daily routines and activities within a household, and depending on the characteristics of residents and their routines, different groups of customers will exhibit similar types of usage patterns. For example, homes, where everyone works during the day from 9AM-5PM, will have a different profile than a home with a retired person.
To validate this hypothesis, we perform customer segmentation analysis on the daily load profile of homes across the entire customer base. Since we are primarily interested in the pattern rather than the magnitude of the energy usage, we begin by normalizing the average daily load profile for each home between 0 and 1. We then use k-means clustering on these load profiles. k-means is a widely used clustering technique that takes a set of instances (individual homes) and their features (average energy consumption for each hour of the day) along with the desired number of clusters, k, as input. It uses an iterative approach to partition the data set into k groups such that the intra-cluster distance is small and inter-cluster distance is high. For this experiment, we used the sum of the squared distances between different load profiles. Typically, there exists statistical techniques to converge on the number of clusters such as Akaike Information Criterion(AIC) or Bayesian Information Criterion (BIC). However, these are generic model selection criteria and may not necessarily work well for all domains. Thus, we decided to employ visual model selection. By running k-means for different values of k, we found k = 8 to be the ideal choice for our dataset which did not result in any outlier clusters.
Figure
7 shows the eight clusters (customer segments) that resulted from our analysis. The lightly shaded lines are the profiles of each home present in that cluster and the bold line represents the centroid of the load pattern within each cluster. Broadly, there are four clusters that are bimodal with two peak usage periods of varying degrees and four clusters that are unimodal with a single peak usage period over the course of the day.
Table
3 summarizes the key characteristics of the customer segments within each cluster that include cluster type, peaks observed, the number of homes and their proportion in the dataset. As shown, around 7,216 homes (53.6% of total) exhibit bimodal usage (clusters
a,b,c,d), 1,240 homes (9.2% of total) exhibit unimodal daytime peak usage (cluster
e), 2,291 homes (17.0% of total) exhibit unimodal evening peak usage (cluster
f), while 2,711 homes (20.1% of total) exhibit “nocturnal” usage (clusters
g,h).
Figures
7(a)–
7(d) depict the four bimodal clusters. Figures
7(a) and
7(b) are homes with a small morning peak and a more prominent evening peak. These homes usually correspond to homes with working/school routine. Figure
7(c) is the opposite with a greater morning peak and a less prominent evening peak. Figure
7(d) depicts households with large morning and evening peaks. The nature of these peaks reflect appliance usage with homes at different times of day. For example, a taller morning peak reveals greater appliance use in the morning (e.g., use of laundry machines), while those with taller evening peaks reveal homes where more of these activities are performed in the evenings. Figure
7(d) depicts a more uniform distribution of activities in the morning and evenings. These represent homes that are occupied during the day.
Figures
7(e)–
7(h) depict four clusters with unimodal usage characterized by a single peak. Figure
7(e) depicts households where energy usage peaks in mid-day—presumably due to occupancy during daytime hours. Figure
7(f) depicts homes where peak usage occurring during evenings, with different peaks reflecting when daily chores are performed. Figures
7(g) and
7(h) represents nocturnal homes where the off-peak period occurs in the late morning or mid-afternoon and peak usage occurs during night hours. Presumably, these homes represent occupants who come home late at night.
Prior work that has studied segmentation based on energy usage also revealed similar unimodal and bimodal peaking profiles, with households that exhibit an evening peak accounting for the largest group of customers [
31]. However, as with most other prior studies, the data used in this prior work is less granular (
\(\gt 10\times\)) than the data used in this work, and this enables segmentation at a much finer scale.
In summary: Our customer segmentation reveals how the energy profiles correspond to their daily routines, with 53.6% of homes exhibiting bimodal energy usage, whereas, 26.2% and 20.1% of homes exhibit unimodal daytime and nocturnal energy usage, respectively.
5.3 Peak Analysis
Next, we analyze the peak power consumption recorded by each meter. We define peak power as the maximum power recorded by a smart meter at any time during the year. The recorded power in the data contains a few instances of unusually high values, which we attribute to spurious meter readings. Therefore, to compute the peak, we take the the 99.9th percentile reading across all data to eliminate these spurious reads which may affect peak analysis.
Figures
8(a) and
8(b) depict the cumulative distribution function of peak power drawn by residential and commercial meters during the year, respectively. Figure
8(a) shows that the median peak power across residential meters is 5.4 kW. In comparison, this is approximately 7.7
\(\times\) the average usage depicted in Figure
5(a). The figure also shows that some meters draw up to
\(\approx\)88 kW at some point in time during the year, while the 95
\({\rm {th}}\) percentile is 10.4 kW. On the other hand, Figure
8(b) shows that the median peak power across commercial meters is 3.5 kW, which is 5
\(\times\) the average usage depicted in Figure
5(b). The figure also shows that the 95
\({\rm {th}}\) percentile is 20.0 kW, which is 2
\(\times\) the highest peak observed in 95% of all residential meters.
To examine the extremity of peak usage from the average usage, we compute the peak-to-average ratio for each home during the year. Figure
8(c) plots the distribution of this ratio. The median peak-to-average ratio is 6.9, indicating that the peak usage is approximately 7
\(\times\) the average usage. Similar to the peak usage, this distribution also depicts a long tail, with some homes experiencing as high as 40
\(\times\) the average power consumption. As we will see later in Section
6.2, the peak-to-average ratio experiences a smoothing effect at the transformer level, i.e., when multiple meters are combined into one transformer, their respective peaks occur at different times, and this leads to a lower aggregate peak at the transformer level.
In summary: Commercial meters experience higher peaks than residential meters. The peak-to-average ratio across all meters is 6.9, indicating that peak usage is approximately 7\(\times\) the average usage.