[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Optimal Economic Research of Microgrids Based on Multi-Strategy Integrated Sparrow Search Algorithm under Carbon Emission Constraints
Previous Article in Journal
Retarded Gravity in Disk Galaxies
Previous Article in Special Issue
Combination Test for Mean Shift and Variance Change
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs

1
School of Mathematics, Jilin University, No. 2699 Qianjin Street, Changchun 130012, China
2
Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(4), 389; https://doi.org/10.3390/sym16040389
Submission received: 21 February 2024 / Revised: 16 March 2024 / Accepted: 22 March 2024 / Published: 26 March 2024
(This article belongs to the Special Issue Applications Based on Symmetry/Asymmetry in Functional Data Analysis)
Figure 1
<p>True values at observed time points and estimated function of <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">β</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 1.</p> ">
Figure 2
<p>True values at observed time points and estimated function of <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">β</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 2.</p> ">
Figure 3
<p>Polynomial regression estimator of <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">β</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 1.</p> ">
Figure 4
<p>Polynomial regression estimator of <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">β</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 2.</p> ">
Figure 5
<p>True values at observed time points and estimated function of <math display="inline"><semantics> <mrow> <mi>v</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>μ</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 1.</p> ">
Figure 6
<p>True values at observed time points and estimated function of <math display="inline"><semantics> <mrow> <mi>v</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>μ</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 2.</p> ">
Figure 7
<p>Polynomial regression estimator of <math display="inline"><semantics> <mrow> <mi>v</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>μ</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 1.</p> ">
Figure 8
<p>Polynomial regression estimator of <math display="inline"><semantics> <mrow> <mi>v</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>μ</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math> for scenario 2.</p> ">
Figure 9
<p>Estimated function of <math display="inline"><semantics> <mrow> <mi>β</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </semantics></math>.</p> ">
Figure 10
<p>Fitted points of cost based on current approach and the semi-parametric estimating approach.</p> ">
Figure 11
<p>Fitted points of cumulative cost based on current approach and the semi-parametric estimating approach.</p> ">
Figure 12
<p>Marginal survival compared with Kaplan–Meier estimator.</p> ">
Versions Notes

Abstract

:
In longitudinal studies, subjects are repeatedly observed at a set of distinct time points until the terminal event time. The time-varying coefficient model extends the parametric method and captures the dynamic trajectories of time-dependent covariate effects, thus enabling it to describe the potential relationship between the longitudinal variable and the observed time points. In this study, we propose a novel approach to the estimation of medical costs using a symmetric kernel smoothing method in the time-varying coefficient joint model. A smooth function of medical costs is derived by weighting the values of longitudinal data at all distinct observed time points via the combination of the kernel method and the inverse probability weighting method. For the simulation study, we first set up the true functions of time-varying coefficients; we then generated random samples for covariates and censored survival times. Subsequently, the longitudinal data of response variables could be produced. Further, numerical simulation experiments were conducted by using the proposed method and applying R code to the generated data. The estimated results for the parameters and non-parametric functions were compared with different settings. The numerical results illustrate that as the sample size increases, the bias and model-based standard errors decrease, and the performance improves with larger sample sizes. The estimates of functions in the model almost coincide with the true functions, as shown in the figures of the simulation study. Furthermore, the consistency of the obtained estimator is demonstrated via theoretical analysis, and a numerical simulation is performed to illustrate the performance of the proposed estimators. The proposed model is applied to a real-world data set acquired from a multicenter automatic defibrillator implantation trial (MADIT).

1. Introduction

In medical cost studies, researchers need to use appropriate methods to evaluate the average medical cost of a patient across their whole life. Due to censoring mechanisms, the survival function is generally not identifiable. Bang and Tsiatis [1] introduced a class of weighted estimators that appropriately account for censoring; although extensive simulation studies showed that the estimators perform well in finite samples, even with heavily censored data, the estimator is not efficient, and the computations are complex. Lin et al. [2] partitioned the entire time period of interest into a number of small intervals and estimated the average total cost to minimize the bias induced by censoring. Furthermore, the estimators were proven to be asymptotically normal. Huang and Lovato [3] formulated weighted log-rank statistics in a marked point process framework and developed the asymptotic theory. The above methods have been applied to estimate the cumulative mean function. However, in addition to censoring mechanisms, data often consist of longitudinal outcomes, time-dependent/independent covariates, event times, and censoring times. In longitudinal studies, subjects are repeatedly observed at a set of distinct time points until the terminal event time. The joint model [4] contains both a longitudinal sub-model, to express the effect on longitudinal measurements from time-dependent covariates, and a survival sub-model, to reveal the survival function association with the longitudinal part. Time-independent covariates are generally present in the survival sub-model due to the absence of repeated measurements.
Deng [5] considered a linear parametric regression model to describe longitudinal data and used the joint modeling technique and the inverse probability weighting method [6] to estimate the cumulative mean function. Although this method can also solve the problem of handling time-dependent covariates and right-censored time-to-event data, it may still be restrictive for capturing initial data.The main reason for this is that the effects of covariates on longitudinal outcomes are considered time-independent constants.
Thus, some researchers have focused on non-parametric longitudinal sub-models (e.g., Hoover et al. [7]; Zhao et al. [8]; Li et al. [9]; Do et al. [10]). There have been numerous approaches to estimating non-parametric estimators in the recent literature, such as kernel, smoothing spline, regression spline, and wavelet-based methods. For instance, Eubank and Speckman [11] proposed a well-behaved non-parametric kernel regression model in a small-sample study with bias-corrected confidence bands and proved the asymptotic properties. Wu et al. [12] minimized the local square criterion and obtained the asymptotic distributions for kernel estimators. However, when the covariate dimension is too high, the smooth estimation of general multivariate non-parametric regression may require a large sample size, and the smoothing results may be difficult to interpret.
Several time-varying coefficient longitudinal models have been considered in recent studies. For instance, You et al. [13] considered the time-varying coefficient as a polynomial basis regression spline, proposed a mixed-effects model for multiple longitudinal outcomes using the local polynomial method, and provided tuning parameters and variable selection. However, this model can only be used in continuous longitudinal outcomes. Observed data tend to be discrete in many applications. Moreover, time-independent covariates were not considered in their study [13]. The joint model with kernel smoothing varying coefficients in the longitudinal sub-model can be used to estimate the cumulative mean function with discrete data. Therefore, in this study, we estimate the cumulative mean function with time-dependent/independent covariates using the kernel method in a joint time-varying coefficient model based on right/interval censoring history process data. In our method, the estimator of the mean state function is unbiased at time points where all subjects are observed. The reason for this is that, after smoothing the function with the kernel method, the values of the original sample points are not changed. Thus, we can utilize all the known information for estimation without data bias. Moreover, in our simulation study, the estimator of the cumulative mean state function is almost equal to the value of the preassigned function at any time point.
The remainder of the paper is organized as follows: Section 2 establishes the joint varying coefficient model and proposes the estimators of time-varying coefficients based on our method. Section 3 demonstrates the feasibility of this method through numerical simulations. Section 4 applies the proposed model to a real-world data set from a multicenter automatic defibrillator implantation trial (MADIT). Section 5 discusses the influence of bandwidth h selection and concludes this paper. Finally, the appendices provide the proofs of the main results.

2. Estimation for a Time-Varing Coefficient Model

It is assumed that history process data are right-censored. Let T denote the terminal event time, and C denote the censoring event time. Let Y ( t ) denote the state process which is related to the time-dependent covariate X ( t ) and time-independent covariate W . The state process satisfies Y ( t ) = 0 when t T . For i = 1 , 2 , , m , let T i and C i denote the true values of T and C for the ith subject. Further, δ i = I ( T i C i ) denotes the censoring indicator. Assume that censoring time is independent of terminal event time and the state process. Let y i ( t ) , q-dimensional vector x i ( t ) = ( x i 1 ( t ) , x i 2 ( t ) , , x i q ( t ) ) T , and p-dimensional vector w i = ( w i 1 , , w i p ) T be the observed history of Y ( t ) , X ( t ) , and W for the ith subject.
Now, the time-varying coefficient joint model can be formed as follows:
y i ( t ) = x i ( t ) T β ( t ) + ϵ i ( t ) , h i ( t ) = h 0 ( t ) exp { w i γ + α ( y i ( t ) ϵ i ( t ) ) } ,
where β ( t ) is time-varying coefficient parameter, α is the association coefficient of the longitudinal outcome to the hazards for occurring event, h 0 ( t ) is the baseline hazards function, which is known in our model, and ϵ i ( t ) is the random error with ϵ i ( t ) N ( 0 , σ 2 ) . Further, it is assumed that ϵ ( t ) is independent of terminal event T conditional on X ( t ) and W.

2.1. Estimation of Time-Varying Coefficient in Longitudinal Model

We should notice that the observations for ith subject ( i = 1 , 2 , , m ) are not continuous but only able to be obtained at some special times t i j , ( j = 1 , 2 , , n i ) . The state process Y ( t ) can be the observed until the event time T i * = min ( T i , C i ) Thus, let y i ( t i j ) be the observed state history. For the sake of illustration, assume y i ( t i j ) = 0 when t T * .
For convenience, define sets Δ i = { t i j , j = 1 , 2 , , n i } for i = 1 , 2 , , m and Δ = { t ( k ) ; k = 0 , 1 , 2 , , N } , where 0 = t ( 0 ) < t ( 1 ) < t ( 2 ) < < t ( N ) are the observed distinct time points for all subjects. N denotes the number of all the distinct observed time points.
The estimator β ^ ( t ) of the time-varying coefficient β ( t ) can be calculated by minimizing the following equation:
L N ( t ) = k = 1 N W k , h ( t t ( k ) ) i = 1 m [ I { T i * t ( k ) } { y i ( t ( k ) ) x i ( t ( k ) ) T β ( t ) } ] 2 ,
where
W k , h ( t t ( k ) ) = K h ( t t ( k ) ) k = 1 N K h ( t t ( k ) ) ,
K h ( · ) is the kernel weight function. The bandwidth, which is generally selected according to the observed time points, reveals the distance of t and t ( k ) . There is a major distinction between the estimation from Equation (2) and the estimation in Wu [12] since the censoring mechanism, indicator I { T i * t ( k ) } , is necessary in order to avoid incomplete data due to censoring.
Remark 1. 
We assume that K h ( · ) is the Epanechnikov kernel function. The cross-validation method can be used to select the appropriate kernel function. The kernel function with the smallest induced error is considered the best. More details on kernel function selection can be found in [14].
Equivalently, we write Equation (2) in the matrix form:
L N ( t ) = k = 1 N Y k X k β ( t ) T D ˜ k ( t ) Y k X k β ( t ) ,
where
Y k = y 1 ( t ( k ) ) y m ( t ( k ) ) ,
X k = x 11 ( t ( k ) ) x 1 q ( t ( k ) ) x m 1 ( t ( k ) ) x m q ( t ( k ) ) ,
and
D ˜ k ( t ) = diag W k , h ( t t ( k ) ) I { T 1 * t ( k ) } , , W k , h ( t t ( k ) ) I { T m * t ( k ) ) ,
It is assumed that k = 1 N X t ( k ) T D ˜ k X t ( k ) is also invertible. By minimizing Equation (3), β ^ ( t ) can be expressed as the following q-dimensional column vector:
β ^ ( t ) = k = 1 N X k T D ˜ k ( t ) X k 1 k = 1 N X k T D ˜ k ( t ) Y k .
The traditional method for bandwidth selection is K-fold cross-validation (CV). Time-varying coefficients complicate the calculation. However, ‘leave-one-subject-out’ cross-validation proposed by Rice and Sliverman [15] can be used in such scenarios. In this case, the kernel weight function is K ( t t ( k ) ; h ) , the estimator of time-varying coefficient β ( t ) is β ^ ( t ; h ) . Minimizing the following equation:
C V ( h ) = k = 1 N i = 1 m E [ I { T i * t ( k ) } { y i ( t ( k ) ) x i ( t ( k ) ) T β ( i ) ( t ( k ) ; h ) } ] 2 ,
where β ^ ( i ) ( t ( k ) ; h ) is the kernel estimator computed with all measurements except the measurements of ith subject. Then, the cross-validation bandwidth can be obtained.

2.2. Estimation of Survival Model

Define m i ( t ) = y i ( t ) ϵ i ( t ) = x i ( t ) T β ( t ) , and then the estimate of m i ( t ) is
m ^ i ( t ) = x i ( t ) T β ^ ( t ) .
The data related to survival sub-model consist of { ( x i ( t ) , w i , T i * , δ i , ) ; i = 1 , 2 , , m } .
For each given t R ,
h i ( t ) = h 0 ( t ) exp { w i γ + α m i ( t ) } ,
The Cox partial maximum likelihood function [4] is
L p ( γ , α ) = i = 1 m exp { w i γ + α m i ( T i * ) } 0 T i * exp { w i γ + α m i ( s ) } d s δ i .
The log-partial likelihood function for Equation (5) is
L L ( θ ) = log L p ( γ , α ) = i = 1 m δ i w i γ + α m i ( T i * ) log 0 T i * exp { w i γ + α m i ( s ) } d s .
Then, replacing m i ( t ) by the estimator m ^ i ( t ) , the estimator θ ^ = ( γ ^ , α ^ ) can be obtained by maximizing Equation (6).
We define the survival function of T as:
S T ( t ) = P ( T > t ) .
The estimate of S T ( t ) can be obtained from the hazards function in joint model:
S ^ T ( t ) = exp 0 t h 0 ( s ) exp { w T γ ^ + α ^ m ^ ( s ) } d s .
Theoretically, the estimate S ^ T ( t ) is related to the values of covariates x ( s ) for s [ 0 , t ] . Since x ( t ) generally is not continuously observed, the estimate S ^ T ( t ) can not be derived even at observed time points. S ^ T ( t ) can be calculated only if x ( t ) can be observed continuously. Thus, we replace S ^ T ( t ) with the Kaplan–Meier estimator [16]. Alternatively, in terms of the law of large numbers, the survival function can also be estimated as follows:
S ^ ( t ) = 1 m i = 1 m I { T i t } .
Moreover, the estimate of the survival function of S ^ * ( t ) = P ( T * > t ) for T * can be obtained in a similar way.

2.3. Estimation of Cumulative Mean State Function

Suppose ν ( t ) is the mean function of Y ( t ) , that is, given X ( t ) , ν ( t ) = E Y ( t ) X ( t ) . Because of the censoring, the proposed estimator ν ^ ( t ) for the mean state function ν ( t ) at any time is as follows:
ν ^ ( t ) = 1 m i = 1 m I { T i * t } Y ^ i ( t ) S ^ * ( t ) ,
where Y ^ i ( t ) = x i ( t ) β ^ ( t ) is the fitted value of Y i ( t ) at time point t.
Then, the cumulative mean function μ ( t ) for any time point t can be obtained as:
μ ^ ( t ) = 0 t ν ^ ( s ) d s .

2.4. The Asymptotic Property of Estimators

Here, we discuss the asymptotic property of estimators. To rigorously define the statements, we introduce various notations. Let R ( t ) denote the observed covariate processes, such as baseline information, study time, and so on, that is, R ( t ) = { x ( t ) , w } . Then, let Z ¯ R ( t ) denote the longitudinal covariate history prior to time t and Z ¯ Y ( t ) denote the response history prior to time t, that is, Z ¯ R ( t ) = { R ( s ) : s < t } and Z ¯ Y ( t ) = { Y ( s ) : s < t } . Furthermore, we use · to denote the Euclidean norm in real space and R ( t ) to denote the derivative of R ( t ) with respect to time t. It is assumed that all observed time points t i j are independent of each other and follow a distribution F T with density f T .
Assumption 1. 
X ( t ) is Lipschitz-continuous with order λ 0 , | E ( X l T X l ) E ( X s T X s ) | C 0 | t ( l ) t ( s ) | λ 0 for any t ( l ) and t ( s ) in support of f T and some C 0 > 0 , and β ( t ) and f T are Lipschitz-continuous with orders of λ 1 > 0 and λ 2 > 0 , respectively.
Assumption 2. 
o ( t ) = lim Δ t 0 E { ϵ ( t + Δ t ) ϵ ( t ) } and w ( t ) = E { ϵ 2 ( t ) } are continuous.
Assumption 3. 
W k , h ( · ) is square-integrable that integrates to one and satisfies u W k , h ( u ) d u = 0 , m 1 = u 2 W k , h ( u ) d u < , and m 2 = W k , h 2 ( u ) d u < , while h 0 and N h as N .
Assumption 4. 
The bandwidth satisfies h = N 1 / 5 M 0 for some constant M 0 .
Assumption 5. 
lim n N 6 / 5 i = 1 n n i = U for some 0 U <
Theorem 1. 
Under Assumptions (1)–(5), the estimator β ^ ( τ ) defined in Equation (4) is asymptotically multivariate normal for any τ [ 0 , T ] as m .
We summarize the asymptotic normality of β ^ ( τ ) at a fixed point τ , where τ R in Appendix A.
Remark 2. 
Most assumptions are similar regularity conditions as that in Wu et al. [12]. Assumptions 1 and 2 are rigorous statements for which k = 1 N X t ( k ) T D ˜ k X t ( k ) is positive-definite and invertible asymptotically. Assumption 3 ensures that W k , h ( · ) has a compact support on R. Assumptions 4 and 5 are results from finite moments in Assumptions 1 and 2. More discussions about the asymptotic risk for the kernel estimators can be found in [12].
Theorem 2. 
It is assumed that for t ( 0 , T ) , E ( y i ( t ) x i ( t ) ) = E ( Y ( t ) X ( t ) ) , the estimator ν ^ ( t ) defined in Equation (8) is an unbiased estimator for any t Δ .
The proof can be found in Appendix B.
Based on Zeng and Cai [17], the following assumptions are imposed on the joint model.
Assumption 6. 
For any t [ 0 , T ] , the covariate process R ( t ) is fully observed and conditional on Z ¯ R ( t ) , Z ¯ Y ( t ) , and T t ; the distribution of R ( t ) depends only on Z ¯ R ( t ) . R ( t ) is continuously differentiable in [ 0 , T ] and m a x t [ 0 , T ] R ( t ) < with a probability of one.
Assumption 7. 
The censoring time C depends only on Z ¯ R ( t ) and R ( t ) for any t < T conditional on Z ¯ R ( t ) , Z ¯ Y ( t ) , R ( t ) and T t .
Assumption 8. 
Full-rank P ( X T X ) is positive. Additionally, if there is an existing constant vector C 0 satisfying X ( t ) T C 0 = g ( t ) for a deterministic function g ( t ) for all t [ 0 , T ] with a positive probability, then C 0 = 0 and g ( t ) = 0 .
Assumption 9. 
The true value of parameter θ = σ 2 , γ T , α satisfies θ Q 0 , σ 0 2 > Q 0 1 for a known positive constant Q 0 .
Assumption 10. 
The baseline hazard function h 0 ( t ) is bounded and positive in [ 0 , T ] .
Assumption 11. 
There is an existing positive constant a > 0 satisfying S T ( T ) a .
Remark 3. 
Assumption 6 serves as a fundamental statement in joint models, indicating that the association between the history process and the survival time is due to observed covariate processes, such as baseline information, study time, and so on, denoted by R ( t ) . Assumption 7 means that there exist some appropriate measures such that the intensity function of N c ( t ) exists. Assumption 8 is the identifiability assumption in a linear mixed-effects model. Assumptions 9–11 imply that, conditional on Z ¯ R ( t ) and R ( t ) , the probability of a subject surviving after time τ is at least some positive constant. Theorem 3.1 in Zeng and Cai [17] states the strong consistency of the maximum likelihood estimator. More discussions on the assumptions can be found in [17].
Theorem 3. 
Under Assumptions (1)–(11), the estimators ν ^ ( t ) defined in Equation (8) and μ ^ ( t ) defined in Equation (9) are consistent for any t [ 0 , T ] .
The proof can be found in Appendix C.

3. Simulation

In this section, some numerical results are presented. In our simulation, the joint model can be described as:
Y ( t ) = x ( t ) β ( t ) + ϵ ( t ) , h ( t ) = h 0 ( t ) exp { γ w + α ( Y ( t ) ϵ ( t ) ) } ,
where Y ( t ) is the state function, x ( t ) = ( 1 , x 1 ( t ) , , x p ( t ) ) is the vector of covariates for regression parameters, β ( t ) = ( β 0 ( t ) , β 1 ( t ) , , β p ( t ) ) T is the time-varying coefficients, w is the covariates for regression parameter γ , α is the association parameter, and ϵ ( t ) N ( 0 , 0.1 ) .
The standard deviation (Std.dev) and the root mean square errors (RMSE) with R M S E = [ 1 N k = 1 N ( g ^ ( t ( k ) ) g ( t ( k ) ) ) 2 ] 1 2 of overall estimates calculated by R 4.3.1 [18] are used to assess the quality of estimators. We summarize the steps in the following procedure:
  • Set the sample size n, the true function β ( t ) = ( β 0 ( t ) , β 1 ( t ) , β 2 ( t ) ) , the true value of parameters α , γ , and the rate of censoring r;
  • Generate a random sample g i U [ 0 , 1 ] ;
  • Derive the random sample s i of lifetime with the hazards function h ( t ) ;
  • Generate a random sample of censoring C i U [ a , b ] ;
  • Set t i = m i n { v i , c i } , δ i = I { v i c i } ;
  • Generate the random sample of the time-dependent covariates x 1 ( s ) , x 2 ( s ) , and the baseline hazards function h 0 ( s ) .
  • For s = { 1 , 2 , , t i } , generate the response variables y ( s ) = β 0 + β 1 x 1 ( s ) + β 2 x 2 ( s ) .
Output the estimated function β ^ ( t ) , μ ^ ( t ) , the estimated value of parameters α ^ and γ ^ , the bias and the std.err of parameters α and γ , and the RMSE of the estimated function ν ^ ( t ) , μ ^ ( t ) .
We utilize packages of ‘MASS’, ‘splines’, ‘survival’, ‘nlme’, ‘JM’, ‘lattice’, ‘mvtnorm’, ‘tibble’ and ‘ggplot2’ in our simulation study.
Now, we consider the following scenarios:
Scenario 1: Set x ( t ) = ( 1 , x 1 , x 2 ) , x 1 = r ( 1.5 sin ( t ) + 1 ) where r B e r n o u l i ( 0.5 ) , x 2 U [ 1 , 2 ] , β ( t ) = ( β 0 , β 1 , β 2 ) T , β 0 ( t ) = 1.5 t , β 1 ( t ) = 1.2 t 0.5 , β 2 ( t ) = t 0.2 , w U [ 0 , 1 ] , γ = 1.0 , α = 0.25 , and h 0 ( t ) = h 0 .
Scenario 2: Set x ( t ) = ( 1 , x 1 , x 2 ) , x 1 = r log ( t ) where r N ( 1 , 0.5 ) , x 2 e x p ( 0.5 ) , β ( t ) = ( β 0 , β 1 , β 2 ) T , β 0 ( t ) = t , β 1 ( t ) = t 0.5 , β 2 ( t ) = sin ( t ) , w U [ 0 , 1 ] , γ = 1.0 , α = 0.15 , and h 0 ( t ) = h 0 .
The following is the form based on the joint model:
Y ( t ) = β 0 ( t ) + β 1 ( t ) x 1 ( t ) + β 2 ( t ) x 2 + ϵ ( t ) , h ( t ) = h 0 ( t ) exp { γ w + α ( β 0 ( t ) + β 1 ( t ) x 1 ( t ) + β 2 ( t ) x 2 } .
After 1000 replications in R, we obtain the results of the estimation at different sample sizes. In each scenario, we control the censoring rate by about 25 % .
In scenario 1, we set the true values of parameters as α = 0.25 and γ = 1.0 . In the scenario 2, we set the true values of parameters as α = 0.15 and γ = 1.0 . From the proposed estimators given in Equations (8) and (9), fitted values of the state function v ( t ) and the cumulative mean function μ ( t ) for any time points can be computed.
Table 1 and Table 2 present the estimates of the root of mean square errors (RMSEs) of the state function v ( t ) and the cumulative mean function μ ( t ) with different numbers of time points N for the two scenarios. The RMSE is small when N = 106 because the estimators of v ( t ) are unbiased at observed time points. Since the estimators are biased at other time points, the RMSE increases as N increases.
Table 3 and Table 4 summarize the main findings of fixed parameters. The results show that as the sample size increases, the bias and model-based standard errors decrease, which coincides with empirical results reasonably well. The performance improves with larger sample size. Note that it is common for the Std. Err. of γ to be very large, sometimes reaching as large as 0.2 in small samples for linear mixed-effects models (see Table 1 in [5]). In the semi-parametric model estimation based on polynomial regression, the Std. Err. of γ even reaches 0.3 (see Tables 1 and 2 in [19]). Compared with these methods, the Std. Err. of γ in our paper is less.
Figure 1 and Figure 2 present the proposed estimated functions of β ( t ) as a continuous curve and true values of β ( t ) at time points t Δ as a series of points. Figure 3 and Figure 4 present the estimated functions of β ( t ) using the local polynomial regression method. In Scenario 1, our method is not significantly superior to polynomial regression. However, in Scenario 2, the term regression method does not fit β 2 ( t ) well, and our method estimates β ^ 2 ( t ) close to the true value. This is because β 2 ( t ) is set up as a trigonometric function rather than a linear combination of power functions. It can be seen that while the local polynomial regression method is suitable for power basis functions, the kernel smoothing method fits the function better, even in such cases.
Furthermore, Figure 5 and Figure 6 show the true curves and fitted values of the state function v ( t ) and the mean of cumulative mean function μ ( t ) . In each figure, the estimated functions approximate to the values of true functions. Figure 7 and Figure 8 show the results of the local polynomial regression method. Similar to the previous conclusion, the method we proposed works better.

4. An Application to MADIT Data

In this section, we validate the proposed estimator with a real data set from a multicenter automatic defibrillator implantation trial (MADIT). MADIT data contain 181 subjects (patients) from 36 centers in the USA that were observed at a total of 134,853 discrete time points. With the 181 patients, 89 of them chose to implant cardiac defibrillators, while another 92 did not. Throughout this section, we encode the ‘implanted’ group as I C D = 1 and ‘not implanted’ group as I C D = 0 . Since the effect of treatment of whether to implant cardiac defibrillators (ICDs) did not directly induce any medical costs but actually affected the expected survival time, we consider it as the time-independent covariate in the survival sub-model.
The observed patients have six types of costs that can be observed daily from the start to the death time or censoring time: Type 1: hospitalization and emergency department visits; Type 2: outpatient tests and procedures; Type 3: physician/specialist visits; Type 4: community services; Type 5: medical supplies; Type 6: medication. These costs directly drive medical costs, thus they should be considered as time-dependent covariates in the longitudinal process sub-model. It should be pointed out that the whole observation contains quite a lot of data points, so the R-code cannot work for the daily cost data. Therefore, we merge the data by combining 12 days into one time unit.
These data also include the patients’ ID codes, observed survival times in days, and death indicators.
Now, in the data set, we encode the total 181 patients as follows:
  • ID code (from 1 to 181);
  • Treatment code (1 for ICD and 0 for conventional);
  • Observation of survival time;
  • Death indicator(1 for death, 0 for censored);
  • Merged medical costs of type 1–6.
To analyze this data set, we describe the model as follows:
Y i ( t ) = β 0 ( t ) + β 1 ( t ) x 1 i ( t ) + β 2 ( t ) x 2 i ( t ) + β 3 ( t ) x 3 i ( t ) + β 4 ( t ) x 4 i ( t ) + β 5 ( t ) x 5 i ( t ) + β 6 ( t ) x 6 i ( t ) + ϵ i ( t ) , h i ( t ) = h 0 ( t ) exp { γ w i + α ( Y i ( t ) ϵ i ( t ) ) } ,
where for r = 1 , 2 , , 6 , x i r ( t ) = 1 if type = 1, otherwise, x i r ( t ) = 0 , w i = 1 for the ICD group and w i = 0 for the not implanted group. The estimated parameters γ ^ and α ^ in survival sub-model are obtained by Cox partial maximum likelihood method, they are always asymptotically normal. The estimates of γ and α in this paper are attained by calling ‘coxph()’ and ‘method = peicewise-PH-GH’ in the JM package. The Std. Err. and p-value are automatically produced.
In this case, β ( t ) can be considered the relationship between the natural logarithm of medical cost and time unit. The fitted curve is illustrated in Figure 9. Time-varying parameters more accurately describe the effect of covariates on state function over time, such as β 5 ( t ) ; after a certain time point, x 5 ( 5 ) does not affect the state function, although the in early stage it does.
Table 5 shows the estimates of the survival sub-model. The association effect α is positive, which corresponds to reality, that is, patients in a serious life condition require more medical attention. In other words, serious illness leads to higher medical costs, corresponding to lower survival rates. This further supports that our model is efficient and reasonable. However, due to the large p-value of the result, this conclusion is not specific, which should be further tested using a large sample in future research. The treatment effect γ is negative, which corresponds to reality, that is, an automatic defibrillator implantation trial can reduce the risk of death. In other words, the ‘implanted’ group (ICD = 1) has a lower risk of death. This conclusion is supported by the small p-value of the result.
Table 6 presents the estimated values of the cumulative cost for 5 years and the total treatment period. Comparing Sub-figure(a) with Sub-figure(b) in Figure 10 and Figure 11, we realize that the fitted points for the mean medical cost based on the current approach better describe the elaborate trajectories of medical costs, and the result for the fitted points of the cumulative mean medical cost based on the current approach exhibit a similar trend to Li’s [19] result; however, the result is more accurate and shows continuous change, which, in turn, leads to improved understanding of real-world data.
Compared with parametric/semi-parametric estimation, our method suggests a smoother time-varying mean medical cost function, which is no longer a straight line with an observable change rate, but exhibits more fluctuations over time. In practical situations, government agencies or insurance companies would not know which treatment was selected by a patient. The estimated cumulative mean function provides them with a more accurate statistical decision recommendation for macro allocation. For example, we should consider the additional changes in medical costs because they would suddenly increase or decrease at distinct time points.
Moreover, the estimated marginal survival function is smoother compared with the Kaplan–Meier estimator. The result is similar to research by Li’s [19], as shown in Figure 12. The solid line is the estimated survival rate, and the dotted line represents the 95 % confidence interval of the estimated survival rate.

5. Discussion

Generally, some estimate methods can lead to biased estimates in the model, and the time-varying coefficient model compensates for this shortcoming. In this study, we provide an estimation of the time-varying coefficient using the kernel function technique. When the covariates are continuous, the continuous state function can be obtained, allowing for the derivation of the continuous cumulative mean state function. In summary, our method is more precise because the kernel function smooths the estimator; more importantly, the proposed estimator at the original observed true points is unbiased. Theorem 2 demonstrates this perspective. In our numerical simulation and application analysis, we consider the observation interval to be a unit of 1, and we chose a bandwidth h of 1 in the kernel smoothing function. Notably, in cases where the observation time interval is not 1 or involves interval censoring, we must choose the appropriate bandwidth. Many studies on kernel function have proposed methods for choosing the bandwidth. In general, too large a bandwidth distorts data, and too small a bandwidth does not lead to a smooth continuous function. Moreover, as h , β ( t ) turns out to be a series of constants, that is, a time-independent parametric estimator.
In this study, we do not consider the random effect coefficients in the longitudinal data model, and the correlation that may exist among repeated observations within individuals is ignored. If we consider the random effect coefficients, we will not obtain Theorem 2, but we can still obtain Theorem 3. In future studies, random effects should be considered in longitudinal models, which may yield better results.
Furthermore, the estimation of the time-varying coefficient using the kernel function technique can be extended to multiple longitudinal outcomes, and the justification for the survival sub-model would need a multi-dimensional association coefficient. The methods proposed in [20] may be applicable.

Author Contributions

Conceptualization, S.L., D.D. and Y.H.; methodology, S.L. and D.D.; software, S.L. and D.D.; validation, D.D. and Y.H.; formal analysis, S.L.; investigation, S.L.; resources, D.D.; data curation, S.L. and D.D.; writing—original draft preparation, S.L.; writing—review and editing, D.D.; visualization, Y.H.; supervision, D.D. and Y.H.; project administration, D.D.; funding acquisition, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

Han’s work is partially supported by the National Natural Science Foundation of China (grant number 11871244); Deng’s work is partially supported by the Natural Sciences and Engineering Research Council of Canada.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. The Asymptotic Normality of β ^ ( τ )

For each t ( l ) Δ , denote
β ( t ( l ) ) = ( β 1 ( t ( l ) ) , β 2 ( t ( l ) ) , , β q ( t ( l ) ) ) .
For all r , c = 0 , , q , denote
b r z ( t ( l ) ) = E [ x i r ( t ( l ) ) I { T m * t ( l ) } x i c ( t ( l ) ) ] ;
u r ( t ( l ) ) = M 0 3 / 2 z = 1 q m 1 [ β z ( t ( l ) ) b r z ( t ( l ) ) f ( ( t ( l ) ) + β z ( t ( l ) ) b r z ( t ( l ) ) f ( t ( l ) ) + ( 1 / 2 ) β z ( t ( l ) ) b r z ( t ( l ) ) f ( t ( l ) ) ] ;
G ( t ( l ) ) = ( f ( t ( l ) ) ) 1 E [ X l T X l ] 1 ( t ( l ) ) ( u 0 ( t ( l ) ) , , u q ) ;
V r z ( t ( l ) ) = m 2 ( t ( l ) ) b r z ( t ( l ) ) f ( ( t ( l ) ) m 2 + U M 0 o ϵ ( t ( l ) ) b r z ( t ( l ) ) f 2 ( ( t ( l ) ) ;
V ( t ( l ) ) = D 00 ( t ( l ) ) D 0 q ( t ( l ) ) D q 0 ( t ( l ) ) D Q q ( t ( l ) ) ;
W ( t ( l ) ) = ( f ( t ( l ) ) ) 2 E [ X l T X l ] 1 V ( t ( l ) ) E [ X l T X l ] 1 .
Then,
( m h ) 1 / 2 ( β ^ ( t ( l ) ) β ( t ( l ) ) ) N ( G ( t ( l ) ) , W ( t ( l ) ) )

Appendix B. Proof of Theorem 2

Proof of Theorem 2. 
For each t ( l ) Δ ,
β ^ ( t ( l ) ) = k = 1 N X k T D ˜ k ( t ( l ) ) X k 1 k = 1 N X k T D ˜ k ( t ( l ) ) Y k ,
where
D ˜ k ( t ( l ) ) = diag W k , h ( t ( l ) t ( k ) ) I { T 1 * t ( l ) } , , W k , h ( t ( l ) t ( k ) ) I { T m * t ( l ) } .
When t ( l ) = t ( k ) ,
D ˜ l ( t ( l ) ) = diag W k , h ( t ( l ) t ( l ) ) I { T 1 * t ( l ) } , , W k , h ( t ( l ) t ( l ) ) I { T m * t ( l ) } = diag I { T 1 * t ( l ) } , , I { T m * t ( l ) } ,
when t ( l ) t ( k ) ,
D ˜ k ( t ( l ) ) = diag W k , h ( t ( l ) t ( k ) ) I { T 1 * t ( l ) } , , W k , h ( t ( l ) t ( k ) ) I { T m * t ( l ) } = diag 0 , , 0 .
β ^ ( t ( l ) ) can be written as
β ^ ( t ( l ) ) = X l T D ˜ l ( t ( l ) ) X l 1 X l T D ˜ l ( t ( l ) ) Y l = i = 1 m x i T ( t ( l ) ) D ˜ l ( t ( l ) ) x i ( t ( l ) ) 1 i = 1 m x i T ( t ( l ) ) D ˜ l ( t ( l ) ) y i ( t ( l ) ) .
Under the assumption that ϵ ( t ) is independent of X ( t ) , and E ϵ ( t ) = 0 . Then from Gauss–Markov theory,
E β ^ ( t ( l ) ) = β ( t ( l ) ) .
Combining Equations (7) and (8), the estimator ν ^ ( t ( l ) ) of mean state function at time point t ( l ) is
ν ^ ( t ( l ) ) = i = 1 m I { T i * t ( l ) } Y ^ i ( t ( l ) ) i = 1 m I { T i * t ( l ) } ,
where Y ^ i ( t ( l ) ) is the fitted value of Y i ( t ( l ) ) at time point t ( l ) from the joint model:
Y ^ i ( t ( l ) ) = x i ( t ( l ) ) β ^ ( t ( l ) ) .
By Assumption E ( y i ( t ) x i ( t ) ) = E ( Y ( t ) X ( t ) ) , we have
ν ( t ( l ) ) = E ( Y ( t ( l ) ) X ( t ( l ) ) ) = i = 1 m I { T i * t ( l ) } E Y ( t ( l ) ) X ( t ( l ) ) i = 1 m I { T i * t ( l ) } = i = 1 m I { T i * t ( l ) } E y i ( t ( l ) ) x i ( t ( l ) ) i = 1 m I { T i * t ( l ) } = i = 1 m I { T i * t ( l ) } x i ( t ( l ) ) β ( t ( l ) ) i = 1 m I { T i * t ( l ) } .
Thus, for any t ( l ) Δ , from Equation (A1), we have
E ν ^ ( t ( l ) ) ν ( t ( l ) ) = i = 1 m I { T i * t ( l ) } E x i ( t ( l ) ) β ^ ( t ( l ) ) i = 1 m I { T i * t ( l ) } i = 1 m I { T i * t ( l ) } x i ( t ( l ) ) β ( t ( l ) ) i = 1 m I { T i * t ( l ) } = i = 1 m I { T i * t ( l ) } x i ( t ( l ) ) i = 1 m I { T i * t ( l ) } E β ^ ( t ( l ) ) β ( t ( l ) ) = 0 .
This completes the proof of Theorem 2.

Appendix C. Proof of Theorem 3

Proof of Theorem 3. 
From Equation (8), we have
ν ^ ( t ) ν ( t ) = 1 m i = 1 m I { T i t } S ^ T ( t ) Y ^ i ( t ) E Y ( t ) = 1 m i = 1 m I { T i t } S ^ T ( t ) S T ( t ) Y ^ i ( t ) [ S T ( t ) S ^ T ( t ) ] + 1 m i = 1 m I { T i t } S T ( t ) ( Y ^ i ( t ) Y i ( t ) ) + 1 m i = 1 m I { T i t } S T ( t ) Y i ( t ) E Y ( t ) = ( I ) + ( I I ) + ( I I I ) .
Under the assumption that ϵ ( t ) is independent of T and C conditional on X ( t ) and W, from the law of large numbers,
( I I I ) = 1 m i = 1 m I { T i t } S T ( t ) Y i ( t ) E Y ( t ) 0 a . s .   a s   m .
For the second term ( I I ) , we have
( I I ) = 1 m i = 1 m I { T i t } S T ( t ) ( Y ^ i ( t ) Y i ( t ) ) = 1 m i = 1 m I { T i t } S T ( t ) x i ( t ) ( β ^ ( t ) β ( t ) ) + 1 m i = 1 m I { T i t } S T ( t ) ( ϵ i ( t ) ) ( I V ) + ( V ) .
By Assumption (6), x i ( t ) C 0 for some positive constants C 0 . By Theorem 3.1 in Zeng and Cai [17] and Assumption (11), we have that
( I V ) 1 m i = 1 m I { T i t } S T ( t ) x i ( t ) β ^ ( t ) β ( t ) C 0 a 2 β ^ ( t ) β ( t ) a . s .   a s   m .
By Theorem 1 in Wu et al. [12] and Assumptions (1)–(5), we have that
( I V ) C 0 a 2 β ^ ( t ) β ( t ) 0 a . s .   a s   m .
Also, from the law of large numbers,
( V ) = 1 m i = 1 m I { T i t } S T ( t ) ( ϵ i ( t ) ) = E I { T t } S T ( t ) ( ϵ ( t ) ) = E ( I { T t } ) S T ( t ) E ( ϵ ( t ) ) = 0 a . s .   a s   m
Similarly, by Assumption (9), Theorem 3.1 in Zeng and Cai [17], and Theorem 2 in Phadia and Ryzin [21], we have that
( I ) = 1 m i = 1 m I { T i t } S ^ T ( t ) S T ( t ) x i ( t ) β ^ ( t ) [ S T ( t ) S ^ T ( t ) ] 1 m i = 1 m I { T i t } S ^ T ( t ) S T ( t ) x i ( t ) β ^ ( t ) S T ( t ) S ^ T ( t ) M 0 C 0 2 a 4 S T ( t ) S ^ T ( t ) 0 a . s .   a s   m .
Now, ν ^ ( τ ) converges to ν ( τ ) almost uniformly in τ [ 0 , T ] .
Then, based on Egoroff Theorem in Bartle [22], μ ^ ( τ ) converges to μ ( τ ) in probability. □
This completes the proof of Theorem 3.

References

  1. Bang, H.; Tsiatis, A. Estimating medical costs with censored data. Biometrika 2000, 87, 329–343. [Google Scholar] [CrossRef]
  2. Lin, D.Y.; Feuer, E.J.; Etzioni, R.; Wax, Y. Estimating medical costs from incomplete follow-Up data. Biometrics 1997, 53, 419–434. [Google Scholar] [CrossRef] [PubMed]
  3. Huang, Y.; Lovato, L. Tests for lifetime utility or cost via calibrating survival time. Stat. Sin. 2002, 12, 707–723. [Google Scholar]
  4. Rizopoulos, D. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
  5. Deng, D. Estimating the cumulative mean function for history process with time-dependent covariates and censoring mechanism: Estimating the cumulative mean function for history process. Stat. Med. 2016, 35, 4624–4636. [Google Scholar] [CrossRef] [PubMed]
  6. Korn, E.L. On estimating the distribution function for quality of life in cancer clinical trials. Biometrika 1993, 80, 535–542. [Google Scholar] [CrossRef]
  7. Hoover, D.R.; Rice, J.A.; Wu, C.O.; Yang, L. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 1998, 85, 809–822. [Google Scholar] [CrossRef]
  8. Zhao, X.; Tong, X.; Sun, L. Joint analysis of longitudinal data with dependent observation times. Stat. Sin. 2012, 22, 317–336. [Google Scholar] [CrossRef]
  9. Li, C.; Xiao, L.; Luo, S. Joint model for survival and multivariate sparse functional data with application to a study of Alzheimer’s Disease. Biometrics 2022, 78, 435–447. [Google Scholar] [CrossRef] [PubMed]
  10. Do, H.; Nandi, S.; Putzel, P.; Smyth, P.; Zhong, J. A joint fairness model with applications to risk predictions for under-represented populations. Biometrics 2022, 79, 826–840. [Google Scholar] [CrossRef] [PubMed]
  11. Eubank, R.L.; Speckman, P.L. Confidence bands in nonparametric regression. J. Am. Stat. Assoc. 1993, 88, 1287–1301. [Google Scholar]
  12. Wu, C.O.; Chiang, C.T.; Hoover, D.R. Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data. J. Am. Stat. Assoc. 1998, 93, 1388–1402. [Google Scholar] [CrossRef]
  13. You, L.; Qiu, P. Joint modeling of multivariate nonparametric longitudinal data and survival data: A local smoothing approach. Stat. Med. 2021, 40, 6689–6706. [Google Scholar] [CrossRef] [PubMed]
  14. Silverman, B.W. Density Estimation: For Statistics and Data Analysis; Chapman & Hall: London, UK, 2018. [Google Scholar]
  15. Rice, J.A.; Silverman, B.W. Estimating the Mean and Covariance Structure Nonparametrically When the Data are Curves. J. R. Stat. Soc. Ser. Methodol. 1991, 53, 233–243. [Google Scholar] [CrossRef]
  16. Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
  17. Zeng, D.; Cai, J. Asymptotic results for maximum likelihood estimators in joint analysis of repeated measurements and survival time. Ann. Stat. 2005, 33, 2132–2163. [Google Scholar] [CrossRef]
  18. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023. [Google Scholar]
  19. Li, S.; Deng, D.; Han, Y.; Zhang, D. Joint model for estimating the asymmetric distribution of medical costs based on a history process. Symmetry 2023, 15, 2130. [Google Scholar] [CrossRef]
  20. Kenyon, J.R. Analysis of Multivariate Survival Data. Technometrics 2002, 44, 86–87. [Google Scholar] [CrossRef]
  21. Phadia, E.G.; Ryzin, J.V. A note on convergence rates for the product limit estimator. Ann. Stat. 1980, 8, 673–678. [Google Scholar] [CrossRef]
  22. Bartle, R.G. The Elements of Integration and Lebesgue Measure; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2011. [Google Scholar]
Figure 1. True values at observed time points and estimated function of β ( t ) for scenario 1.
Figure 1. True values at observed time points and estimated function of β ( t ) for scenario 1.
Symmetry 16 00389 g001
Figure 2. True values at observed time points and estimated function of β ( t ) for scenario 2.
Figure 2. True values at observed time points and estimated function of β ( t ) for scenario 2.
Symmetry 16 00389 g002
Figure 3. Polynomial regression estimator of β ( t ) for scenario 1.
Figure 3. Polynomial regression estimator of β ( t ) for scenario 1.
Symmetry 16 00389 g003
Figure 4. Polynomial regression estimator of β ( t ) for scenario 2.
Figure 4. Polynomial regression estimator of β ( t ) for scenario 2.
Symmetry 16 00389 g004
Figure 5. True values at observed time points and estimated function of v ( t ) and μ ( t ) for scenario 1.
Figure 5. True values at observed time points and estimated function of v ( t ) and μ ( t ) for scenario 1.
Symmetry 16 00389 g005
Figure 6. True values at observed time points and estimated function of v ( t ) and μ ( t ) for scenario 2.
Figure 6. True values at observed time points and estimated function of v ( t ) and μ ( t ) for scenario 2.
Symmetry 16 00389 g006
Figure 7. Polynomial regression estimator of v ( t ) and μ ( t ) for scenario 1.
Figure 7. Polynomial regression estimator of v ( t ) and μ ( t ) for scenario 1.
Symmetry 16 00389 g007
Figure 8. Polynomial regression estimator of v ( t ) and μ ( t ) for scenario 2.
Figure 8. Polynomial regression estimator of v ( t ) and μ ( t ) for scenario 2.
Symmetry 16 00389 g008
Figure 9. Estimated function of β ( t ) .
Figure 9. Estimated function of β ( t ) .
Symmetry 16 00389 g009
Figure 10. Fitted points of cost based on current approach and the semi-parametric estimating approach.
Figure 10. Fitted points of cost based on current approach and the semi-parametric estimating approach.
Symmetry 16 00389 g010
Figure 11. Fitted points of cumulative cost based on current approach and the semi-parametric estimating approach.
Figure 11. Fitted points of cumulative cost based on current approach and the semi-parametric estimating approach.
Symmetry 16 00389 g011
Figure 12. Marginal survival compared with Kaplan–Meier estimator.
Figure 12. Marginal survival compared with Kaplan–Meier estimator.
Symmetry 16 00389 g012
Table 1. The estimates of RMSE with different numbers of N for scenario 1.
Table 1. The estimates of RMSE with different numbers of N for scenario 1.
Parameter N = 106 N = 221 N = 441
ν ( t ) 0.90195751.2916021.762792
μ ( t ) 11.4267315.128327.13671
Table 2. The estimates of RMSE with different numbers of N for scenario 2.
Table 2. The estimates of RMSE with different numbers of N for scenario 2.
Parameter N = 106 N = 221 N = 441
ν ( t ) 1.8302972.4346873.582806
μ ( t ) 19.0316945.2668584.30626
Table 3. The estimate results of the event process for scenario 1.
Table 3. The estimate results of the event process for scenario 1.
ParameterTrue n = 125 n = 250 n = 500
BiasStd. Err.BiasStd. Err.BiasStd. Err.
α 0.25−0.03450.0300−0.03020.0212−0.03000.0154
γ 1.00−0.01190.25900.02750.1730−0.00400.1260
Table 4. The estimate results of the event process for scenario 2.
Table 4. The estimate results of the event process for scenario 2.
ParameterTrue n = 125 n = 250 n = 500
BiasStd. Err.BiasStd. Err.BiasStd. Err.
α 0.15−0.02860.0311−0.02300.0188−0.02230.0116
γ 1.000.02190.11700.02080.11600.02350.1120
Table 5. Estimate results of the event process.
Table 5. Estimate results of the event process.
ParameterValueStd. Err.p-Value
Treatment−1.22190.33620.0003
Association effect0.25190.20870.2274
Table 6. Estimated cumulative costs for 5 years and total period.
Table 6. Estimated cumulative costs for 5 years and total period.
Year 1Year 2Year 3Year 4Year 5Total
21,661.8345,403.1067,981.31101,983.30134,749.20135,412.00
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Deng, D.; Han, Y. A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry 2024, 16, 389. https://doi.org/10.3390/sym16040389

AMA Style

Li S, Deng D, Han Y. A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry. 2024; 16(4):389. https://doi.org/10.3390/sym16040389

Chicago/Turabian Style

Li, Simeng, Dianliang Deng, and Yuecai Han. 2024. "A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs" Symmetry 16, no. 4: 389. https://doi.org/10.3390/sym16040389

APA Style

Li, S., Deng, D., & Han, Y. (2024). A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry, 16(4), 389. https://doi.org/10.3390/sym16040389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop