Decision Aids for Health Performance Individualized estimation of human core body temperature using noninvasive measurements

Individualized estimation of human core body temperature using noninvasive measurements. Appl Physiol 124: 1387–1402, 2018. rising core body temperature (T c ) during strenuous physical activity is a leading indicator of heat-injury risk. Hence, a system that can estimate T c in real time and provide early warning of an impending temperature rise may enable proactive interventions to reduce the risk of heat injuries. However, real-time ﬁeld assessment of T c requires impractical inva- sive technologies. To address this problem, we developed a mathematical model that describes the relationships between T c and noninvasive measurements of an individual’s physical activity, heart rate, and skin temperature, and two environmental variables (ambient temperature and relative humidity). A Kalman ﬁlter adapts the model parameters to each individual and provides real-time personalized T c estimates. Using data from three distinct studies, comprising 166 subjects who performed treadmill and cycle ergometer tasks under different experimental conditions, we assessed model performance via the root mean squared error (RMSE). The individualized model yielded an overall average RMSE of 0.33 (SD (cid:2) 0.18)°C, allowing us to reach the same conclusions in each study as those obtained using the T c measurements. Furthermore, for 22 unique subjects whose T c exceeded 38.5°C, a potential lower T c limit of clinical relevance, the average RMSE decreased to 0.25 (SD (cid:2) 0.20)°C. Importantly, these results remained robust in the presence of simulated real-world operational conditions, yielding no more than 16% worse RMSEs when measurements were missing (40%) or laden with added noise. Hence, the individualized model provides a practical means to develop an early warning system for reducing heat-injury risk. impractical invasive sensors, serving as a practical and effective surrogate for core temperature monitoring. factors, which include the subject’s hydration status, clothing, envi- ronmental condition, exercise intensity, ﬁtness, and core-to-skin temperature gradient (12, 15, 33, 42). Hence, here we used 38.5°C as a tradeoff between a low-enough value to support the analyses of a reasonable-size data set (22 subjects in 34 proﬁles) and a high-enough value that is clinically meaningful. subjects.


INTRODUCTION
Athletes, Armed Forces personnel, and industrial workers are at risk for heat illness when they perform intense physical activities in hot and humid conditions. Such exertional heat illness is the third leading cause of sudden death in sport, with rates of incidence on the rise among sport, military, and industrial populations (3,9,25,26,52). Every year, the United States (U.S.) military consistently reports~2,000 cases of heat injuries despite a continued focus on prevention (3), and over 9,200 American high school students are treated for exertional heat illness (9,52). Meanwhile, heat injuries cause 33 yearly deaths among U.S. industrial populations (25). During strenuous, goal-oriented physical activities, such as military operations or athletic competitions, humans may either overlook or fail to perceive subtle thermoregulatory changes that can lead to heat injuries (16).
An unregulated rise in core body temperature (T c ) is a leading indicator of heat-injury risk. Under normal conditions, the human thermoregulatory system maintains homeostasis at a T c around 37°C. During compensable exercise, T c can rise by a few degrees and return back to its homeostatic level postexercise (i.e., a regulated rise). However, during strenuous physical activity in hot and humid conditions, the thermoregulatory system may be unable to cope with the rate of heat production and thus fail to curb a rising T c . This could trigger a cascade of clinical responses, starting with mild degradation of physical and cognitive performance that progresses to heat exhaustion and then heat stroke and culminates in multiorgan dysfunction and potentially death (14,16,24,45,57).
A system that accurately measures T c , reliably predicts the onset of T c increase, and generates early warnings may enable proactive interventions that could potentially prevent or reduce the risk of heat injuries. However, obtaining invasive medicalgrade (i.e., gold-standard) T c measurements from the pulmonary artery is impractical in ambulatory settings (e.g., during military field training and athletic activities). For exercise monitoring, especially in indoor laboratory settings, rectal temperature is the accepted gold-standard measure of T c (8,23,29,40). However, the invasiveness of the temperature sensor and the relative discomfort it can cause for long-duration monitoring make rectal probes impractical for use in outdoor settings, such as those involving military training or field operations (13,44). Ingestible thermometer pills, which in the last decade have been used successfully in field settings (38), are considered a reliable means to measure T c . However, their cost and the practical difficulties in continually monitoring a large number of subjects for long-duration activities make them an unattractive option. Noninvasive methods to measure T c via axillary or tympanic temperatures have not performed as well as gold-standard rectal measurements during exercise, as reported by several previous studies (8,23,29,40,44). However, more recent studies suggest that, while they can provide adequate measurements, their accuracy depends on sensor location and type of tympanic device. These studies also argue for extensive validation before field use (21,58).
Recent advances in commercially available wearable devices (e.g., fitness-tracking wristwatches), which provide increasingly reliable noninvasive measurements of physiological variables, such as heart rate (HR), skin temperature (T s ), and physical activity (A c , as measured by a 3-axis accelerometer), allow for the development of computational algorithms that combine these data through mathematical models to provide individualized T c estimates in real time. Two modeling approaches have been proposed. The first approach relies on using first-principles models that describe heat production in the body in response to physical activity, heat transfer from the body core to the skin, and that from the skin to the environment via a series of macroscopic heat-balance equations (17)(18)(19)22). However, although some of these models are very detailed, they invariably include a large number of parameters, which presents challenges for adapting the model to an individual under different conditions. The second approach relies on data-driven models that use nonlinear functions derived from population-average data to relate noninvasive measurements of physiological variables to T c (5,49). Although promising, these models do not account for the large interindividual variability in response to heat stress [e.g., for acclimated vs. nonacclimated individuals (14)], especially at high T c values, which are most relevant for heat injuries but for which data to train such models are scarce (49). A parsimonious mathematical model that includes a limited number of parameters that can be adapted on the fly to learn an individual's heat-stress response could potentially overcome these challenges.
To achieve this goal, we first formulated a mathematical model that describes the relationships between T c and noninvasive measurements of A c , two physiological signals (HR and T s ), and two environmental variables [ambient temperature (T a ) and relative humidity (RH)]. We then hypothesized that this model would yield accurate personalized T c estimates if its parameters were continually adapted to each individual. To this end, we coupled the mathematical model to a Kalman filter (32), which automatically adapts the model parameters to each individual on the fly in real time, in response to the individual's physical activity and, in doing so, accounts for the individual's sex, fitness, hydration status, exercise intensity, acclimatization level, clothing, and environmental condition. This customizability of the model is based on the premise that physiological variables (HR and T s ) measured from an individual are reflective of subject-specific differences in the factors mentioned above. To evaluate the performance of the individualized model, we simulated real-time operation by allowing the model to automatically learn the physiological heat-stress response of 166 subjects exposed to different exertional and environmental conditions in three separate laboratory studies and then directly compared the estimated T c against the corresponding measurements. Finally, because we eventually intend to use the modelestimated T c as a surrogate for core temperature measurements, we performed the following two additional analyses: 1) we verified whether the model-estimated T c allowed us to reach the same findings as those obtained in the three studies using the measured T c data, and 2) we tested whether the model was robust to real-world operational conditions, such as nonavailable or unreliable measurements, by simulating these scenarios and evaluating the accuracy of the model-estimated individualized T c profiles.

METHODS
We used data from three distinct previously reported studies, comprising 166 subjects who performed treadmill and cycle ergometer tasks under different experimental and environmental conditions. Table 1 provides the demographic information for each of the three studies described below. Study 1. Sixty subjects (42 men and 18 women) were recruited from military and university communities to participate in a standardized heat-tolerance test (HTT) (41). The HTT is widely used in the military community to assess the ability of an individual to return to duty after previously suffering a heat injury (43). It involves a 2-h walk at 5 km/h on a treadmill set to a 2% grade in an environmental chamber set to T a ϭ 40°C and RH ϭ 40%. In the context of the HTT, an individual is deemed heat intolerant if T c exceeds 38.5°C and HR exceeds 150 beats/min, or when neither tends to plateau by the end of the protocol. The Israeli Defense Force has been using the HTT for over 30 yr to help screen military personnel for return to active duty (12a). The Institutional Review Board of the Uniformed Services University (Bethesda, MD) approved the study. Each subject, after being briefed on the purposes and procedures of the study, gave written informed consent before study participation. During the study, each subject reported to the Uniformed Services University's environmental chamber on two occasions. Upon arriving in the morning, the subject changed into shorts and athletic shoes (women additionally wore a sports bra). Subjects underwent the standard HTT on one testing day. On another testing day, subjects underwent an identical test except with a T a of 22°C (the order was randomized). After walking, subjects rested and cooled down for at least 20 min (or until T c Ͻ38.0°C) while physiological measurements continued. The average time between the two testing days was 4 days. During the HTT, subjects could drink water ad libitum (up to 1 l/h). T c was measured by a rectal thermometer inserted 10 cm beyond the anal sphincter (MEAS Temperature Probe; Measurement Specialties, Dayton, OH), T s was measured by a skin sensor at the chest (YSI 409B; YSI, Yellow Springs, OH), and HR was measured by a Polar HR monitor (Polar Team 2 Pro; Polar, Lake Success, NY). HR was monitored and recorded every second while T c and Ts were recorded throughout the test at 15-s intervals (41). For maintaining a consistent sampling period across the data sets from the three studies, we resampled the HR and T s signals to 1-min intervals by averaging the data. For our mathematical model, we identified the metabolic equivalent unit (MET: the ratio of oxygen consumed during a specific physical activity to that at rest) for a walking speed of 5 km/h from the Compendium of Physical Activities (1) and then inferred the Ac level to be moderate (MET~3-6), following a previous study (51).
Study 2. Similar to study 1, 96 subjects (all men) performed the standard HTT protocol (walking on a treadmill set to a 2% grade at 5.0 km/h for 120 min) at a Ta of 40°C and a RH of 40%. The Institutional Review Board of the Israel Defense Forces Medical Corps approved the use of the data for retrospective studies (12a, 34).
Throughout the duration of the walk, Tc was measured by a rectal thermometer inserted 10 cm beyond the anal sphincter (YSI 401; YSI), HR was measured by a Polar HR monitor (Polar Team 2 Pro; Polar) and stored at 1-min intervals by a heart watch (RS800cx watch; Polar Electro Oy, Kempele, Finland), and Ts was measured by a skin sensor at the chest (YSI 409B; YSI). Temperature data were continually monitored and recorded throughout the test at 1-min intervals. Similar to study 1, we inferred the Ac level to be moderate for a walking speed of 5 km/h. Study 3. Ten active and healthy men participated in this study, which was approved by the University of Otago's Human Ethics Committee in accordance with the Declaration of Helsinki. Subjects were given a concise explanation of all experimental procedures and potential risks. Subsequently, they gave their written informed consent (46).
The participants, who were local cyclists ranging from recreationally active individuals to regional multisport athletes, were not heat acclimated. Subjects completed the following four trials in counterbalanced order after random assignment: 1) precooling before exercise and no fan airflow during exercise (PCNF), 2) no precooling and no fan airflow (NCNF), 3) precooling with fan airflow (PCWF), and 4) no precooling with fan airflow (NCWF). Of the 10 subjects, the requisite data (HR, Ts, and Tc measurements) were available for all four trials for four subjects and for at least two of the four trials for the remaining six subjects. Each subject performed the trials at the same time of day, with trials separated by at least 7 days.
The test protocol started with the subject laying submerged chest deep in a custom-insulated bath for 1 h before exercising, in either thermoneutral water (35°C) or cool water (24°C; precooling was stopped when 1 h had elapsed or when Tc decreased by 0.5°C, whichever occurred earlier). Each subject began the cycling protocol on an electromagnetically braked cycle ergometer (Velotron version 1.5; RaceMate, Seattle, WA) exactly 10 min after exiting the bath; this period was required for drying off, changing, and relocating to a temperature-controlled environmental chamber (Ta ϭ 30°C; RH ϭ 50%). During trials requiring airflow, a large fan (655-mm-diameter blade; Imasu IMS International, Tsuen Wan, Hong Kong) was placed 1 meter in front of the subject. The fan height was adjusted to include airflow over the head, torso, arms, and upper legs, covering as much surface area as possible in the cycling position; the maximum average wind velocity at 1 meter was 4.8 m/s. During the cycling trials, the air exhaled by the subject was sampled breath by breath to calculate the rates of ventilation, oxygen uptake, and CO 2 production (Cortex Biophysik Metalyzer 3B, Leipzig, Germany). For our mathematical model, we computed METs from the rates of oxygen uptake and converted these to Ac levels (51). HR was monitored via a chest-strap device (Vantage NV; Polar Electro, Port Washington, NY). Ts was measured at 10 sites on the right side of the body via insulated skin thermistors affixed to the skin surface with adhesive tape (Type EU; Grant Instruments, Cambridge, UK), and the measurements were used to compute a mean value. T c was monitored via an esophageal thermistor while the subject was in the bath and via a rectal thermistor placed 10 cm past the anal sphincter while the subject exercised (Mon-a-therm 400; Mallinckrodt Medical, St. Louis, MO). The measured data (HR, T c, Ts, oxygen uptake, and CO2 production) were logged at 1-min intervals (Grant 1200 series Squirrel data logger; Grant Instruments).
Individualized model. The proposed model uses an individual's noninvasive measurements of A c, HR, and Ts, as well as two environmental variables, Ta and RH, to estimate the individual's Tc in real time. The model includes two elements, a mathematical model and a Kalman filter (32), which together provide real-time individualized estimates of T c via the following three steps (Fig. 1): 1) first, in step 1, the mathematical model uses the measured (or computed) activity Ac and environmental variables Ta and RH as inputs to estimate values of HR and Ts (i.e., to compute the state variables HR andTs); 2) next, in step 2, the system computes the errors between the measured HR and Ts and the estimated values HR andTs; and 3) finally, in step 3, to reflect the individual's physiological response, the Kalman filter considers these errors to correct the state variables and update the mathematical model parameters, which are then used in the subject-adapted model to provideT c, an improved estimate of the individual's Tc. We refer to the combination of the mathematical model and the Kalman filter as the "individualized" model. We have provided details of the Kalman filter algorithm in APPENDIX A.
Mathematical model. The mathematical model consists of the following two submodels: a phenomenological component that relates A c to HR and a first-principles macroscopic energy-balance component (17)(18)(19) that relates metabolic activity (represented by HR) to Ts and Tc.
The structure of the phenomenological model was motivated by the observation that an increase in A c leads to a rapid increase in HR, which subsequently decays exponentially when Ac decreases. This relationship was mathematically represented by the following equation: where ⌬HR denotes the change in HR from a resting state HR0 (i.e., ⌬HR ϭ HR -HR0), ␣1 denotes the rate constant for HR, and ␤ represents the gain in HR resulting from physical activity. Here, we set HR0 as the mean of the measured HR during the initial 10 min of data (~80 beats/min) obtained under light activity levels. Following a previous study (51), we quantized human activities into specific MET values, which were further classified into five activity levels as follows: A c ϭ 0 for rest (MET ϭ 1), Ac ϭ 1 for light activity (MET 1-3), Ac ϭ 2 for moderate activity (MET~3-6), Ac ϭ 3 for high activity (MET~6 -9), and Ac ϭ 4 for very high activity (MET Ն 9). In Eq. 1, we raised Ac to the fourth power to ensure good separation of HR at different activity levels because we noticed during data analysis that moderate activity (Ac ϭ 2) leads to a greater increase in HR compared with light activity (Ac ϭ 1).
The first-principles component of the model consists of the coreand skin-temperature compartments: where ⌬T c ϭ Tc -Tc0 and ⌬Ts ϭ Ts -Ts0, with Tc0 and Ts0 denoting the initial core and skin temperatures, respectively, Ps denotes the vapor pressure of water for Ts, and Pa represents the vapor pressure of water at a given Ta and RH, as calculated by the perceived heat index.
Here, we set Tc0 to 37°C and Ts0 as the mean of the measured Ts during the initial 10 min of data collection. We used the equation proposed by Rothfusz (50) to compute the heat index (for a given T a and RH) and computed the vapor pressure of water via the Antoine equation (56): .63, and C ϭ 233.43, with T ϭ Ts for P ϭ Ps and T ϭ heat index for P ϭ Pa (both in units of°C). In Eq. 2, ␣2 denotes the thermoregulatory rate constant of Tc, ␥1 denotes the rate of heat gain due to metabolic activity (HR), S(⌬HR) is the standard sigmoid function to reduce the heat gain to zero when HR decreases below the resting state (i.e., when ⌬HR becomes negative), and ␥ 2 denotes the rate of heat transfer from the core to the skin compartment. In Eq. 3, ␣3 denotes the rate of convective heat transfer from the skin compartment to the environment, and ␣ 4 denotes the rate of heat loss to the environment due to sweat evaporation. Thus, the mathematical model consists of three states (⌬HR, ⌬Tc, and ⌬Ts, corresponding to Eqs. 1, 2, and 3, respectively) and seven parameters (␣1, ␤, ␣2, ␥1, ␥2, ␣3, and ␣4). Initial values of model parameters. We used published data from four studies to estimate the parameters ␣1 and ␤, which relate Ac to HR via Eq. 1 (4, 20, 37, 53). The studies involved a total of 184 subjects performing treadmill runs or cycle ergometer tasks at differ-ent speeds, with A c ranging from zero (rest) to four (very high). We obtained the physiological ranges of ␣1 and ␤ by performing a least-squares fit of Eq. 1 to the data from these studies. We fixed ␣2 at a constant value because it represents the rate of self-regulated changes in Tc, which is invariant across individuals (28,36). We obtained the physiological ranges of the other four parameters (␥1, ␥2, ␣3, and ␣4) from published values of human tissue and heat transfer properties from experimental studies and the human numbers database (2,18,27,30,31,48,54). We computed the means and SDs of the six parameters from the physiological ranges thus obtained and used these values to initialize the individualized model.
Evaluation of model performance. We evaluated the ability of the individualized model to learn a subject's heat-stress response under different environmental and experimental conditions by simulating real-time estimates of each subject's Tc (i.e.,Tc) and comparing them against the corresponding experimental measurements. To assess the accuracy ofTc, we computed the square root of the mean squared differences betweenTc and the gold-standard measurements of Tc [i.e., the root mean squared error (RMSE)] throughout the duration of the experiment for each of the 166 subjects under each experimental condition. Note that Tc measurements are only used in this study to assess the model estimatesTc. In addition, to assess the accuracy of the model for Tc values that are elevated enough to be clinically meaningful while still being able to analyze a reasonable fraction of the data, we computed the RMSE betweenTc and Tc for measurements Ͼ38.5°C. Across the three studies, 22 unique subjects had Tc values exceeding 38.5°C in at least one of the experimental conditions (34 individual profiles in total). Previous studies suggest that a rising T c may lead to a cascade of responses, starting with degradation of aerobic performance and reduced cognitive function at around 38.2°C and beyond (14,24,45) that progresses to heat exhaustion (33,42,57) and to heat stroke (16,57). Importantly, there is no specific T c threshold that delineates the transition from one state to the next. Rather, a subject's response to an increasing T c depends on several Step 1 (dark gray), the mathematical model uses the activity profile and environmental variables to estimate the HR and Ts.
Step 2 (medium gray), the system computes the errors between the model-estimated HR and Ts and the corresponding measurements.
Step 3 (light gray), the Kalman filter uses the errors to update six model parameters (␣1, ␤, ␥1, ␥2, ␣3, and ␣4) and provide individualized real-time core temperature estimates. We fixed ␣2, the thermoregulatory rate constant of Tc, to a constant value. Parameter definitions: ␣1, the rate constant of HR; ␤, the gain in HR in response to Ac; ␥1, the heat gain resulting from metabolic activity; ␥2, the rate of heat transfer from the core to the skin compartment; ␣3, the rate of heat transfer from the skin to the environment via convection; and ␣4, the rate of heat transfer resulting from sweat evaporation. factors, which include the subject's hydration status, clothing, environmental condition, exercise intensity, fitness, and core-to-skin temperature gradient (12,15,33,42). Hence, here we used 38.5°C as a tradeoff between a low-enough value to support the analyses of a reasonable-size data set (22 subjects in 34 profiles) and a high-enough value that is clinically meaningful.

RESULTS
Evaluation of model performance. Table 2 shows the average RMSE at all T c values and the average RMSE for T c Ͼ38.5°C, across each experimental condition for the three studies. The individualized model (Table 2, model with T s ) yielded an average error of 0.33 (SD ϭ 0.18)°C. Importantly, for the 22 subjects whose T c exceeded 38.5°C (34 individual profiles; in some cases, the same subject exceeded the threshold in two or more experimental conditions), the average error decreased to 0.25 (SD ϭ 0.20)°C. In study 3, there were two subjects whose T c exceeded 39°C in three or more experimental conditions. For these subjects (3 and 7, seven time profiles), the model yielded an average RMSE of 0.22 (SD ϭ 0.03)°C for T c Ͼ39.0°C. In addition, we assessed the individualized model's ability to reproduce the magnitude of the measured peak T c (max T c , see APPENDIX B for details) and found that, on average, the absolute difference between max T c andT c at the same time point was 0.27 (SD ϭ 0.20)°C across all subjects in the three studies.
Ability of the individualized model to learn the thermoregulatory response of the same subject under different experimental conditions. A novel feature of the individualized model is its ability to adapt the model parameters to an individual subject and, as such, explicitly account for subject-specific variations in thermoregulatory responses under different environmental or experimental conditions. Figure 2 shows an example of the model's ability to estimate a subject's T c (study 1, subject 1) for the same activity (treadmill task) performed at two different T a values, 22°C and 40°C. Figure 2A shows the measured A c levels that drove the individualized model to track the measured HR and T s (Fig. 2, B and C, respectively) via the Kalman filter and provides real-time T c estimates (Fig. 2D). Changes in HR lagged behind those in HR, both at the begin-ning and end of the moderate activity (A c ϭ 2) period (up tõ 15 min and after~125 min; Fig. 2B). At the beginning, the delay occurred because of model initialization (the initial 10-min period during which HR was already increasing). This was subsequently followed by a small delay (~5 min) introduced by the causal noise-rejecting filter applied to HR before the Kalman filter algorithm became engaged (see APPENDIX A). At the end of the moderate activity period, the noise-rejecting filter alone caused the delay in HR. As expected, the T c values in the 40°C condition were higher than those in the 22°C condition, an effect that the model captured. The individualized model yielded errors of 0.16°C and 0.24°C for the 22°C and 40°C conditions, respectively. Figure 3 shows the performance of the model in predicting the T c of one subject from study 2 (subject 3) at T a ϭ 40°C and RH ϭ 40%. As in study 1, despite a minor lag in HR in the beginning (the initial 15 min), the model generally captured the rise in T c for the 14 subjects whose T c exceeded 38.5°C in study 2 [average error ϭ 0.24 (SD ϭ 0.19)°C; Table 2]. Figure 4 shows an example of the model's ability to estimate a subject's T c (study 3, subject 3) for the same activity (cycle ergometer task) performed under four different experimental conditions in an environmental chamber maintained at a T a of 30°C and a RH of 50%. In the precooling conditions (PCNF and PCWF), the model overestimated T c up to 60 min into the task (Fig. 4D). This discrepancy reflects the observation that, whereas HR increased rapidly (by~70 beats/min in 10 min), rectal T c measurements increased more slowly. In these conditions, the rectal measurements may have lagged (35,40) and underestimated the true T c because of the influence of the volume of cutaneous blood cooled by water (or other cooling device) circulating in the veins of the legs during cycling (39). Nonetheless, at later time points when T c values were high, which are important in predicting the risk of heat injuries, the model showed good accuracy across the four experimental conditions.
Adaptation of model parameters. To describe how the model adapts to different heat-stress conditions in the same individual, we examined how the adjustable parameters changed as a  For the two models, entries indicate mean root mean squared error (RMSE) values with SD in parentheses. RMSE (°C) is shown throughout the duration of the experiment and during episodes where core body temperature (Tc) Ͼ 8.5°C. Columns labeled Noisy HR show the effects of adding noise to the HR measurements on model performance. Ts, skin temperature; HR, heart rate; NCNF, no precooling and no fan airflow during exercise; NCWF, no precooling but with fan airflow during exercise; PCNF, precooling and no fan airflow during exercise; PCWF, precooling but with fan airflow during exercise. *Study 1 was conducted under two different ambient temperature conditions, whereas study 2 was conducted at 40°C. In both studies, the relative humidity was set to 40%. †Study 3 was conducted at an ambient temperature of 30°C at a relative humidity of 50%. function of time. Figure 5 shows the dynamical changes in the six adjustable parameters of the individualized model as it adapted to the subject in Fig. 2 under two conditions (T a ϭ 22 and 40°C).
Before interpreting the figure, we note that not all model parameters are equally sensitive to changes in T a . Parameters ␣ 1 and ␤ (Eq. 1), which denote the rate constant of and gain in HR, respectively, are affected by changes in HR due to A c and T a . Parameters ␥ 1 and ␥ 2 (Eq. 2), which denote the rates of heat gain resulting from metabolic activity and heat transfer to the skin compartment, respectively, are affected by changes in T s due to T a . However, the effects of T a on T c via these two parameters (␥ 1 and ␥ 2 ) are minimal because the skin and deeper tissues effectively insulate the body core  from variations in environmental conditions (2,16). The parameter ␣ 3 (Eq. 3), which denotes the rate of heat loss from the skin to the environment via convection, is the only parameter that is directly affected by changes in T a and hence is the most sensitive. We also note that, depending on the condition, the model parameters require between 30 and 45 min to converge and adapt to an individual's response to heat stress.
Upon initiation of the HTT, A c changed from light to moderate (i.e., it changed from 1 to 2; Fig. 5A), causing the HR to rise (Fig. 2B). In both temperature conditions, ␣ 1 , the rate constant of the HR, initially decreased up to~25 min into the simulation to a considerably lower value than the initial value (Fig. 5B). Subsequently, ␣ 1 settled at this value for the 22°C condition (where the HR settled to~110 beats/min) while continuing to decrease further in the 40°C condition (where the HR continued to increase up to~150 beats/min). In both conditions, ␣ 1 constrained the rate of change in HR as it increased (Fig. 2B). Parameter ␤, the gain in HR due to A c , decreased initially because HR remained constant during the initial 10 min of model initialization. Subsequently, ␤ increased nominally (Fig. 5C), contributing to the rise in HR. The sharp increases in ␣ 1 and ␤ at the end of the moderate activity (A c ϭ 2) period contributed to the decrease in the HR (after~135 min). Thus, these two parameters are highly sensitive to A c -induced variations in HR.
Parameter ␥ 1 , the rate of heat gain because of metabolic activity, decreased as T c increased, effectively dampening the rise of T c in response to a rise in HR (Fig. 5D). Parameter ␥ 2 , the rate of heat transfer to the skin compartment, varied as a function of the difference between T s and T c ; it rapidly decreased as T c increased, whereas T s remained almost constant in both conditions, because heat was not dissipated to the skin (Figs. 2, C and D, and 5E).
Parameter ␣ 3 , the rate of heat loss from the skin to the environment via convection, decreased more at 40°C than at 22°C (Fig. 5F). This was expected because, at T a ϭ 22°C, the difference between T s and T a was large (~10°C), increasing convective heat loss. In contrast, at T a ϭ 40°C, the gradient was small (~4°C), effectively reducing convective heat loss compared with the 22°C condition (Fig. 2C). Parameter ␣ 4 , the rate of evaporative heat loss from the skin to the environment, decreased rapidly (within~10 min) and remained low in both conditions (Fig. 5G). This was expected because ␣ 4 typically remains low unless there is considerable wetting of the skin and the environment is conducive to sweat evaporation.
Individualized model estimates corroborate study findings. Apart from evaluating model performance (detailed in the sections above), we assessed whetherT c would lead to the same conclusions as those obtained from the measured T c in each of the three studies.
The two major findings of study 1 (41) were that 1) a subject's heat tolerance can only be assessed at high T a levels (T a ϭ 40°C) and 2) heat-intolerant subjects have a significantly higher T c than do heat-tolerant subjects at a T a of 40°C. To evaluate their findings, the authors computed two metrics from the measured data: max T c and ⌬T c (the difference between max T c and the T c value achieved 1 h after commencing the treadmill task). Both metrics were lower at 22°C than at 40°C (finding 1, Tables 3-5). Furthermore, both metrics were lower for heat-tolerant subjects than for heat-intolerant subjects at 40°C (finding 2, Tables 3-5). Importantly, the same findings were obtained whenT c (Tables 3-5), instead of the measured T c , was used to compute the two metrics.
The finding in study 2 matched the second finding from study 1, that heat-intolerant subjects have significantly higher T c than heat-tolerant subjects at a T a of 40°C (12a, 34). Indeed, Tables 3-5 shows this finding to be true for both T c andT c (study 2).
The authors of study 3 concluded that precooling slowed the rate of rise in T c and increased exercise duration (46). We sought to test directly whether the individualized model could reproduce this effect by computing the time it takes for each subject's T c to exceed the heat-injury threshold of 38.5°C. Precooling increased the time to T c Ͼ38.5°C [Tables 3-5 Table 2, also see Figs. [2][3][4][5].
To further investigate how the individualized model would estimate T c when encountering operational conditions in realworld scenarios, we provided it with noise-laden or partially missing HR measurements and evaluated its performance (Table 2). To assess performance for noise-laden HR measurements, we used the procedure detailed in APPENDIX D to simulate 100 random realizations by adding noise to the original HR data for each subject in each experimental condition. Briefly, to add realistic noise, we determined the error characteristics of the Samsung GearS3 smartwatch (Samsung Electronics America, Ridgefield Park, NJ), using 83 h of in-house HR measurements obtained while subjects performed different activities and comparing them against those obtained with a goldstandard chest-strap device (Polar H7; Polar Electro Oy). We then estimated the subject's T c for each of the 100 noise-laden HR profiles and computed 100 RMSE values by comparing the estimates against the measured T c values. We then computed the average RMSE for that subject and repeated this procedure for each subject in each experimental condition across the three studies. Noise-laden HR data degraded model performance by only 16% for the 22 subjects whose T c exceeded 38 For missing HR data, we generated 100 random realizations for one subject in one experimental condition by randomly removing a fixed fraction of the total number of HR values. We varied the missing fraction from 10 to 50%, computed the average RMSE for each fraction of missing HR data, and computed the percentage increase in the RMSE relative to the RMSE obtained when using the true HR data (without missing values). We repeated this procedure for the 22 unique subjects (34 individual profiles across the different experimental conditions) whose T c exceeded 38.5°C in any of the three studies. We then plotted the percentage increase in RMSE for all T c values (Fig. 6) and the percentage increase in RMSE for T c Ͼ38.5°C (Fig. 6) as a function of the percentage of missing HR data. Even with 40% of the data missing, the RMSE for T c Ͼ38.5°C increased, on average, by only 13% (or 0.03°C).

DISCUSSION
In this study, we developed an individualized model that uses activity data (A c ) and noninvasive measurements of HR and T s , along with two environmental variables (T a and RH), to provide real-time estimates of a subject's T c . We demonstrated that the model estimated T c with good accuracy across 166 subjects subjected to different exertional and environmental conditions [average RMSE ϭ 0.33 (SD ϭ 0.18)°C]. Furthermore, the estimation accuracy increased for T c values exceeding 38.5°C [average RMSE ϭ 0.25 (SD ϭ 0.20)°C], a potential lower T c limit of clinical relevance, when the ability to make accurate estimates starts to become important. We also showed that the model-estimated T c (i.e.,T c ) corroborated the findings inferred from the corresponding measurements in each of the three studies, supporting the hypothesis thatT c could serve as a surrogate for invasive T c measurements. Importantly, these results remained robust in the presence of simu- Fig. 6. Effect of missing HR data on model performance. Using data from the 22 unique subjects (34 profiles from the 3 studies) whose Tc exceeded 38.5°C, we generated 100 random realizations of missing HR data for each fraction (from 10 to 50%) of the total number of HR samples for each of the 34 profiles. We computed the RMSE between the measured and estimated Tc values for each random realization and missing HR fraction in each of the 34 profiles. We then computed and plotted the percentage increase in the RMSE as a function of the fraction of missing HR data for each of the 34 profiles. Values are medians and interquartile ranges. lated real-world operational conditions, such as nonavailability of T s measurements (no difference in performance), random noise added to the HR data (16% worse RMSE), or partially missing HR data (13% worse RMSE with 40% of the data missing). A novel feature of the individualized model is that it adapts the model parameters to each subject and, as such, directly accounts for subject-specific variations in thermoregulatory responses (e.g., because of acclimation or fitness level) and responses to exogenous factors (e.g., clothing and environmental conditions). This is made possible by the two elements of the model (mathematical equations and the Kalman filter). The mathematical model equations provide the physiological relationships linking T c and the noninvasively measured variables HR and T s . In turn, the adjustable parameters in the equations provide the degrees of freedom to customize the model to an individual based on the measured HR and T s . This is achieved through the Kalman filter. As it attempts to minimize the errors between the measured and the model-estimated values of HR and T s after each measurement, it changes the equation parameters. In doing so, it gradually adapts the model parameters, individualizing the model to best represent the measured data and, hence, improving the estimates of T c .
Here, we specifically demonstrated that the model can learn the variations in the thermoregulatory responses of the same subject under two different environmental conditions (22 and 40°C) by reducing ␣ 3 (the rate of convective heat loss from the skin to the environment) more at 40°C than at 22°C. Similarly, the model adapted to subjects performing the same activity under four different experimental conditions in which the subjects' baseline T c (via precooling) or skin heat loss (via fan airflow) was varied during the task. The model accounted for these variations by adjusting ␥ 2 (the rate of heat transfer from the core to the skin compartment to partially account for precooling) and ␣ 4 (the rate of evaporative heat loss to partially account for fan airflow), in addition to activity-related adjustments to other parameters (␣ 1 , ␤, and ␥ 1 ; see Figs. E1, E and G, in APPENDIX E). This strongly suggests that adaptation of model parameters is necessary to improveT c and thereby highlights the potential limitations of existing data-driven algorithms, which do not provide for parameter updates (5,49), and of detailed higher-order thermoregulatory models in which a large number of parameters makes model adaptation difficult (17)(18)(19)22).
Further analyses of the data suggest that the model can adapt to individual differences, regardless of the factors involved (e.g., sex, age). Using data from study 1, we performed eight pairwise comparisons across three factors: sex (study 1 included 42 men and 18 women), environment (T a ϭ 22 or 40°C), and heat tolerance condition (6 men and 8 women were heat intolerant). Our analyses revealed that the average RMSE difference across these comparisons was Ͻ0.06°C (this value was achieved for heat-intolerant women vs. heat-intolerant men at T a ϭ 40°C). We also evaluated the effect of age on model performance by dividing the subjects from all three studies into two age groups, using a cutoff of 30 yr. The average difference in RMSE (0.02°C) was not statistically significant (Wilcoxon rank-sum test, P ϭ 0.98). These analyses indicate the broad applicability of the individualized model in spite of interindividual differences. Interestingly, the model without T s providedT c values that were indistinguishable from those provided by the model that used T s measurements. This was achieved even though, in the former model,T s values deviated considerably from their corresponding measurements (Figs. 2C-4C), suggesting that large errors inT s did not translate into large errors inT c . This can be attributed to the fact that, in the studies investigated here, the average gradient between the measured T c andT s was similar for the two variants of the model [model with T s : 3.30 (SD ϭ 2.63)°C; model without T s : 2.94 (SD ϭ 1.23)°C]. As a result, the converged values of ␥ 2 (the rate of heat transfer between the core and skin compartments) differed, on average across all subjects and experimental conditions, between the two model variants by only 16 (SD ϭ 13)%. In certain scenarios, such as in cold conditions when the measured T s is likely to be lower, the gradient between T c and T s could be greater. In this case, the model without T s would then generate an erroneous estimate of ␥ 2 , which in turn would reduce its ability to accurately estimate T c . Under such scenarios, we expect that T s measurements would provide additional information to correct the ␥ 2 estimate and improve the accuracy ofT c . We emphasize that this limitation does not apply to the full model but only to the model that does not use T s measurements.
The data sets we used were limited to those from experiments in which healthy subjects performed two types of activity (walking on a treadmill and cycling) in environmental conditions varying between 20 and 40°C at 40% or 50% relative humidity. Under real-world conditions, however, men and women differing in many factors [e.g., age, body composition, fitness level, presence or absence of an illness that can induce heat injury (11,55,57), acclimation level, clothing, and hydration status] perform free-ranging activities of varying intensity. Studies that systematically vary several of the aforementioned factors and include subjects with T c values well within the zone of clinical relevance could further serve to validate our model. Nevertheless, our basic premise is that the physiological variables (HR and T s ) measured from individuals are reflective of subject-specific differences in the aforementioned factors. For example, an elevated HR may reflect a change in an individual's metabolic activity in response to illness, whereas a change in T s indicates some change in environmental condition. Similarly, A c reflects the intensity level of physical activity. Thus, the model can estimate T c in different scenarios because the inputs capture subject-specific and environmental factor differences and because the model automatically accounts for these differences by minimizing the errors between the measured and model-estimated values of HR and T s . Indeed, here we demonstrated that the model accurately estimates the same individual's T c under different environmental or experimental conditions and is robust to real-world operational issues. Importantly, we showed that the model performs similarly despite interindividual differences in attributes, such as age and sex.
Conclusion. We used a Kalman filter algorithm that relies on noninvasive measurements of physiological signals and environmental variables to adapt a mathematical model to provide real-time subject-specific individualized T c estimates. The individualized model provided accurate T c estimates even under real-world operational conditions, such as when measurements were unavailable, only partially available, or unreliable. Importantly, we demonstrated that the model-estimated individ-ualized T c corroborated the findings from the three studies analyzed here, suggesting that it can serve as a surrogate for measurements obtained by invasive T c sensors and thereby allow for continuous monitoring of an individual's T c for an extended period. We are currently integrating the individualized model into a hardware/software system to provide early warning of an impending heat injury at an individual level. Specifically, we are combining physiological responses, measured by a fitness-tracking wristwatch and wirelessly transmitted to a smartphone that houses the individualized model, with a previously developed algorithm to predict T c 20 min ahead of the current time (36). In conclusion, if the individualized model provides accurate T c estimates, the early warning system's ability to predict an impending rise in T c with sufficient lead time should allow users to proactively intervene and reduce the risk of heat injuries. T a P a ͔ T . We set ϭ 1/2 to ensure nonnegative parameter values. We then computed the Jacobian of f(x,u) (47) and discretized the results to obtain a linear time-varying model (10). The discrete model had state equations xkϩ1 ϭ Fk xk ϩ Gk ukϩ1 and output equations yk ϭ C xk ϩ y0, where Fk and Gk denote the discrete linearized state-transition and input matrices, respectively, obtained at each time index k, C denotes the matrix that outputs the estimated ⌬HR and ⌬T s signals, and y0 ϭ [HR0 Ts0] T represents the vector of the mean values of HR and Ts from the initial 10 min of data that were used to initialize the model.
The flowchart in Fig. A1 depicts the initialization of, and sequence of steps involved in, the Kalman filter algorithm. As described in METHODS, we initialized the algorithm with parameter values (which is part of the initial state vector x 0) obtained from previously pub-lished studies and their corresponding variances 2 (which is part of the initial state-error covariance matrix P0). In addition, we used HR and Ts measurements obtained from the first 10 min of data to estimate the noise variances (which comprise matrix R) and the systemic uncertainty between estimated and measured values of HR and T s ( 2 in matrix Q). Using each subject's first 10 min of data, we estimated 2 . To do so, we first filtered the measured HR and Ts in real time, using a second-order causal low-pass Butterworth filter with a cut-off frequency of 3.3 mHz, to reject noise while preserving the frequency band that overlaps with the measured T c. We then used a 5-min moving average of the measured Ac, the measured Ta, and Pa (computed from the measured Ta and RH) to drive the mathematical model and computed the error between the filtered HR and Ts data and the corresponding model estimates. This error, which is linearly related to the systemic uncertainty in the model states, allowed us to estimate 2 by solving the resulting linear least-squares problem (7).
After initialization, the Kalman filter algorithm proceeded in the following manner. At each 15-s discrete time interval, the algorithm used a 5-min moving average of the measured (or computed) A c, the measured Ta, the Pa at the present time index k ϩ 1, and the model parameters obtained up to time index k to estimate the model states xkϩ1|k (Fig. A1, estimation step). In the estimation step, the algorithm also estimated the state-error covariance matrix Pkϩ1|k, using the model parameters up to time index k and the process noise matrix Q. Subsequently, the algorithm computed the Kalman gain Kkϩ1. Next, the algorithm used the error ekϩ1 between the filtered measurements (ykϩ1) and the estimated HR and Ts (Cxkϩ1|k) to update the Tc estimate (an element of the state vector xkϩ1|kϩ1), the model parameters, and the state-error covariance matrix Pkϩ1|kϩ1 (Fig. A1, update step). In situations where HR or Ts measurements were temporarily unavailable (i.e., missing measurements), we assumed that the noise characteristics of the error at time point k ϩ 1 would not be drastically different from the previous time point k and, hence, set e kϩ1 to ek in the update step. The algorithm repeated this procedure for each time step until the end of the time series data.

APPENDIX B: PEAK TC STATISTICS
To assess the individualized model's ability to estimate the timing and magnitude of the measured peak T c in each profile, we computed the following three statistics: 1) the absolute time difference between the measured and model-estimated peak T c, 2) the absolute difference . We initialized the algorithm with , 2 , the HR and Ts measurement noise variances (matrix R), and the process noise variance 2 estimated from the first 10 min of data for each subject. After initialization, the algorithm proceeded by using ukϩ1 to drive the mathematical model to estimate HR and Ts (estimation step). Next, by scaling the error ekϩ1 between the filtered measurements and the model-estimated HR and Ts by the Kalman gain Kkϩ1, the algorithm updated the model parameters and the Tc estimates at each time index (update step) until the end of the measured time series data. between the magnitude of the measured peak Tc and the model estimate at the same time point, and 3) the absolute difference between the magnitudes of the measured and model-estimated values of peak T c. Table B1 reports the values of these statistics for all subjects in each of the three studies.

APPENDIX C: MODIFIED KALMAN FILTER ALGORITHM TO HANDLE UNAVAILABLE TS
The Kalman filter algorithm described above cannot be directly applied to the case where Ts is not measured. This is because the model parameters, which depend on Ts measurements (␥1, ␥2, ␣3, and ␣4, see Eqs. 2 and 3 in Mathematical model in METHODS), cannot be adapted to the individual without such measurements. Hence, for running the model without T s measurements, we modified the update step of the Kalman algorithm by setting ekϩ1 ϭTskϩ1 -Tsk, i.e., we used the difference inTs between the present time point k ϩ 1 and the previous time point k as a surrogate for ekϩ1. This enabled the model parameters to be adapted to the individual despite the absence of Ts measurements.

APPENDIX D: SIMULATING NOISE-LADEN HR MEASUREMENTS
To simulate noise added to the HR data, we first obtained the error characteristics of the Gear S3 smartwatch (Samsung Electronics America) by comparing the HR measurements of the watch against those of a gold-standard chest-strap device (Polar H7; Polar Electro Oy) under different physical activities for a total of 83 h of recordings sampled at 15-s intervals. We binned the HR measurements from the chest-strap device into 10-beat intervals ranging from 70 to 190 beats/min, with each bin comprising HR values within 5 beats/min of the bin center (e.g., the bin centered at 70 beats/min included data ranging from 65 to 75 beats/min). We then computed the error distribution for each HR bin. Thus, we obtained 13 error distributions, one for each of the 13 HR bins (70, 80, . . . , 190, all in units of beats/min). Figure D1 shows the error characteristics of the Gear S3 smartwatch.
To add noise to the HR data, we started with the time profile of the measured HR for a given subject. Next, for each HR value from the time profile, we identified the corresponding HR bin, randomly Reported values are means with SD in parentheses. The absolute time difference between the measured Tc peak (max Tc) and the model-estimated Tc peak (maxTc), the absolute error between max Tc andTc at the same time point, and the absolute error between max Tc and maxTc are shown. *Study 1 was conducted under two different Ta conditions, whereas study 2 was conducted at 40°C. In both studies, the relative humidity was set to 40%. †Study 3 was conducted at a Ta of 30°C at a relative humidity of 50%. sampled an error value from the cumulative distribution function of the particular error distribution corresponding to that bin, and added that value to the HR data. For example, if the true HR value was 83 beats/min, we selected the error distribution for the HR bin centered at 80 beats/min, randomly sampled an error value from the cumulative distribution function of that error distribution (e.g., 3 beats/min), and added that value to the true HR value to obtain the noise-added value of 86 beats/min. We repeated this procedure for all HR values throughout the time profile to obtain one noise-added HR realization. For each time profile for a given subject, we performed 100 such realizations. Figure E1 shows the time profiles of the six model parameters for the subject shown in Fig. 3 (study 3, subject 3). Apart from the activity-related changes that affect all parameters, changes in the experimental condition affect parameters ␥ 2 and ␣4. Specifically, parameter ␥2, the rate of heat transfer from the core to the skin compartment, responds to the difference between Tc and Ts (see Mathematical model, Eqs. 2 and 3 in METHODS) and, hence, is affected slightly differently between precooling (T c -Ts~6.5°C initially) and no-cooling (Tc -Ts~7.0°C initially; see Fig. 3, C and D, in RESULTS) conditions. Figure E1E shows that ␥ 2 is slightly lower in the PCNF condition (~30 min from the beginning when the first Tc measurement is shown in Fig. 4) than in the NCNF condition (~15 min from the beginning when the first Tc measurement is shown in Fig. 4). Note, however, that in the other two conditions (PCWF and NCWF) there is no difference in the value of ␥2. Thus, although precooling affects ␥2, the effect is masked by variation in Ts between experimental conditions.

APPENDIX E: TIME PROFILES OF THE SIX ADJUSTABLE MODEL PARAMETERS FOR A SUBJECT FROM STUDY 3
The rate of sweat evaporation, ␣4, depends on the difference between the vapor pressures of water in the skin and the environment (Ps -Pa, see Eq. 3 in Mathematical model in METHODS). Although both factors are indirectly affected by airflow, the difference Ps -Pa was very low in these experimental conditions (because they depend on Ts and Ta, both of which were around 30°C). Hence, we expected only a slightly larger value of ␣4 in the fan airflow conditions (PCWF and NCWF) compared with the other two conditions, but the value was higher only in the PCWF condition (Fig. E1G). Figure F1 shows a scatterplot comparing the estimated Tc against the corresponding measurements across all 166 subjects and experi- mental conditions (42,026 data points from 247 time profiles). The Pearson's correlation coefficient between the measured and estimated T c was 0.72, indicating good agreement across the entire temperature range. Note that temporal information critical for continuously monitoring Tc during physical activity is lost in such a plot. Furthermore, the plot masks information about the ability of the model to be individualized to a subject and is only presented here to indicate overall agreement between T c estimates and Tc measurements.

␣1
Rate constant for the heart rate signal (h Ϫ1 ) ␣2 Thermoregulatory rate constant for the core body temperature signal (h Ϫ1 ) ␣3 Rate of convective heat transfer from the skin compartment to the environment (h Ϫ1 ) ␣4 Rate of heat loss to the environment due to sweat evaporation (h Ϫ1 ϫ°C/mmHg) Ac Activity level ranging from 0 (low) to 4 (high) ␤ Gain in heart rate due to physical activity (beats·min Ϫ1 ·h Ϫ1 ) C