About the Stage III data
(Last modified: 2/28/02)
The purpose of this write-up is to provide the DMIP participants with
some sense as to how Stage III precipitation estimates were produced, what
the known error sources and characteristics are, and what may be expected
when using Stage III data as precipitation forcing in hydrologic models.
The main ingredients to Stage III data are the Digital Precipitation
Array (DPA) products, operational hourly rain gauge data, and interactive
quality control by the Hydrometeorological Analysis and Service (HAS) forecasters
at the River Forecast Center (RFC). The DPA products, sometimes referred
to as the Hourly Digital Precipitation (HDP) products, are generated by
the Precipitation Processing Subsystem (PPS), which is one of many automatic
algorithms in the WSR-88D Radar Product Generator (RPG). For a description
of PPS, the reader is referred to Fulton et al. (1998). Even though it
has "precipitation" as its first name, PPS is designed to estimate rainfall
and rainfall only. As such, its products are of highly suspect quality
in times and areas of snowfall.
The DPA products are radar-only estimates of hourly accumulation of
rainfall on an approximately 4x4 km2 rectilinear grid. This
grid, referred to as HRAP (Hydrologic Rainfall Analysis Project), is based
on the polar stereographic projection. It is a subset of the Limited Fine
Mesh (LFM) grid used by the Nested Grid Model (NGM) at the NWS National
Centers for Atmospheric Prediction (NCEP). For further details of this
mapping, the reader is referred to Greene and Hudlow (1982) and Reed and
Maidment (1999).
The accuracy of the DPA products are affected mostly by the following
factors; 1) how well the radar can see precipitation near the surface given
the sampling geometry of the radar beams and the reflectivity morphology
of the precipitating cloud, 2) how accurately the microphysical parameters
of the precipitation system are known (Z-R, hail cap, etc.), 3) how accurate
the radar hardware calibration is, and 4) various sampling errors in the
radar measurement of returned power (how many pulses per sampling volume,
how many scans per hour, beam width, etc.)
The first, known as the vertical profile of reflectivity (VPR) effect,
can introduce a factor of two (or lower) overestimation (where the radar
beam intercepts the bright band layer) and a factor or ten (or higher)
underestimation at far ranges of the radar (where the radar beam samples
ice particles rather than liquid precipitation) in well-developed stratiform
precipitation in the cool season. The following rule of thumb may be useful
in assessing the presence and spatial extent of the VPR effect in WSR-88D
precipitation estimation. The axis of the lowest radar beam (approximately
0.5 elevation angle) reaches the altitudes of 1, 2, 3, 4, 5 km at ranges
of approximately of 60, 120, 160, 200, 230 km, respectively. Hence, if
the freezing level is at 2 km above the ground, one may expect bright band
enhancement at and around the range of 120 km (resulting in overestimation
of rainfall if the Z-R parameters are applicable to the surface rainfall,
which very often is not the case) and radar sampling of ice particles beyond
that range (resulting in severe underestimation of rainfall if the Z-R
parameters are applicable to the surface rainfall). Note that, at Oklahoma
City, the climatological freezing level is at or below 2 km in the months
of February and March, and at or below 3 km through May (Smith et al. 1997).
One of the more important changes in the production of DPA, related
to the sampling geometry of the radar beams, occurred in the spring of
1996 when bi-scan maximization (see Fulton et al. 1998 for details) in
PPS was essentially disabled. What that means is that DPAs afer the spring
of 1996 suffer less from bright band contamination and are less range-dependent.
The net effect of this change to the overall quality of Stage III data
over the DMIP basins, however, is less clear because bi-scan maximization
tended to compensate, to an extent, for radar underestimation of rainfall
due to nonuniform vertical profile of reflectivity (Seo et al. 2000) and
inaccurate Z-R parameters. It is difficult to pinpoint the exact timing
of this change in the Stage III product (which is based on DPAs from many
sites: see below) because each radar is operated independently and hence
the timing of the change varies from site to site. For a summary of radar-only
and radar-gage evaluation of DPA products prior to the disabling of bi-scan
maximization, the reader is referred to Smith et al. (1996). For similar
analyses based on the DPA products since the disabling of bi-scan maximization,
the reader is referred to Smith et al. (1997).
As for the microphysical parameters, the Z-R is the most important.
Initially, only the "convective" Z-R parameters were used; Z=300R1.4.
Though they work well for deep convective precipitation systems, the convective
parameters underestimate, often severely, for other types of storms. In
1997, the "tropical" Z-R parameters, Z=250R1.2, were added to
be used for hurricanes, tropical storms, small scale deep-saturated storms
fed by tropical oceanic moisture, etc. In December of 1999, the "stratiform"
Z-R parameters were also added to be used for general stratiform events
(Z=2001.6) and for winter stratiform events at sites east (Z=130R2.0)
and west (Z=75R2.0) of the continental divide. (The use of the
stratiform parameters does not intersect the DMIP simulation period, and
hence is not of direct interest here.) Loosely speaking, the tropical Z-R
produces about a factor of two more rainfall than the convective. It is
not known, however, whether there are any specific events through 1997
that have been identified as "tropical" based on post analysis.
Lack of radar calibration also had an effect on the quality of Stage
III data in the study area. It is known, for example, that KTLX (Twin Lakes,
OK) was biased low (or "cold" in the NEXRAD lingo) in the early years (at
least through 1995), resulting in rather significant underestimation, up
to a factor of two, of rainfall (Smith et al. 1996, Seo et al. 1999).
Whereas the errors describe above affect many bins over a relatively
large area in more or less the same ways, the effects of sampling errors
are much more random and can vary from one HRAP bin to the next. The operational
experience of Stage III data is limited to the lumped models, for which
the effect of the sampling errors tends to average out. The effect of the
sampling errors in distributed modeling is largely unknown.
Another important source of error in the DPA product, which has been
fully identified only recently, is strictly computational. Due to the CPU
and RAM limitations in the "legacy" Radar Product Generator (RPG), PPS
uses I*2 arithmetic (rather than I*4). Inconsistencies were found in the
arithmetic that resulted in truncation, as opposed to rounding-off, of
rainfall amounts. The net effect of this bug (which has mostly been fixed
in 2001) is minimal for most rainfall events. For long-lasting stratiform
events, however, the total loss of rainfall (due to not counting very small
amounts) can be rather significant (see http://hsp.nws.noaa.gov/oh/hrl/papers/2001mou/Mou01_PDF.html).
Also, it is estimated that this error is a large contributing factor to
the conditional bias seen in the DPA products, i.e., the smaller the rainfall
estimate in the DPA product is, the larger the bias (on the low side) relative
to the gauge rainfall (Seo et al. 1996).
Once the DPAs were transmitted to the RFC, they were fed into Stage
II. Stage II was made basically of three algorithms; mean field bias adjustment,
gauge-only analysis and radar-gauge analysis. Stage II was run on a radar
site-by-radar site basis before its products were mosaicked in Stage III.
This practice, favored by the first designers of the system for computational
and programmatic reasons (Hudlow 1988), has significant drawbacks, as will
be explained shortly. Of the Stage II algorithms, the mean field bias adjustment
had by far the biggest quantitative impact. Note that, in terms of the
scatter plot between the gauge and the matching radar rainfall estimates,
the mean field bias adjustment has the effect of pulling the line of scatter
closer to the diagonal, and hence can greatly impact the catchment-wide
volume of water being estimated (see also Steiner et al. 1999).
For mean field bias adjustment, the algorithm of Smith and Krajewski
(1991) was initially used. Operational experience, however, indicated that
the algorithm tended to significantly undercorrect, resulting in mean field
bias-adjusted rainfall being significant underestimates. After a period
of redevelopment (Seo et al. 1997, Anagnostou et al. 1998), the algorithm
was replaced by Seo et al. (1997) in the late spring/early summer of 1997.
Initially, gauge-only and radar-gauge analyses were carried out, respectively,
by the reciprocal (or inverse) distance-squared method and by a scheme
that performed linear weighted averaging, at each bin, of the gauge analysis
estimate and the matching raw radar rainfall estimate. Based on operational
experience, these algorithms were replaced in the spring of 1996 by the
kriging- and cokriging-like, respectively, algorithms of Seo (1996a, 1996b).
It is important to note that the quality (or lack thereof) of gauge-only
analysis figures prominently in the quality of the Stage III data, particularly
in the early days of NEXRAD when not all WSR-88Ds were in place and DPAs
were frequently missing due primarily to communication problems.
If the DPA was not available/missing for the WSR-88D umbrella, the gauge-only
analysis field was used for that site in the Stage III mosaicking process.
Gauge-only analysis at the hourly scale, however, is susceptible to the
number of available real-time hourly gauge data as well as its spatial
configuration (which varies from analysis time to analysis time). Note,
for example, that, if there are no gauge data over a large area, the gauge-only
analysis algorithm has no choice, in the absence of radar data, but to
assume no rainfall. As such, depending on the density of the real-time
hourly gauge network in the area, the gauge-only analysis could be significant
underestimates, due primarily to lack of detection of precipitation (i.e.
by sparse gauges). An opposite problem of sort also existed with the initial
gauge-only analysis algorithm (used up to the spring of 1996), which could
produce significant overestimates due to too large a radius of influence
assumed. It is also worth noting that not all gauges that are available
now were available in the earlier years of NEXRAD. As such, the quality
of Stage II/III products suffered not only from frequently missing DPAs
but also from fewer rain gauge data to work with.
Radar-gauge analysis does not have as great a quantitative impact as
the mean field bias adjustment. In the context of the scatter plot of gauge
and matching radar data, the role of radar-gauge merging is primarily to
reduce the scatter (as opposed to pulling the line of scatter toward the
45 line). Because the merging algorithm assumes (see Seo 1996b for details)
that the mean field bias-adjusted radar rainfall estimates are bias-free
(not only in the amount given that radar successfully detected rainfall,
but also in the radar detection of rainfall), the quality of radar-gauge
analysis depends directly on the quality of mean field bias adjustment.
This has two large consequences. The first is that, in the early years
of NEXRAD when the initial mean field bias adjustment algorithm tended
significantly undercorrect the bias, the radar-gauge estimates (and hence
the Stage III data, which are the mosaic of Stage II data) also tended
to underestimate. The other is that, at far ranges from the radar where
beam overshooting occurs (i.e. where radar beam overshoots the cloud top,
thus failing to detect precipitation) particularly for low-topped precipitation
systems in the cool season, the radar-gauge estimates necessarily tend
to underestimate.
In the summer of 1996, ABRFC implemented a local bias adjustment algorithm
called Process 1 (P1). P1 calculates HRAP bin-specific ratios of gauge-to-radar
rainfall at gauge locations, and performs spatial interpolation of the
ratios based on triangulation of gauge locations (Young et al. 2000, Seo
and Breidenbach 2002). Because it relies exclusively on the data from the
current hour for the adjustment, P1 is susceptible to sampling (both in
space and time) errors, but works well in relatively uniform widespread
cool-season precipitation (for which Stage III is known to perform poorly
due, in large part, to the "adjust-and-mosaic," as opposed to "mosaic-and-adjust,"
strategy of data processing: see below). The general practice at ABRFC
has been that Stage III is preferred in the warm season and P1 in the cool
season. For a comparative analysis of Stage III and P1 products, the reader
is referred to Young et al. 2000).
Once Stage II is run for all WSR-88D sites in the RFC service area and
vicinity, the Stage II products (typically the radar-gauge merging estimates)
are mosaicked in Stage III. The sole mosaicking rule used initially in
Stage III was simple arithmetic averaging: average all estimates from all
radars in the coverage overlap, and that is your Stage III estimate. This
scheme often had grievous consequences in that it made good estimates from
a close-in radar bad by mixing them with poor estimates at far ranges from
another radar. To ameliorate the situation within the constraints of the
software, another mosaicking rule was added in the early summer of 1997,
which took the maximum among all estimates in the coverage overlap as the
best estimate.
Because of the variety of the sources of error in radar-based/-aided
precipitation estimation, the Hydrometeorological Analysis and Service
(HAS) forecasters play a critical role in improving the quality and accuracy
of Stage III data. The primary tool used for this man-machine interaction
was the Stage III Graphical User Interface (GUI). P1 has its own GUI with
which the forecaster could also "create snow." In the initial Stage III
GUI, the forecasters could, for example, add/remove gauge observations
or ignore radar data, and rerun the analysis algorithms. In the spring
of 1999, the "draw-in precipitation" function was added to Stage III, which
allowed manual addition and subtraction of precipitation amounts over the
user-specified area.
The role of the HAS forecasters is particularly important in quality-controlling
rain gauge data. Real-time hourly rain gauge data are subject to all kinds
of errors (see, e.g., Steiner et al. 1999), and it is well known that an
alarmingly large fraction of all observations that come in to the RFC is
unusable. Also, because the majority of the gauges are not heated in the
winter (and hence purposefully blocked out by the RFC), gauge-aided precipitation
estimates from Stage II in winter may not necessarily be much of an improvement
from the radar-only DPA estimates.
Based on the several years of operational experience with Stage II/III,
much of the software was overhauled in 2000 and redeveloped into the Multisensor
Precipitation Estimator (MPE). At ABRFC, MPE has been running since March
2001 (along with Stage II/III and P1). Because the historical Stage III
data used for the current phase of DMIP do not intersect the MPE era, a
detailed description of MPE is not of direct interest here. We only list
the key features of MPE; "mosaic-and-adjust" (rather than "adjust-and-mosaic")
strategy, "rational" mosaicking based on data-driven delineation of the
effective coverage of the radar (Breidenbach et al. 1999), improved mean
field bias adjustment (Seo and Breidenbach 2001), ingestion of satellite
data-derived precipitation estimate (Fortune et al. 2002), implementation
of local bias correction (Seo and Breidenbach 2002), ordinary (as opposed
to simple) kriging and cokriging-like gauge-only and radar-gauge analysis
algorithms to improve unbiasedness.
Because the quantitative use of Stage III data at the RFC has been limited
to lumped models for rather large basins, by far the biggest problem at
the RFC with the Stage III data has been the systematic (mostly on the
low side) biases, particularly in the earlier days of NEXRAD (say from
1993 through mid-1997). Indeed, many of the recent changes and improvements
have been primarily to reduce systematic biases in the Stage III data.
What this means, in terms of error statistics, is that the efforts thus
far have been geared more toward reducing the mean error (ME) and conditional
(on the precipitation amount) ME at the basin scale, rather than reducing
the root mean square error (RMSE) at the HRAP scale. Note that the unbiasedness
matters particularly acutely at the RFCs, where the hydrologic model is
run in a continuous mode, and hence even a relatively small bias in precipitation
forcing can result, after some duration, in unrealistic drying up of the
model soil moisture.
Because all hourly rain gauge data that were available in real time
had already been used in the generation of Stage III data, it is difficult
to assess, at the hourly scale, the accuracy of the Stage III product via
independent validation. Independent validation at the daily scale, nevertheless,
should be possible based on daily observations that were not reported in
real time. Such an evaluation of the Stage III data, however, is beyond
the scope of this phase of DMIP, and will be explored as an future endeavor.
To assess systematic biases in Stage III data, a number of studies have
been carried out (Johnson et al. 1999, Stellman et al. 2000, Wang et al.
2000) to compare the Stage III data-derived mean areal precipitation estimate
(referred to as "MAPX") with the rain gauge data-derived mean areal precipitation
estimate (referred to as "MAP") on a long-term scale. The general finding
for the Illinois basins is that, overall, MAPX is about 5 to 10 percent
lower, and, as one would expect, the magnitude of the bias varies from
basin to basin, from season to season, and from period to period (particularly
those associated with use of Stage III only and both P1 and Stage III).
On the event scale, the general experience with Stage III data is that,
in periods (particularly in the early years), Stage III data are subject
to much larger biases, which may potentially render some comparisons essentially
meaningless: for some events, the error bound in the streamflow simulation
due to the error in the precipitation data may be larger than the inter-model
differences in streamflow simulation due to the model errors.
Because of the wide spectrum of error sources and algorithm changes,
it is difficult to identify, at the event scale, which errors and algorithm
changes may be affecting the accuracy of the Stage III estimates and how.
Even if they could be identified and their effects qualitatively assessed,
it is not possible, without independent validation using high-quality rain
gauge data, to quantify the magnitude of the errors in the Stage III data.
It is possible, however, to gain some sense of event-specific volumetric
bias that may be present in the Stage III data based on the streamflow
observations. For example, one may run the hydrologic model of choice many
times using different adjustment factors to the Stage III data until the
resulting simulated hydrograph is reasonably close, at least in the volumetric
sense, to the observed. Obviously, the resulting bias estimate, representing
the bias in the Stage III data aggregated at the space and time scales
of the basin and the basin response, respectively, is subject to model
errors and uncertainties in the initial conditions, and hence must be interpreted
due caution (much more so in the model warm-up period). Nevertheless, in
the absence of any direct evidence (in the form of high-quality rain gauge
data), such inference may be the only way to glimpse at the magnitude of
the first-order errors in the Stage III data at the event scale of temporal
aggregation.
Such an exercise, based on the Sacramento model-unit hydrograph combination
in the lumped mode, was carried out for TIFM7, WTTO2 and BLUO2 in the context
of variational assimilation, which produces bias estimates in precipitation
forcing as a by-product (see Seo et al. 2002 for details). The event-specific
bias estimates ranged from 0.86 to 2.14 for TIFM7, 0.83 to 1.39 for WTTO2,
and 0.85 to 1.68 for BLUO2. It is also seen that, for TIFM7, the Stage
III data in the first year or so is of highly suspect quality and may not
be taken seriously, and that, for BLUO2, consistent and significant low
bias exists in the Stage III data well into 1996.
Because many of the error sources are tied to the sampling geometry of radar
(and to that of gauges to some extent), very often, visualizing
Stage III data (say, at the temporal scale of aggregation of a day)
over the entire domain offers very good clues as to the kinds of
errors that the Stage III data may be subject to. As such, the DMIP
participants are encouraged to visually examine the Stage III data
(e.g., at http://www.abrfc.noaa.gov/archive) associated with significant
flood events for signs of artifacts and anomalies.
Obviously, the event-specific bias estimates described above (even if
they are in the ball park) shed little light on the magnitude of error
at a finer scale (say, at the HRAP and hourly scales). The hope is that,
given that unbiasedness at a larger scale is a necessary condition for
that at a smaller scale, such estimates may offer some guidance as to how
much stock one may put in the model calibration and/or intercomparison
results at a smaller scale.
In summary, due to a variety of error sources (sampling-geometrical,
reflectivity-morphological, microphysical, sampling by sparse rain gauges,
algorithm changes, etc.), the Stage III data are subject to systematic
errors that may vary over various time scales (a storm scale, an intra-storm
scale, seasonal, etc.). As such, care must be exercised in accepting and
interpreting the model simulation results. The participants are also strongly
encouraged to visually examine the Stage III data and to perform, e.g.,
sensitivity analysis to help gauge the magnitude of error that may be present
in the Stage III data.
REFERENCES
Anagnostou, E. N., W. F. Krajewski, D.-J. Seo, and E. R. Johnson, 1998:
Mean-field rainfall bias studies for WSR-88D. J. Hydrol. Eng, 3(3), 149-159.
Breidenbach, J. P., D.-J. Seo, P. Tilles, and K. Roy, 1999: Accounting
for radar beam blockage patterns in radar-derived precipitation mosaics
for River Forecast Centers, Preprints, 15th Conf. on IIPS, Amer.
Meteorol. Soc., 5.22, Dallas, TX.
Fortune, M. A., J. P. Breidenbach, and D.-J. Seo, 2002: Integration
of bias corrected, satellite- based estimates of precipitation into AWIPS
at River Forecast Centers, Preprints, Int. Symp. on AWIPS, Amer. Meteorol.
Soc., J7.4, Orlando, FL.
Fulton, R. A., J. P. Breidenbach, D.-J. Seo, D. A. Miller, 1998: WSR-88D
rainfall algorithm., Wea. Forecasting., 13, 377-395.
Greene, D. R. and M. D. Hudlow, 1982: Hydrometeorologic grid mapping
procedures. AWRA Int. Symp. on Hydrometeor. June 13-17, Denver, CO. (available
upon request from NWS/HL)
Hudlow, M. D., 1988: Technological development in real-time operational
hydrologic forecasting in the United States., J. Hydrol., 102, 69-92.
Johnson, D., M. Smith, V. Koren, and B. Finnerty, 1999: Comparing mean
areal precipitation estimates from NEXRAD and rain gauge networks. J. Hydrol.
Eng., 4(2), 117-124.
Reed, S. M., and D. R. Maidment, 1999: Coordinate transformations for
using NEXRAD data in GIS-based hydrologic modeling. J. Hydrol. Eng., 4,
174-183.
Seo, D.-J., and J. P. Breidenbach, 2002: Real-time correction of spatially
nonuniform bias in radar rainfall data using rain gauge measurements. to
appear in J. Hydrometeor.
Seo, D.-J., R. A. Fulton, and J. P. Breidenbach, 1997: Final report
for Interagency MOU among the NEXRAD Program, WSR-88D OSF and NWS/OH/HRL,
NWS/OH/HRL, Silver Spring, MD. (Available upon request from NWS/HL)
Seo, D.-J., 1998b: Real-time estimation of rainfall fields using radar
rainfall and rain gauge data. J. Hydrol., 208, 37-52.
Seo, D.-J., J. P. Breidenbach, and E. R. Johnson, 1999: Real-time estimation
of mean field bias in radar rainfall data. J. Hydrol., 131-147.
Seo, D.-J., V. Koren, and N. Cajina, 2002: Real-time variational assimilation
of hydrologic and hydrometeorological data into operational hydrologic
forecasting., submitted to J. Hydrometeor. (available upon request from
NWS/HL)
Seo, D.-J., J. P. Breidenbach, R. A. Fulton, D. A. Miller, and T. O'Bannon,
2000: Real-time adjustment of range-dependent bias in WSR-88D rainfall
data due to nonuniform vertical profile of reflectivity., J. Hydrometeor.,
1(3), 222-240.
Smith, J. A., and W. F. Krajewski, 1991: Estimation of the mean field
bias of radar rainfall estimates., J. Appl. Meteor., 30, 397-412.
Smith, J. A., D.-J. Seo, M. L. Baeck, and M. D. Hudlow, 1996: An intercomparison
study of NEXRAD precipitation estimates. Water Resour. Res., 32, 2035-2045.
Smith, J. A. M. L. Baeck, and M. Steiner, 1997: Hydrometeorological
assessment of the NEXRAD rainfall algorithms. Final report to NOAA/NWS/OH/HRL,
Dept. of Civil Eng. and Oper. Res., Princeton Univ., Princeton, NJ. (available
upon request from NWS/HL)
Steiner, M. J., J. A. Smith, S. J. Burges, C. V. Alonso, and R. W. Darden,
1999: Effect of bias adjustment and rain gauge data quality control on
radar rainfall estimation. Water Resour. Res., 35, 2487-2503.
Stellman, K. M., H. E. Fuelberg, R. Garza, and M. Mullusky, An examination
of radar- and rain gauge-derived mean areal precipitation over Georgia
watersheds, Weather and Forecasting, 16(1), 133-144.
Wang, D., M. B. Smith, Z. Zhang, S. Reed, and V. Koren, 2000: Statistical
comparison of mean areal precipitation estimates from WSR-88D, operational
and historical gauge networks. Preprints, 15th Conf. on Hydrol.,
Amer. Meteor. Soc., Long Beach, CA, 107-110.
Young, C. B., A. A. Bradley, W. F. Krajewski and A. Kruger, 2000: Evaluating
NEXRAD multisensor precipitation estimates for operational hydrologic forecasting.
J. Hydrometeor., 1, 241-254.
|