dwww Home | Manual pages | Find package

r.fill.stats(1grass)        GRASS GIS User's Manual       r.fill.stats(1grass)

NAME
       r.fill.stats   -  Rapidly fills ’no data’ cells (NULLs) of a raster map
       with interpolated values (IDW).

KEYWORDS
       raster, surface, interpolation, IDW, no-data filling

SYNOPSIS
       r.fill.stats
       r.fill.stats --help
       r.fill.stats [-mkwus] input=name output=name  [uncertainty=name]   dis-
       tance=value  mode=name   [minimum=value]   [maximum=value]  power=value
       cells=value  [--overwrite]  [--help]  [--verbose]  [--quiet]  [--ui]

   Flags:
       -m
           Interpret distance as map units, not number of cells

       -k
           Keep (preserve) original cell values
           By default original values are smoothed

       -w
           Just print the spatial weights matrix

       -u
           Just print estimated memory usage

       -s
           Single precision floating point output

       --overwrite
           Allow output files to overwrite existing files

       --help
           Print usage summary

       --verbose
           Verbose module output

       --quiet
           Quiet module output

       --ui
           Force launching GUI dialog

   Parameters:
       input=name [required]
           Raster map with data gaps to fill

       output=name [required]
           Name of result output map

       uncertainty=name
           Name of uncertainty output map

       distance=value [required]
           Distance threshold (default: in cells) for interpolation
           Default: 3

       mode=name [required]
           Statistic for interpolated cell values
           Options: wmean, mean, median, mode
           Default: wmean

       minimum=value
           Minimum input data value to include in interpolation

       maximum=value
           Maximum input data value to include in interpolation

       power=value [required]
           Power coefficient for IDW interpolation
           Default: 2.0

       cells=value [required]
           Minimum number of data cells within search radius
           Default: 8

DESCRIPTION
       r.fill.stats is a module for fast gap filling and  interpolation  (with
       smoothing) of dense raster data.

       The  r.fill.stats  module is capable of quickly filling small data gaps
       in large and high-resolution raster maps. It’s primary  purpose  is  to
       improve  high-resolution,  rasterized sensor data (such as Lidar data).
       As a rule of thumb, there should be at least  as  many  data  cells  as
       there  are  "no data" (NULL) cells in the input raster map. Several in-
       terpolation modes are available. By default, all values  of  the  input
       raster  map  will  be  replaced with locally interpolated values in the
       output raster map. This is the equivalent of running a low-pass smooth-
       ing  filter  on  the interpolated data and is often desirable, owing to
       noisy nature of high-resolution sensor data. With dense data and  small
       gaps,  this  should  result in only slight deviations from the original
       data and produce smooth output. Alternatively, setting the -k flag will
       disable the low-pass filter and preserve the original cell values.

       Available gap filling modes:

           •   spatially weighted mean (default)

           •   simple mean

           •   simple median

           •   simple mode

       The  spatially  weighted  mean  is  equivalent  to  an Inverse Distance
       Weighting (IDW; see also r.surf.idw) interpolation. The simple mean  is
       equivalent  to  a simple low-pass filter.  Median and mode replacements
       can also be achieved using r.neighbors.

       Note that r.fill.stats is highly optimized for fast processing of large
       raster  datasets  with  a  small interpolation distance threshold (i.e.
       closing small gaps). As a trade-off for speed and a small  memory  foot
       print,  some  spatial  accuracy is lost due to the rounding of the dis-
       tance threshold to the closest approximation in input raster cells  and
       the use of a matrix of precomputed weights at cell resolution (see fur-
       ther down for details). Note also that processing  time  will  increase
       exponentially with higher distance settings. Cells outside the distance
       threshold will not be interpolated, preserving the edges of the  origi-
       nal data extent.

       This  module is not well suited for interpolating sparse input data and
       closing large gaps. For such  purposes  more  appropriate  methods  are
       available, such as v.surf.idw or v.surf.rst.

       Applications  where the properties of r.fill.stats are advantageous in-
       clude the processing of high resolution, close range scanning  and  re-
       mote  sensing  datasets.  Such datasets commonly feature densely spaced
       measurements that have some gaps  after  rasterization,  due  to  blind
       spots,  instrument  failures, and misalignments between the GIS’ raster
       cell grids and the original  measurement  locations.  In  these  cases,
       r.fill.stats  should  typically  be  run using the "weighted mean" (de-
       fault) method and with a small distance setting (the default is to  use
       a search radius of just three cells).

       The  images below show a gradiometer dataset with gaps and its interpo-
       lated equivalent, produced using the spatially weighted  mean  operator
       (mode="wmean").

       In  addition, r.fill.stats can be useful for raster map generalization.
       Often, this involves removing small clumps  of  categorized  cells  and
       then  filling  the resulting data gaps without "inventing" any new val-
       ues. In such cases, the "mode"  or  "median"  interpolators  should  be
       used.

   Usage
       The  most  critical  user-provided  settings  are the interpolation/gap
       filling  method  (mode)  and  the  maximum  distance,   up   to   which
       r.fill.stats  will scan for cells with values (distance).  The distance
       can be expressed as a number of cells (default) or in the current GRASS
       location’s  units  (if  the -m flag is given). The latter are typically
       meters, but can be any other units of a planar coordinate system.

       Note that proper handling of geodetic coordinates  (lat/lon)  and  dis-
       tances  is  currently  not  implemented. For lat/lon data, the distance
       should therefore be specified in cells and usage of r.fill.stats should
       be restricted to small areas to avoid large inaccuracies that may arise
       from the different extents of cells along the latitudinal and  longitu-
       dinal axes.

       Distances  specified in map units (-m flag) will be approximated as ac-
       curately as the current region’s cell resolution settings  allow.   The
       program will warn if the distance cannot be expressed as whole cells at
       the current region’s resolution. In such case, the number of  cells  in
       the search window will be rounded up. Due to the rounding effect intro-
       duced by using cells as spatial units, the actual maximum distance con-
       sidered  by  the interpolation may be up to half a cell diagonal larger
       than the one specified by the user.

       The interpolator type "wmean" uses  a  precomputed  matrix  of  spatial
       weights  to  speed up computation. This matrix can be examined (printed
       to the screen) before running the  interpolation,  by  setting  the  -w
       flag.  In  mode "wmean", the power option has the usual meaning in IDW:
       higher values mean that cell values in the neighborhood lose their  in-
       fluence  on  the  cell  to be interpolated more rapidly with increasing
       distance from the latter. Another way of explaining this effect  is  to
       state  that larger "power" settings result in more localized interpola-
       tion, smaller ones in more globalized interpolation.  The default  set-
       ting is power=2.0.

       The  interpolators  "mean", "median" and "mode" are calculated from all
       cell values within the search radius. No spatial weighting  is  applied
       for  these  methods.  The  "mode"  of  the input data may not always be
       unique. In such case, the mode will be  the  smallest  value  with  the
       highest frequency.

       Often,  input  data will contain spurious extreme measurements (spikes,
       outliers, noise) caused by the limits of device  sensitivity,  hardware
       defects,  environmental  influences, etc. If the normal, valid range of
       input data is known beforehand, then the minimum  and  maximum  options
       can  be  used  to  exclude  those input cells that have values below or
       above that range, respectively. This  will  prevent  the  influence  of
       spikes and outliers from spreading through the interpolation.

       Unless the -k (keep) flag is given, data cells of the input map will be
       replaced with interpolated values instead of  preserving  them  in  the
       output. In modes "wmean" and "mean", this results in a smoothing effect
       that includes the original data (see below)!

       Besides the result of the interpolation/gap filling,  a  second  output
       map  can  be  specified  via the uncertainty option. The cell values in
       this map represent a simple measure of how  much  uncertainty  was  in-
       volved in interpolating each cell value of the primary output map, with
       "0" being the lowest and "1" being the theoretic  highest  uncertainty.
       Uncertainty  is  measured  by  summing up all cells in the neighborhood
       (defined by the search radius distance) that contain a value in the in-
       put  map,  multiplied  by their weights, and dividing the result by the
       sum of all weights in the neighborhood.   For  mode=wmean,  this  means
       that  interpolated output cells that were computed from many nearby in-
       put cells have very low uncertainty  and  vice  versa.  For  all  other
       modes,  all weights in the neighborhood are constant "1" and the uncer-
       tainty measure is a simple measure of how many input  data  cells  were
       present in the search window.

   Smoothing
       The  r.fill.stats  module  uses  the  interpolated values to adjust the
       original values and create a smooth surface, which is akin to running a
       low-pass  filter on the data. Since most high-resolution sensor data is
       noisy, this is normally a desired effect and results in output that  is
       more  suitable  for  deriving secondary products, such as slope, aspect
       and curvature maps. Larger settings for the  search  radius  (distance)
       will  result in a stronger smoothing. In practice, some experimentation
       with different settings for distance and power  might  be  required  to
       achieve  good  results. In some cases (e.g. when dealing with low-noise
       or classified data), it might be desirable to turn off  data  smoothing
       by  setting the -k (keep) flag. This will ensure that the original cell
       data is copied through to the result map without alteration.

        Effect of smoothing the original data: The top row shows a  gap-filled
       surface  computed from a rasterized Lidar point cloud (using mode=wmean
       and power=2), and the derived  slope,  aspect,  and  profile  curvature
       maps.  The  smoothing  effect is clearly visible.  The bottom row shows
       the effect of setting the -k flag: Preserving the original cell  values
       in  the interpolated output produces and unsmoothed, noisy surface, and
       likewise noisy derivative maps.  The effect can be seen in  the  illus-
       tration  above: Slope, aspect, and profile curvature are computed using
       the r.slope.aspect module, which uses a window  (kernel)  for  computa-
       tions that considers only the immediate neighborhood of each cell. When
       performed on noisy data, such local operations result in equally  noisy
       derivatives  if the original data is preserved (by setting the -k flag)
       and no smoothing is performed.

       (Note that the effects of noisy data can also be avoided by using  mod-
       ules  that are not restricted to minimal kernel sizes. For example, as-
       pect and other  morphometric  parameters  can  be  computed  using  the
       r.param.scale  module  which operates with variable-size cell neighbor-
       hoods.)

   Spatial weighting scheme
       The key to getting good gap filling results is to understand  the  spa-
       tial weighting scheme used in mode "wmean". The weights are precomputed
       and assigned per cell within the search window centered on the location
       at  which  to  interpolate/gap fill all cells within the user-specified
       distance.

       The illustration below shows  the  precomputed  weights  matrix  for  a
       search distance of four cells from the center cell:
       000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
       000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
       000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
       000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
       000.09 000.22 000.42 000.68 001.00 000.68 000.42 000.22 000.09
       000.07 000.19 000.37 000.56 000.68 000.56 000.37 000.19 000.07
       000.04 000.13 000.25 000.37 000.42 000.37 000.25 000.13 000.04
       000.01 000.06 000.13 000.19 000.22 000.19 000.13 000.06 000.01
       000.00 000.01 000.04 000.07 000.09 000.07 000.04 000.01 000.00
       Note  that  the weights in such a small window drop rapidly for the de-
       fault setting of power=2.

       If the distance is given in map units (flag -m), then the search window
       can  be  modeled  more  accurately  as a circle. The illustration below
       shows the precomputed weights for a distance in map units that  is  ap-
       proximately equivalent to four cells from the center cell:
       ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......
       ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
       ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
       000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
       000.00 000.09 000.28 000.58 001.00 000.58 000.28 000.09 000.00
       000.00 000.07 000.22 000.44 000.58 000.44 000.22 000.07 000.00
       ...... 000.02 000.11 000.22 000.28 000.22 000.11 000.02 ......
       ...... 000.00 000.02 000.06 000.09 000.06 000.02 000.00 ......
       ...... ...... ...... 000.00 000.00 000.00 ...... ...... ......

       When  using  a  small  search radius, cells must also be set to a small
       value. Otherwise, there may not be enough cells with  data  within  the
       search radius to support interpolation.

NOTES
       The  straight-line metric used for converting distances in map units to
       cell numbers is only adequate for planar coordinate systems. Using this
       module with lat/lon input data will likely give inaccurate results, es-
       pecially when interpolating over large geographical areas.

       If the distance is set to a relatively  large  value,  processing  time
       will  quickly approach and eventually exceed that of point-based inter-
       polation modules such as v.surf.rst.

       This module can handle cells with different X and Y resolutions.   How-
       ever,  note  that  the weight matrix will be skewed in such cases, with
       higher weights occurring close to the center and along  the  axis  with
       the  higher resolution. This is because weights on the lower resolution
       axis are  less  accurately  calculated.  The  skewing  effect  will  be
       stronger  if  the  difference  between  the  X and Y axis resolution is
       greater and a larger "power" setting is chosen. This  property  of  the
       weights  matrix  directly reflects the higher information density along
       the higher resolution axis.

       Note on printing the weights matrix (using the  -w  flag):  the  matrix
       cannot be printed if it is too large.

       The  memory  estimate  provided  by the -u flag is a lower limit on the
       amount of RAM that will be needed.

       If the -s flag is set, floating point type output will be  saved  as  a
       "single  precision"  raster map, saving ~50% disk space compared to the
       default "double precision" output.

EXAMPLES
   Gap-filling of a dataset using spatially weighted mean (IDW)
       Gap-fill a dataset using spatially weighted mean (IDW)  and  a  maximum
       search  radius  of  3.0  map units; also produce uncertainty estimation
       map:
       r.fill.stats input=measurements output=result dist=3.0 -m mode=wmean uncertainty=uncert_map
       Run a fast low-pass filter (replacement all cells with  mean  value  of
       neighboring cells) on the input map:
       r.fill.stats input=measurements output=result dist=10 mode=mean
       Fill data gaps in a categorized raster map; preserve existing data:
       r.fill.stats input=categories output=result dist=100 -m mode=mode -k

   Lidar point cloud example
       Inspect  the  point density and determine the extent of the point cloud
       using the r.in.lidar module:
       r.in.lidar -e input=points.las output=density method=n resolution=5 class_filter=2
       Based on the result, set computational region extent and desired  reso-
       lution:
       g.region -pa raster=density res=1
       Import the point cloud as raster using binning:
       r.in.lidar input=points.las output=ground_raw method=mean class_filter=2
       Check that there are more non-NULL cells than NULL ("no data") cells:
       r.univar map=ground_raw
       total null and non-null cells: 2340900
       total null cells: 639184
       ...
       Fill in the NULL cells using the default 3-cell search radius:
       r.fill.stats input=ground output=ground_filled uncertainty=uncertainty distance=3 mode=wmean power=2.0 cells=8

         Binning of Lidar and resulting ground surface with filled gaps.  Note
       the remaining NULL cells  (white)  in  the  resulting  ground  surface.
       These are areas with a lack of cells with values in close proximity.

   Outlier removal and gap-filling of SRTM elevation data
       In  this  example,  the SRTM elevation map in the North Carolina sample
       dataset location is filtered for outlier elevation values; missing pix-
       els are then re-interpolated to obtain a complete elevation map:
       g.region raster=elev_srtm_30m -p
       d.mon wx0
       d.histogram elev_srtm_30m
       # remove SRTM outliers, i.e. SRTM below 50m (esp. lakes), leading to no data areas
       r.mapcalc "elev_srtm_30m_filt = if(elev_srtm_30m < 50.0, null(), elev_srtm_30m)"
       d.histogram elev_srtm_30m_filt
       d.rast elev_srtm_30m_filt
       # using the IDW method to fill these holes in DEM without low-pass filter
       # increase distance to gap-fill larger holes
       r.fill.stats -k input=elev_srtm_30m_filt output=elev_srtm_30m_idw distance=100
       d.histogram elev_srtm_30m_idw
       d.rast elev_srtm_30m_idw
       r.mapcalc "diff_orig_idw = elev_srtm_30m - elev_srtm_30m_idw"
       r.colors diff_orig_idw color=differences
       r.univar -e diff_orig_idw
       d.erase
       d.rast diff_orig_idw
       d.legend diff_orig_idw

SEE ALSO
        r.fillnulls, r.neighbors, r.surf.idw, v.surf.idw, v.surf.rst

       Inverse Distance Weighting in Wikipedia

AUTHOR
       Benjamin Ducke

SOURCE CODE
       Available at: r.fill.stats source code (history)

       Accessed: unknown

       Main  index  | Raster index | Topics index | Keywords index | Graphical
       index | Full index

       © 2003-2022 GRASS Development Team, GRASS GIS 7.8.7 Reference Manual

GRASS 7.8.7                                               r.fill.stats(1grass)

Generated by dwww version 1.14 on Sun Apr 20 03:32:09 CEST 2025.