Mathematics, philosophy, code, travel and everything in between. More about me…

# Thoughts on weighing observations

Suppose you have a set of observations (measurements) and want to assess how well they fall into an ideal target range. Here are a few thoughts on how to go beyond the most obvious measure: percentage of “in-range values”.

Suppose our observations lie in $(X, \rho)$, a metric space, and our target range is a subset $S \subset X$ closed in $X.$ For each observation $x \in X$ we want to calculate its weight $\omega(x) \in \lbrack 0, 1 \rbrack.$ The total weight of a set of observation will then naturally be

$$\omega(\Xi) := \frac{1}{|\Xi|} \sum_{x \in \Xi} \omega(x), \quad \Xi \subset X.$$

## Properties of $\omega$ weights

Feel free to skip to examples of specific weighting functions if you are not interested in an abstract discussion of their possible properties.

Note on notation. For $x \in X, A \subset X$ let $\rho(x, A) := \inf \{ \rho(x, y): y \in A \}$ be the distance of $x$ from the set $A.$ Let $\overline{A}$ denote the closure of $A \subset X$ in $X,$ let $\mathrm{int}\; A$ denote the interior of $A$ and let $\partial A := \overline{A} \setminus \mathrm{int}\; A$ be the boundary of $A.$

Let us consider some properties we would expect this $\omega: X \to \lbrack 0, 1 \rbrack$ to have. First, I regard these as essential:

1. $\forall s \in S: \omega(s) = 1.$
All observations within the target range have maximum weight.
2. $\forall x, y \in X: \rho(x, S) = \rho(y, S) \Rightarrow \omega(x) = \omega(y).$
$\omega(x)$ is in fact a function of the distance of $x$ from $S.$ We could be thinking about a real function $\omega_{\rho}: \lbrack 0, \infty ) \to \lbrack 0, 1 \rbrack,$ $\rho(x, S) \mapsto \omega_{\rho}(\rho(x, S))$ instead of the abstract $\omega: X \to \lbrack 0, 1 \rbrack.$
3. $\forall x, y \in X: \rho(x, S) \lt \rho(y, S) \Rightarrow \omega(x) \ge \omega(y).$
Observations farther from target have smaller weight.
4. $\forall x \in X: \omega(x) \to 0$ as $\rho(x, S) \to \infty.$
The weight of outliers approaches zero.

Some properties are rather desirable:

1. $\omega \in C_0(X).$
$\omega$ is continuous on $X.$ Prevents sudden drops in weight.
2. $\omega \in C_0(\partial S).$
Weaker form of (v). Guarantees smoothish drop-off at least around the boundary of the target. Obviously (v) $\Rightarrow$ (vi).
3. $\omega \in C_1(X).$
$\omega$ is differentiable on $X.$ Guarantees smooth weight transitions. Obviously (vii) $\Rightarrow$ (v).
4. $\omega \in C_1(\partial S).$
Weaker form of (vii). Guarantees smooth weight transition at least on the boundary of the target. Obviously (vii) $\Rightarrow$ (viii).

And finally, some properties are application dependent.

1. $\forall x \in X: \omega(x) \gt 0.$
All weights are positive.
2. $\rho(x, S) \mapsto \omega_{\rho}(\rho(x, S))$ is (strictly) concave on $X.$
Decrease in weight speeds up with growing distance from the target.

## Examples of $\omega$ functions

I used the following setup for the example plots. This target range just happens to be specific to my application.

$$X = \mathbb{R}, \\ \rho = \rho_e \quad\mathrm{(Euclidean)}, \\ S = \lbrack 3.6, 7.8 \rbrack.$$

### Discrete weight (in-range indicator)

$$\omega_d(x) := \begin{cases} 1, & x \in S, \ 0, & x \not\in S. \end{cases}$$

The most basic binary indicator. Has the essential properties but that is all. Because it is discontinuous at $\partial S,$ it is too sensitive. For example, an observation of 7.9, mere 1.2% above the upper bound of 7.8, is immediately discarded with zero weight.

### Polynomial weight

$$\omega_{P(\alpha, \beta)}(x) := \begin{cases} 1, & x \in S, \ \max \left\lbrace 1 - \left( \frac{\rho(x, S)}{\beta \mathrm{diam}\; S} \right)^{\alpha}, 0 \right\rbrace, & x \not\in S, \end{cases}$$

where $\alpha, \beta \gt 0$ and $\mathrm{diam}\; S := \sup \{ \rho(x, y): x, y \in S \}$ is the diameter of $S.$

This looks complicated but is not. The distance of $x$ from $S$ is scaled by a $\beta$-multiple of the “size” of $S$ and raised to the power of $\alpha.$ Negative weights are normalized to zero.

This weight function has properties the (i)–(vi) and (viii) (it is not differentiable $\forall x: \rho(x, S) = \beta \mathrm{diam}\; S$ where the polynomial hits the ground). It is concave $\forall x: \rho(x, S) < \beta \mathrm{diam}\; S$ when $\alpha \gt 1$ but does not have property (ix) (it does assign zero weights).

The parameters $\alpha$ and $\beta$ will depend on your application and requirements. For example, let $\alpha = 2$ and $\beta = \frac{1}{2}$ (weight is zero for points farther than one half of the target’s diameter):

Or $\alpha = 3$ and $\beta = 1:$

Instead of scaling by multiples of $\mathrm{diam}\; S$ we could perhaps scale by $\beta \sqrt{\strut\mathrm{Var}\;\Xi}$ or a $\beta$-quantile for a sample of observations $\Xi \subset X.$

### Exponential weight

$$\omega_{\exp(\alpha)}(x) := \begin{cases} 1, & x \in S, \ \exp \left( -\alpha\rho(x, S) \right), & x \not\in S, \end{cases}$$

$\alpha > 0.$ This function is interesting by being strictly positive. It is everywhere continuous and differentiable everywhere except for $\partial S.$ The sharp drop-off around $\partial S$ and its convexity might be problematic. $\alpha$ controls the speed of convergence to zero.

For example, $\alpha = \frac{1}{2}:$

Much more could be written and explored in this area; these are just a few methods I have been playing with in a certain project. Other weighting functions can be considered and there is always the question of choosing parameters (they can either be fixed a priori or based on stochastic properties of the observation sample).

October 1, MMXIII — Mathematics.