Raman spectroscopy background correction noise insensitive signal subtraction


Quick Links
Link to J Renwick Beattie's homepage
link to J Renwick Beattie's about me page
Renwick Beattie's page on consultancy and training for Raman spectroscopy, multivariate analysis and signal processing
link to J Renwick Beattie's publication list
link to J Renwick Beattie's resources page
link to J Renwick Beattie's links page
Link to J Renwick Beattie's Sitemap
contact J R Beattie

Raman spectra of sclera repetitions of single sample at single point

Throughout my research career I have had to work with difficult samples in which the raw signal obtained from my sample have been dominated by a broad, non-Raman background signal. Traditionally this has been corrected by acquiring each signal to a high quality and performing a background estimation on this signal. The estimated background is subtracted from the raw data to leave baseline corrected data (1,2). However, it is not always realistic to accumulate the signals to a sufficiently high quality to reliably correct the background on each signal. In high volume or high speed mapping experiments a small increase in acquisition time can have a very large increase in the total accumulation time. In kinetic experiments the acquisition time is determined by the required time resolution. In clinical in-vivo analysis a considerable range of limitations come into play including low maximum permitted laser powers and consideration of patient comfort. I was tasked with developing the use of Raman spectroscopy for the detection of ocular protein modifications with the ultimate aim of developing an in-vivo diagnostic. This requires very restrictive laser powers and short accumulation times, with the consequence that the analysis must make use of very poor quality signals. Figure 1 shows some raw data from donor samples, and it is clear that the raw signal is dominated by a broad non-Raman background, and indeed in some samples the background was even higher, accounting for 99% of the signal. I have developed a method of correcting the non-Raman background that can consistently and reliably correct the baseline of the data as is shown in Figure 2 (3).  Details of this approach can be read in full in that publication, here I will discuss the implications of the approach.Background corrected Raman spectra of sclera using traditional and Noise Insensitive Background Signal Subtraction NIBSS

        Noise Insensitive Background Signal Subtraction (NIBSS) is not a new method of estimating the uninformative background signal, rather it is a new method of applying existing estimation methods. Put simply, NIBSS estimates the background on the results of a multivariate analysis of a dataset rather than the traditional approach of estimating it on the individual signals. This has crucial implications.

        1)   The background shape is estimated a limited number of times, determined by the complexity of the dataset rather than the number of signals recorded.

        2)      The accuracy of the background correction improves as the dataset grows, in contrast to the conventional paradigm whereby the accuracy is solely dependant on the individual signal quality.

 

Traditional Paradigm

NIBSS

No. Signals

No. Estimates

Relative Accuracy

No. Estimates

Relative Accuracy

100

100

1

10

10

10,000

10,000

1

10

100

1,000,000

1,000,000

1

10

1000

Table 1. Impact of data size on accuracy of background correction using a traditional estimation per-signal paradigm compared with NIBSS.

The consequences of these facts are:

1) Estimates are where uncertainty and variability enters the processing, so fewer estimates reduces the variation induced.

prediction variance for heme predicted from Raman spectra of sclera processed using traditional and NIBSS correction2) The background correction of low quality signals can be dramatically improved by gathering more data rather than by improving the quality of the individual signals. This becomes critical in situations where power or time limitations prevent the acquisition of good quality signals. As alluded to above, my own application I am developing fitted this scenario precisely, where I am employing lasers to measure from patient’s eyes. Here we are very limited in the maximum laser power and the amount of time a patient can reasonably be expected to accept a laser being shone on their eye. By gathering a larger dataset of low quality signals it is possible to perform the background correction with high accuracy and reproducibility.

Figure 3 shows the variation in the prediction of heme content from unnormalized data corrected using the traditional per-signal correction and the equivalent data corrected using the unnormalized data corrected using NIBSS. This figure has displayed results from unnormalized data so that the effects of this step do not confuse the trends discussed. The most striking difference between the two approaches, apart from the size of the scaling factor for the best fit trend (5.5 times higher for traditionally corrected data), is the power of the trendline. The prediction variance would be expected to be directly proportional to the noise level of the signal, i.e. inversely proportional to the signal to noise ratio (a power of -1). The traditionally corrected data however varies against the power of -0.83 suggesting that the prediction variance rises faster than the noise level in the signal. In contrast the power of the best fit line for NIBSS is -0.99, not significantly different from -1, demonstrating that the reproducibility of the background correction is insensitive to noise as the prediction variance varies only in response to the shot noise of the signal, not any variation induced by processing.

Prediction ariance for heme based on Raman spectra of sclera processed using traditional and SVD based signal processing paradigms

Background correction is a subtractive process, while normalisation involves dividing the signal by a normalisation factor. This means that normalisation has the potential to induce multiplicative effects on the prediction variance. Figure 4 compares the prediction variance against signal to noise ratio for the Raman data processed using the traditional approaches and SVD based approaches (NIBSS and SVD based normalisation). Upon normalisation the prediction variance is decreased by a factor of 2 for the low quality signals (SNR 0.8) normalised using an SVD based approach after NIBSS. The power of the trend is now slightly higher than -1 suggesting that there is a slight interaction between the shot noise and SVD based normalisation but the overall magnitude of the prediction variation is decreased substantially throughout the range studied (SNR down to 0.8). In contrast the relationship between traditionally processed (linear interpolation background correction and band area normalisation) data is much greater than -1 at -1.6. The prediction variance is greater for these traditionally corrected signals throughout the range studied but the difference becomes considerably more magnified at lower SNR. Indeed the ratio of the variances from the two methods follows almost a square relationship (power 1.82) with SNR demonstrating that the variance induced by the normalisation step is multiplying with the variance induced by the baseline correction. NIBSS induces no additional variation (variance induced is a factor of 1) in contrast to the traditional per-signal correction approach, hence there is no induced variation to magnify during the normalisation step.

SNR

Relative Time/Power per signal

Relative Variance

Repetitions for CI95% = 3% for traditional

Repetitions for CI95% = 3% for SVD based

1

1

17

17000

61

10

100

3

10

1

21

441

2

1

1

50

2500

1

1

1

Table 2. Amount of effort required to achieve a 95 % confidence interval of just 3 % of mean prediction for signals processed in the traditional per-signal manner and by the SVD based approach.

 

In terms of implications for experimental methodology the results are striking.  Table 2 lays out the effort required to achieve a benchmark confidence in the data of 3%, a figure widely considered as an ideal target for laboratory analysis. Confidence intervals depend on the ratio of the standard deviation (times 1.96 for 95 % intervals) ratioed to the square root of the number of samples measured. Consequently any differences in reproducibility between two methods becomes magnified considerably (to the power of 2), making the effort required to achieve parity grow to the power of two. With very poor quality signals (SNR = 1) the relative variance from traditionally applied background correction to NIBSS is a factor of 17. It requires just 61 parity signals to be 95% sure the true mean is within 3% of the reported prediction when using NIBSS. In order to be as confident about traditionally processed data the experimenter would need to acquire 17,000 spectra – 290 times more effort than for NIBSS processed data. An alternative would be to seek to improve the signal quality to reduce the number of replicates required (not always an option due to experimental, kinetic, safety or comfort reasons). In order to be confident that a single measurement will produce a prediction within 3% of the true value requires a signal to noise ratio of just 10 for NIBSS and 21 for traditionally processed data, requiring 4.4 times more effort to achieve the same reliability per signal. It isn’t until the signal to noise ratio reaches 50 that the traditional and NIBSS approaches give the approximately the same level of prediction variance.

Because NIBSS reduces the effort required to achieve a determined level of accuracy, these gains can be exploited in a number of ways. The aim I was originally aiming for was to use the efficiency gains to make a cutting edge application feasible and the use of NIBSS has brought this within a clinically feasible realm. The measurements can be carried out using a power (1 mW) below the maximum permissible exposure (1.4 mW) and still achieve a reliable prediction in under 15 s (few patients would accept laser probing for several minutes, even when using near infrared wavelengths).

 

Traditional

NIBSS

Acquisition

Long

Short

Power

High

Low

Throughput

Low

High

Accuracy

Low

High

Detection Limit

High

Low

Sampling Area

Small

Large

Instrument Cost

High

Low

Table 3. How the gains in reliability from employing NIBSS can be translated to experimental or procedural gains.

Alternatively the gains could be used to improve any of the parameters listed in Table 3 for existing or more easily achievable protocols. NIBSS could benefit high speed mapping, ultra sensitive detection limits, more robust and complete sampling from specimens, more accurate analyses, shorter acquisition allows higher throughput, lower power is essential in clinical settings and in the analysis of high explosives. It is hard to imagine an application that would not be able to exploit the NIBSS approach in some way.

Anyone interested in finding out about using the algorithm can read up on the prototype here

 

1.           Beattie, J. R., Bell, S. E. J., Borgaard, C., Fearon, A. and Moss, B. W. Prediction of adipose tissue composition using Raman spectroscopy:average properties and individual fatty acids. Lipids (2006) 41, 287-294

2.           Beattie, R. J., Bell, S. J., Farmer, L. J., Moss, B. W. and Desmond, P. D. Preliminary investigation of the application of Raman spectroscopy to the prediction of the sensory quality of beef silverside. Meat Science (2004) 66, 903-913

3.            Beattie, J. R. Optimising reproducibility in low quality signals without smoothing; an alternative paradigm for signal processing. J Raman Spec (2011) 42, 1419-1427

4.           Beattie, J.R., McGarvey J.J., Estimation of Signal backgrounds on Multivariate loadings improves model generation in face of complex variation in backgrounds and constituents. J Raman Spec. (2013) 44.p. 329-338

 5.         Beattie, J.R. Multivariate Analysis for the Processing of Signal, OGST (2014),DOI: 10.2516/ogst/2013185


   

Home | About Me | Publications | Resources | External Links | Sitemap