accsum

accsum Commit Details

Date:2011-05-30 23:16:17 (6 years 7 months ago)
Author:Michael Baudin
Commit:24
Parents: 23
Message: * Fixed accsum_fdcs. * Fixed help page of accsum_fdcs for ordering of input data. * Created overview and quick start.
Changes:
M/help/en_US/accsum_dcs.xml
M/changelog.txt
M/demos/ademo.sce
M/macros/accsum_dcs.sci
M/src/c/accsum.c
M/help/en_US/accsum_fdcs.xml
M/src/c/builder_c.sce
M/readme.txt
M/help/en_US/pseudomacros/accsum_fdcs.sci
M/help/en_US/accsum_0overview.xml
M/etc/accsum.start

File differences

macros/accsum_dcs.sci
2222
2323
2424
25
26
27
2528
2629
2730
......
3033
3134
3235
36
3337
3438
3539
// Description
// A Doubly Self Compensated Sum algorithm.
// Uses accsum_fasttwosum.
// The input data x must be ordered in decreasing magnitude.
// To do this, we may use the accsum_order
// function with order=5.
//
// Examples
// [s,e] = accsum_dcs ( [2 1] ) // 3
// [s,e] = accsum_dcs ( [1 2] ) // 3
// x = accsum_wilkinson(10); size(x,"*")
// s = sum(x)
// x = accsum_order ( x , 5 );
// [s,e] = accsum_dcs ( x )
// [s,e] = accsum_dcs ( x )
//
changelog.txt
1111
1212
1313
14
14
15
16
1517
* Used consistently apifun.
* Created missing .ref
* Created higham example.
-- Michael Baudin <michael.baudin@scilab.org> March 2011
* Fixed accsum_fdcs.
* Fixed help page of accsum_fdcs for ordering of input data.
* Created overview and quick start.
demos/ademo.sce
7272
7373
7474
75
7576
7677
7778
//#8 -0.09179687500000000 +0.30664062500000000 +0.39257812500000000 +0.36132812500000000
// The order #5, combined with simply (or double) compensated summation algo gives the
// correct answer.
// With double-double implementation, the result is: 0.3579858392477035522460937500000
// Conclusions:
src/c/builder_c.sce
1
1
22
33
44
......
1717
1818
1919
20
20
21
22
2123
2224
2325
26
27
28
29
2430
25
31
2632
2733
2834
// Copyright (C) 2010 - Michael Baudin
// Copyright (C) 2010 - 2011 - Michael Baudin
//
// This file must be used under the terms of the CeCILL.
// This source file is licensed as described in the file COPYING, which
ldflags = "";
if MSDOS then
cflags = "-DWIN32 -DACCSUM_EXPORTS";
// Configure the floating point unit so as to use
// SSE2 units, potentially instead of x87 registers.
cflags = "-DWIN32 -DACCSUM_EXPORTS /arch:SSE2";
libs = [
];
else
// Configure the floating point unit so as to use
// SSE units, instead of registers.
// The x86 registers may use extended precision.
// http://gcc.gnu.org/wiki/FloatingPointMath
include1 = src_dir;
cflags = "-I"""+include1+"""";
cflags = "-mfpmath=sse -msse2 -I"""+include1+"""";
libs = [
];
end
src/c/accsum.c
99
1010
1111
12
13
1214
1315
1416
......
9294
9395
9496
95
97
9698
9799
98100
#include "accsum.h"
#include <math.h>
// References
//
// "Stability and numerical accuracy of algorithms", Nicolas Higham
double tmp;
double z;
if ( abs(a) < abs(b) )
if ( fabs(a) < fabs(b) )
{
// Switch a and b
tmp = a;
help/en_US/accsum_dcs.xml
4747
4848
4949
50
51
52
5053
5154
5255
......
6164
6265
6366
67
6468
6569
6670
<para>
A Doubly Self Compensated Sum algorithm.
Uses accsum_fasttwosum.
The input data x must be ordered in decreasing magnitude.
To do this, we may use the accsum_order
function with order=5.
</para>
<para>
</para>
[s,e] = accsum_dcs ( [1 2] ) // 3
x = accsum_wilkinson(10); size(x,"*")
s = sum(x)
x = accsum_order ( x , 5 );
[s,e] = accsum_dcs ( x )
[s,e] = accsum_dcs ( x )
help/en_US/pseudomacros/accsum_fdcs.sci
2323
2424
2525
26
27
28
2629
2730
2831
......
3134
3235
3336
37
3438
3539
3640
// A Doubly Self Compensated Sum algorithm.
// Uses the fasttwosum algorithm.
// This is a fast implementation, based on compiled source code.
// The input data x must be ordered in decreasing magnitude.
// To do this, we may use the accsum_order
// function with order=5.
//
// Examples
// s = accsum_fdcs ( [2 1] ) // 3
// s = accsum_fdcs ( [1 2] ) // 3
// x = accsum_wilkinson(10); size(x,"*")
// s = sum(x)
// x = accsum_order ( x , 5 );
// s = accsum_fdcs ( x )
// s = accsum_fdcs ( x )
//
help/en_US/accsum_fdcs.xml
4848
4949
5050
51
52
53
5154
5255
5356
......
6265
6366
6467
68
6569
6670
6771
A Doubly Self Compensated Sum algorithm.
Uses the fasttwosum algorithm.
This is a fast implementation, based on compiled source code.
The input data x must be ordered in decreasing magnitude.
To do this, we may use the accsum_order
function with order=5.
</para>
<para>
</para>
s = accsum_fdcs ( [1 2] ) // 3
x = accsum_wilkinson(10); size(x,"*")
s = sum(x)
x = accsum_order ( x , 5 );
s = accsum_fdcs ( x )
s = accsum_fdcs ( x )
help/en_US/accsum_0overview.xml
3131
3232
3333
34
35
34
35
36
37
38
3639
40
3741
38
39
40
41
42
42
43
44
45
46
47
48
49
50
51
52
53
4354
4455
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
4594
4695
4796
48
97
4998
5099
51
52
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
53120
54
55
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
56143
57
144
58145
59
60
61
62146
63147
64148
......
96180
97181
98182
183
184
185
99186
100187
101188
<title>Purpose</title>
<para>
This is a DRAFT.
</para>
The goal of this toolbox is to provide accurate algorithms to
compute sums.
We consider here sums of a given dataset x, that is, we
consider s = x(1)+x(2)+...+x(n).
</para>
<para>
In 1993, the Book "Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods"
was published.
Associated source codes in the Matlab language were provided at http://www.netlib.org/templates/.
From 2000 to 2011, these function have been ported to Scilab 4 by Aladin Group (IRISA-INRIA).
Starting in 2010, Michael Baudin upgraded this module to Scilab 5.
These algorithms may be required to manage datasets which
are ill-conditionned with respect to the sum function, which
happens when the data are varying highly in magnitude and in
sign.
Hence, these datasets are very sensitive to small changes in the
input.
In this case, the "sum" function of Scilab is not appropriate
and may produce results which have only a small number of significant
digits, or no significant digit at all.
Users may consider the condnum module and the condnb_sumcond
function to compute the condition number of a particular sum.
See http://atoms.scilab.org/toolboxes/condnb for details.
</para>
<para>
The module is mainly based on the book "Stability and numerical
accuracy of algorithms" by Nicolas Higham.
</para>
<para>
In order to test the algorithms on practical datasets,
we include a dataset which was created by Yun Helen He and Chris H.Q. Ding.
This dataset is provided in the "demos" directory, in the "etaana.dat" file.
</para>
<para>
"The SSH variable is a two-dimensional see surface volume (integrated
sea surface area times sea surface height) distributed among multiple
processors.
At each time step, the global summation of the sea surface
volume of each model grid is needed in order to calculate the
average sea surface height.
The absolute value of the data itself is very large (in the order of
10^10 to 10^15), with different signs, while the result of the
global summation is only of order of 1.
Running the model in double precision with different number of
processors generate very different global summations, ranging from
-100 to 100, making the simulation results totally meaningless.
[...]
The 2D array is dimensioned as ssh(120,64), with a total of 7860
double precision numbers."
</para>
<para>
Other datasets are provided in this module, based on examples
created by Wilkinson, Higham and Priest.
</para>
<para>
The toolbox is based on macros and compiled source code.
</para>
</refsection>
<refsection>
<title>Authors</title>
<title>Quick start</title>
<para>
1993 - Richard Barrett, Michael Berry, Tony F. Chan, James Demmel, June M. Donato, Jack Dongarra, Victor Eijkhout, Roldan Pozo, Charles Romine, and Henk Van der Vorst
</para>
The <literal>accsum_fdcs</literal> function
provides a doubly self compensated sum algorithm.
The data must be ordered in decreasing magnitude.
To do this, we use the <literal>accsum_order</literal>
function with <literal>order=5</literal>.
This function is based on compiled source code, so that it is
fast enough, even for relatively large datasets.
</para>
<programlisting role="example">
<![CDATA[
path = accsum_getpath ( );
filename=fullfile(path,"demos","etaana.dat");
x=fscanfMat(filename);
e = 0.357985839247703552;
x = accsum_order ( x , 5 );
s = accsum_fdcs ( x )
abs(s-e)/e
]]>
</programlisting>
<para>
1993 - Univ. of Tennessee and Oak Ridge National Laboratory
</para>
The <literal>accsum_fcompsum</literal>, which uses
a compensated summation algorithm.
It does not require an ordered input and requires
less floating point operations.
In the current case, the simply compensated summation
algorithm does not perform sufficiently well.
</para>
<programlisting role="example">
<![CDATA[
x=fscanfMat(filename);
e = 0.357985839247703552;
s = accsum_fcompsum ( x )
abs(s-e)/e
]]>
</programlisting>
</refsection>
<refsection>
<title>Authors</title>
<para>
2000 - 2001 - INRIA - Aladin Group
2011 - DIGITEO - Michael Baudin
</para>
<para>
2010 - 2011 - DIGITEO - Michael Baudin
</para>
</refsection>
<refsection>
<para>
"On properties of floating point arithmetics: numerical stability and the cost of accurate computations", Douglas Priest, 1992
</para>
<para>
"Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications". Yun He and Chris H.Q. Ding. Journal of Supercomputing, Vol.18, Issue 3, 259-277, March 2001. Also Proceedings of International Conference on Supercomputing (ICS'00), May 2000, 225-234.
</para>
</refsection>
</refentry>
etc/accsum.start
6161
6262
6363
64
64
65
6566
6667
6768
// ====================================================================
// A Welcome message.
mprintf("\tType ""demo_gui()"" and search for ""%s"" for Demonstrations.\n",modulename);
mprintf("\tType ""help accsum_overview"" for quick start.\n");
//mprintf("\tType ""demo_gui()"" and search for ""%s"" for Demonstrations.\n",modulename);
// =============================================================================
// Create variables at level #0
readme.txt
55
66
77
8
9
810
9
11
12
13
14
15
16
17
18
19
20
21
22
1023
24
25
26
27
28
29
1130
31
32
33
34
35
36
37
1238
1339
1440
......
4571
4672
4773
48
49
50
51
52
53
5474
5575
5676
......
80100
81101
82102
103
83104
105
The goal of this toolbox is to provide accurate algorithms to
compute sums.
We consider here sums of a given dataset x, that is, we
consider s = x(1)+x(2)+...+x(n).
The toolbox is based on macros.
These algorithms may be required to manage datasets which
are ill-conditionned with respect to the sum function, which
happens when the data are varying highly in magnitude and in
sign.
Hence, these datasets are very sensitive to small changes in the
input.
In this case, the "sum" function of Scilab is not appropriate
and may produce results which have only a small number of significant
digits, or no significant digit at all.
Users may consider the condnum module and the condnb_sumcond
function to compute the condition number of a particular sum.
See http://atoms.scilab.org/toolboxes/condnb for details.
The flagship of this module is the accsum_fdcs function, which
provides a doubly self compensated sum algorithm.
This function is based on compiled source code, so that it is
fast enough, even for relatively large datasets.
The data must be ordered in decreasing magnitude.
To do this, we may use the accsum_order function with order=5.
The module is mainly based on the book "Stability and numerical
accuracy of algorithms" by Nicolas Higham.
The toolbox is based on macros and compiled source code.
Type "help accsum_overview" for quick start.
Features
--------
* This modules depends on the helptbx module.
* This modules depends on the apifun module.
TODO
----
* Implement pairwise summation.
* Implement optimal summation.
Author
------
* https://hpcrd.lbl.gov/SCG/ocean/NRS/ECMWF/img14.htm
* https://hpcrd.lbl.gov/SCG/ocean/NRS/SCSsum.F
* "On properties of floating point arithmetics: numerical stability and the cost of accurate computations", Douglas Priest, 1992
* "Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications". Yun He and Chris H.Q. Ding. Journal of Supercomputing, Vol.18, Issue 3, 259-277, March 2001. Also Proceedings of International Conference on Supercomputing (ICS'00), May 2000, 225-234.

Archive Download the corresponding diff file

Revision: 24