Stop Foolish Rounding

This page serves as the nexus for a community effort to stop foolish rounding of monetary values in business applications. Only a community will do because the roots of the problem run deep.


Founding Rant

I read a sophisticated email list where some of the best minds in software testing meet. Alan Jorgensen recently threw some dirt at software developers' feet on that list when he blamed them for the industry's failure to solve problems at their root cause:

I expect that there will be reluctance from the software development people for such methods are not consistent with the current software development culture. The idea of root cause analysis, for instance, is certainly alien to that culture. If your management team has been grown from software development, it will be a very hard sell.

I chose to defend programmers by citing an example where Jorgensen's for-instance, root cause analysis, was so often foiled by non-programmers that the real cause of many bugs has been taken off the table:

Root cause analysis, ha, ha, ha.
Allow me a brief rant. Consider the program:
a + b
This is a very failure prone program because it fails silently in many useful cases.
Root cause analysis would suggest that one should modify + to simply compute the right answer in all useful cases. This has been done by programmers over and over but these implementations of + are judged "odd ball" by management who would rather run a strict java or c# shop out of some weird distortion of the "prudent man" rule. (The prudent man rule says that you can't be accused of being reckless with your investor's money if you are doing the same thing everyone else does.)
I wrote financial software once and traced the root cause of difficult bugs to the implementation of + for data of types Date and Money. Fortunately I was programming in Smalltalk where the implementation of + was accessible. I took several days to correct these deficiencies after which my life, and that of my customers, became much better.
You might be thinking that there is nothing wrong with + and that Ward is just being cranky. I then say to you, you have not gone far enough in your root cause analysis.
And I say to business, who has selected and paid for most of today's computing infrastructure, you are fools for having funded 50 years of software and not yet gotten a + that works well for time or money.
I once advised an international company on how to implement a useful + for money. The developers loved it. The customers loved it. But the database didn't like it much at all. I met one of the developers a few years later. He asked, "do you have any idea how hard it is to persist that money abstraction to the database?" Yes, I had to admit that I knew it would be trouble but I also knew that they would get through it somehow once they got hooked on getting right answers in all of their money calculations.
(Aside: The database problem comes from the fact that the sum of a+b can take twice the storage of either a or b when a and b are international currency. This requires either a variable size storage mechanism or preallocation of space for hundreds of currencies. Databases favor neither solution. Again business has been fooled by the database vendors, not the programmers.)
I saw recently where the IEEE was proposing a standard for "decimal floating point" under the misguided believe that this would alleviate rounding errors. They are fools, for they attempted to do this within fixed sized storage, the one "feature" of floating a "point".
(Aside: The program a / b introduces additional potential errors which are again more correctly solved by variable allocation of storage than by "floating" a "point", decimal or otherwise.)'
I have intentionally stopped one sentence short of offering a complete solution in each of these cases. This is so that you can print this email and give it to the next Six Sigma guy that comes around as a test. Ask him to explain what I am talking about. If he gets it, ask him why he isn't hounding the vendors into fixing + instead of bothering you. If he doesn't get it, ask him how his methods are going to work on large programs when they can't find the bug in a + b.
(Aside: I heard that my financial software written for DOS is still in use today managing trillions of dollars. I also heard that the current owner/vendor was trying to meet customer demand by porting it to an industry standard database rather than the "odd ball" one I wrote for them. This porting effort was not going well. No surprise.)
You may need to read this post several times to get all the good advice that I've hidden between the sentences. I thank you for your attention. Best regards. -- Ward

Now I find myself in the position of having said, here is a problem and your solution doesn't work, without offering a solution myself. Thus this page. It may remain a public rant, like a blog post, or, it might seed the organization that takes on this persistent problem in computer software. I hope the latter. -- Ward 11:37, 8 September 2007 (PDT)


Call to Action

Um, what are we suppose to do about this?

Let's change both programming languages and database systems to have good representations for money. We may find that the shortest path to this goal is through the promotion of dynamic representations such as those Ward exploited in Smalltalk, and then using this capability to deliver good implementations of money.

Don't we have this already? Strangely no, not in the mainline languages and databases.

How do we proceed? Let's start by following this script for organization formation:

  • Individual has some passion for a subject, but not the energy to pursue it.
  • Individual rants somewhere, then converts that rant to an organizational seed at AboutUs.
  • Google finds the seed and introduces the page name into the internet's vocabulary.
  • Individual spreads the rant around in many circles, suggests folks just google for it.
  • A loose community forms around the page, with diverse ideas as to the way forward.
  • The page, and mechanisms like Consensus Polling, tightens the community.
  • The community acts yielding some positive result that wouldn't have happened any other way.


The Way Forward

We are not yet in agreement as to how to proceed. Here are some alternatives.

Methodology

Alan Jorgensen describes how Six Sigma might be used as a means of avoiding foolishness. He does not necessarily recommend this method of quality improvement as applied to software development.

Reference Implementation

Ward Cunningham and David Frech have an implementation of Rational Money in progress. It is available as an open source git repository. Use the following command to retrieve it.

git clone git://c2.com/money


Clarifications

The original rant only alludes to the breadth of the problem. Here is some correspondence and explanations.

Fungibility

The storage space of the sum is the storage space of the less valued currency plus one digit. That does not imply that it needs twice as much storage. The programmer will normally assign a longer variable and that might be eventually one with a double space consumption. Especially in Smalltalk that would not be necessary if a special currency operation is programmed. -- Hans Hartmann

This would only be true if the quantities were fungible, which is not the case with mixed currencies. See Wikipedia on Fungibility. -- Ward

The plus one digit answer assumes something that I think Ward didn't assume, which is that one currency is converted into the other. However, doing that brings on a whole host of other problems, starting with: "Why did you pick that specific conversion rate and not another one?", and continuing with "How do you fix it if some big stakeholder decides that they don't like the rate you picked? All in all, I think I'd rather deal with the problems of doubling the field length. Those problems are much more tractable. -- Pat McGee

I understand your argument about fungibility, however I would not speak of an operator "+" in the meaning of addition. It would resort to adding apples and pears. -- Hans

But apples and pears can be added and the result is fruit. The "+" operator works for me. See Ward's complex arithmetic example below. This is the same issue as currency. As I understand the meaning of Fungibility, fruit is fungible, apples and pears are not. -- Alan Jorgensen

I considered the problem very similar to the one of complex numbers where mathematicians do agree on the definition of +. In particular, if A = 5, and B = 6i, then A+B = (5, 6i). Likewise, if A = 100 USD, and B = 200 YEN, then A+B = (100 USD, 200 YEN). I had several hundred dimensions instead of complex number's two. This lead me to choose a variable sized representation. -- Ward

Floating Point Precision

Awesome post. I only recently found out about machine epsilon (some things we learn WAY too late in life). I thought you might be interested in this paper: Mindless (pdf) -- Michael Bolton

Prof. W. Kahan's paper explains many dozens of problems that can arise with floating point computations, and, how all but the rarest attempt to compensate only make matters worse. One of the few successful techniques is to repeat a calculation at a uniformly higher resolution.

Much foolishness comes from trying to make numbers that are very close print as if they are exact.

Representing Currency in a Data Base

Oh, and on another issue, I think the proper instantiation a money object must look something like a sparse matrix: ($1.00 and 400 yen) + (2 Euros and 49 Francs) = ($1.00 and 400 yen and 2 Euros and 49 Francs). I should think a data base could handle this nicely. It is just recognizing that money is not a nice clean numeric field. And "+" is a mess, but doable.

I'm not sure as a programmer that I would recognize this requirement at the onset of a project. So by what mechanism would I find out that currency objects are non-trivial?

In engineering practice, this is avoided by having a "design specification". (At least this was so in the most successful engineering environment that I worked in.) This document was reviewed by any and all stakeholders. Everyone was invited and if you did not participate, and something you knew of was not covered, woe be unto you! But then you could be held accountable for the project design results if you "signed off" on it. I have a lot more "rant" on this that I will get to. A statement like "Currency is represented by a scaled integer in the range 0 .. $4,000,000.000" would be in this document. Imagine the comments this would elicit from stakeholders!

And again, if you are selling fixes for defects (as opposed to quality), defects are a financial resource and why would you want to fix something that so handily produces revenue? (And more on this rant later, as well.)

Alan Jorgensen 19:32, 13 September 2007 (PDT)

Floating Point Requirements Error

One of the reasons that this problem exists (a + b gets the wrong answer) is that the initial requirements for the design of floating point were simply incomplete. The missing element is "precision".

There needs to be a precision field in floating point representation that identifies the number of bits of precision still represented by the mantissa. This field should have a special value when the floating point value represented is an integer (i.e., there are as many bits of precision as are available). Otherwise the value should be the number of half-bits of precision still available. An operation requiring rounding should subtract one from this field. Normalization affects this field.

This doesn't solve the problems of adding currency objects (an n-dimensional space), but it would surely do a lot for "unstable" matrix calculations.

Alan Jorgensen 01:15, 13 September 2007 (PDT)

Comparison of Floating Point Formats

Kahan's Mindless Rounding paper is entertaining but I could not reproduce the errors he sees with my Excel 2007. Has anyone tried out his examples on versions of Excel?

Also, about the new IEEE standard, Intel has a decimal simulation library out that can be downloaded from their site[1].

Anyone interested in doing some nitty-gritty performance/accuracy measurements on Java/C#/new 754r?

-- ted 10:20, 6 November 2007 (PST)

Disbursing Pennies

Martin Fowler[2] proposes that the 'divide' method return an array for situations where pennies must add up:

"Multiplication is very straightforward ... But division is not, as we have to take care of errant pennies. We'll do that by returning an array of monies, such that the sum of the array is equal to the original amount, and the original amount is distributed fairly between the elements of the array. Fairly in this sense means those at the beginning get the extra pennies."

-- Jim Tyhurst, Ph.D. Tyhurst Technology Group LLC

Fowler's approach assumes that the computer can be no more precise than the representations possible in real life using coins. This only makes sense if the divide is associated with an actual disbursement of payments, which is hardly the only application of division. For example, computation of ownership share, when monies are entering and leaving a fund, requires many divisions following one another. -- Ward

Resources

Money in Real Life

Money as Represented in the Computer



Retrieved from "http://aboutus.com/index.php?title=Stop_Foolish_Rounding&oldid=18900041"