-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cube root of a real number #214
Comments
I've already got an implementation ready based upon the NSWC Mathematical Library version and using Fypp for templating of different real kinds. |
Thanks.
I don't use usually complex numbers, neither cube roots. However, to be in agreement with the intrinsic function |
Personally, I also don't have a foreseeable usage for complex cube roots. It should however, be possible to have the the same interface for real and complex cube roots: interface cbrt
function cbrt_sp(x) result(cbrt)
complex(sp), intent(in) :: x
complex(sp) :: cbrt
end function
function cbrt_complex_sp(x) result(cbrt)
complex(sp), intent(in) :: x
complex(sp) :: cbrt(3)
end function
end interface A better option might be to follow the IMSL library CBRT(X) function:
Lahey/Fujitsu Fortran also provides a CBRT function, which returns a single number for real or complex variables. In a thread on the Intel Fortran Forum, @sblionel shows the compiler does in fact call a special cube root routine for function cbrt_v1(x) result(cbrt)
real, intent(in) :: x
real :: cbrt
if (x >= 0.0) then
cbrt = x**(1.0/3.0)
else
cbrt = -((-x)**(1.0/3.0))
end if
end function Digging back further, the behavior of the
Also the gfortran mailing list contains an interesting discussion on cbrt: @certik Do you have an opinion on this one? My understanding is that the goal of providing a CBRT function is to be "mathematically" correct, in the sense that it can also accept negative arguments. The problem however is can the behavior of CBRT differ from X**(1./3.) in terms of representation/accuracy/speed. |
See my comment on precisely this issue (and the subsequent discussion there): symengine/symengine#1644 (comment) We should do whatever is the most consistent and document it well. It could be that we need two cbrt functions for the two most common conventions. |
How about an optional argument to decide which branch you want?
|
As for negative real arguments, I am inclined to have Upshot: this is what we learn in school before getting introduced to complex numbers. It's what most people would expect when working with reals only. Drawback: introduces some "gotcha" potential for people who use mix reals and complex numbers without reading the docs. |
With an elemental procedure, you could pass an array of |
The c++ And if you want the complex number, you can just use |
Or better yet, cast it to complex first. I have in mind:
|
This is also the behavior of Octave cbrt |
MATLAB on the other hand does not provide a
|
It seems Matlab's |
Let's keep the ball rolling. The way I see it now, there are essentially two choices:
Comments:
y = cbrt(cmplx(x))
y = cmplx(x)**(1._sp/3)
y = cbrt(z)
y = z**(1._sp/3)
r = cmplx(-1.,sqrt(3.))/2.
j = cmplx(0.,1.)
z = -8 + 0*j
y1 = cbrt(z) ! 1.0000 + 1.7321i
y2 = y1 * r ! -2.0000 - 0.0000i
y3 = y1 * conjg(r) ! 1.0000 - 1.7321i
write(*,'(A)') "real to real"
write(*,fmtr) "cbrt( 8._sp) = ", cbrt(8._sp)
write(*,fmtr) "cbrt(-8._sp) = ", cbrt(-8._sp)
write(*,'(/,A)') "real to complex"
write(*,fmtc) "cbrt(8._sp,k=0) = ", cbrt(8._sp,k=0)
write(*,fmtc) "cbrt(8._sp,k=1) = ", cbrt(8._sp,k=1)
write(*,fmtc) "cbrt(8._sp,k=2) = ", cbrt(8._sp,k=2)
write(*,'(/,A)') "complex to complex"
z = cmplx(-8._sp)
write(*,fmtc) "z = ", z
write(*,fmtc) "cbrt(z) = ", cbrt(z)
write(*,fmtc) "cbrt(z,k=0) = ",cbrt(z,k=0)
write(*,fmtc) "cbrt(z,k=1) = ",cbrt(z,k=1)
write(*,fmtc) "cbrt(z,k=2) = ",cbrt(z,k=2)
write(*,fmtc) "cbrt(z,k=3) = ",cbrt(z,k=3) produces the following output:
|
There is a
|
Thanks @vmagnin, we should also have tests for the special cases. |
@ivan-pi module functions
use iso_fortran_env, only: dp=>real64
implicit none
contains
pure real(dp) function cbrt(x)
real(dp), intent(in) :: x
cbrt = sign(abs(x)**(1.0_dp / 3.0_dp), x)
end function
end module functions
program main
use functions
real(dp) :: x
x = 27.0_dp
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
x = -27.0_dp
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
x = -0.0_dp
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
x = 0.0_dp
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
! Infinity:
x = 1.0_dp/x
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
! NaN:
x = sqrt(-x)
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
! -Infinity:
x = 0.0_dp
x = -1.0_dp/x
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
end program main which yields the following results in agreement with the Java specifications: $ gfortran essai_cbrt.f90 && ./a.out
27.000000000000000 3.0000000000000000 27.000000000000000
-27.000000000000000 -3.0000000000000000 -27.000000000000000
-0.0000000000000000 -0.0000000000000000 -0.0000000000000000
0.0000000000000000 0.0000000000000000 0.0000000000000000
Infinity Infinity Infinity
NaN NaN NaN
-Infinity -Infinity -Infinity But I don't know if it would be the fastest implementation. We call three functions in each case: |
That looks good. My naive implementation would be: pure real function cbrt(x)
real, intent(in) :: x
if (x >= 0.) then
cbrt = x**(1./3)
else
cbrt = -((-x)**(1./3))
end if
end function Unfortunately, for the value zero it does not preserve the sign:
Concerning speed, I've prepared a small benchmark and the difference is not that large. I've used Fypp to create some simple benchmarking macros: #:def NTIC(n=1000)
#:global BENCHMARK_NREPS
#:set BENCHMARK_NREPS = n
block
use, intrinsic :: iso_fortran_env, only: int64, dp => real64
integer(int64) :: benchmark_tic, benchmark_toc, benchmark_count_rate
integer(int64) :: benchmark_i
real(dp) :: benchmark_elapsed
call system_clock(benchmark_tic,benchmark_count_rate)
do benchmark_i = 1, ${BENCHMARK_NREPS}$
#:enddef
#:def NTOC(*args)
#:global BENCHMARK_NREPS
end do
call system_clock(benchmark_toc)
benchmark_elapsed = real(benchmark_toc - benchmark_tic)/real(benchmark_count_rate)
benchmark_elapsed = benchmark_elapsed/${BENCHMARK_NREPS}$
#:if len(args) > 0
${args[0]}$ = benchmark_elapsed
#:else
write(*,*) "Average time is ",benchmark_elapsed," seconds."
#:endif
end block
#:del BENCHMARK_NREPS
#:enddef
module cbrt_mod
implicit none
public
contains
elemental real function cbrt1(x)
real, intent(in) :: x
if (x >= 0.) then
cbrt1 = x**(1./3)
else
cbrt1 = -((-x)**(1./3))
end if
end function
elemental real function cbrt2(x)
real, intent(in) :: x
cbrt2 = sign(abs(x)**(1.0 / 3.0), x)
end function
end module
program main
use cbrt_mod
implicit none
integer, parameter :: n = 1000000
real :: x(n), y(n), z(n)
call random_number(x)
@:NTIC(1000)
y = cbrt1(x)
@:NTOC()
@:NTIC(1000)
z = cbrt2(x)
@:NTOC()
! We need to print something, otherwise the compiler
! seems to skip the calculation completely...
print *, maxval(abs(y-z)), sum(abs(y-z))
end program Output:
|
With the Intel Fortran compiler there is practically no difference:
Edit: I realized I was only sampling positive values... If I add an extra line with
Your sign/abs/** version looks like the |
Very counter-intuitive... We don't know what do exactly the compilers. With -O3 there is probably inlining in cbrt2(). Considering only ifort, my cbrt2() does not change with negative values. Normal. But why your cbrt1() is x5 longer !? There is a jump to the negative case, and two sign changes, but 5x times longer seems unreasonable... The gfortran behavior seems therefore OK, but ifort ??? It is also amazing that in most cases ifort gives a 5x faster code than gfortran for such simple calculations. Does ifort forces some kind of parallelism inside the processor ? (SSE vectorisation ?) Perhaps it could be interesting to add |
Precision loss with very big and small values: x = 1d300
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
x = 1d-300
print *, x, cbrt(x), cbrt(x)*cbrt(x)*cbrt(x)
|
Which version of Playing around on godbolt.org (https://godbolt.org/z/ZLaUcW) shows
Apart from any clever compiler optimisations, I would expect the |
@LKedward
|
It reveals also the limits of benchmarking: one implementation could be better with some compilers, some CPU but not with other (Intel, AMD, ARM...). And worse, one implementation could be better with deterministic algorithm, and another implementation with Monte Carlo algorithms... Those branch prediction mechanisms not only introduce security problems but also make benchmarking very delicate.... |
Nice findings! Indeed, apart from different processors and compiler settings there are also more subtle issues with benchmarking related to noise and measurement statistics and how the process interacts with the operating system. The README of the BenchmarkingTools.jl Julia package contains some information. As @certik has said in a few earlier issues, but I've come to understand now, it is important we agree on an intuitive API and provide a reference implementation with the correct behavior. Optimized implementations for different platforms will hopefully come in later as more users or even hardware vendors get involved. |
Interestingly, it looks like the
Yep, this is a good point.
I agree, optimization isn't the focus for |
It implies that if a "naive" implementation is used at first, its limits should be clearly stated in the source code and documentation. |
As for implementation, it seems good enough to pass |
As far as I can understand, the C version already works correctly for negative numbers. This leaves us to figure out our own version for complex roots. |
@ivan Jose Pulido Sanchez <[email protected]>
I tried your code with this tes codet:
do i = 1,100
call random_number( x )
x = x * 10.0_dp**i
x3 = cbrt(x)
write(*,*) i, x, abs(x - x3**3), abs(x - x3**3)/x
enddo
and the relative error was either zero or at most 3.1e-16 over the whole
range. I guess that this shows that the function is sufficiently accurate.
(Note: this restores the original value instead of comparing two different
ways of calculating the cube root)
Regards,
Arjen
Op vr 10 jul. 2020 om 10:50 schreef Ivan <[email protected]>:
… As far as I can understand, the C version already works correctly for
negative numbers.
(As a side note: I've tried porting the C version to Fortran:
https://gist.github.com/ivan-pi/5cf86ba198bc497331fba3d3a1a07c59 with
promising results.)
This leaves us to figure out our own version for complex roots.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#214 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR55FQ2XQ3QCUZTYO5TR23I45ANCNFSM4OF3DRVQ>
.
|
Thanks @arjenmarkus for the test! I reran it with the original C libm version by the side:
I get some small differences in the last places:
If I output as hexadecimals, I can indeed see some differences do remain, meaning my port is not a perfect match to the C one available on my platform. |
I pretty much doubt you can get closer: the Fortran and C compilers are
likely to use slightly different ordering of the machine instructions. I
tried with gfortran and Intel Fortran and they also gave slightly different
results, but always in the order of the last few bytes.
BTW, gfortran complained about overflow in some of the constants, so I had
to use -fno-range-check to compile the code. Not a showstopper, but still
it might be a complication.
Regards,
Arjen
Op vr 10 jul. 2020 om 13:26 schreef Ivan <[email protected]>:
… Thanks @arjenmarkus <https://github.com/arjenmarkus> for the test! I
reran it with the original C libm version by the side:
interface
pure function c_cbrt(x) bind(c,name="cbrt")
use iso_c_binding, only: c_double
real(c_double), value :: x
real(c_double) :: c_cbrt
end function
end interface
do i = 1, 100
call random_number(x)
x = x*10._dp**i
x3 = cbrt(x)
cx3 = c_cbrt(x)
write(*,*) i, x, abs(x - x3**3)/x, abs(x - cx3**3)/x
end do
I get some small differences in the last places:
76 4.0638919578662523E+075 1.9770924779982508E-016 1.9770924779982508E-016
77 7.6742391032182145E+076 1.6751503544737001E-016 3.3503007089474002E-016
78 7.0211150593663916E+076 0.0000000000000000 0.0000000000000000
79 5.1811233562026912E+078 1.5879804862696962E-016 1.5879804862696962E-016
80 3.6709083775467760E+079 1.7930216590378397E-016 1.7930216590378397E-016
81 1.2689247125165195E+080 0.0000000000000000 4.1496666677608811E-016
82 1.9609377628799389E+081 4.2964052674018527E-016 4.2964052674018527E-016
83 6.0802906214348435E+082 0.0000000000000000 0.0000000000000000
84 5.4375757115883662E+083 1.9832328300050751E-016 3.9664656600101501E-016
85 6.0508530767420035E+084 2.8515592178725947E-016 1.4257796089362973E-016
86 5.3970844933448210E+085 1.2787916059682132E-016 1.2787916059682132E-016
87 3.7468781425583570E+086 1.4735993185149220E-016 1.4735993185149220E-016
88 9.1656756575841826E+087 1.9276779266309714E-016 1.9276779266309714E-016
89 6.2174751629903578E+088 2.2733949308498420E-016 2.2733949308498420E-016
90 8.4055194736970552E+089 0.0000000000000000 0.0000000000000000
91 8.1811243675954609E+090 2.2114947934287768E-016 2.2114947934287768E-016
92 2.1849450318528865E+091 1.6561070122654541E-016 3.3122140245309081E-016
93 5.0864521082672201E+092 1.1382402387030692E-016 4.5529609548122767E-016
94 6.6274040704877999E+093 0.0000000000000000 2.7954737753913594E-016
95 2.1064887821671668E+094 1.7590157075425002E-016 1.7590157075425002E-016
96 1.6334791213177604E+094 2.2683772368054151E-016 2.2683772368054151E-016
97 5.3161169157241797E+096 1.7843264361368490E-016 1.7843264361368490E-016
98 6.4011877353899199E+097 1.1854909860403442E-016 3.5564729581210328E-016
99 4.0327657707019941E+098 1.5053788475170072E-016 4.5161365425510220E-016
100 4.7072761867901112E+099 0.0000000000000000 4.1269490362120304E-016
If I output as hexadecimals, I can indeed see some differences do remain,
meaning my port is not a perfect match to the C one available on my
platform.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#214 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR2TSBWQTZVITOB7T53R233EPANCNFSM4OF3DRVQ>
.
|
I have learned from @kargl that the |
Just found this interesting implementation of a cube root from Takuya Ooura. The usage license is the following:
The code in ! cubic root function in double precision
!
function dcbrt(x)
implicit real*8 (a - h, o - z)
dimension c(0 : 23)
parameter (
& p2pow16 = 65536.0d0,
& p2pow48 = 281474976710656.0d0)
parameter (
& p2powm16 = 1 / p2pow16,
& p2powm48 = 1 / p2pow48)
data c /
& 1.5319394088521d-3, -1.8843445653409d-2,
& 1.0170534986000d-1, -3.1702448761286d-1,
& 6.3520892642253d-1, -8.8106985991189d-1,
& 1.0517503764540d0, 4.2674123235580d-1,
& 1.5079083659190d-5, -3.7095709111375d-4,
& 4.0043972242353d-3, -2.4964114079723d-2,
& 1.0003913718511d-1, -2.7751961573273d-1,
& 6.6256121926465d-1, 5.3766026150315d-1,
& 1.4842542902609d-7, -7.3027601203435d-6,
& 1.5766326109233d-4, -1.9658008013138d-3,
& 1.5755176844105d-2, -8.7413201405100d-2,
& 4.1738741349777d-1, 6.7740948115980d-1 /
if (x .eq. 0) then
dcbrt = 0
return
end if
if (x .gt. 0) then
w = x
y = 0.5d0
else
w = -x
y = -0.5d0
end if
if (w .gt. 8) then
do while (w .gt. p2pow48)
w = w * p2powm48
y = y * p2pow16
end do
do while (w .gt. 8)
w = w * 0.125d0
y = y * 2
end do
else if (w .lt. 1) then
do while (w .lt. p2powm48)
w = w * p2pow48
y = y * p2powm16
end do
do while (w .lt. 1)
w = w * 8
y = y * 0.5d0
end do
end if
if (w .lt. 2) then
k = 0
else if (w .lt. 4) then
k = 8
else
k = 16
end if
u = ((((((c(k) * w + c(k + 1)) * w +
& c(k + 2)) * w + c(k + 3)) * w +
& c(k + 4)) * w + c(k + 5)) * w +
& c(k + 6)) * w + c(k + 7)
dcbrt = y * (u + 3 * u * w / (w + 2 * u * u * u))
end
! I haven't tested the accuracy, speed, or behavior for special values. If someone can decipher the algorithm I'd be interested to read the explanation. |
Sure: it seems it's a rational function approximation (the last two lines), the |
Related to #150 (non-special mathematical functions)
cbrt
- Cube root of a real numberDescription
Returns the cube root of the real number (x), that is a number (y) such that (y^3 = x).
Syntax
y = cbrt(x)
Arguments
x
: A real number (x).Return value
Returns the value (\sqrt[3]{x}), the result is of the type
real
and has the same kind asx
.Example
As seen from the discussion on Discourse this function is semantically more accurate than writing
x**(1./3.)
which only works for positive real numbers and returns NaN otherwise.A possible extension would be to allow complex arguments,
the return value would then be an array with 3 elements for the 3 cube roots(if the number is real and non-zero, there is one real root and a conjugate pair of complex roots; a complex non-zero value will have three distinct cube roots)The text was updated successfully, but these errors were encountered: