Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drotmg failure on Ubuntu 14.04 #484

Closed
btracey opened this issue Dec 30, 2014 · 19 comments
Closed

Drotmg failure on Ubuntu 14.04 #484

btracey opened this issue Dec 30, 2014 · 19 comments
Assignees
Labels

Comments

@btracey
Copy link
Contributor

btracey commented Dec 30, 2014

In the gonum project, we have an implementation of the float64 BLAS in Go as well as allow linking of a c-based blas. We have written our own test suite which can be seen at https://github.com/gonum/blas/blob/master/testblas/level1double.go . The tests for Drotmg can be found starting at line 1315 of the present file. The tests pass for the OS X accelerate framework, and also pass for me using OpenBLAS on OSX, and OpenBLAS also passes on the github travis link-in. We are seeing OpenBLAS fail our tests on Ubuntu 14.04. They get an unexpected answer for the following cases:

{
Name: "AbsQ1_LT_AbsQU__D2_Pos",
P: &blas.DrotmParams{
Flag: blas.Diagonal,
H: [4]float64{5.0 / 12, 0, 0, 0.625},
},
D1: 2,
D2: 3,
X1: 5,
Y1: 8,
Rd1: 2.3801652892561984,
Rd2: 1.586776859504132,
Rx1: 121.0 / 12,
},

{
Name: "D1=D2_X1=X2",
P: &blas.DrotmParams{
Flag: blas.Diagonal,
H: [4]float64{1, 0, 0, 1},
},
D1: 2,
D2: 2,
X1: 8,
Y1: 8,
Rd1: 1,
Rd2: 1,
Rx1: 16,
},

These are the Diagonal cases with a non-zero first element. Please see gonum/blas#59 for our discussion.

@xianyi
Copy link
Collaborator

xianyi commented Dec 31, 2014

@btracey , thank you for the report. I will try to reproduce this error on my machine.

B.T.W. what's your CPU? Is Ubuntu 32-bit or 64-bit?

@xianyi xianyi added the Bug label Dec 31, 2014
@xianyi xianyi self-assigned this Dec 31, 2014
@kortschak
Copy link
Contributor

First core info from /proc/cpuinfo:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 430  @ 2.27GHz
stepping    : 2
microcode   : 0x9
cpu MHz     : 1199.000
cache size  : 3072 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips    : 4521.72
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual

From the linked issue:

$ uname -pr
3.13.0-43-generic x86_64
$ gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
<snip>

@xianyi
Copy link
Collaborator

xianyi commented Jan 4, 2015

@kortschak @btracey ,

I cannot reproduce this error on my Ubuntu 64-bit machine.

Here is my test codes.
https://gist.github.com/xianyi/42d235556da86e3a0984

The outputs for case 1:

2.380165
1.586777
10.083333
param[0]=1.000000
param[1]=0.416667
param[2]=0.000000
param[3]=0.000000
param[4]=0.625000

The outputs for case 2:

1.000000
1.000000
16.000000
param[0]=1.000000
param[1]=1.000000
param[2]=0.000000
param[3]=0.000000
param[4]=1.000000

@kortschak
Copy link
Contributor

On my machine:

~ $ cat test_drotmg.c 
#include <stdio.h>
#include "cblas.h"

void cblas_drotmg (double *d1, double *d2, double *x1, const double y1, double *param);

void drotmg_ (double *d1, double *d2, double *x1, const double* y1, double *param);

int main()
{
  int i;
  double d1,d2,x1,y1;
  double param[5];

  /*
  d1=2;
  d2=3;
  x1=5;
  y1=8;
  */

  d1=2;
  d2=2;
  x1=8;
  y1=8;

  //  cblas_drotmg(&d1, &d2, &x1, y1, param);
  drotmg_(&d1, &d2, &x1, &y1, param);

  printf("%lf\n", d1);
  printf("%lf\n", d2);
  printf("%lf\n", x1);

  for(i=0; i<5; i++) {
    printf("param[%d]=%lf\n", i, param[i]);
  }
  return 0;
}
~ $ gcc test_drotmg.c -o drot -L~/Development/OpenBLAS -lopenblas
~ $ ./drot 
1.000000
1.000000
16.000000
param[0]=1.000000
param[1]=0.000000
param[2]=1.000000
param[3]=0.000000
param[4]=1.000000

@xianyi
Copy link
Collaborator

xianyi commented Jan 4, 2015

How did you build OpenBLAS?

@kortschak
Copy link
Contributor

Just make (however the failure is also seen with the ubuntu deb 0.2.8-6ubuntu1).

@xianyi
Copy link
Collaborator

xianyi commented Jan 4, 2015

Could you upload config_last.h and Makefile.last in OpenBLAS to gist.github.com?

@kortschak
Copy link
Contributor

No Makefile.last, but there is Makefile.conf_last.

https://gist.github.com/kortschak/3cc591a2b74a14d42fed

@xianyi
Copy link
Collaborator

xianyi commented Jan 4, 2015

These files are fine. I have no idea about this error.

Could you debug it by yourself? To build OpenBLAS with make DEBUG=1.

The drotmg only call the function at OpenBLAS/interface/rotmg.c .

@kortschak
Copy link
Contributor

OK. I have sorted this out. It is not an OpenBLAS issue, it is the ubuntu build.

The deb installed lib was shadowing the lib built from source so I was seeing the effect in both cases. Being more careful with the build and uninstalling the deb gives me the correct result.

Thanks and apologies.

@xianyi
Copy link
Collaborator

xianyi commented Jan 5, 2015

Please don't hesitate to file an issue if you have any questions.

@xianyi xianyi closed this as completed Jan 5, 2015
@juliantaylor
Copy link

when was this fixed in openblas?
ubuntu 14.04 has 0.2.8 with patches for #294, #304, #333, and #340

@kortschak
Copy link
Contributor

I'll do a bisection tonight and let you know.

@juliantaylor
Copy link

seems to be 692b14c

@kortschak
Copy link
Contributor

Confirmed fixed by 692b14c.

Thankyou for saving me that time, @juliantaylor.

@juliantaylor
Copy link

I can't find a test associated to it, please add the case posted here to the unittests.

kortschak added a commit to kortschak/OpenBLAS that referenced this issue Jan 6, 2015
Test requested in issue OpenMathLib#484.

Run tests by applying the following change and then make:

	diff --git a/Makefile.rule b/Makefile.rule
	index bea1fe1..9852ff3 100644
	--- a/Makefile.rule
	+++ b/Makefile.rule
	@@ -140,7 +140,7 @@ NO_AFFINITY = 1

	-# UTEST_CHECK = 1
	+UTEST_CHECK = 1
@kortschak
Copy link
Contributor

@juliantaylor that test has been merged.

@juliantaylor
Copy link

thanks, I think it would be nicer if the tests where run by a standard make check instead of a hidden variable in a makefile. That would save me the trouble of filing bugs in all distributions to tell the maintainers how to run the tests.
Openblas needs as many people to run the tests as possible due to the large amount of code running only on certain cpus.

@kortschak
Copy link
Contributor

Perhaps file an issue for that, I'm not a collaborator on this project, I just found the bug in our project. BTW make UTEST_CHECK=1 will run those tests - I missed that by the time I was writing the commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants