Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of AbstractIrrational #50

Closed
wants to merge 8 commits into from
Closed

Conversation

hhaensel
Copy link

@hhaensel hhaensel commented Dec 12, 2019

Currently Format.jl fails to handle the printout of Irrationals such as pi.

This can be circumvented by defining a new DEFAULTFORMATTER for AbstractIrrational.
I wondered, what a good standard print format might be

  • numeric representation, ('f')
  • symbolic representation, which would be a right-aligned string

In this PR I propose to introduce a new type and class character 'v' for "variable" which is per default right-aligned, so that the result of pyfmt(FormatSpec("2v"), pi) (EDIT: as well as fmt(pi,2)) is " π"

Together with the StringLiterals.jl package this would make it possible to write f"\%(pi, 10)" to print a right-aligned symbol.

I understand that the introduction of a new character type breaks with the standard Python format, so I am a bit unsure, whether this is a good idea, or how it could be done better. So I'd be happy to receive some feedback here.

EDIT: two more comments ...

  • I want to give some more reasoning for my idea:
    At the REPL print(pi) results in π, whereas print(1pi) results in 3.141592653589793. This behaviour would then be identical.
  • The 'v' type would be also quite useful for any other symbolic printout, e.g.:
using Sympy
@vars a  # (a,)
default_spec!(Sym, 'v')
fmt(1.234a,10)
"   1.234*a"

@coveralls
Copy link

coveralls commented Dec 12, 2019

Coverage Status

Coverage increased (+0.2%) to 95.826% when pulling 34feb56 on hhaensel:master into c4d2f61 on JuliaString:master.

@codecov-io
Copy link

codecov-io commented Dec 12, 2019

Codecov Report

Merging #50 into master will decrease coverage by 0.39%.
The diff coverage is 97.14%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #50     +/-   ##
=========================================
- Coverage   96.52%   96.12%   -0.4%     
=========================================
  Files           6        6             
  Lines         489      516     +27     
=========================================
+ Hits          472      496     +24     
- Misses         17       20      +3
Impacted Files Coverage Δ
src/fmt.jl 94.44% <100%> (-4.05%) ⬇️
src/fmtspec.jl 95.38% <100%> (+1.83%) ⬆️
src/fmtcore.jl 93.63% <94.44%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c4d2f61...34feb56. Read the comment docs.

@ScottPJones
Copy link
Member

I agree that Irrationals should be handled, however, I don't believe that there should be a type for formatting displays of irrational.
Rather, if an irrational is passed to one of the numeric formats, it should be displayed as the value, and to the s string format, as the symbolic name.
The default would be either s or one of the numeric forms such as f or g.
The other formatted output functions such as @printf, Formatting.fmt all work with irrationals, printing the symbolic name with s, or the numeric value with f.

@hhaensel
Copy link
Author

Thanks for your comments. I understand that introducing a new character is a bit strange.
The basic idea is still to make it possible to format also composed numeric types, such as complex numbers, physical constants, etc.
I have pushed the idea a bit, currently without including test routines. But fmt() now works for complex numbers.
I would be happy to receive feedback whether my idea looks strange to you or not.
The concept is to try define the output routines for numbers such that each type can have a fmt_number() routine to which the correct formatting routine is passed as argument (see 'fmtcore.jl`)

It also works for PhysicalConstants 😄
P.S.: I have removed the 'v' character and replaced it by calling fmt_default!() but then reset! won't work for these types...

@hhaensel
Copy link
Author

Just to give a small code example for PhysicalConstants:

using Format
using PhysicalConstants.CODATA2018, Unitful

import Format.default_spec
default_spec!(Unitful.AbstractQuantity, 'f')
default_spec(::Type{<:Unitful.AbstractQuantity}) = Format.DEFAULT_FORMATTERS[Unitful.AbstractQuantity]

function Format.fmt_Number(x::Unitful.AbstractQuantity, f::Function)
    io=IOBuffer()
    print(io, f(x.val))
    if !Unitful.isunitless(unit(x))
        Unitful.has_unit_spacing(unit(x)) && print(io," ")
        show(io, unit(x))
    end
    String(take!(io))
end

c_0 = SpeedOfLightInVacuum;
fmt(c_0, 30, 1)        #  "            299792458.0 m s^-1"
pyfmt("30e", c_0)   #  "           2.997925e+08 m s^-1"

@hhaensel
Copy link
Author

Just a thought: It might be a good idea to introduce 'S' for symbols instead of 'v' for variables. Alternatively, one could also opt for 'N' for Number.
Anyhow, I added some tests and did some code beautifying.

@hhaensel
Copy link
Author

By the way, rational numbers come for free (and are also part of the tests), as they are Numbers

@ScottPJones
Copy link
Member

I still don't believe that it's a good idea to add any characters, for the format strings that are meant to be C or Python compatible.
I think using s for symbolic/string output, or one of d,f,e for numeric output, is sufficient to get whatever output one wants, and doesn't potentially cause future incompatibility problems.

Copy link
Member

@ScottPJones ScottPJones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the Union{} here?
I don't think that the dict should try use a union as the key.

@hhaensel
Copy link
Author

The Union{} is needed in order to cover cases like Complex{Integer} or Rational which are all of type UnionAll.

@hhaensel
Copy link
Author

I am ok with removing 'S'. Do you have a good idea for resetting right-aligned strings?

Copy link
Member

@ScottPJones ScottPJones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the addition of S, the Union to the Dict, the separate handling of Complex?
You can add just Number and Complex as 's'.

@@ -29,7 +29,10 @@ default_spec!(::Type{T}, ::Type{K}) where {T,K} =
(DEFAULT_FORMATTERS[T] = DEFAULT_FORMATTERS[K]; nothing)

# seed it with some basic default formatters
for (t, c) in [(Integer,'d'), (AbstractFloat,'f'), (AbstractChar,'c'), (AbstractString,'s')]
ComplexInteger = Complex{T} where T<:Integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want a subtype here? Why not just all anything of type Complex?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted automatic Integer formatting for all Complex{Integer} types in one go, that's why I changed the dict type tp Union{DataType,UnionAll}.
So I can write fmt_default!(Complex{T} where T<:Integer} and also default_spec!(Complex{T} where T<:Integer}, 'f')
Otherwise we would need to to this for all subtypes individually.
The same hold for any other package that has a similar type system without an abstract supertype.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this means is that you have to set up defaults for pretty much any arbitrary Complex number, which I don't think is desirable behavior.

src/fmt.jl Outdated
ComplexInteger = Complex{T} where T<:Integer
ComplexFloat = Complex{T} where T<:AbstractFloat
for (t, c) in [(Integer,'d'), (AbstractFloat,'f'), (AbstractChar,'c'), (AbstractString,'s'),
(ComplexInteger, 'd'), (ComplexFloat, 'f'), (Number,'S'), (AbstractIrrational,'S')]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need both Number and AbstractIrrational, since AbstractIrrational is a subtype of Number.
You could simply add (Number, 's'). For complex, just add (Complex, 's'), and have a special _srepr function for any complex number.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced both because I wanted to allow the user to set different default formats. E.g. in many cases users will want to display a floating point number for AbstrqactIrrationals.
This can be done by calling fmt_default!(AbstractIrrational, 'f'). Other Number types won't be affected and still display as strings.

default_spec(::Type{<:AbstractFloat}) = DEFAULT_FORMATTERS[AbstractFloat]
default_spec(::Type{<:AbstractString}) = DEFAULT_FORMATTERS[AbstractString]
default_spec(::Type{<:AbstractChar}) = DEFAULT_FORMATTERS[AbstractChar]
default_spec(::Type{<:AbstractIrrational}) = DEFAULT_FORMATTERS[AbstractIrrational]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the one for Number is needed.

@@ -9,25 +9,26 @@
# width ::= <integer>
# prec ::= <integer>
# type ::= 'b' | 'c' | 'd' | 'e' | 'E' | 'f' | 'F' | 'g' | 'G' |
# 'n' | 'o' | 'x' | 'X' | 's'
# 'n' | 'o' | 'x' | 'X' | 's' | 'S'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want any non-standard characters added, 's' works fine for outputting symbols already.


function printfmt(io::IO, fs::FormatSpec, x)
cls = fs.cls
ty = fs.typ
if cls == 'i'
ix = Integer(x)
ix = x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've introduced type instability here, and it's better to just get an exception if the format class isn't valid for the given type.

ty == 'd' || ty == 'n' ? _pfmt_i(io, fs, ix, _Dec()) :
ty == 'x' ? _pfmt_i(io, fs, ix, _Hex()) :
ty == 'X' ? _pfmt_i(io, fs, ix, _HEX()) :
ty == 'o' ? _pfmt_i(io, fs, ix, _Oct()) :
_pfmt_i(io, fs, ix, _Bin())
elseif cls == 'f'
fx = float(x)
fx = x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, better for this to get an exception, instead of slowing this down by introducing type instability, and covering up what might be useful information for the programmer.

Copy link
Author

@hhaensel hhaensel Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure whether we should care about type stability here. (I am definitely not an expert here!)
But packages like PhysicalConstants redefine float() to work on unitful quantities. So float(x) ends up to be something else than a subtype of AbstractFloat, e.g. float(c_0) is 2.99792458e8 m s^-1.
So if we keep fx = float(x) , fx will be type unstable anyhow.

Well, I think I found a way. I am still not so familiar with Types...
still struggling to formulate something similar like Complex{<:Integer} if I remove the Union{}

Why do you need to separate out different types of complex numbers for defaults?
For things like complex numbers, you'll need to be careful about how width / precision is handled,
so that it doesn't just get treated like a string after the initial conversion to a string (which it looks like your code might be doing), which might end up having critical information truncated.

I think, you misinterpreted my code. I have added new less specific methods, e.g. _pfmt_f(io, x::Number, ...), which are called if the currently existing method does not cover the type, i.e. if x is not a subtype of AbstractFloat in case of _pfmt_f(). The method calls the routine _pfmt_Number_f() which then calls fmt_Number(). (This method needs to be defined by the user, if he wants to add a new number type. For Complex types I have already defined it.)

The last argument of that method is an anonymous formatting routine for the specific type ('i', 'e' or 'f') where the width of the FormatSpec is set to -1. This should be mainly a copy of the Number's show() method where the value is replaced by f(value). The result type of the function needs to be a String which is then passed to _pfmt_s() with the original FormatSpec, which then takes into account the width of the resulting string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result type of the function needs to be a String which is then passed to _pfmt_s() with the original FormatSpec, which then takes into account the width of the resulting string.

That's the problem, because depending on the length, it may truncate important information.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see this: f"\{1.3f}(1+2im)" results in "1.000 + 2.000im"
_pfmt_s() only sets the minimal width.

@hhaensel
Copy link
Author

hhaensel commented Dec 16, 2019 via email

@hhaensel
Copy link
Author

hhaensel commented Dec 16, 2019 via email

@ScottPJones
Copy link
Member

Well, I think I found a way. I am still not so familiar with Types...
still struggling to formulate something similar like Complex{<:Integer} if I remove the Union{}

Why do you need to separate out different types of complex numbers for defaults?
For things like complex numbers, you'll need to be careful about how width / precision is handled,
so that it doesn't just get treated like a string after the initial conversion to a string (which it looks like your code might be doing), which might end up having critical information truncated.

@@ -29,7 +29,10 @@ default_spec!(::Type{T}, ::Type{K}) where {T,K} =
(DEFAULT_FORMATTERS[T] = DEFAULT_FORMATTERS[K]; nothing)

# seed it with some basic default formatters
for (t, c) in [(Integer,'d'), (AbstractFloat,'f'), (AbstractChar,'c'), (AbstractString,'s')]
ComplexInteger = Complex{T} where T<:Integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this means is that you have to set up defaults for pretty much any arbitrary Complex number, which I don't think is desirable behavior.

@hhaensel
Copy link
Author

hhaensel commented Dec 16, 2019

I think, it is strongly desirable, to have a common format for all Complex{Intxx} types. It is just the analog case to have a common format for all Integer types. The same is true for all Complex{Floatxx} types.
So if you don't like the Union{} so much, why not chosing the internal Type{T} where T type?

@ScottPJones
Copy link
Member

I've added a PR #51 to address some of your issues, without adding new format characters.
It adds a lot more generality to the defaults for types, by allowing any keyword arguments to be used (and preserved for use by reset!), which allows for making numbers right-aligned.

I'd like to collaborate with you, to make sure that the functionality you want is added to Format (and the formatted strings), and make sure that your contributions are noted in the git log as well.

What things still need to be done, in your opinion, once PR #51 is merged?
I agree that handling Complex numbers in a better way is needed, as well as other numeric types such as the ones you mentioned with units, but it is better to make changes in separate PRs, instead of putting too much in one single PR.

@ScottPJones
Copy link
Member

One issue for formatted display of numbers like complex or the ones with units, is how the width and precision values should be used. For strings, the precision gives the maximum length, which should not be used in that fashion for numbers. The width needs to apply to the entire string, not each number separately (as in a Complex number)

@hhaensel
Copy link
Author

Pleasure to work with you! Your PR covers a lot of things I wouldn't have written that fast.

One issue for formatted display of numbers like complex or the ones with units, is how the width and precision values should be used.

  • In my code I use the same semantics as for real numbers, just that the same formatting is used for both real and imaginary part.
    Complex{<:Integer} are formatted as Integers, so prec is not used. width is used, but for the whole resulting string.
    Complex{<:AbstractFloat} are formatted as Floats, so prec is used, whereas width is still not used.
    Complex{Rational} is formatted as Rationals in each of the parts.
    If I force a formatting, such as 'f' or 'e', fx = float(x) converts to the respective float type which is then appropriately formatted. This only works, if we give up type stability for fx. Otherwise we would have to check the resulting type manually, but I thought this would be the strength of Julia 😉.

For strings, the precision gives the maximum length, which should not be used in that fashion for numbers. The width needs to apply to the entire string, not each number separately (as in a Complex number)

  • If I execute fmt("Hello World!", 5), I still receive "Hello World!" and not "Hello" ... 🤔

@hhaensel
Copy link
Author

I will close this PR and build a new one on top of #51

@hhaensel hhaensel closed this Dec 17, 2019
@hhaensel hhaensel mentioned this pull request Dec 17, 2019
@ScottPJones
Copy link
Member

ScottPJones commented Dec 18, 2019

If I execute fmt("Hello World!", 5), I still receive "Hello World!" and not "Hello"

This is why I was explaining that there are different formatting methods, which are inconsistent,
i.e. format, fmt, cfmt pyfmt.
pyfmt uses a Python-like format string (it's missing '%' and 'g'/'G')
fmt uses the Python-like formatting code, using the default set up (this was part of Tom Breloff's PR#10 that was never merged into Formatting.jl, which is why I made the fork years ago.
cfmt generates handlers using the built-in Julia @printf.

For example:

julia> @printf "%.5s" "Hello World!"
Hello
julia> v = "Hello World!" ; f"\%.5s(v)"
"Hello"

The Python style formatters ignore precision for strings, unlike the C style ones, i.e.:

julia> f"\{.5s}(v)"
"Hello World!"

@hhaensel
Copy link
Author

O wow that is, indeed, an important piece of information that I was missing. That makes things more complicated.
I will nevertheless slow down a bit due to personal reasons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants