os.open doesn't accept Union[bytes, Text] #1943

euresti · 2016-07-26T23:48:25Z

The following code:

from typing import Text, Union
import os

def test(path):
    # type: (Union[bytes, Text]) -> None
    f = os.open(path, os.O_RDONLY)

Passes with --py2 but fails in python3 as follows:

error: Type argument 1 of "open" has incompatible value "Union[bytes, str]"

@JukkaL

The text was updated successfully, but these errors were encountered:

gvanrossum · 2016-07-26T23:53:13Z

FWIW if we change os.open() to take Union[str, bytes] the error goes away, but I'm sort of thinking that passing that union to something expecting AnyStr should work...

JukkaL · 2016-07-27T09:49:38Z

Yes, Union[str, bytes] should probably be sometimes valid when AnyStr is expected (but it isn't right now). Example:

def f(x: AnyStr) -> AnyStr:
    return x

def g(x: Union[str, bytes]) -> None:
    f(x)  # should be fine, return type Union[str, bytes]

However, List[Union[str, bytes]] should not be compatible with List[AnyStr] because the latter is uniform but the prior could have a mix of str and bytes. Also, this shouldn't be valid, following similar reasoning:

def f2(x: AnyStr, y: AnyStr) -> AnyStr:
    return x + y

def g2(x: Union[str, bytes], y: Union[str, bytes]) -> None:
    f2(x, y)  # error, could be a mix of str and bytes

Here's how we could support these (and other things not involving type variables, such as overloads):

If caller has arguments with union types, expand all the combinations of the union types into non-union types and test the call with each of these expansions. The call is valid if all combinations type check okay.

The call to f2 in the second example would be checked 4 times:

f2(<str>, <str>)
f2(<str>, <bytes>)
f2(<bytes>, <str>)
f2(<bytes>, <bytes>)

The obvious problem with this is that this could be really slow if there are many union types in a call. An implementation could plausibly be clever and only expand those unions where it seems like a useful thing to do.

gnprice · 2016-08-04T21:25:52Z

I think the right fix to @euresti's original issue is that the type of os.open's parameter should not be AnyStr -- it should be Union[bytes, str]. In general, AnyStr doesn't provide any additional value when it appears only once in a type signature, and it introduces complexity because its semantics are somewhat complicated.

It's possible we should also handle the interaction of AnyStr with Union differently in general. I think that ought to be a separate issue if we want to pursue it further.

gnprice · 2016-08-04T21:28:45Z

Closing in favor of the typeshed issue -- @JukkaL , I haven't fully absorbed your comment, but if you think there's more to discuss please do make an issue for it.

Thanks for reporting this, David!

gvanrossum · 2016-08-04T22:49:12Z

This issue wasn't actually closed, and I don't think it should be just yet. I still think that a function with exactly one AnyStr parameter (not occurring in a container) should be supported -- as @matthiaskramm mentioned it is a pretty convenient notation. That said I agree we should just fix os.open() -- and all other os functions that have just one string argument and that don't return a string.

gnprice · 2016-08-11T22:09:23Z

I don't want to ban using AnyStr just once in a function signature, but I think the semantics as currently written in PEP 484 are pretty complicated (details in that typeshed thread). Do you think what we're currently doing differs from those semantics? Or is there another fix you think we can make here?

JukkaL · 2017-01-18T18:25:25Z

Increasing priority to normal since issues with union types are frequently causing trouble for users. This may also affect code that uses --strict-optional.

This commit adds support for very basic and simple union math when calling overloaded functions, resolving python#4576. One thing led to another, and this ended up accidentally fixing or touching on several different overload-related issues. In particular, I believe this pull request: 1. Fixes the bug (?) where calling overloaded functions can sometimes silently infer a return type of 'Any' 2. Changes the semantics of how mypy handles overlapping functions, which I believe is currently under discussion in python/typing#253 Although this change is functional and mergable, I was planning on polishing it more -- adding more tests, fleshing out the union math behavior, etc. However, I think these are sort of big changes and wanted to check in and make sure this pull request is actually welcome/is a good idea. If not, let me know, and I'd be happy to abandon it. --- Details on specific changes made: 1. The new algorithm works by modifying checkexpr.overload_call_targets to return all possible matches, rather then just one. We start by trying the first matching signature. If there was some error, we (conservatively) attempt to union all of the matching signatures together and repeat the typechecking process. If it doesn't seem like it's possible to combine the matching signatures in a sound way, we end and just output the errors we obtained from typechecking the first match. The "signature-unioning" code is currently deliberately very conservative. I figured it was better to start small and attempt to handle only basic cases like python#1943 and relax the restrictions later as needed. For more details on this algorithm, see the comments in checkexpr.union_overload_matches. 2. This change incidentally resolves any bugs related to how calling an overloaded function can sometimes silently infer a return type of Any. Previously, if a function call caused an overload to be less precise then a previous one, we gave up and returned a silent Any. This change removes this case altogether and only infers Any if either (a) the caller arguments explicitly contains Any or (b) if there was some error. For example, see python#3295 and python#1322 -- I believe this pull request touches on and maybe resolves (??) those two issues. 3. As a result, I needed to fix a few parts of mypy that were relying on this "silently infer Any" behavior -- see the changes in checker.py and semanal.py. Both files were using expressions of the form `zip(*iterable)`, which ended up having a type of `Any` under the old algorithm. The new algorithm will instead infer `Iterable[Tuple[Any, ...]]` which actually matches the stubs in typeshed. 4. These changes cause the attr stubs in `test-data/unit/lib-stub` to no longer work. It seems that the stubs both here and in typeshed were both also falling prey to the 'silently infer Any' bug: code like `a = attr.ib()` typechecked not because they matched the signature of any of the overloads, but because that particular call caused one or more overloads to overlap, which made mypy give up and infer Any. I couldn't find a clean way of fixing the stubs to infer the correct thing under this new behavior, so just gave up and removed the overloads altogether. I think this is fine though -- it seems like the attrs plugin infers the correct type for us anyways, regardless of what the stubs say. If this pull request is accepted, I plan on submitting a similar pull request to the stubs in typeshed. 4. This pull request also probably touches on python/typing#253. We still require the overloads to be written from the most narrow to general and disallow overlapping signatures. However, if a *call* now causes overlaps, we try the "union" algorithm described above and default to selecting the first matching overload instead of giving up.

This commit adds support for very basic and simple union math when calling overloaded functions, resolving python#4576. As a side effect, this change also fixes a bug where calling overloaded functions can sometimes silently infer a return type of 'Any' and slightly modifies the semantics of how mypy handles overlaps in overloaded functions. Details on specific changes made: 1. The new algorithm works by modifying checkexpr.overload_call_targets to return all possible matches, rather then just one. We start by trying the first matching signature. If there was some error, we (conservatively) attempt to union all of the matching signatures together and repeat the typechecking process. If it doesn't seem like it's possible to combine the matching signatures in a sound way, we end and just output the errors we obtained from typechecking the first match. The "signature-unioning" code is currently deliberately very conservative. I figured it was better to start small and attempt to handle only basic cases like python#1943 and relax the restrictions later as needed. For more details on this algorithm, see the comments in checkexpr.union_overload_matches. 2. This change incidentally resolves any bugs related to how calling an overloaded function can sometimes silently infer a return type of Any. Previously, if a function call caused an overload to be less precise then a previous one, we gave up and returned a silent Any. This change removes this case altogether and only infers Any if either (a) the caller arguments explicitly contains Any or (b) if there was some error. For example, see python#3295 and python#1322 -- I believe this pull request touches on and maybe resolves (??) those two issues. 3. As a result, this caused a few errors in mypy where code was relying on this "silently infer Any" behavior -- see the changes in checker.py and semanal.py. Both files were using expressions of the form `zip(*iterable)`, which ended up having a type of `Any` under the old algorithm. The new algorithm will instead infer `Iterable[Tuple[Any, ...]]` which actually matches the stubs in typeshed. 4. Many of the attrs tests were also relying on the same behavior. Specifically, these changes cause the attr stubs in `test-data/unit/lib-stub` to no longer work. It seemed that expressions of the form `a = attr.ib()` were evaluated to 'Any' not because of a stub, but because of the 'silent Any' bug. I couldn't find a clean way of fixing the stubs to infer the correct thing under this new behavior, so just gave up and removed the overloads altogether. I think this is fine though -- it seems like the attrs plugin infers the correct type for us anyways, regardless of what the stubs say. If this pull request is accepted, I plan on submitting a similar pull request to the stubs in typeshed. 4. This pull request also probably touches on python/typing#253. We still require the overloads to be written from the most narrow to general and disallow overlapping signatures. However, if a *call* now causes overlaps, we try the "union" algorithm described above and default to selecting the first matching overload instead of giving up.

ilevkivskyi · 2018-05-17T21:18:33Z

Currently the original example passes in both Python 2 and Python 3 modes. So I am closing this since we have another issue for tracking union maths.

gnprice mentioned this issue Aug 4, 2016

os.open should not have AnyStr in its type python/typeshed#439

Closed

gnprice added this to the Undetermined priority milestone Aug 11, 2016

gnprice added the feature label Aug 11, 2016

Michael0x2a added Bytes-Unicode and removed feature labels Aug 11, 2016

gnprice added the feature label Aug 11, 2016

gvanrossum added priority-2-low and removed Bytes-Unicode labels Jan 6, 2017

JukkaL added topic-union-types priority-1-normal and removed priority-2-low labels Jan 18, 2017

gvanrossum removed this from the Undetermined priority milestone Mar 29, 2017

JukkaL mentioned this issue Apr 26, 2017

Warn about unconstrained typevars #3254

Closed

JukkaL mentioned this issue Jun 16, 2017

Unions of collectively valid argument types not accepted by @overload groups #3556

Closed

Michael0x2a mentioned this issue Apr 2, 2018

Adds support for basic union math with overloads #4842

Closed

Michael0x2a mentioned this issue May 15, 2018

Clarify what @overload means python/typing#253

Closed

ilevkivskyi closed this as completed May 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os.open doesn't accept Union[bytes, Text] #1943

os.open doesn't accept Union[bytes, Text] #1943

euresti commented Jul 26, 2016

gvanrossum commented Jul 26, 2016

JukkaL commented Jul 27, 2016

gnprice commented Aug 4, 2016

gnprice commented Aug 4, 2016

gvanrossum commented Aug 4, 2016

gnprice commented Aug 11, 2016

JukkaL commented Jan 18, 2017

ilevkivskyi commented May 17, 2018

os.open doesn't accept Union[bytes, Text] #1943

os.open doesn't accept Union[bytes, Text] #1943

Comments

euresti commented Jul 26, 2016

gvanrossum commented Jul 26, 2016

JukkaL commented Jul 27, 2016

gnprice commented Aug 4, 2016

gnprice commented Aug 4, 2016

gvanrossum commented Aug 4, 2016

gnprice commented Aug 11, 2016

JukkaL commented Jan 18, 2017

ilevkivskyi commented May 17, 2018