Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.open doesn't accept Union[bytes, Text] #1943

Closed
euresti opened this issue Jul 26, 2016 · 8 comments
Closed

os.open doesn't accept Union[bytes, Text] #1943

euresti opened this issue Jul 26, 2016 · 8 comments

Comments

@euresti
Copy link
Contributor

euresti commented Jul 26, 2016

The following code:

from typing import Text, Union
import os

def test(path):
    # type: (Union[bytes, Text]) -> None
    f = os.open(path, os.O_RDONLY)

Passes with --py2 but fails in python3 as follows:

error: Type argument 1 of "open" has incompatible value "Union[bytes, str]"

@JukkaL

@gvanrossum
Copy link
Member

FWIW if we change os.open() to take Union[str, bytes] the error goes away, but I'm sort of thinking that passing that union to something expecting AnyStr should work...

@JukkaL
Copy link
Collaborator

JukkaL commented Jul 27, 2016

Yes, Union[str, bytes] should probably be sometimes valid when AnyStr is expected (but it isn't right now). Example:

def f(x: AnyStr) -> AnyStr:
    return x

def g(x: Union[str, bytes]) -> None:
    f(x)  # should be fine, return type Union[str, bytes]

However, List[Union[str, bytes]] should not be compatible with List[AnyStr] because the latter is uniform but the prior could have a mix of str and bytes. Also, this shouldn't be valid, following similar reasoning:

def f2(x: AnyStr, y: AnyStr) -> AnyStr:
    return x + y

def g2(x: Union[str, bytes], y: Union[str, bytes]) -> None:
    f2(x, y)  # error, could be a mix of str and bytes

Here's how we could support these (and other things not involving type variables, such as overloads):

  • If caller has arguments with union types, expand all the combinations of the union types into non-union types and test the call with each of these expansions. The call is valid if all combinations type check okay.

The call to f2 in the second example would be checked 4 times:

  • f2(<str>, <str>)
  • f2(<str>, <bytes>)
  • f2(<bytes>, <str>)
  • f2(<bytes>, <bytes>)

The obvious problem with this is that this could be really slow if there are many union types in a call. An implementation could plausibly be clever and only expand those unions where it seems like a useful thing to do.

@gnprice
Copy link
Collaborator

gnprice commented Aug 4, 2016

I think the right fix to @euresti's original issue is that the type of os.open's parameter should not be AnyStr -- it should be Union[bytes, str]. In general, AnyStr doesn't provide any additional value when it appears only once in a type signature, and it introduces complexity because its semantics are somewhat complicated.

It's possible we should also handle the interaction of AnyStr with Union differently in general. I think that ought to be a separate issue if we want to pursue it further.

@gnprice
Copy link
Collaborator

gnprice commented Aug 4, 2016

Closing in favor of the typeshed issue -- @JukkaL , I haven't fully absorbed your comment, but if you think there's more to discuss please do make an issue for it.

Thanks for reporting this, David!

@gvanrossum
Copy link
Member

This issue wasn't actually closed, and I don't think it should be just yet. I still think that a function with exactly one AnyStr parameter (not occurring in a container) should be supported -- as @matthiaskramm mentioned it is a pretty convenient notation. That said I agree we should just fix os.open() -- and all other os functions that have just one string argument and that don't return a string.

@gnprice
Copy link
Collaborator

gnprice commented Aug 11, 2016

I don't want to ban using AnyStr just once in a function signature, but I think the semantics as currently written in PEP 484 are pretty complicated (details in that typeshed thread). Do you think what we're currently doing differs from those semantics? Or is there another fix you think we can make here?

@JukkaL
Copy link
Collaborator

JukkaL commented Jan 18, 2017

Increasing priority to normal since issues with union types are frequently causing trouble for users. This may also affect code that uses --strict-optional.

@gvanrossum gvanrossum removed this from the Undetermined priority milestone Mar 29, 2017
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

One thing led to another, and this ended up accidentally fixing or
touching on several different overload-related issues. In particular,
I believe this pull request:

1.  Fixes the bug (?) where calling overloaded functions can sometimes
    silently infer a return type of 'Any'

2.  Changes the semantics of how mypy handles overlapping functions,
    which I believe is currently under discussion in python/typing#253

Although this change is functional and mergable, I was planning on
polishing it more -- adding more tests, fleshing out the union math
behavior, etc.

However, I think these are sort of big changes and wanted to check in
and make sure this pull request is actually welcome/is a good idea.
If not, let me know, and I'd be happy to abandon it.

---

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, I needed to fix a few parts of mypy that were
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  These changes cause the attr stubs in `test-data/unit/lib-stub` to
    no longer work. It seems that the stubs both here and in typeshed
    were both also falling prey to the 'silently infer Any' bug: code
    like `a = attr.ib()` typechecked not because they matched the
    signature of any of the overloads, but because that particular call
    caused one or more overloads to overlap, which made mypy give up and
    infer Any.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

One thing led to another, and this ended up accidentally fixing or
touching on several different overload-related issues. In particular,
I believe this pull request:

1.  Fixes the bug (?) where calling overloaded functions can sometimes
    silently infer a return type of 'Any'

2.  Changes the semantics of how mypy handles overlapping functions,
    which I believe is currently under discussion in python/typing#253

Although this change is functional and mergable, I was planning on
polishing it more -- adding more tests, fleshing out the union math
behavior, etc.

However, I think these are sort of big changes and wanted to check in
and make sure this pull request is actually welcome/is a good idea.
If not, let me know, and I'd be happy to abandon it.

---

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, I needed to fix a few parts of mypy that were
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  These changes cause the attr stubs in `test-data/unit/lib-stub` to
    no longer work. It seems that the stubs both here and in typeshed
    were both also falling prey to the 'silently infer Any' bug: code
    like `a = attr.ib()` typechecked not because they matched the
    signature of any of the overloads, but because that particular call
    caused one or more overloads to overlap, which made mypy give up and
    infer Any.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 11, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 23, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
@ilevkivskyi
Copy link
Member

Currently the original example passes in both Python 2 and Python 3 modes. So I am closing this since we have another issue for tracking union maths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants