This bug, the HN thread and how hard it is to fix (https://bugs.python.org/issue...

deanCommie · on Jan 14, 2020

An extremely obscure and rare scenario that exposes a flaw with a convenient tool does not negate the value of the tool in all other cases.

Are you also going to throw out all use of automatic garbage collection in languages because of a bug caused by a GC pause.

Should we stop bothering with encryption because people discover weaknesses and attack vectors?

Come on, man.

> Also, KeyboardInterrupt should not be an exception. It's a signal, not an error or exceptional situation that code wants to report to the caller.

OP doesn't mention anything about KeyboardInterrupt, but I'll bite.

Some languages like C# make "Events" a 1st party citizen, making it really easy to write non-sequential code and handling such events.

Most don't, so when people want to GUARANTEE a caller handles a particular signal, they use the Exception tool because it posesses the necessary qualities.

"Using Exceptions as flow control" is a well-known anti-pattern but even anti-patterns have their use in exceptional circumstances.

> ("randomly" as in: why is "open file" throwing an exception when file doesn't exist but "find substring" returns -1 when substring doesn't exist?)

I'll concede this one point. In-memory utility methods shouldn't throw exceptions when a useful error code would be just as good - it keeps code cleaner without trycatch blocks. NumberFormatException is my bane in Java.

sicromoft · on Jan 15, 2020

Terminology nit: Flow control is not the same as control flow (which is sometimes referred to as flow of control... yeah it's confusing).

https://en.wikipedia.org/wiki/Flow_control_(data)

https://en.wikipedia.org/wiki/Control_flow

CaptainMarvel · on Jan 14, 2020

Exceptions as flow-control is normal in Python. It’s used all over the place, e.g. generators with StopIteration.

rumanator · on Jan 14, 2020

> Exceptions as flow-control is normal in Python.

Perhaps it's high time that problem was solved?

larkost · on Jan 14, 2020

This isn't a problem, it is a design decision that you evidently don't like. There is a difference.

rumanator · on Jan 15, 2020

> This isn't a problem

It actually is. Abusing exceptions as the normal control flow is a crude anti-pattern. Obviously the normal flow of control is not exceptional behavior. Therefore, abusing exceptions and exception-handling mechanisms to implement the happy path is simply wrong at many levels, from conceptual to practical, and in the end actually cause problems and hard to find buga such as the problems described in this bug report.

minitech · on Jan 15, 2020

As a fan of Python, I agree that it’s a problem, and probably one of Python’s worst design decisions. It doesn’t achieve anything over sentinels, and results in lots of potential for hidden bugs (especially in the case of __getattr__, and before PEP 479). Comparison overloading even does use NotImplemented instead of an exception.

maweki · on Jan 14, 2020

Especially on StopIteration this is a good way to keep the interface small when lacking static typing.

A generator in Java has the has_next() and next() method and this can be statically checked and you're basically only allowed to call next is hasnext is true.

In python the generator only has to provide the next-method and raises an exception when it's exhausted. I don't find this particularly unclean.

masklinn · on Jan 15, 2020

> NumberFormatException is my bane in Java.

Surely that’s because of java? conversions normally throw in python and that’s no issue. So does indexing. So does str.index.

Dylan16807 · on Jan 15, 2020

> OP doesn't mention anything about KeyboardInterrupt

Please go check on the original post.

ken · on Jan 15, 2020

> a pretty good argument against exceptions

Every time there’s a subtle bug with 7 contributing factors, people pick the one which is caused by a design decision they happen to hate, and place all of the blame squarely on that one.

joshuamorton · on Jan 14, 2020

> ("randomly" as in: why is "open file" throwing an exception when file doesn't exist but "find substring" returns -1 when substring doesn't exist?)

Because there can be lots of ways for `open` to throw an exception (encoding issues, not found, permissions, etc.). The underlying open syscall has 39 error codes. Python adds a few more failure modes atop that. I'd much rather deal with named error conditions than deal with return values -1 through -50.

Find substring can only fail in one way.

incompatible · on Jan 15, 2020

Even C can map the returns values -1 through -50 to symbolic constants, typically defined in preprocessor headers.

joshuamorton · on Jan 15, 2020

Absolutely, but dealing with categories of errors is much harder with unstructured error codes.

winstonewert · on Jan 15, 2020

My question is: what is your alternative to exceptions, and how would they would have made the error easier to spot? Its not obvious that the error would have been easier to spot if it had been written using Go's `!= nil` pattern or Rust's `?` pattern.

My experience is that while its true that its easy to have bugs in implicit exception handling logic, its just as easy, maybe easier, to have bugs in the explicit boilerplate its replacing.

imron · on Jan 15, 2020

> Its not obvious that the error would have been easier to spot if it had been written using Go's `!= nil` pattern or Rust's `?` pattern.

I like Rust's `?` operator (and Swift's `guard` statement also), but it's worth noting that the bug wouldn't have existed in Rust in the first place because the file object wouldn't have outlived the scope it was defined in.

masklinn · on Jan 15, 2020

Also and really because it’s quite hard to go and replace implementation details with code which behaves completely differently and implements an unrelated contract entirely.

Dylan16807 · on Jan 15, 2020

> how would they would have made the error easier to spot?

There was a try block with two function calls, and it seems like the author assumed only the first function call could throw.

With return-based errors, the error handling wouldn't implicitly be shared between the two functions. The author would write the handling for the first function, and then they would notice the second function could return an error state too. Then they might write a correct handler, or at least we should expect them to say "this can't happen, so panic". This bug would have been trivial to fix if it panicked instead of corrupting fds.

winstonewert · on Jan 15, 2020

What makes you think that "the author assumed only the first function call could throw?"

Programmers in exception languages never think (or at least should never think) "X cannot throw" because it is never true. Any function has the potential of throwing an exception. This is even more true in Python than in other exception throwing languages.

No, the problem here is that the author has realized that they need special cleanup logic in the case that a file fails to be opened, but they applied that logic too broadly.

Dylan16807 · on Jan 15, 2020

So for reference the second line is:

  return _TemporaryFileWrapper(file, name, delete)

> What makes you think that "the author assumed only the first function call could throw?"

Because it's almost true, and the code cleanup is clearly designed around just the first function.

_TemporaryFileWrapper does nothing in its __init__ except copy the four parameters into the object. There's no reason to anticipate a problem.

But thinking about it more precisely, there's an even worse issue.

Even if neither function has an error, if you get a KeyboardInterrupt between the object creation and the actual return, everything gets corrupted.

People aware of the bug, trying to fix this code, suggested a nested try/catch, and I think that would still explode if the timing was exactly wrong.

Because really, who thinks about a return statement throwing? It's an unintuitive problem that can only happen with this style of exceptions.

It's probably even worse than that. If an exception can hit right after the file object is made then you get a similar problem...

Is there anything in Python that restricts the timing of KeyboardException or is this entire adventure on extremely treacherous territory?

winstonewert · on Jan 16, 2020

That's precisely the point I was making: there is no python code which is guaranteed not to throw. If the original author thought that the line of code in question wouldn't throw, they were were wrong and fundamentally misunderstood Python.

Dylan16807 · on Jan 16, 2020

Well the code wouldn't have caused any errors if it only caught normal exceptions, at least.

> That's precisely the point I was making: there is no python code which is guaranteed not to throw.

Well the first thing you said is "what is your alternative to exceptions, and how would they would have made the error easier to spot? Its not obvious that the error would have been easier to spot if it had been written using Go's `!= nil` pattern or Rust's `?` pattern."

In those languages, you can't have an error state trigger between lines of code, or inside a line of code that does basic things like assigning to a local variable or returning.

It won't happen in C++ either (with certain assumptions).

Python is a language that's supposed to be straightforward, and very few coders are going to understand exactly how hostile exceptions can be. So while these assumptions shouldn't have been made, the blame goes toward the design of the language. An alternative system would be better.

winstonewert · on Jan 16, 2020

> Well the code wouldn't have caused any errors if it only caught normal exceptions, at least.

Uhh... it was a normal exception that triggered the bug in question.

> Well the first thing you said is

Yes, that's the first thing I said. But the point I'm making here is that your initial response to that is probably wrong: "the author assumed only the first function call could throw" That's just not how you think in Python. (Or at least it is not how you should think.)

My point is this: the bug wasn't assuming that the second function wouldn't fail. The bug was assuming that the same cleanup code was appropriate in the case of either failure. That same bug could easily exist with return codes as it does in exceptions.

> Python is a language that's supposed to be straightforward

In fairness to Python, this is an atypical case. The problem arises because this is low level code dealing directly with fds. Most of the time, code will deal with higher level objects which will properly clean up after themselves regardless of what crazy stuff you do with them.

Dylan16807 · on Jan 16, 2020

> Uhh... it was a normal exception that triggered the bug in question.

That doesn't count because it was an invasive monkeypatch that took a function that could not possibly cause an error and made it throw.

> My point is this: the bug wasn't assuming that the second function wouldn't fail. The bug was assuming that the same cleanup code was appropriate in the case of either failure.

Maybe. Hard to know for sure.

> In fairness to Python, this is an atypical case.

I dunno, pretty much any code that uses a catch block to undo things has to be very carefully written. Even if you're wrapping things in higher level objects, you still have ugly scenarios where you're mutating a data structure and have to unwind the changes halfway through. I bet tons of that code is written under the assumption that there will be no exceptions.

I guess if we consider "trying to catch KeyboardInterrupt at any level without immediate exit" as a weird barely-supported case, then things are fine.

winstonewert · on Jan 16, 2020

> That doesn't count because it was an invasive monkeypatch that took a function that could not possibly cause an error and made it throw.

The original claim was based on that possibility, its pretty goofy to declare it doesn't count.

> I dunno, pretty much any code that uses a catch block to undo things has to be very carefully written. Even if you're wrapping things in higher level objects, you still have ugly scenarios where you're mutating a data structure and have to unwind the changes halfway through. I bet tons of that code is written under the assumption that there will be no exceptions.

Is it any different for return codes than for exceptions? I'd say that any code which attempts to undo change in the case of an error has to be carefully written. Every special case on the ordinary code path, has to have matching special cases on the error paths. In fact, I've never seen code in any language that does such a thing. And I wouldn't trust if it I did, since its extremely difficult to get right for all cases, and the errors paths almost certainly hasn't been well tested.

> I guess if we consider "trying to catch KeyboardInterrupt at any level without immediate exit" as a weird barely-supported case, then things are fine.

Well, trying to continue running after KeyboardInterrupt is dubious, the user just asked the program to terminate.

mixmastamyk · on Jan 15, 2020

A non-sequitur me thinks, as the code doesn't look simple and correct. File needs to be closed on error if it exists, fd if not. i.e. The broken version looks broken.

While there is lots of meta-complexity around this, the bug itself is relatively simple. Make sure files are closed properly on error.

Also find -1 is a wrapper from C, probably kept for compatibility reasons.

simion314 · on Jan 14, 2020

I do not see how exceptions caused the bug, seems the cause is a bug in a GC language d that is wrapping a low level code/resource and not releasing it properly.

masklinn · on Jan 15, 2020

In a langage without exceptions the file wrapper being trivial it would not have returned any sort of error condition and thus would not have been checked for it.

It is an issue of exceptions that try clauses can easily be overly broad in the amount of code they cover.

simion314 · on Jan 15, 2020

I still do not understand, I did not see where the exception caused the issue in this case, you can have bugs with error codes that are not checked and that blow up much later so I am confused by your vague response (maybe you prefer checked excpetion so you don't ignore problems, I prefer those because they force correctness over developer comfort).

masklinn · on Jan 15, 2020

> I did not see where the exception caused the issue in this case

    try:
        file = _io.open(fd, mode, buffering=buffering,
                        newline=newline, encoding=encoding, errors=errors)

        return _TemporaryFileWrapper(file, name, delete)
    except BaseException:
        _os.unlink(name)
        _os.close(fd)
        raise

The except clause was built under the assumption that only opening the fd would fail, and thus the cleanup it performs is in those terms, especially the part about closing the file descriptor and ignoring the file object (since that's assumed not to possibly exist).

However due to the mock it's the wrapper creation which fails. This means the cleanup runs, closes the fd, but leaves the file object (created by io.open) around and assuming it owns an fd, the file object later on gets collected by the GC and closes the fd a second time, breaking anyone unlucky enough to have gotten its reuse.

This issue would not happen if the wrapper were created outside the `try` statement (though the file would be leaking to be eventually closed by the GC which is not great either)

simion314 · on Jan 15, 2020

Thanks for spending the time explaining the problematic code.

rocqua · on Jan 15, 2020

You could get a very similar issue in C++ where a destructor closes an FD that was already erroneously closed by an earlier call to a member function due to an exception.

The issue here is a bug in resource management. The specific bug was caused by an exception that shouldn't be caught still being caught. Which erroneously messed with resource management.

simion314 · on Jan 15, 2020

I did not see that in the snippets I will try to find the section, was the bug in the code that was catching the exception or it was thrown by mistake. I agree that is a resource management issue and you often make this kind of bugs when you try to reuse objects (for performance reasons most of the time)

thaumasiotes · on Jan 15, 2020

> why is "open file" throwing an exception when file doesn't exist but "find substring" returns -1 when substring doesn't exist?

Well, first, there's more to it than that. "Open file" will throw an exception when you open the file to read and it doesn't exist. If you open the file to write and it doesn't exist, you won't get an exception because that's expected behavior. Instead, the file will be automatically created behind the scenes. The complementary scenario, where you're opening a file to write and the file already exists, is more likely to be a bug.

Exceptions have semantic qualities and depend on what you're trying to do. Trying to read a file that doesn't exist is obviously an error state because it is inherently impossible to do. The same goes for detecting the location of a substring within a string that doesn't contain it. But examining a string to see whether it contains a particular substring has no such problem. You can always test a string for the inclusion of some other string. So the judgment of whether "find substring" should throw an exception or not depends on whether you think the question being asked when someone invokes "find substring" is "is substring in there?" (answer of "no" is not exceptional) or "I know substring is in there; where is it?" (answer of "it isn't in there, you doofus" is exceptional). Both questions are common; this isn't really a call the language should be making.

The same problem is more frequently discussed in the context of hash tables. If I look up a key in a hash table and I get a null value, is that because there is no value stored for that key, or is it because null is the value stored for that key? People would like accessing the table to be convenient while distinguishing these two cases, and python doesn't allow that. You can choose from some different behaviors:

    table[key]     # Absence of the key is exceptional.
    
    table.get(key) # Absence of the key will yield None...
                   # but this is indistinguishable from
                   # a key with the stored value None.
    
    key in table   # This directly tests whether key is
                   # present, but won't tell you the stored value.

The true solution is for lookups to return multiple values, one to tell you whether the key was present and the other to tell you what its value was, if anything. But this is rare[1] because people don't want to handle multiple return values from every hash lookup.

But you can store any value, even null, in a hash table -- that's their thing. Finding substrings doesn't have this problem; a legal result must be a valid index into the string. We can solve the how-do-we-want-the-hash-table-to-behave problem perfectly by returning an invalid index when the substring isn't present, and... that's what we do.

[1] Common Lisp addresses this by allowing functions to return multiple values (distinct from a single value with multiple subvalues) -- it's possible to treat the return value of a multiple-valued function as a single value, in which case you get only the first value. This is great for the hash table example, but in general it makes this easier while making it much more annoying to handle the multiple values when you do want them all.

depressedpanda · on Jan 15, 2020

> The same problem is more frequently discussed in the context of hash tables. If I look up a key in a hash table and I get a null value, is that because there is no value stored for that key, or is it because null is the value stored for that key? People would like accessing the table to be convenient while distinguishing these two cases, and python doesn't allow that. You can choose from some different behaviors:

Or, you can create a second null-like type in the language, name it 'undefined' perhaps, and return that when the key doesn't exist.

Of course, that may be considered kicking the can down the road as now you will have to trust developers to be competent enough to never store 'undefined' in the hash table, which I suppose is the reason you didn't mention JavaScript's solution to the problem.

masklinn · on Jan 15, 2020

The actual fix is to have a completely separate wrapper type (usually an option type) which signals in a type-safe and unambiguous manner that the key was missing.

Then you don’t have the issues of in-band signalling.

thaumasiotes · on Jan 15, 2020

This isn't really different from the multiple values approach. For example, python could have table accesses return (None, True) for a None value that was present, (None, False) for a value that wasn't present, (5, True) for a 5 value that was present...

Is that multiple values? (Yes.) Or a wrapper type? (Also yes, it's a tuple, a collection of values.) This works, but it's unpopular because it makes every table access more difficult.

masklinn · on Jan 15, 2020

> This isn't really different from the multiple values approach.

If you ignore most of what I said then it’s not.

> Is that multiple values? (Yes.) Or a wrapper type? (Also yes, it's a tuple, a collection of values.)

It’s neither type-safe nor unambiguous.

> This works

For low value of works. Which I guess is all you can ask for in a dynamically typed langage (eg Erlang uses this pattern to fairly good effect, but it has very good support for it).

But it should not be confused with an actual solution to the issue. It can make things less bad (again given good support for this pattern which Python does not have), not actually good.

thaumasiotes · on Jan 15, 2020

> It’s neither type-safe nor unambiguous.

How'd you get here? It's exactly as type-safe as the unwrapped hash table, which is admittedly not especially type-safe, and it's fully unambiguous.

masklinn · on Jan 15, 2020

> It's exactly as type-safe as the unwrapped hash table, which is admittedly not especially type-safe

Yes.

> and it's fully unambiguous.

No, you have any random value flagged as not present (in fact you can forget about or ignore the flag entirely), and no clue if that means anything.

thaumasiotes · on Jan 15, 2020

Can you provide an example of what you mean by "ambiguity"?

thaumasiotes · on Jan 15, 2020

> Or, you can create a second null-like type in the language, name it 'undefined' perhaps, and return that when the key doesn't exist.

No, this isn't a solution. This is like terminating C strings with 0 characters, or terminating TSV fields with tabs. It sounds like it makes sense, until you ask "what if the terminator occurs within the data?"

Eventually someone's going to need to store undefined. It's no different than null. If it were a competence issue (which it isn't), we'd be saying that storing null in the hash table was a mistake, not something to work around with null2.

Izkata · on Jan 15, 2020

Javascript's solution is actually obj.hasOwnProperty(key) and the "in" operator ("in" checks the inheritance tree, hasOwnProperty doesn't).

Checking for undefined is better known/easily discoverable but hasn't been the right way to do it since at least IE 5.5 / Firefox 1.

masklinn · on Jan 15, 2020

> We can solve the how-do-we-want-the-hash-table-to-behave problem perfectly by returning an invalid index when the substring isn't present, and... that's what we do.

Negative indices are valid in python. Also str.index raises.

And product types are not a good way to signal exclusive conditions.

excpointer · on Jan 14, 2020

Popcorn time...