{"type":"rich","version":"1.0","title":"Achow101 [ARCHIVE] wrote","author_name":"Achow101 [ARCHIVE] (npub1wh…d26cj)","author_url":"https://yabu.me/npub1wh7lmdsh2r0ygnp39pk7k5a7mll5x5w44pwn6ekvdvmwjhazr5rqxd26cj","provider_name":"njump","provider_url":"https://yabu.me","html":"📅 Original date posted:2018-06-29\n📝 Original message:Hi,\n\nI do not think that protobuf is the way to go for this. Not only is it another dependency\nwhich many wallets do not want to add (e.g. Armory has not added BIP 70 support because\nof its dependency on protobuf), but it is a more drastic change than the currently proposed\nchanges. The point of this email thread isn't to rewrite and design a new BIP (which is effectively\nwhat is currently going on). The point is to modify and improve the current one. In particular,\nwe do not want such drastic changes that people who have already implemented the current\nBIP would have to effectively rewrite everything from scratch again.\n\nI believe that this discussion has become bikeshedding and is really no longer constructive. Neither\nof us are going to convince the other to use or not use protobuf. ASeeing how no one else\nhas really participated in this discussion about protobuf and key uniqueness, I do not think\nthat these suggested changes are really necessary nor useful to others. It boils down to personal preference\nrather than technical merit. As such, I have opened a PR to the BIPs repo (https://github.com/bitcoin/bips/pull/694)\nwhich contains the changes that I proposed in an earlier email.\n\nAdditionally, because there have been no objections to the currently proposed changes, I propose\nto move the BIP from Draft to Proposed status.\n\nAndrew\n\n\n​​\n\n‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐\n\nOn June 29, 2018 2:53 AM, matejcik via bitcoin-dev \u003cbitcoin-dev at lists.linuxfoundation.org\u003e wrote:\n\n\u003e ​​\n\u003e \n\u003e Short version:\n\u003e \n\u003e -   I propose that conflicting \"values\" for the same \"key\" are considered\n\u003e     \n\u003e     invalid.\n\u003e     \n\u003e -   Let's not optimize for invalid data.\n\u003e -   Given that, there's an open question on how to handle invalid data\n\u003e     \n\u003e     when encountered\n\u003e     \n\u003e     In general, I don't think it's possible to enforce correctness at the\n\u003e     \n\u003e     format level. You still need application level checks - and that calls\n\u003e     \n\u003e     into question what we gain by trying to do this on the format level.\n\u003e     \n\u003e     Long version:\n\u003e     \n\u003e     Let's look at this from a different angle.\n\u003e     \n\u003e     There are roughly two possible \"modes\" for the format with regard to\n\u003e     \n\u003e     possibly-conflicting data. Call them \"permissive\" and \"restrictive\".\n\u003e     \n\u003e     The spec says:\n\u003e     \n\u003e     \"\"\"\n\u003e     \n\u003e     Keys within each scope should never be duplicated; all keys in the\n\u003e     \n\u003e     format are unique. PSBTs containing duplicate keys are invalid. However\n\u003e     \n\u003e     implementors will still need to handle events where keys are duplicated\n\u003e     \n\u003e     when combining transactions with duplicated fields. In this event, the\n\u003e     \n\u003e     software may choose whichever value it wishes.\n\u003e     \n\u003e     \"\"\"\n\u003e     \n\u003e     The last sentence of this paragraph sets the mode to permissive:\n\u003e     \n\u003e     duplicate values are pretty much OK. If you see them, just pick one.\n\u003e     \n\u003e     You seem to argue that Combiners, in particular simple ones that don't\n\u003e     \n\u003e     understand field semantics, should merge keys permissively, but\n\u003e     \n\u003e     deduplicate values restrictively.\n\u003e     \n\u003e     IOW: if you receive two different values for the same key, just pick\n\u003e     \n\u003e     whichever, but $deity forbid you include both!\n\u003e     \n\u003e     This choice doesn't make sense to me.\n\u003e     \n\u003e     What would make sense is fully restrictive mode: receiving two\n\u003e     \n\u003e     different values for the same key is a fatal condition with no recovery.\n\u003e     \n\u003e     If you have a non-deterministic scheme, put a differentiator in the key.\n\u003e     \n\u003e     Or all the data, for that matter.\n\u003e     \n\u003e     (Incidentally, this puts key-aware and keyless Combiners on the same\n\u003e     \n\u003e     footing. As long as all participants uphold the protocol, different\n\u003e     \n\u003e     value = different key = different full record.)\n\u003e     \n\u003e     Given that, it's nice to have the Combiner perform the task of detecting\n\u003e     \n\u003e     this and failing. But not at all necessary. As the quoted paragraph\n\u003e     \n\u003e     correctly notes, consumers still need to handle PSBTs with duplicate keys.\n\u003e     \n\u003e     (In this context, your implied permissive/restrictive Combiner is\n\u003e     \n\u003e     optimized for dealing with invalid data. That seems like a wrong\n\u003e     \n\u003e     optimization.)\n\u003e     \n\u003e     A reasonable point to decide is whether the handling at the consumer\n\u003e     \n\u003e     should be permissive or restrictive. Personally I'm OK with either. I'd\n\u003e     \n\u003e     go with the following change:\n\u003e     \n\u003e     \"\"\"\n\u003e     \n\u003e     In this event, the software MAY reject the transaction as invalid. If it\n\u003e     \n\u003e     decides to accept it, it MUST choose the last value encountered.\n\u003e     \n\u003e     \"\"\"\n\u003e     \n\u003e     (deterministic way of choosing, instead of \"whichever you like\")\n\u003e     \n\u003e     We could also drop the first part, explicitly allowing consumers to\n\u003e     \n\u003e     pick, and simplifying the Combiner algorithm to `sort -u`.\n\u003e     \n\u003e     Note that this sort of \"picking\" will probably be implicit. I'd expect\n\u003e     \n\u003e     the consumer to look like this:\n\u003e     \n\u003e \n\u003e     for key, value in parse(nextRecord()):\n\u003e       data[key] = value\n\u003e     \n\u003e \n\u003e Or we could drop the second part and switch MAY to MUST, for a fully\n\u003e \n\u003e restrictive mode - which, funnily enough, still lets the Combiner work\n\u003e \n\u003e as `sort -u`.\n\u003e \n\u003e To see why, remember that distinct values for the same key are not\n\u003e \n\u003e allowed in fully restrictive mode. If a Combiner encounters two\n\u003e \n\u003e conflicting values F(1) and F(2), it should fail -- but if it doesn't,\n\u003e \n\u003e it includes both and the same failure WILL happen on the fully\n\u003e \n\u003e restrictive consumer.\n\u003e \n\u003e This was (or is) my point of confusion re Combiners: the permissive key\n\u003e \n\u003e -   restrictive value mode of operation doesn't seem to help subsequent\n\u003e     \n\u003e     consumers in any way.\n\u003e     \n\u003e     Now, for the fully restrictive consumer, the key-value model is indeed\n\u003e     \n\u003e     advantageous (and this is the only scenario that I can imagine in which\n\u003e     \n\u003e     it is advantageous), because you can catch key duplication on the parser\n\u003e     \n\u003e     level.\n\u003e     \n\u003e     But as it turns out, it's not enough. Consider the following records:\n\u003e     \n\u003e     key(\u003cPSBT_IN_REDEEM_SCRIPT\u003e + abcde), value(\u003csome redeem script\u003e)\n\u003e     \n\u003e \n\u003e and:\n\u003e \n\u003e key(\u003cPSBT_IN_REDEEM_SCRIPT\u003e + fghij), value(\u003csome other redeem script\u003e)\n\u003e \n\u003e A purely syntactic Combiner simply can't handle this case. The\n\u003e \n\u003e restrictive consumer needs to know whether the key is supposed to be\n\u003e \n\u003e repeating or not.\n\u003e \n\u003e We could fix this, e.g., by saying that repeating types must have high\n\u003e \n\u003e bit set and non-repeating must not. We also don't have to, because the\n\u003e \n\u003e worst failure here is that a consumer passes an invalid record to a\n\u003e \n\u003e subsequent one and the failure happens one step later.\n\u003e \n\u003e At this point it seems weird to be concerned about the \"unique key\"\n\u003e \n\u003e correctness, which is a very small subset of possibly invalid inputs. As\n\u003e \n\u003e a strict safety measure, I'd instead propose that a consumer MUST NOT\n\u003e \n\u003e operate on inputs or outputs, unless it understand ALL included fields -\n\u003e \n\u003e IOW, if you're signing a particular input, all fields in said input are\n\u003e \n\u003e mandatory. This prevents a situation where a simple Signer processes an\n\u003e \n\u003e input incorrectly based on incomplete set of fields, while still\n\u003e \n\u003e allowing Signers with different capabilities within the same PSBT.\n\u003e \n\u003e (The question here is whether to have either a flag or a reserved range\n\u003e \n\u003e for \"optional fields\" that can be safely ignored by consumers that don't\n\u003e \n\u003e understand them, but provide data for consumers who do.)\n\u003e \n\u003e \u003e \u003e To repeat and restate my central question: Why is it important,\n\u003e \u003e \u003e \n\u003e \u003e \u003e that an agent which doesn't understand a particular field\n\u003e \u003e \u003e \n\u003e \u003e \u003e structure, can nevertheless make decisions about its inclusion or\n\u003e \u003e \u003e \n\u003e \u003e \u003e omission from the result (based on a repeated prefix)?\n\u003e \u003e \n\u003e \u003e Again, because otherwise you may need a separate Combiner for each\n\u003e \u003e \n\u003e \u003e type of script involved. That would be unfortunate, and is very\n\u003e \u003e \n\u003e \u003e easily avoided.\n\u003e \n\u003e This is still confusing to me, and I would really like to get to the\n\u003e \n\u003e same page on this particular thing, because a lot of the debate hinges\n\u003e \n\u003e on it. I think I covered most of it above, but there are still pieces to\n\u003e \n\u003e clarify.\n\u003e \n\u003e As I understand it, the Combiner role (actually all the roles) is mostly\n\u003e \n\u003e an algorithm, with the implication that it can be performed\n\u003e \n\u003e independently by a separate agent, say a network node.\n\u003e \n\u003e So there's two types of Combiners:\n\u003e \n\u003e a) Combiner as a part of an intelligent consumer -- the usual scenario\n\u003e \n\u003e is a Creator/Combiner/Finalizer/Extractor being one participant, and\n\u003e \n\u003e Updater/Signers as other participants.\n\u003e \n\u003e In this case, the discussion of \"simple Combiners\" is actually talking\n\u003e \n\u003e about intelligent Combiners which don't understand new fields and must\n\u003e \n\u003e correctly pass them on. I argue that this can safely be done without\n\u003e \n\u003e loss of any important properties.\n\u003e \n\u003e b) Combiner as a separate service, with no understanding of semantics.\n\u003e \n\u003e Although parts of the debate seem to assume this scenario, I don't think\n\u003e \n\u003e it's worth considering. Again, do you have an usecase in mind for it?\n\u003e \n\u003e You also insist on enforcing a limited form of correctness on the\n\u003e \n\u003e Combiner level, but that is not worth it IMHO, as discussed above.\n\u003e \n\u003e Or am I missing something else?\n\u003e \n\u003e \u003e Perhaps you want to avoid signing with keys that are already signed\n\u003e \u003e \n\u003e \u003e with? If you need to derive all the keys before even knowing what\n\u003e \u003e \n\u003e \u003e was already signed with, you've already performed 80% of the work.\n\u003e \n\u003e This wouldn't concern me at all, honestly. If the user sends an already\n\u003e \n\u003e signed PSBT to the same signer, IMHO it is OK to sign again; the\n\u003e \n\u003e slowdown is a fault of the user/workflow. You could argue that signing\n\u003e \n\u003e again is the valid response. Perhaps the Signer should even \"consume\"\n\u003e \n\u003e its keys and not pass them on after producing a signature? That seems\n\u003e \n\u003e like a sensible rule.\n\u003e \n\u003e \u003e To your point: proto v2 afaik has no way to declare \"whole record\n\u003e \u003e \n\u003e \u003e uniqueness\", so either you drop that (which I think is unacceptable\n\u003e \u003e \n\u003e \u003e -   see the copy/sign/combine argument above), or you deal with it in\n\u003e \u003e     \n\u003e \u003e     your application code.\n\u003e \u003e     \n\u003e \n\u003e Yes. My argument is that \"whole record uniqueness\" isn't in fact an\n\u003e \n\u003e important property, because you need application-level checks anyway.\n\u003e \n\u003e Additionally, protobuf provides awareness of which fields are repeated\n\u003e \n\u003e and which aren't, and implicitly implements the \"pick last\" resolution\n\u003e \n\u003e strategy for duplicates.\n\u003e \n\u003e The simplest possible protobuf-based Combiner will:\n\u003e \n\u003e -   assume all fields are repeating\n\u003e -   concatenate and parse\n\u003e -   deduplicate and reserialize.\n\u003e     \n\u003e     More knowledgeable Combiner will intelligently handle non-repeating\n\u003e     \n\u003e     fields, but still has to assume that unknown fields are repeating and\n\u003e     \n\u003e     use the above algorithm.\n\u003e     \n\u003e     For \"pick last\" strategy, a consumer can simply parse the message and\n\u003e     \n\u003e     perform appropriate application-level checks.\n\u003e     \n\u003e     For \"hard-fail\" strategy, it must parse all fields as repeating and\n\u003e     \n\u003e     check that there's only one of those that are supposed to be unique.\n\u003e     \n\u003e     This is admittedly more work, and yes, protobuf is not perfectly suited\n\u003e     \n\u003e     for this task.\n\u003e     \n\u003e     But:\n\u003e     \n\u003e     One, this work must be done by hand anyway, if we go with a custom\n\u003e     \n\u003e     hand-parsed format. There is a protobuf implementation for every\n\u003e     \n\u003e     conceivable platform, we'll never have the same amount of BIP174 parsing\n\u003e     \n\u003e     code.\n\u003e     \n\u003e     (And if you're hand-writing a parser in order to avoid the dependency,\n\u003e     \n\u003e     you can modify it to do the checks at parser level. Note that this is\n\u003e     \n\u003e     not breaking the format! The modifed parser will consume well-formed\n\u003e     \n\u003e     protobuf and reject that which is valid protobuf but invalid bip174 - a\n\u003e     \n\u003e     correct behavior for a bip174 parser.)\n\u003e     \n\u003e     Two, it is my opinion that this is worth it in order to have a standard,\n\u003e     \n\u003e     well described, well studied and widely implemented format.\n\u003e     \n\u003e     Aside: I ha that there is no advantage to a record-set based\n\u003e     \n\u003e     custom format by itself, so IMHO the choice is between protobuf vs\n\u003e     \n\u003e     a custom key-value format. Additionally, it's even possible to implement\n\u003e     \n\u003e     a hand-parsable key-value format in terms of protobuf -- again, arguing\n\u003e     \n\u003e     that \"standardness\" of protobuf is valuable in itself.\n\u003e     \n\u003e     regards\n\u003e     \n\u003e     m.\n\u003e     \n\u003e \n\u003e bitcoin-dev mailing list\n\u003e \n\u003e bitcoin-dev at lists.linuxfoundation.org\n\u003e \n\u003e https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev"}