<oembed><type>rich</type><version>1.0</version><title>Achow101 [ARCHIVE] wrote</title><author_name>Achow101 [ARCHIVE] (npub1wh…d26cj)</author_name><author_url>https://yabu.me/npub1wh7lmdsh2r0ygnp39pk7k5a7mll5x5w44pwn6ekvdvmwjhazr5rqxd26cj</author_url><provider_name>njump</provider_name><provider_url>https://yabu.me</provider_url><html>📅 Original date posted:2018-06-29&#xA;📝 Original message:Hi,&#xA;&#xA;I do not think that protobuf is the way to go for this. Not only is it another dependency&#xA;which many wallets do not want to add (e.g. Armory has not added BIP 70 support because&#xA;of its dependency on protobuf), but it is a more drastic change than the currently proposed&#xA;changes. The point of this email thread isn&#39;t to rewrite and design a new BIP (which is effectively&#xA;what is currently going on). The point is to modify and improve the current one. In particular,&#xA;we do not want such drastic changes that people who have already implemented the current&#xA;BIP would have to effectively rewrite everything from scratch again.&#xA;&#xA;I believe that this discussion has become bikeshedding and is really no longer constructive. Neither&#xA;of us are going to convince the other to use or not use protobuf. ASeeing how no one else&#xA;has really participated in this discussion about protobuf and key uniqueness, I do not think&#xA;that these suggested changes are really necessary nor useful to others. It boils down to personal preference&#xA;rather than technical merit. As such, I have opened a PR to the BIPs repo (https://github.com/bitcoin/bips/pull/694)&#xA;which contains the changes that I proposed in an earlier email.&#xA;&#xA;Additionally, because there have been no objections to the currently proposed changes, I propose&#xA;to move the BIP from Draft to Proposed status.&#xA;&#xA;Andrew&#xA;&#xA;&#xA;​​&#xA;&#xA;‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐&#xA;&#xA;On June 29, 2018 2:53 AM, matejcik via bitcoin-dev &lt;bitcoin-dev at lists.linuxfoundation.org&gt; wrote:&#xA;&#xA;&gt; ​​&#xA;&gt; &#xA;&gt; Short version:&#xA;&gt; &#xA;&gt; -   I propose that conflicting &#34;values&#34; for the same &#34;key&#34; are considered&#xA;&gt;     &#xA;&gt;     invalid.&#xA;&gt;     &#xA;&gt; -   Let&#39;s not optimize for invalid data.&#xA;&gt; -   Given that, there&#39;s an open question on how to handle invalid data&#xA;&gt;     &#xA;&gt;     when encountered&#xA;&gt;     &#xA;&gt;     In general, I don&#39;t think it&#39;s possible to enforce correctness at the&#xA;&gt;     &#xA;&gt;     format level. You still need application level checks - and that calls&#xA;&gt;     &#xA;&gt;     into question what we gain by trying to do this on the format level.&#xA;&gt;     &#xA;&gt;     Long version:&#xA;&gt;     &#xA;&gt;     Let&#39;s look at this from a different angle.&#xA;&gt;     &#xA;&gt;     There are roughly two possible &#34;modes&#34; for the format with regard to&#xA;&gt;     &#xA;&gt;     possibly-conflicting data. Call them &#34;permissive&#34; and &#34;restrictive&#34;.&#xA;&gt;     &#xA;&gt;     The spec says:&#xA;&gt;     &#xA;&gt;     &#34;&#34;&#34;&#xA;&gt;     &#xA;&gt;     Keys within each scope should never be duplicated; all keys in the&#xA;&gt;     &#xA;&gt;     format are unique. PSBTs containing duplicate keys are invalid. However&#xA;&gt;     &#xA;&gt;     implementors will still need to handle events where keys are duplicated&#xA;&gt;     &#xA;&gt;     when combining transactions with duplicated fields. In this event, the&#xA;&gt;     &#xA;&gt;     software may choose whichever value it wishes.&#xA;&gt;     &#xA;&gt;     &#34;&#34;&#34;&#xA;&gt;     &#xA;&gt;     The last sentence of this paragraph sets the mode to permissive:&#xA;&gt;     &#xA;&gt;     duplicate values are pretty much OK. If you see them, just pick one.&#xA;&gt;     &#xA;&gt;     You seem to argue that Combiners, in particular simple ones that don&#39;t&#xA;&gt;     &#xA;&gt;     understand field semantics, should merge keys permissively, but&#xA;&gt;     &#xA;&gt;     deduplicate values restrictively.&#xA;&gt;     &#xA;&gt;     IOW: if you receive two different values for the same key, just pick&#xA;&gt;     &#xA;&gt;     whichever, but $deity forbid you include both!&#xA;&gt;     &#xA;&gt;     This choice doesn&#39;t make sense to me.&#xA;&gt;     &#xA;&gt;     What would make sense is fully restrictive mode: receiving two&#xA;&gt;     &#xA;&gt;     different values for the same key is a fatal condition with no recovery.&#xA;&gt;     &#xA;&gt;     If you have a non-deterministic scheme, put a differentiator in the key.&#xA;&gt;     &#xA;&gt;     Or all the data, for that matter.&#xA;&gt;     &#xA;&gt;     (Incidentally, this puts key-aware and keyless Combiners on the same&#xA;&gt;     &#xA;&gt;     footing. As long as all participants uphold the protocol, different&#xA;&gt;     &#xA;&gt;     value = different key = different full record.)&#xA;&gt;     &#xA;&gt;     Given that, it&#39;s nice to have the Combiner perform the task of detecting&#xA;&gt;     &#xA;&gt;     this and failing. But not at all necessary. As the quoted paragraph&#xA;&gt;     &#xA;&gt;     correctly notes, consumers still need to handle PSBTs with duplicate keys.&#xA;&gt;     &#xA;&gt;     (In this context, your implied permissive/restrictive Combiner is&#xA;&gt;     &#xA;&gt;     optimized for dealing with invalid data. That seems like a wrong&#xA;&gt;     &#xA;&gt;     optimization.)&#xA;&gt;     &#xA;&gt;     A reasonable point to decide is whether the handling at the consumer&#xA;&gt;     &#xA;&gt;     should be permissive or restrictive. Personally I&#39;m OK with either. I&#39;d&#xA;&gt;     &#xA;&gt;     go with the following change:&#xA;&gt;     &#xA;&gt;     &#34;&#34;&#34;&#xA;&gt;     &#xA;&gt;     In this event, the software MAY reject the transaction as invalid. If it&#xA;&gt;     &#xA;&gt;     decides to accept it, it MUST choose the last value encountered.&#xA;&gt;     &#xA;&gt;     &#34;&#34;&#34;&#xA;&gt;     &#xA;&gt;     (deterministic way of choosing, instead of &#34;whichever you like&#34;)&#xA;&gt;     &#xA;&gt;     We could also drop the first part, explicitly allowing consumers to&#xA;&gt;     &#xA;&gt;     pick, and simplifying the Combiner algorithm to `sort -u`.&#xA;&gt;     &#xA;&gt;     Note that this sort of &#34;picking&#34; will probably be implicit. I&#39;d expect&#xA;&gt;     &#xA;&gt;     the consumer to look like this:&#xA;&gt;     &#xA;&gt; &#xA;&gt;     for key, value in parse(nextRecord()):&#xA;&gt;       data[key] = value&#xA;&gt;     &#xA;&gt; &#xA;&gt; Or we could drop the second part and switch MAY to MUST, for a fully&#xA;&gt; &#xA;&gt; restrictive mode - which, funnily enough, still lets the Combiner work&#xA;&gt; &#xA;&gt; as `sort -u`.&#xA;&gt; &#xA;&gt; To see why, remember that distinct values for the same key are not&#xA;&gt; &#xA;&gt; allowed in fully restrictive mode. If a Combiner encounters two&#xA;&gt; &#xA;&gt; conflicting values F(1) and F(2), it should fail -- but if it doesn&#39;t,&#xA;&gt; &#xA;&gt; it includes both and the same failure WILL happen on the fully&#xA;&gt; &#xA;&gt; restrictive consumer.&#xA;&gt; &#xA;&gt; This was (or is) my point of confusion re Combiners: the permissive key&#xA;&gt; &#xA;&gt; -   restrictive value mode of operation doesn&#39;t seem to help subsequent&#xA;&gt;     &#xA;&gt;     consumers in any way.&#xA;&gt;     &#xA;&gt;     Now, for the fully restrictive consumer, the key-value model is indeed&#xA;&gt;     &#xA;&gt;     advantageous (and this is the only scenario that I can imagine in which&#xA;&gt;     &#xA;&gt;     it is advantageous), because you can catch key duplication on the parser&#xA;&gt;     &#xA;&gt;     level.&#xA;&gt;     &#xA;&gt;     But as it turns out, it&#39;s not enough. Consider the following records:&#xA;&gt;     &#xA;&gt;     key(&lt;PSBT_IN_REDEEM_SCRIPT&gt; + abcde), value(&lt;some redeem script&gt;)&#xA;&gt;     &#xA;&gt; &#xA;&gt; and:&#xA;&gt; &#xA;&gt; key(&lt;PSBT_IN_REDEEM_SCRIPT&gt; + fghij), value(&lt;some other redeem script&gt;)&#xA;&gt; &#xA;&gt; A purely syntactic Combiner simply can&#39;t handle this case. The&#xA;&gt; &#xA;&gt; restrictive consumer needs to know whether the key is supposed to be&#xA;&gt; &#xA;&gt; repeating or not.&#xA;&gt; &#xA;&gt; We could fix this, e.g., by saying that repeating types must have high&#xA;&gt; &#xA;&gt; bit set and non-repeating must not. We also don&#39;t have to, because the&#xA;&gt; &#xA;&gt; worst failure here is that a consumer passes an invalid record to a&#xA;&gt; &#xA;&gt; subsequent one and the failure happens one step later.&#xA;&gt; &#xA;&gt; At this point it seems weird to be concerned about the &#34;unique key&#34;&#xA;&gt; &#xA;&gt; correctness, which is a very small subset of possibly invalid inputs. As&#xA;&gt; &#xA;&gt; a strict safety measure, I&#39;d instead propose that a consumer MUST NOT&#xA;&gt; &#xA;&gt; operate on inputs or outputs, unless it understand ALL included fields -&#xA;&gt; &#xA;&gt; IOW, if you&#39;re signing a particular input, all fields in said input are&#xA;&gt; &#xA;&gt; mandatory. This prevents a situation where a simple Signer processes an&#xA;&gt; &#xA;&gt; input incorrectly based on incomplete set of fields, while still&#xA;&gt; &#xA;&gt; allowing Signers with different capabilities within the same PSBT.&#xA;&gt; &#xA;&gt; (The question here is whether to have either a flag or a reserved range&#xA;&gt; &#xA;&gt; for &#34;optional fields&#34; that can be safely ignored by consumers that don&#39;t&#xA;&gt; &#xA;&gt; understand them, but provide data for consumers who do.)&#xA;&gt; &#xA;&gt; &gt; &gt; To repeat and restate my central question: Why is it important,&#xA;&gt; &gt; &gt; &#xA;&gt; &gt; &gt; that an agent which doesn&#39;t understand a particular field&#xA;&gt; &gt; &gt; &#xA;&gt; &gt; &gt; structure, can nevertheless make decisions about its inclusion or&#xA;&gt; &gt; &gt; &#xA;&gt; &gt; &gt; omission from the result (based on a repeated prefix)?&#xA;&gt; &gt; &#xA;&gt; &gt; Again, because otherwise you may need a separate Combiner for each&#xA;&gt; &gt; &#xA;&gt; &gt; type of script involved. That would be unfortunate, and is very&#xA;&gt; &gt; &#xA;&gt; &gt; easily avoided.&#xA;&gt; &#xA;&gt; This is still confusing to me, and I would really like to get to the&#xA;&gt; &#xA;&gt; same page on this particular thing, because a lot of the debate hinges&#xA;&gt; &#xA;&gt; on it. I think I covered most of it above, but there are still pieces to&#xA;&gt; &#xA;&gt; clarify.&#xA;&gt; &#xA;&gt; As I understand it, the Combiner role (actually all the roles) is mostly&#xA;&gt; &#xA;&gt; an algorithm, with the implication that it can be performed&#xA;&gt; &#xA;&gt; independently by a separate agent, say a network node.&#xA;&gt; &#xA;&gt; So there&#39;s two types of Combiners:&#xA;&gt; &#xA;&gt; a) Combiner as a part of an intelligent consumer -- the usual scenario&#xA;&gt; &#xA;&gt; is a Creator/Combiner/Finalizer/Extractor being one participant, and&#xA;&gt; &#xA;&gt; Updater/Signers as other participants.&#xA;&gt; &#xA;&gt; In this case, the discussion of &#34;simple Combiners&#34; is actually talking&#xA;&gt; &#xA;&gt; about intelligent Combiners which don&#39;t understand new fields and must&#xA;&gt; &#xA;&gt; correctly pass them on. I argue that this can safely be done without&#xA;&gt; &#xA;&gt; loss of any important properties.&#xA;&gt; &#xA;&gt; b) Combiner as a separate service, with no understanding of semantics.&#xA;&gt; &#xA;&gt; Although parts of the debate seem to assume this scenario, I don&#39;t think&#xA;&gt; &#xA;&gt; it&#39;s worth considering. Again, do you have an usecase in mind for it?&#xA;&gt; &#xA;&gt; You also insist on enforcing a limited form of correctness on the&#xA;&gt; &#xA;&gt; Combiner level, but that is not worth it IMHO, as discussed above.&#xA;&gt; &#xA;&gt; Or am I missing something else?&#xA;&gt; &#xA;&gt; &gt; Perhaps you want to avoid signing with keys that are already signed&#xA;&gt; &gt; &#xA;&gt; &gt; with? If you need to derive all the keys before even knowing what&#xA;&gt; &gt; &#xA;&gt; &gt; was already signed with, you&#39;ve already performed 80% of the work.&#xA;&gt; &#xA;&gt; This wouldn&#39;t concern me at all, honestly. If the user sends an already&#xA;&gt; &#xA;&gt; signed PSBT to the same signer, IMHO it is OK to sign again; the&#xA;&gt; &#xA;&gt; slowdown is a fault of the user/workflow. You could argue that signing&#xA;&gt; &#xA;&gt; again is the valid response. Perhaps the Signer should even &#34;consume&#34;&#xA;&gt; &#xA;&gt; its keys and not pass them on after producing a signature? That seems&#xA;&gt; &#xA;&gt; like a sensible rule.&#xA;&gt; &#xA;&gt; &gt; To your point: proto v2 afaik has no way to declare &#34;whole record&#xA;&gt; &gt; &#xA;&gt; &gt; uniqueness&#34;, so either you drop that (which I think is unacceptable&#xA;&gt; &gt; &#xA;&gt; &gt; -   see the copy/sign/combine argument above), or you deal with it in&#xA;&gt; &gt;     &#xA;&gt; &gt;     your application code.&#xA;&gt; &gt;     &#xA;&gt; &#xA;&gt; Yes. My argument is that &#34;whole record uniqueness&#34; isn&#39;t in fact an&#xA;&gt; &#xA;&gt; important property, because you need application-level checks anyway.&#xA;&gt; &#xA;&gt; Additionally, protobuf provides awareness of which fields are repeated&#xA;&gt; &#xA;&gt; and which aren&#39;t, and implicitly implements the &#34;pick last&#34; resolution&#xA;&gt; &#xA;&gt; strategy for duplicates.&#xA;&gt; &#xA;&gt; The simplest possible protobuf-based Combiner will:&#xA;&gt; &#xA;&gt; -   assume all fields are repeating&#xA;&gt; -   concatenate and parse&#xA;&gt; -   deduplicate and reserialize.&#xA;&gt;     &#xA;&gt;     More knowledgeable Combiner will intelligently handle non-repeating&#xA;&gt;     &#xA;&gt;     fields, but still has to assume that unknown fields are repeating and&#xA;&gt;     &#xA;&gt;     use the above algorithm.&#xA;&gt;     &#xA;&gt;     For &#34;pick last&#34; strategy, a consumer can simply parse the message and&#xA;&gt;     &#xA;&gt;     perform appropriate application-level checks.&#xA;&gt;     &#xA;&gt;     For &#34;hard-fail&#34; strategy, it must parse all fields as repeating and&#xA;&gt;     &#xA;&gt;     check that there&#39;s only one of those that are supposed to be unique.&#xA;&gt;     &#xA;&gt;     This is admittedly more work, and yes, protobuf is not perfectly suited&#xA;&gt;     &#xA;&gt;     for this task.&#xA;&gt;     &#xA;&gt;     But:&#xA;&gt;     &#xA;&gt;     One, this work must be done by hand anyway, if we go with a custom&#xA;&gt;     &#xA;&gt;     hand-parsed format. There is a protobuf implementation for every&#xA;&gt;     &#xA;&gt;     conceivable platform, we&#39;ll never have the same amount of BIP174 parsing&#xA;&gt;     &#xA;&gt;     code.&#xA;&gt;     &#xA;&gt;     (And if you&#39;re hand-writing a parser in order to avoid the dependency,&#xA;&gt;     &#xA;&gt;     you can modify it to do the checks at parser level. Note that this is&#xA;&gt;     &#xA;&gt;     not breaking the format! The modifed parser will consume well-formed&#xA;&gt;     &#xA;&gt;     protobuf and reject that which is valid protobuf but invalid bip174 - a&#xA;&gt;     &#xA;&gt;     correct behavior for a bip174 parser.)&#xA;&gt;     &#xA;&gt;     Two, it is my opinion that this is worth it in order to have a standard,&#xA;&gt;     &#xA;&gt;     well described, well studied and widely implemented format.&#xA;&gt;     &#xA;&gt;     Aside: I ha that there is no advantage to a record-set based&#xA;&gt;     &#xA;&gt;     custom format by itself, so IMHO the choice is between protobuf vs&#xA;&gt;     &#xA;&gt;     a custom key-value format. Additionally, it&#39;s even possible to implement&#xA;&gt;     &#xA;&gt;     a hand-parsable key-value format in terms of protobuf -- again, arguing&#xA;&gt;     &#xA;&gt;     that &#34;standardness&#34; of protobuf is valuable in itself.&#xA;&gt;     &#xA;&gt;     regards&#xA;&gt;     &#xA;&gt;     m.&#xA;&gt;     &#xA;&gt; &#xA;&gt; bitcoin-dev mailing list&#xA;&gt; &#xA;&gt; bitcoin-dev at lists.linuxfoundation.org&#xA;&gt; &#xA;&gt; https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev</html></oembed>