19:10:38 <sipa> #startmeeting 19:10:38 <lightningbot> Meeting started Fri Mar 29 19:10:38 2019 UTC. The chair is sipa. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:10:38 <lightningbot> Useful Commands: #action #agreed #help #info #idea #link #topic. 19:10:42 <sipa> yow, topics? 19:10:52 <jnewbery> suggested topic: rebroadcast behaviour 19:10:55 <instagibbs> hmm calendar is off by a week, here 19:11:29 <jnewbery> instagibbs: it was delayed by a week during FC and didn't revert 19:12:05 <achow101> suggested topic: non-hardened derivation paths 19:12:12 <sipa> #topic rebroadcast behaviour 19:12:51 <instagibbs> I see we finally have rebroadcast tests :) 19:13:01 <jnewbery> I've been looking at wallet rebroadcasts. The current behaviour is: set a timer for random (0, 30). When it pops rebroadcast all unconfirmed wallet txs. Set timer again. 19:13:25 <sipa> but only if there has been a new block since the previous rebroadcast? 19:13:29 <jnewbery> (it's not quite as bad as that because each peer has a bloom filter so we won't actually rebroadcast until it's been purged from there) 19:13:35 <jnewbery> ah yes, as long as there's been a block 19:13:44 <jnewbery> I have a couple of suggested changes 19:14:03 <jnewbery> 1. separate the scheduling, so each tx is on its own timer, instead of sending them all at once 19:14:28 <jnewbery> 2. have the wallet remember some number of random txs it sees from the mempool, and add those to its rebroadcast list as decoys 19:14:43 <jnewbery> wanted to get some concept feedback on those before I started coding up 19:15:16 <sipa> so (1) is problematic if there are interdependent unconfirmed transactions in your wallet 19:15:31 <sipa> as you may end up sending a child to a different peer than the parent 19:15:35 <sipa> gmaxwell: opinions? ^ 19:16:01 <jnewbery> in (1), I'd think we'd still send each tx to all peers 19:16:08 <sipa> oh! 19:16:18 <jnewbery> just not all at the same time 19:16:28 <instagibbs> trickle rather than flood 19:16:38 <sipa> right, but you may send the child before the parent 19:16:58 <sipa> (they get sorted in the broadcast buffer before actual sending, but that's just a few-seconds timer) 19:17:29 <sipa> i don't know whether the rebroadcast mechanism in the wallet currently actually cares about that 19:17:39 <jnewbery> I think it doesn't, but I'd have to recheck 19:17:55 <sipa> yeah, but if it sends all unconfirmed txn, they'll get sorted before broadcast 19:18:05 <jnewbery> oh, where does that happen? 19:18:30 <sipa> sendmessages 19:19:29 <sipa> there's a poisson distributed per-peer timer (but simultaneous for all inbound connections) that flushes a buffer of to-be-broadcast txn 19:20:00 <jnewbery> yeah, I see it now 19:20:05 <sipa> and it sorted first by dependency order and then lexicographically 19:20:09 <sipa> *sorts 19:20:37 <sipa> so if you announce an txid within a few seconds of each other, there's a high chance they'll end up being sorted correctly before actual broadcast 19:20:44 <jnewbery> // Topologically and fee-rate sort the inventory we send for privacy and priority reasons. 19:20:51 <sipa> yeah, that 19:21:00 <sipa> oh yes, feerate; not lexicographically 19:21:10 <gmaxwell> (1) could consider only the deepest ancestors and announce all the parents at once. 19:21:19 <gmaxwell> ignoring the details (1) sounds like a very important improvement. 19:22:29 <jnewbery> is the wallet aware of dependencies in its transactions? 19:22:33 <sipa> yes 19:22:49 <gmaxwell> er only the deepest children. 19:24:08 <gmaxwell> (2) I don't have a problem with but I think the benefit is a bit dubious, like random txn by itself very likely provides no privacy improvement, because only the real sender or recipent will send txn for all of a single address/linked address consistently. It's sometimes hard to tell where the line is between snake oil that just adds complexity to the codebase for no protection against an 19:24:09 <gmaxwell> even slightly incompetent attacker, vs fuzzing things up in a way that provides incremental privacy even though far from perfect 19:24:18 <sipa> jnewbery: mapTxSpends 19:24:27 <gmaxwell> I would think though that matching random addresses would be a lot better than random txn. 19:25:16 <gmaxwell> I think a (3) avoiding pointles retransmissions would be more helpful. But I would happily review an implementation of (2), reservations aside. 19:25:31 <jnewbery> Do we know what percentage of txs are re-used addresses? And what number of those re-used addresses have multiple unconfirmed txs at the same time? 19:25:59 <gmaxwell> jnewbery: 'most' (I expect sipa will provide some numbers) But it's not just a same time question: 19:26:04 <sipa> jnewbery: also, poisson distributed rebroadcast events are probably more private than uniform distributed ones 19:26:24 <gmaxwell> what I think it should do is per node pick a random value (e.g. the addr man randomizer) and use that to consistently select some small portion of addresses to rebroadcast. 19:26:42 <gmaxwell> so that over months of time we're consisently rebroadcasting the same other addresses. 19:27:12 <gmaxwell> Yeah uniform leaks information that possion doesn't, possion is the distribution you get from memoryless processes. 19:27:53 <jnewbery> if you're talking about months of time, then presumably you'd need to save this to disk 19:28:08 <sipa> the addrman randomizer is stored on disk 19:28:20 <gmaxwell> Thats why I specified that one. 19:28:37 <jnewbery> So you're saying this behaviour should live in the node rather than the wallet? 19:28:47 <sipa> wallet is easier i think 19:29:01 <gmaxwell> also if you wipe out your peers.day because you're trying to avoid addrman based fingerprinting then you'll also wipe rebroadcast fingerprinting. 19:29:01 <jnewbery> the wallet isn't aware of peers 19:29:06 <gmaxwell> I think nodes without wallets should d this too. 19:29:07 <sipa> but it would possibly result in identifiable behavior that can be detected if a wallet file moves to another node 19:29:23 <gmaxwell> because there are many walletless nodes that would provide cover... 19:29:41 <gmaxwell> maybe thats not realistic for engineering reasons. ::shrugs:: I'm just talking spherical cows here. 19:30:16 <gmaxwell> In any case, if it isn't consistent like this, it's really obvious to me how attackers will filter it out. Monitor, and look for dispositive failures to rebroadcast. The real source won't have them, the fake sources will. 19:30:19 <jnewbery> I think the implementation can be quite minimal if the wallet just keeps its own list of txs and doesn't try to distinguish behaviour between peers 19:30:53 <gmaxwell> and given someone that already has network wide monitoring that filtering is probably only a dozen lines of code/query. 19:31:13 <jnewbery> are you saying genuine wallet rebroadcasts should also be to only one peer? 19:31:40 <jnewbery> because if its to all peers that's different behaviour than decoy rebroadcasts and would be easy to spot 19:32:07 <gmaxwell> no, both should be to all peers. 19:32:33 <jnewbery> then I've misunderstood your 'per node' comment above 19:33:05 <gmaxwell> jnewbery: my node being different from yours. 19:33:18 <jnewbery> ok 19:33:19 <gmaxwell> I prefer to rebroadcast 1apple you prefer to rebroadcast 1spatula. 19:33:25 <jnewbery> yep 19:33:34 <gmaxwell> and if I restart I still prefer 1apple. 19:33:54 <gmaxwell> so an attacker can't tell if I like 1apple's addresses because of my random number or because I am 1apple. 19:34:16 <bitcoin-git> [13bitcoin] 15promag opened pull request #15700: rfc: Synchronize validation interface registration (06master...062019-03-sync-validation-interface-registration) 02https://github.com/bitcoin/bitcoin/pull/15700 19:34:34 <sipa> so what if the address you pick for rebroadcasts suddenly gets a ton of transactions 19:34:52 <gmaxwell> sipa: well you're going to relay them on the network regardless. 19:35:00 <gmaxwell> So I don't think thats actually a major cost? 19:35:41 <sipa> ah right, if it's per node they'd just be broadcast from the mempool directly; no need to copy/store them elsewhere 19:36:10 <sipa> though you do need to keep track of txn that spend outputs assigned to the addresses you've chosen 19:36:33 <sipa> but that could just be a boolean per-tx i guess 19:37:34 <gmaxwell> right. all these considerations is why I'm a little dubious. I think the simplest version does little to nothing. And I'm not sure of the bound of the complexity on an effort to really be indistinguishable. I mean mean one way would be to effectively importaddress on random addresses you see with a flag to hide them in the wallet but implemented that way would have too many DOS potentials. 19:38:06 <gmaxwell> it would, however, have a pretty strong guarentee that you'd treat them the same, making them very hard to distinguish. 19:38:18 <gmaxwell> though it would also only work if you have a wallet. 19:38:49 <jnewbery> I think just selecting random txs is surely better than little to nothing 19:39:04 <sipa> another approach is having the rebroadcast mechanism be purely in the node, and have it mark some subset of transactions... but then have the wallet forcibly set that mark on its own transactions too 19:39:20 <sipa> but the wallet doesn't do the rebroadcasting 19:40:22 <gmaxwell> jnewbery: I think any existing deanon attackers can filter out random tx rebroadcasts fairly reliably with a line of code, maybe its still worth doing. We have other paper thin privacy mechenisms (e.g. most of the node anti-fingerprinting). 19:41:08 <jnewbery> but if they're random, some of them will be for addresses that aren't re-used. How would the attacked filter those out? 19:42:52 <gmaxwell> jnewbery: A couple ways: for rebroadcast purposes every coin that isn't lost is reused, since senders and recipents both rebroadcast. Also even when addresses aren't reused, they're co-used in inputs with other transactions. So broadcasting of the complete address cluster is also a fairly strong hurestic. 19:43:28 <gmaxwell> I think you've convinced me that in at least some cases it would be indistinguishable. 19:43:46 <jnewbery> Many txs don't get rebroadcast at all and are confirmed in the next block 19:43:53 <gmaxwell> e.g. one side of send/recieve doesn't need any rebroadcast, no reuse, no useful information from clustering. 19:44:28 <jnewbery> Right. An attacker can't tell if you would have rebroadcast 19:44:59 <gmaxwell> (thats also, aside, a reason rebroadcasts should be possion timed) 19:45:03 <sipa> maybe we should move to achow101's topic? 19:45:32 <jnewbery> I don't think I need anything else on this for now. Thanks for the input! 19:45:56 <achow101> I've started working native descriptor wallets and I just wanted some opinions on some parts of the design. 19:46:05 <sipa> cool 19:46:27 <sipa> #topic non-hardened derivation paths 19:46:51 <achow101> right now I have the wallet make 6 descriptors, in pairs of 3. each pair has an internal and external descriptor, and each pair for each of the address types 19:47:14 <sipa> makes sense 19:47:20 <achow101> it uses derivation paths that we currently use in the wallet, but I think it would be better if we used bip 44/49/89 19:47:51 <achow101> thoughts? it would mean that we use non-hardened derivation paths 19:47:53 <sipa> i think using non-hardened derivation paths is fine if there is no way to export the individual private keys 19:48:29 <gmaxwell> I think it's necessary to disable key export on anything non-hardened. 19:48:35 <achow101> as it is now, we need to have a new seed for each address type or we'll end up using keys accidentally 19:48:56 <achow101> I think that the only export that will be possible is exporting the entire private descriptor itself 19:48:57 <sipa> achow101: you would certainly use distinct paths for each of the address types 19:49:03 <sipa> achow101: agree 19:49:10 <gmaxwell> Also, I don't think we should be defaulting to using non-hardened unless we're actually making real use of its abilityies (like to cogenerate addresses with an external signer). 19:49:13 <achow101> i'll be disabling dumpprivkey and all of the imports for descriptor wallets 19:50:08 <sipa> gmaxwell: it also has the advantage of not needing to decrypt the wallet to generate more addresses 19:50:26 <gmaxwell> so generate 100,000 addresses the first go. :) 19:50:44 <gmaxwell> it's a lot of fragility to save a megabyte of file size. :P 19:50:46 <achow101> gmaxwell: it has the advantage of making the wallet file extremely small 19:51:16 <gmaxwell> the wallet file grows from transactions regardless. if you're actually using addresses it'll get big. 19:51:50 <sipa> anyway, i think it certainly makes sense to support nonhardened derivation 19:51:58 <gmaxwell> Sure. 19:52:05 <achow101> right now the thing I don't like is that it needs 3 different seeds, one for each address type 19:52:28 <sipa> achow101: that shouldn't be needed; you can use distinct derivation paths even when using hardened? 19:52:46 <gmaxwell> as far as making wallets small, that I think is mostly interesting for backups, and for that it might be best to have tools/rpcs that strips wallets for backup purposes. 19:52:52 <achow101> sipa: yes, but I don't really like having to introduce yet another set of derivation paths that people have to consider 19:53:00 <achow101> vs. using existing standards 19:53:22 <sipa> achow101: that's why you encode it as a descriptor; it's universal :) 19:53:41 <sipa> but sure- i agree at a high level it's unnecessary to introduce new standards when existing ones exist 19:54:01 <sipa> the question is whether hardened or unhardened is more desirable - i don't know, and it may depend on the situation 19:54:07 <gmaxwell> This is just broken. 19:54:23 <achow101> ? 19:54:47 <gmaxwell> 'existing standards' were made by people considering entirely different use cases and deployment enviroments (hardware wallets) 19:55:17 <gmaxwell> and with a different focus on security (a lot closer to 'who cares') 19:55:32 <sipa> well when we'd use a nonhardened derivation (for external reasons) it makes sense to follow those standards 19:55:46 <sipa> the question remains whether or not to use hardened or not 19:56:00 <gmaxwell> sipa: it might, if it is otherwise fine and would actually result in some kind of useful compatibility. 19:56:33 <gmaxwell> like we'll never be compatible with electrum (say) by using the same path, simply because we generate parllel native and compatibility addresses. 19:56:54 <achow101> gmaxwell: the plan for descriptor wallets is not generate them in parallel 19:57:15 <achow101> each address type will have it's own derivation path base 19:57:48 <sipa> yeah, so your bitcoin core wallet will correspond to a union of several things that are (possibly) compatible with an electrum wallet individually 19:58:55 <gmaxwell> a wallet where you can recover part of the funds and the rest are just invisibly lost isn't compatible though, if anything its a liability. 19:59:13 <sipa> right 20:00:42 <achow101> anywys, it seems like the preference is to use hardened 20:01:11 <gmaxwell> Generally thats my strong preference, except in cases where there is a 'application layer' gain from otherwise. 20:01:43 <gmaxwell> Not just some 'its easier to write this way' or 'wallet file is somewhat smaller for unspecified benenfits' 20:02:43 <sipa> seems we're out of time 20:03:05 <gmaxwell> I really regret ever coming up with public derrivation and wish I could take it back, The original application I had for it is still unavailable to people (so your web server could securely generate fresh addresses for you) in practice... and it gets applied *everywhere* simply because reusing a key is simpler to implement. 20:03:26 <achow101> well that's all I had for today. there was something else I wanted to discuss, but I don't remember what that was 20:03:40 <achow101> so clearly it wasn't very important 20:04:19 <gmaxwell> .. and its resulted in funds loss, and it also results in expectations we won't be able to support for post ECC keys. 20:04:46 <sipa> #endmeeting