12015-10-24T00:25:03 *** belcher has quit IRC
22015-10-24T00:26:14 *** CodeShark has joined #bitcoin-core-dev
32015-10-24T00:36:52 <phantomcircuit> gmaxwell, *substantial*
42015-10-24T00:42:30 *** danielsocials has joined #bitcoin-core-dev
52015-10-24T01:03:18 *** CodeShark has quit IRC
62015-10-24T01:13:52 *** danielsocials has quit IRC
72015-10-24T01:21:25 *** danielsocials has joined #bitcoin-core-dev
82015-10-24T01:34:03 <phantomcircuit> uh
92015-10-24T01:34:12 <phantomcircuit> yeah we definitely have some kind of memory issue in getblocktemplate
102015-10-24T01:34:16 <phantomcircuit> it's trivial to recreate
112015-10-24T01:34:38 <phantomcircuit> while true;do bitcoin-cli getblocktemplate > /dev/null;done
122015-10-24T01:34:40 <phantomcircuit> is enough
132015-10-24T01:42:51 <phantomcircuit> actually no this is expected
142015-10-24T01:43:18 <phantomcircuit> 4GB dbcache + 1.3GB mempool + 800 MB of ?
152015-10-24T01:49:53 <phantomcircuit> BlueMatt, you have code to walk the cache for stuff nothing depends on?
162015-10-24T01:51:52 <BlueMatt> nope
172015-10-24T01:51:54 <BlueMatt> its not hard, though
182015-10-24T02:01:14 *** danielsocials has quit IRC
192015-10-24T02:01:33 <sipa> there is no "depends"
202015-10-24T02:01:38 <sipa> it's a cache
212015-10-24T02:01:46 <BlueMatt> sipa: you know what he meant
222015-10-24T02:04:04 <sipa> it's expected tgat gbt increases the cache size, as it will pull in all dependencies
232015-10-24T02:04:57 <BlueMatt> sipa: indeed, not sure what phantomcircuit is seeing is unexpected or not, but the graphs in the ml related to https://github.com/bitcoin/bitcoin/issues/6876 are not expected at all
242015-10-24T02:06:07 <sipa> well i have no idea what the gbt code is all doing
252015-10-24T02:09:53 *** Dyanisus has quit IRC
262015-10-24T02:14:52 <phantomcircuit> BlueMatt, what im seeing is 6.3GB of ram being used with dbcache=4096
272015-10-24T02:15:07 <BlueMatt> phantomcircuit: in dbcache? that sounds about right
282015-10-24T02:15:07 <phantomcircuit> and calling gettxoutsetinfo (which calls FlushStateToDisk) not changing that
292015-10-24T02:15:12 <BlueMatt> oh
302015-10-24T02:15:13 <BlueMatt> hmm
312015-10-24T02:20:37 <sipa> GBT copies a significant portion of the cache into its own view
322015-10-24T02:20:53 <sipa> so it can pretty much double the memory usage
332015-10-24T02:22:07 <phantomcircuit> sipa, yeah but that view is destroyed at the end of the CreateNewBlock call
342015-10-24T02:22:42 <sipa> does change your res
352015-10-24T02:22:51 <sipa> *doesn't
362015-10-24T02:22:57 <phantomcircuit> right because of memory allocation stuff
372015-10-24T02:23:04 <sipa> fragmentation etc
382015-10-24T02:23:16 *** d_t has quit IRC
392015-10-24T02:27:50 <phantomcircuit> sipa, hmm boost::unordered_map is buckets so make expanding easier
402015-10-24T02:35:25 <phantomcircuit> im not sure that's it though since memory usage just jumped 700MB more
412015-10-24T02:35:37 <phantomcircuit> i cant imagine that's from fragmentation alone
422015-10-24T02:36:51 <sipa> well if there is 800 MB chaonstatr depended on by the mempool, then GBT will add another 800 MB
432015-10-24T02:37:03 <sipa> fragmentation just makes it not able to release it back
442015-10-24T02:37:33 <phantomcircuit> chainstate?
452015-10-24T02:38:30 <sipa> yes
462015-10-24T02:38:38 <sipa> typing in a driving car on a small keyboard
472015-10-24T02:42:21 <phantomcircuit> oh duh right they're separate
482015-10-24T02:45:13 *** fanquake has joined #bitcoin-core-dev
492015-10-24T03:02:08 *** Dyanisus has joined #bitcoin-core-dev
502015-10-24T03:04:20 *** Ylbam has quit IRC
512015-10-24T03:08:34 *** Dyanisus has left #bitcoin-core-dev
522015-10-24T03:28:19 *** fanquake has quit IRC
532015-10-24T03:28:20 *** danielsocials has joined #bitcoin-core-dev
542015-10-24T03:30:26 *** Guest6474 has quit IRC
552015-10-24T03:30:29 *** maaku__ has joined #bitcoin-core-dev
562015-10-24T03:33:53 <morcos> the problem with GBT is it can create that much memory usage per each of the RPC threads
572015-10-24T03:34:23 *** danielsocials has quit IRC
582015-10-24T03:34:33 <morcos> each thread allocates the memory in a separate arena, and even though the objects are destroyed at the end of the call, there tends to be enough fragmentation that the memory isn't entirely free
592015-10-24T03:36:04 <morcos> in addition, if your chainstate expands during an RPC call (such as due to GBT) enough to cause a rehash of the unordered map
602015-10-24T03:36:21 <morcos> then this also will be allocated in a new arena, and possibly all the old chainstate won't be cleaned up
612015-10-24T03:38:04 <morcos> phantomcircuit: sipa: ^ not sure if you followed the earlier conversation i had with wumpus and gmaxwell about this
622015-10-24T03:42:01 *** jgarzik_ has joined #bitcoin-core-dev
632015-10-24T03:42:18 *** jgarzik has quit IRC
642015-10-24T03:45:28 *** d_t has joined #bitcoin-core-dev
652015-10-24T04:00:40 *** gribble has quit IRC
662015-10-24T04:08:42 *** gribble has joined #bitcoin-core-dev
672015-10-24T04:19:15 *** maaku has quit IRC
682015-10-24T04:23:19 *** jgarzik_ is now known as jgarzik
692015-10-24T04:23:27 *** jgarzik has joined #bitcoin-core-dev
702015-10-24T04:27:11 *** Luke-Jr has quit IRC
712015-10-24T04:30:57 *** Luke-Jr has joined #bitcoin-core-dev
722015-10-24T04:32:35 *** danielsocials has joined #bitcoin-core-dev
732015-10-24T04:55:40 *** PaulCape_ has joined #bitcoin-core-dev
742015-10-24T04:58:54 *** PaulCapestany has quit IRC
752015-10-24T05:20:43 *** danielsocials has quit IRC
762015-10-24T05:58:38 *** PaulCape_ has quit IRC
772015-10-24T06:00:27 *** PaulCapestany has joined #bitcoin-core-dev
782015-10-24T06:01:29 <cfields> gmaxwell: for backlog, i added in Tor's RESOLVE extension to SOCKS5 so that we can query all seeds up front without making actual node connections. Not sure if there's any real benefit, but it was trivial to add.
792015-10-24T06:04:45 <wumpus> cfields: nice
802015-10-24T06:33:30 *** d_t has quit IRC
812015-10-24T07:11:58 *** danielsocials has joined #bitcoin-core-dev
822015-10-24T07:16:09 *** danielsocials_ has joined #bitcoin-core-dev
832015-10-24T07:17:30 *** danielsocials has quit IRC
842015-10-24T07:27:10 *** GAit has quit IRC
852015-10-24T07:40:48 *** ParadoxSpiral_ has joined #bitcoin-core-dev
862015-10-24T07:43:46 *** ParadoxSpiral has quit IRC
872015-10-24T07:58:28 <jgarzik> jcorgan, btcdrak: 2015_sqlite branch now works, passes tests
882015-10-24T07:58:42 <jgarzik> not yet performance-tuned
892015-10-24T08:00:40 <wumpus> great
902015-10-24T08:00:46 * wumpus switches to sqlite
912015-10-24T08:06:54 <jgarzik> no apparent transaction size limit. "I am able to insert 10 million rows in a single transaction"
922015-10-24T08:07:30 <jgarzik> BEGIN...COMMIT maps easily to DBWrapper's batches
932015-10-24T08:21:57 <wumpus> that's good news
942015-10-24T08:22:57 <wumpus> one question: will using the mysql branch blow my old leveldb database, or can they exist side by side?
952015-10-24T08:23:34 <wumpus> will just create a new datadir to be sure
962015-10-24T08:25:29 <jgarzik> should be able to exist side by side
972015-10-24T08:25:36 <jgarzik> sqlite database is a file named "db"
982015-10-24T08:26:01 <jgarzik> but don't trust me - be certain and create a new datadir :)
992015-10-24T08:27:35 <jgarzik> 1) Here are all the tweaks for an sqlite database: https://www.sqlite.org/pragma.html
1002015-10-24T08:28:01 <jgarzik> 2) git pull the latest 2015_sqlite branch, it includes some key performance tweaks in dbwrapper.cpp (grep for PRAGMA)
1012015-10-24T08:28:52 * jgarzik heads back to bed *poof*
1022015-10-24T08:30:41 <btcdrak> jgarzik: I'll take a look
1032015-10-24T08:30:51 <GitHub65> [bitcoin] giacecco opened pull request #6885: Instructions on how to make the Homebrew OpenSSL headers visible... (master...master) https://github.com/bitcoin/bitcoin/pull/6885
1042015-10-24T08:31:56 *** Ylbam has joined #bitcoin-core-dev
1052015-10-24T08:35:58 *** danielsocials_ has quit IRC
1062015-10-24T08:45:03 *** Thireus has joined #bitcoin-core-dev
1072015-10-24T12:47:07 <wumpus> 2015-10-24 12:45:29 UpdateTip: new best=000000000000000bf325356179fb8876fe40e250c9e31082242f70f89ecbcd0b height=240923 log2_work=70.308629 tx=1
1082015-10-24T12:47:07 <wumpus> 9292749 date=2013-06-11 12:51:49 progress=0.093959 cache=65.5MiB(34687tx)
1092015-10-24T12:47:07 <wumpus> 2015-10-24 12:46:10 UpdateTip: new best=00000000000000ecca86ba925835b0909eeba33fd90ae9858c01e088d4d13bcf height=240924 log2_work=70.308695 tx=1
1102015-10-24T12:47:07 <wumpus> 9293107 date=2013-06-11 12:59:51 progress=0.093961 cache=2.0MiB(0tx)
1112015-10-24T12:48:04 <wumpus> it seems like the flushing takes quite a long time with sqlite (40 seconds in this case), haven't done a direct comparison with leveldb though
1122015-10-24T12:48:31 <wumpus> but it blazes past N blocks, then hangs noticably on the flush, longer than I remember
1132015-10-24T12:50:54 *** belcher has joined #bitcoin-core-dev
1142015-10-24T13:00:59 *** moli has joined #bitcoin-core-dev
1152015-10-24T13:03:15 *** molly has quit IRC
1162015-10-24T13:04:21 <wumpus> at first glance it seems to do a lot of calls to fsync()
1172015-10-24T13:04:43 <wumpus> (unscientifically tested by running in gdb and breaking+backtracing a few times)
1182015-10-24T13:35:59 *** danielsocials has joined #bitcoin-core-dev
1192015-10-24T13:41:06 <wumpus> yes it's fsyncing - removing the fsync calls (obviously a stupid idea in itself) flush time is down to <10 seconds. I understand calling fsync, but is it doing so for every single statement?
1202015-10-24T14:06:09 *** danielsocials has quit IRC
1212015-10-24T14:15:10 <jcorgan> wumpus: confirming same behavior here
1222015-10-24T14:19:43 <wumpus> "PRAGMA synchronous=0" works too (make sure it's executed every time the database is opened) instead of commenting out fsyncs - though won't be very resistant to crashes https://www.sqlite.org/pragma.html#pragma_synchronous
1232015-10-24T14:20:18 <jcorgan> heh, isn't the whole reason we're testing this is to improve crash resistance? :-)
1242015-10-24T14:21:23 <wumpus> yeah...
1252015-10-24T14:22:04 <jcorgan> once this reindexes i'll bet steady-state performance won't really be any different
1262015-10-24T14:22:05 <wumpus> so sqlite's idea of crash resistance seems to be 'fsync after every file operation'. Or maybe it's after some buffer fills that can be increased, I don't know.
1272015-10-24T14:22:28 <wumpus> jcorgan: I'll bet the same
1282015-10-24T14:23:28 <jcorgan> hmm, reindexing is actually speeding up as it goes along
1292015-10-24T14:25:40 <jcorgan> eyeballing it it is about 150-200 blocks/sec, with peak disk writes of about 70-80 MB/sec
1302015-10-24T14:26:33 <wumpus> jgarzik: the pragma tweaks in sql_db_init are not persistent - they need to be done every time the db opens, not only when it is created
1312015-10-24T14:28:12 *** danielsocials has joined #bitcoin-core-dev
1322015-10-24T14:31:35 <jgarzik> wumpus, very odd - I would think page size is persistent
1332015-10-24T14:32:12 <wumpus> this sounds interesting: https://www.sqlite.org/wal.html
1342015-10-24T14:32:29 <jgarzik> nod
1352015-10-24T14:32:42 <jgarzik> Plenty of tips (sounds like you already figured out some) in https://www.sqlite.org/cvstrac/wiki?p=PerformanceTuning
1362015-10-24T14:33:00 <jgarzik> you can play around with logging types
1372015-10-24T14:33:10 <wumpus> jgarzik: probably the database is created with one page size, but e.g. cache size isn't remembered
1382015-10-24T14:37:41 <wumpus> good news, PRAGMA journal_mode=WAL; seems to solve the excessive-fsync problem too
1392015-10-24T14:39:07 <wumpus> "WAL works best with smaller transactions. WAL does not work well for very large transactions. For transactions larger than about 100 megabytes, traditional rollback journal modes will likely be faster" that's strange - we certainly have transactions that big
1402015-10-24T14:39:10 <jgarzik> In C++, is there a one-line way to convert an int to a std::string ?
1412015-10-24T14:39:43 <jgarzik> i.e. "foo" + numstr(22)
1422015-10-24T14:39:56 <wumpus> we have itostr in utilstrencodings.h
1432015-10-24T14:40:04 <wumpus> and i64tostr
1442015-10-24T14:40:13 <jgarzik> ok, perfect
1452015-10-24T14:42:49 *** fkhan has quit IRC
1462015-10-24T14:44:45 <jgarzik> wumpus, pushed the db init fix to 2015_sqlite (where params like cache_size are configured every time, but 'create table' happens only once)
1472015-10-24T14:45:08 <jgarzik> wumpus, WAL should be good for us
1482015-10-24T14:45:19 <jcorgan> is that in your update?
1492015-10-24T14:45:30 <jgarzik> jcorgan, WAL? no
1502015-10-24T14:47:41 <jgarzik> jcorgan, it's pretty obvious where to add configuration lines at the top of dbwrapper.cpp, so it's straightforward
1512015-10-24T14:47:52 <jgarzik> Another todo item is making the cache size configurable
1522015-10-24T14:48:35 <jgarzik> (via runtime GetArg, I mean)
1532015-10-24T14:48:49 *** fkhan has joined #bitcoin-core-dev
1542015-10-24T14:50:04 <jgarzik> 70-80 MB/sec is pretty darned good
1552015-10-24T14:50:20 <jcorgan> btrfs on hardware raid 10 :-)
1562015-10-24T14:50:22 <jgarzik> I would be interested in the wall clock reindex time
1572015-10-24T14:50:29 <jgarzik> of master vs sqlite
1582015-10-24T14:51:12 <jcorgan> i'm restarting with WAL, i'll time it
1592015-10-24T14:51:20 <jcorgan> btrfs snapshots are the bomb
1602015-10-24T14:52:39 <wumpus> with WAL, PRAGMA wal_autocheckpoint has a lot of influence in the number of fsyncs
1612015-10-24T14:53:27 * jgarzik doesn't see a need for fsync inside a batch
1622015-10-24T14:54:10 <jgarzik> as long as post-crash we see a consistent picture, we can lose the WAL-in-progress
1632015-10-24T14:54:39 <wumpus> well it *looks* to me that it's this: during a batch it writes to the journal, and it fsyncs the journal. I agree though.
1642015-10-24T14:54:53 <jgarzik> changes are batched to std::vector<> internally, and then flooded to the db in a rapid BEGIN..INSERT*..COMMIT sequence.
1652015-10-24T14:55:00 <wumpus> no need to checkpoint *ever* during a batch
1662015-10-24T14:55:05 <jgarzik> correct
1672015-10-24T14:56:26 <sipa> cache size is controlled by -dbcache already
1682015-10-24T14:56:51 <sipa> part of it is assigned to pcoinsTip cache, part to the database layer itself
1692015-10-24T14:57:07 <sipa> using a totally arbitrary formula
1702015-10-24T14:57:10 <jgarzik> sipa, the specific need is for sqlite to move from static constant stored within a constant "foo" string to GetArg configured with that
1712015-10-24T14:58:13 <jgarzik> sipa, Line 23 of https://github.com/jgarzik/bitcoin/blob/4d2e72900de85a1e2ffbc9470df05794242b82b9/src/dbwrapper.cpp#L23
1722015-10-24T15:04:20 <sipa> oh, yes!
1732015-10-24T15:06:01 <sipa> i'm just saying, no need for a new GetArg, there is already logic for this in init.cpp
1742015-10-24T15:07:50 <jgarzik> yep
1752015-10-24T15:07:53 *** Thireus has quit IRC
1762015-10-24T15:10:24 <sipa> note that the chainstate only really has a durability requirement when pruning
1772015-10-24T15:10:37 <sipa> otherwise, any old but consistent state is acceptable
1782015-10-24T15:11:00 <sipa> though blockindex flushes are hard requirements
1792015-10-24T15:12:22 *** Thireus has joined #bitcoin-core-dev
1802015-10-24T15:13:52 <jgarzik> I would like to separate things out into multiple tables
1812015-10-24T15:14:05 <jgarzik> (as sipa mentioned days ago)
1822015-10-24T15:15:36 <sipa> i'm surprised you didn't need to yet
1832015-10-24T15:15:44 <sipa> can transactions span multiple tables?
1842015-10-24T15:16:07 <jgarzik> yes
1852015-10-24T15:16:21 <sipa> in that, you probably want to split it
1862015-10-24T15:23:19 <jgarzik> wumpus, One concern with the current implementation is the 'ORDER BY' - a sort - in the CDBIterator class. Once fully sync'd to current bitcoin block height, failing to store in an always-sorted container may create lumpy bitcoind behavior whenever CDBIterator is used... maybe.
1872015-10-24T15:23:31 <jgarzik> Needs testing to disprove hypothesis.
1882015-10-24T15:24:10 <jgarzik> possibly either the sort is fast enough or smart enough that this is not noticed
1892015-10-24T15:24:30 <wumpus> yes, depends on the kind of index
1902015-10-24T15:25:40 <wumpus> I suppose the default is a sorted index
1912015-10-24T15:26:44 <wumpus> something like a hash index on the key would indeed break the assumption that CDbIterator always returns the results in order
1922015-10-24T15:27:11 <jgarzik> well not break - slow down
1932015-10-24T15:27:45 <jgarzik> 'ORDER BY' provides the ordering guarantee in case the underlying db does not, in this implementation
1942015-10-24T15:27:49 <wumpus> yeah... but an in-database sort of a whole table really isn't pretty
1952015-10-24T15:27:54 <jgarzik> nod
1962015-10-24T15:28:14 <wumpus> I guess it's best to just use a sorted index for now, unless it proves to be a bottleneck
1972015-10-24T15:28:15 <jgarzik> it's a partial sort, starting at the base key
1982015-10-24T15:28:25 <jgarzik> not a whole-table sort
1992015-10-24T15:28:40 <wumpus> don't we only use iterators over the whole table?
2002015-10-24T15:28:54 <jgarzik> not needed
2012015-10-24T15:29:08 <jgarzik> code has one pattern: seek(key) then next() next() next()
2022015-10-24T15:29:11 <wumpus> then again - without an ordered index, the >= criteria on the key will also cause a full scan
2032015-10-24T15:29:50 <jgarzik> yep that is actually an option - assuming results can be unorder - drop 'ORDER BY' and simply scan
2042015-10-24T15:30:48 * jgarzik wonders how to tune index types
2052015-10-24T15:30:51 <wumpus> yeah, that's the same as the database does internally. I suggest just sticking with a sorted index, unless it turns out to be really a problem, this is unwanted uglyness
2062015-10-24T15:31:39 <jcorgan> wumpus, what did you end up with for the list of pragmas
2072015-10-24T15:32:26 <jcorgan> WAL isn't making any difference for me
2082015-10-24T15:33:36 <wumpus> good question though: do any of the CDBIterator clients require the records to be in a defined order?
2092015-10-24T15:35:20 <jgarzik> https://www.sqlite.org/withoutrowid.html
2102015-10-24T15:35:23 <wumpus> looks like they don't: their only requirement is that the keys are in a certain range, because the prefix defines what 'table' they are in
2112015-10-24T15:36:06 <jgarzik> wumpus, as of now they are within a certain -start- range; end is not excised
2122015-10-24T15:36:15 <wumpus> jcorgan: PRAGMA wal_autocheckpoint=0
2132015-10-24T15:36:32 <wumpus> jgarzik: the end is too - by breaking out of the iterator as soon as prefix no longer matches
2142015-10-24T15:36:36 <jcorgan> got it
2152015-10-24T15:36:46 <wumpus> jgarzik: sure, that could be part of the API
2162015-10-24T15:36:55 <jgarzik> wumpus, nod - thus sort is required - otherwise iteration ends prematurely
2172015-10-24T15:37:23 <jgarzik> it works if you can do ">= start_key" and "< next_prefix" and know what next_prefix is
2182015-10-24T15:37:24 <wumpus> my pointi s that the sort would not be required when the records would be binned in a different way, for example in different tables, instead ofusing a prefix to distinguish them
2192015-10-24T15:37:49 <wumpus> they don't rely on the fact that keys are sorted
2202015-10-24T15:37:54 <jgarzik> wumpus, agreed there - hence <jgarzik> I would like to separate things out into multiple tables
2212015-10-24T15:38:32 <wumpus> in the case of the UTXO database this doesn't matter that much because it *almost* only contains COINS entries
2222015-10-24T15:38:44 <wumpus> for the blockdb on the other hand, it can contain these txindex entries...
2232015-10-24T15:39:19 <jgarzik> wumpus, what is your total diff versus 2015_sqlite, WRT pragmas? can you paste that?
2242015-10-24T15:39:20 <wumpus> (and a lot of them, the number of block entries is neglible comparison)
2252015-10-24T15:39:26 <jgarzik> I want to put that in 2015_sqlite
2262015-10-24T15:40:11 <wumpus> + "PRAGMA wal_autocheckpoint=0",
2272015-10-24T15:40:11 <wumpus> + "PRAGMA journal_mode=WAL",
2282015-10-24T15:40:51 <jcorgan> isn't that supposed to be schema.journal_mode ?
2292015-10-24T15:41:01 <jgarzik> global is fine
2302015-10-24T15:41:04 <wumpus> I prefer setting the global option
2312015-10-24T15:41:20 <jcorgan> ok
2322015-10-24T15:41:21 <wumpus> (it's possible to set it per schema)
2332015-10-24T15:41:29 <wumpus> (but do we even define those?)
2342015-10-24T15:42:04 <jgarzik> no need
2352015-10-24T15:42:28 <wumpus> using schema.journal mode *literally* fails, that was my first try too :)
2362015-10-24T15:42:57 <jgarzik> wumpus, well, of course it will fail, first time the table does not exist
2372015-10-24T15:43:31 <jgarzik> stmts can be reorders but ... setting the global is best
2382015-10-24T15:43:35 <wumpus> but setting the global one works, so yeah...
2392015-10-24T15:43:36 <jgarzik> reordered
2402015-10-24T15:44:02 <jgarzik> OK, pushed out WAL & improved errors to 2015_sqlite
2412015-10-24T15:44:16 <jgarzik> going to task switch, poke me if there are other updates for the branch
2422015-10-24T15:44:45 <jcorgan> i'll time the total reindex with the new stuff
2432015-10-24T15:46:43 <jcorgan> huge difference with the pragmas
2442015-10-24T15:49:00 <jgarzik> Thanks. If someone is so motivated, timing with different page sizes (1024 4096, 8192) can be useful. Page size goes all the way up to 64k.
2452015-10-24T15:50:38 <wumpus> wouldn't the optimal page size depend on the hardware as well? would be good to do some benchmarks, for example do a full reindex with various sets of parameters, but I don't have a system I can use for controlled tests without background noise
2462015-10-24T15:51:16 <wumpus> -stopafterblockimport is a great option for timed, batched reindexes, though
2472015-10-24T15:53:22 <jcorgan> i suspect it would be very filesystem and storage configuration dependent
2482015-10-24T15:56:51 <wumpus> btw it could be that we need to call sqlite3_wal_checkpoint_v2 at some point (eg, when flushing) with autocheckpointing disabled
2492015-10-24T15:57:47 *** danielsocials has quit IRC
2502015-10-24T15:58:26 <wumpus> "Writers sync the WAL on every transaction commit if PRAGMA synchronous is set to FULL but omit this sync if PRAGMA synchronous is set to NORMAL." or maybe not. Pragma synchronous is FULL by default. I'm not sure how the checkpoints come into it.
2512015-10-24T15:58:46 <wumpus> syncing on transaction commit sounds sane
2522015-10-24T15:59:00 <jcorgan> yes
2532015-10-24T16:04:19 *** CodeShark has joined #bitcoin-core-dev
2542015-10-24T16:11:04 *** paveljanik has joined #bitcoin-core-dev
2552015-10-24T16:11:38 <wumpus> after running a while, bitcoin-shutoff is taking a long time in sqlite3_close - possibly due to the lack of checkpoints
2562015-10-24T16:14:28 <wumpus> ... -rw------- 1 39G Oct 24 18:06 db-wal probably
2572015-10-24T16:16:35 <wumpus> finished now
2582015-10-24T16:16:42 <sipa> ugh
2592015-10-24T16:17:32 <jgarzik> "Avoiding Excessively Large WAL Files" is a big section in https://www.sqlite.org/wal.html :)
2602015-10-24T16:18:45 <jgarzik> RE optimal page size - ideally it is an "I/O unit" which is filesystem block size - but large dbs might gain from clustering. Also, modern filesystems are kernel-page-cache based, and so an I/O unit in practice is often 4096 or 8192.
2612015-10-24T16:19:52 <wumpus> jgarzik: yes - probably setting the page size to anything less than 4096 is not going to be useful
2622015-10-24T16:20:10 <wumpus> (at least not on linux)
2632015-10-24T16:23:37 <wumpus> adding "rrc = sqlite3_wal_checkpoint_v2(psql, NULL, SQLITE_CHECKPOINT_PASSIVE, NULL, NULL);" at the end of WriteBatch indeed stops the wal file from growing... but, makes the flush slow (~30 seconds) again :/
2642015-10-24T16:25:01 * jgarzik reverted 2015_sqlite back to default autocheckpointing behavior... plenty of room for further research & experiments
2652015-10-24T16:25:09 <jgarzik> I think there is a background WAL checkpoint mode
2662015-10-24T16:25:15 <wumpus> so I'm not sure when to do these checkpoints
2672015-10-24T16:25:33 <wumpus> the default autocheckpointing does a checkpoint every 1000 statements or so, that is probably even worse
2682015-10-24T16:26:11 <jgarzik> nod - it's a stable base for further research, not an endpoint
2692015-10-24T16:26:30 <jgarzik> the main goal is to not-wait for checkpoint, not not-checkpoint
2702015-10-24T16:26:50 <wumpus> yes but we want checkpoints only to happen on commit, not inbetween
2712015-10-24T16:27:34 <wumpus> I think the default does so, not sure though... didn't expect manual calls to sqlite3_wal_checkpoint_v2 to be so slow even when done once per transaction
2722015-10-24T16:28:22 <wumpus> doing it in the background would be nice
2732015-10-24T16:28:57 <wumpus> but then "Checkpointing needs to lock the entire database, so all other readers and writes would have to be blocked. (A passive checkpoint just aborts.)"
2742015-10-24T16:30:05 <wumpus> so I'm not sure how much doing the checkpointing in a thread would help in practice. If you do PASSIVE checkpoints they'll never happen at all, and the other modes do waiting in one way or another
2752015-10-24T16:30:22 <jcorgan> whouda thunk that getting database operations right would be as hard as getting crypto right? :-)
2762015-10-24T16:30:48 <wumpus> well I'm not sure it's as hard as getting crypto right :) but yes it's not trivial
2772015-10-24T16:31:23 <wumpus> but seeing all of this it seems leveldb isn't so bad at all
2782015-10-24T16:32:15 <jgarzik> switching major database system types, one expects some analysis and bumps in understanding the new system
2792015-10-24T16:32:38 <sipa> the only way the wal is cleared is with a checkpoint?
2802015-10-24T16:32:47 <wumpus> yes, and this is only an experiment
2812015-10-24T16:33:03 <wumpus> sipa: yes, 'checkpoint' seems to be the operation 'incorporate the WAL into the databse'
2822015-10-24T16:33:18 <sipa> and that operation needs a full exclusive lock on the database?
2832015-10-24T16:33:23 <wumpus> yep
2842015-10-24T16:33:28 <sipa> that's unusable
2852015-10-24T16:34:15 <jgarzik> The underlying system calls are range locks, so I wonder
2862015-10-24T16:34:29 <jgarzik> non-WAL updates the db with range locks, I'm pretty sure
2872015-10-24T16:34:52 <sipa> range locks are useless on a db ordered by random indexes
2882015-10-24T16:35:44 <wumpus> yes there's always the normal journal, though non-WAL does an fsync per statement, it seems, its performance was really bad here (and all the fsyncs were making the rest of the system slow, too)
2892015-10-24T16:36:21 <jgarzik> not necessarily - modern schemes write index/data updates to new pages, allowing older pages to be read in parallel (presenting older, committed view of the data)
2902015-10-24T16:36:33 <jgarzik> so you can be writing new while reading old
2912015-10-24T16:37:03 <jgarzik> update-in-place is avoided these days, as it does not produce crash-consistent behavior
2922015-10-24T16:37:13 <sipa> yes, sure
2932015-10-24T16:37:40 <sipa> but you want the background-written log to be incorporated into the real db without locking the whole thing down
2942015-10-24T16:37:47 <sipa> leveldb does that
2952015-10-24T16:38:42 <jgarzik> nod - and that's perfectly doable here with range locks - you update the new index, then a quick "flick of a switch" jumps from old consistent state to new consistent state, the latter of which was written in parallel in the background
2962015-10-24T16:39:39 <sipa> every set of wal will touch database entries all over the place
2972015-10-24T16:40:02 <sipa> every non-trivial set.of log entries will need to lock the whole db
2982015-10-24T16:40:03 <jgarzik> same is true for every batch, every database system
2992015-10-24T16:40:52 <jgarzik> again not true - read the above - you don't need to lock the whole db in theory - I can't speak for sqlite but in database circles the solution is well known here.
3002015-10-24T16:41:03 <sipa> oh of.course
3012015-10-24T16:41:16 <sipa> i'm just observing sqlite's behaviour
3022015-10-24T16:41:26 <sipa> leveldb solves it by having different levels :)
3032015-10-24T16:41:45 <sipa> and no guarantee in which level particular data is to be found
3042015-10-24T16:41:57 <sipa> socchanges can "ripple up"
3052015-10-24T16:42:48 <wumpus> just read https://www.sqlite.org/c3ref/wal_checkpoint_v2.html - from what I understand from it, checkpointing requires an exclusive lock and needs to wait for all readers and writers to finish. If there is a way to work around that it'd be nice, but theory isn't of much help here :)
3062015-10-24T16:43:36 <jgarzik> wumpus, that's tuned by wal_checkpoint=[passive/full/restart/truncate] a bit
3072015-10-24T16:44:17 <wumpus> tha's mentioned in the link I gave, yes. It looks like that determines who gets to wait (or cancel) for whom, not whether the operation requires an exclusive lock
3082015-10-24T16:45:01 <sipa> does issuing a passive checkpoint that gets interrupted make progress?
3092015-10-24T16:45:17 <wumpus> it can
3102015-10-24T16:56:30 <wumpus> doing e.g. SQLITE_CHECKPOINT_FULL or _TRUNCATE from a thread would make sense I suppose - it blocks all writers and some readers: apparently only those readers not reading from the most recent database snapshot
3112015-10-24T16:56:49 <wumpus> at least it wait for all readers to non-recent database snapshot to complete
3122015-10-24T16:58:15 <wumpus> (which makes sense as the old version is effectively discarded)
3132015-10-24T16:59:49 <wumpus> blocking writers is bad during initial sync, but not so much during steady-state where not much updates happen
3142015-10-24T17:16:21 <jcorgan> fyi, this reindex is with txindex=1
3152015-10-24T17:16:57 <sipa> that likely hurts performance
3162015-10-24T17:17:03 <sipa> txindex writes are not batched
3172015-10-24T17:17:50 <sipa> all other database writes are
3182015-10-24T17:18:30 <jcorgan> yeah, it's at height 180K and has slowed down dramatically
3192015-10-24T17:19:24 <jgarzik> yeah txindex=1 will slow things down & is not representative
3202015-10-24T17:19:45 <sipa> i consider txindex something you need for debugging/diagnostics, not for production use
3212015-10-24T17:19:50 <wumpus> agreed
3222015-10-24T17:19:50 <jcorgan> let me redo it then
3232015-10-24T17:20:24 <sipa> that said, it shouldn't gratuitously kill performance if there is an easy way around it
3242015-10-24T17:21:56 <sipa> i also doubt that anything you notice at 180k is due to that
3252015-10-24T17:22:45 <jcorgan> i don't know that it started at 180k, just that was where it was when i check in on it
3262015-10-24T17:23:06 <wumpus> don't you mean 280k?
3272015-10-24T17:23:24 <wumpus> (that's where it is here, and I started about the same time)
3282015-10-24T17:23:42 <jcorgan> no, 180k
3292015-10-24T17:23:48 <wumpus> ok
3302015-10-24T17:24:47 <wumpus> unbatched? hmm, doing a sqlite transaction per bitcoin transaction is certainly going to kill sqlite performance
3312015-10-24T17:25:01 <jcorgan> restarted fresh with txindex=0, no difference in speed at the start, but we'll see how it slows down
3322015-10-24T17:25:12 <wumpus> there's almost no transactions at the start :)
3332015-10-24T17:25:18 <jcorgan> right :)
3342015-10-24T17:26:01 <jcorgan> it's about 200 blocks/sec
3352015-10-24T17:26:39 <CodeShark> I don't even count the first 100k blocks :p
3362015-10-24T17:30:26 <CodeShark> so are we benchmarking the SQLite stuff?
3372015-10-24T17:30:53 <jcorgan> "benchmarking" would be generous
3382015-10-24T17:31:05 <jcorgan> more like, wow, it actually works
3392015-10-24T17:31:17 <wumpus> CodeShark: if you want: jgarzik repository, 2015_sqlite branch
3402015-10-24T17:31:36 <CodeShark> jcorgan: why wouldn't it work? :)
3412015-10-24T17:32:11 <wumpus> it does actually work quite well :) although no one has tested yet if it is more resilient to crashes/poweroffs than leveldb, esp. on windows
3422015-10-24T17:32:15 <jcorgan> um, with txindex=0, it just hit 120k in the last 10 minutes
3432015-10-24T17:32:35 <CodeShark> sqlite is a pretty solid piece of software (albeit I've never tried using it for something that really taxes it)
3442015-10-24T17:32:55 <sipa> jcorgan: that sounds very slow
3452015-10-24T17:33:15 <wumpus> this is with autocheckpointing reenabled?
3462015-10-24T17:33:56 <jcorgan> i don't think so
3472015-10-24T17:34:19 <wumpus> well it won't get faster than that
3482015-10-24T17:34:38 <jcorgan> wal, autocheckpoint=0
3492015-10-24T17:35:14 <jcorgan> it just hit 180k
3502015-10-24T17:35:27 <wumpus> doesn't sound too bad to me
3512015-10-24T17:35:49 <jcorgan> still at about 160 block/sec
3522015-10-24T17:39:33 <jgarzik> Ideally you want a database system that ensures consistency, permits uninterrupted reads and writes, and optimizes in the background (i.e. "optimize" means move journalled data to final db position, updates indices, etc.)
3532015-10-24T17:39:49 <jgarzik> app shouldn't have to checkpoint
3542015-10-24T17:40:15 <sipa> yes
3552015-10-24T17:40:40 <jcorgan> wumpus: should i have made a further change in the pragmas?
3562015-10-24T17:41:58 <jcorgan> also, what was that stop after import option, and can it be set in the .conf file?
3572015-10-24T17:42:20 <jgarzik> reads locklessly read consistent data always, and writes do not stomp on that
3582015-10-24T17:43:45 <wumpus> jgarzik: they do in sqlite, the checkpoint is just administrative, it doesn't affect what readers read
3592015-10-24T17:45:04 <wumpus> at the time checkpoint is called, new readers will already be reading the new state, by the time it can start ,old readers will be finished reading the old state, by definition
3602015-10-24T17:47:16 <wumpus> jcorgan: well with autocheckpoint=0 it effectively appends all changes to the db-wal file instead of incorporating them
3612015-10-24T17:47:56 <wumpus> this is a matter of where the data is stored, it doesn't affect how clients perceive the database
3622015-10-24T17:48:20 <sipa> but reading data from the wal file instead of the actual db must be slower?
3632015-10-24T17:48:39 <jgarzik> there is impact if some operations are blocked by other operations
3642015-10-24T17:48:39 <wumpus> could be
3652015-10-24T17:49:10 <wumpus> there can certainly be performance impact, just not functionality/consistency impact, that's all I'm saying...
3662015-10-24T17:49:33 <jgarzik> nod
3672015-10-24T17:49:37 <jcorgan> ok, restarted with txindex=0, wal, and removed the autocheckpoint=0
3682015-10-24T17:50:09 <wumpus> db-wal for chainstateshould stay at about 440M then
3692015-10-24T17:50:46 <wumpus> (1-transaction-sized)
3702015-10-24T17:51:14 <wumpus> hmm too big for one transaction.. no I don't know :)
3712015-10-24T17:53:28 <wumpus> at least it stays relatively constant around 440mb and doesn't become GB's big anymore, to really make it truncate one'd probably need the SQLITE_CHECKPOINT_TRUNCATE
3722015-10-24T17:58:24 <jgarzik> jcorgan, current 2015_sqlite uses autocheckpoint=10000
3732015-10-24T17:58:34 <jgarzik> not that I would suggest re-starting yet again :)
3742015-10-24T17:59:09 <jcorgan> i have it scripted :)
3752015-10-24T18:19:42 <jcorgan> hmm
3762015-10-24T18:19:58 <jcorgan> -stopafterblock import shut it down at 183842
3772015-10-24T18:21:08 <btcdrak> looks like i am missing the sqlite party
3782015-10-24T18:21:49 *** randy-waterhouse has joined #bitcoin-core-dev
3792015-10-24T18:26:24 *** maaku has joined #bitcoin-core-dev
3802015-10-24T18:26:47 *** maaku is now known as Guest6602
3812015-10-24T18:27:21 *** Guest6602 is now known as maaku
3822015-10-24T18:29:21 <CodeShark> it gets really slow once it passes 230k
3832015-10-24T18:44:53 *** maaku has left #bitcoin-core-dev
3842015-10-24T18:58:34 *** Thireus has quit IRC
3852015-10-24T18:58:48 *** Thireus has joined #bitcoin-core-dev
3862015-10-24T18:59:52 <jcorgan> something strange is happening
3872015-10-24T19:00:30 <jcorgan> the node stops reindexing at 183k, then starts up and asks connections for blocks at this point
3882015-10-24T19:10:58 *** Thireus1 has joined #bitcoin-core-dev
3892015-10-24T19:11:23 *** MarcoFalke has joined #bitcoin-core-dev
3902015-10-24T19:12:06 <jcorgan> ok, that's weird
3912015-10-24T19:12:30 <jcorgan> somehow i had restored a snapshot of bitcoin dir that only had 183k blocks in it
3922015-10-24T19:14:02 <jcorgan> i cloned a new snapshot of the original and it is working fine with that
3932015-10-24T19:14:15 *** Thireus has quit IRC
3942015-10-24T19:14:48 *** Thireus1 has quit IRC
3952015-10-24T19:14:58 *** Thireus has joined #bitcoin-core-dev
3962015-10-24T19:16:03 *** Thireus has quit IRC
3972015-10-24T19:16:15 *** Thireus has joined #bitcoin-core-dev
3982015-10-24T19:18:45 *** Thireus1 has joined #bitcoin-core-dev
3992015-10-24T19:21:59 *** Thireus has quit IRC
4002015-10-24T19:43:13 *** paveljanik has quit IRC
4012015-10-24T19:46:05 <btcdrak> I rebased #6312 and #6564 (BIP68+CSV) into one branch for the people who asked to review both PRs in one changeset https://github.com/bitcoin/bitcoin/compare/master...btcdrak:sequenceandcsv
4022015-10-24T19:57:09 *** malte has quit IRC
4032015-10-24T19:57:41 *** malte has joined #bitcoin-core-dev
4042015-10-24T20:00:42 *** randy-waterhouse has quit IRC
4052015-10-24T20:10:42 *** MarcoFalke has quit IRC
4062015-10-24T20:10:58 *** MarcoFalke has joined #bitcoin-core-dev
4072015-10-24T20:14:48 *** MarcoFalke has quit IRC
4082015-10-24T20:15:50 *** MarcoFalke has joined #bitcoin-core-dev
4092015-10-24T20:26:43 *** MarcoFalke has quit IRC
4102015-10-24T20:34:10 *** fkhan has quit IRC
4112015-10-24T20:46:38 *** fkhan has joined #bitcoin-core-dev
4122015-10-24T20:52:33 *** d_t has joined #bitcoin-core-dev
4132015-10-24T21:02:47 *** Thireus1 has quit IRC
4142015-10-24T21:02:58 *** Thireus has joined #bitcoin-core-dev
4152015-10-24T21:07:08 *** Thireus1 has joined #bitcoin-core-dev
4162015-10-24T21:07:48 *** Thireus1 has joined #bitcoin-core-dev
4172015-10-24T21:08:32 *** Thireus has quit IRC
4182015-10-24T21:11:17 *** Thireus has joined #bitcoin-core-dev
4192015-10-24T21:12:02 *** Thireus1 has quit IRC
4202015-10-24T21:12:33 *** d_t has quit IRC
4212015-10-24T21:12:43 *** Thireus has quit IRC
4222015-10-24T21:12:53 *** Thireus has joined #bitcoin-core-dev
4232015-10-24T21:14:47 *** Thireus1 has joined #bitcoin-core-dev
4242015-10-24T21:17:19 *** Thireus has quit IRC
4252015-10-24T21:24:51 *** Thireus1 has quit IRC
4262015-10-24T21:25:01 *** Thireus has joined #bitcoin-core-dev
4272015-10-24T21:28:49 *** Thireus has quit IRC
4282015-10-24T21:28:50 *** Thireus1 has joined #bitcoin-core-dev
4292015-10-24T21:30:16 *** Thireus has joined #bitcoin-core-dev
4302015-10-24T21:30:19 *** Thireus1 has quit IRC
4312015-10-24T21:44:25 *** rcutmore has joined #bitcoin-core-dev
4322015-10-24T21:59:11 *** gmaxwell is now known as gmaxweIl
4332015-10-24T21:59:47 *** gmaxweIl is now known as gmaxwell
4342015-10-24T22:05:39 *** nkuttler has joined #bitcoin-core-dev
4352015-10-24T22:10:33 *** d_t has joined #bitcoin-core-dev
4362015-10-24T22:10:41 *** droark has joined #bitcoin-core-dev
4372015-10-24T22:16:01 *** d_t has quit IRC
4382015-10-24T22:46:06 *** ParadoxSpiral_ has quit IRC
4392015-10-24T22:46:47 *** d_t has joined #bitcoin-core-dev
4402015-10-24T22:57:43 *** droark has quit IRC
4412015-10-24T22:59:39 *** droark has joined #bitcoin-core-dev
4422015-10-24T23:07:36 *** d_t has quit IRC
4432015-10-24T23:57:59 *** dgenr8 has quit IRC
4442015-10-24T23:58:13 *** dgenr8 has joined #bitcoin-core-dev