14:06:20 #startmeeting 14:06:20 Meeting started Mon Jun 6 14:06:20 2011 UTC. The chair is galderz. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:06:20 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:06:56 right, who goes first? 14:07:11 I'll go 14:07:15 mine is simple 14:07:18 #topic vblagoje 14:07:53 i believe that ISPN-83 is not reproducible any more, i just wanted to resolve ISPN-1153 before committing a small fix for ISPN-83. I confirmed ISPN-83 not being reproducible with mlinhard and his original test that raised it 14:07:55 jira [3ISPN-83] Remove dependency on JGroups FLUSH [10Reopened (Unresolved) Task,7 Critical,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-83 14:07:56 jira [3ISPN-1153] Validate relationship between transport related timeouts [10Open (Unresolved) Task,7 Major,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-1153 14:08:14 also as far as move to confluence goes 14:08:28 I talked to Vlastimil and Mark 14:08:49 and they confirmed my fears that this move has to be manual move of 99 of our documents 14:08:53 so that sucks 14:09:09 Vlasimitl is still looking around for some plugin to do this automatically 14:09:33 so lets wait for this week and his report back, otherwise this is going to be a manual move 14:09:44 vblagoje: so their migration tool didnt work for us? 14:10:07 i don't think they have a plugin for sbs content 14:10:23 they told me to wait and they'll report back 14:11:45 from my two week list given by manik I also had a task that I was supposed to work on with mmarkus but we did get there yet i think 14:11:49 vblagoje: ok, let me know as you hear more from them 14:11:56 so that's it from me 14:11:59 ok manik 14:12:28 vblagoje: so you're still working on helping mmarkus benchmark stuff around JTA? 14:13:02 yeah that is the the task i had in mind 14:13:03 manik: I'm almost there writing radargun support for tx 14:13:06 manik 14:13:34 and after that I'll just run the benchmark on ATL cluster 14:13:49 mmarkus: so you are doing it? 14:14:12 manik: yes, as I am 14:15:16 manik: shall I take it from here? 14:15:36 what are you doing this week vblagoje? 14:15:49 mmarkus: I thought you had a lot more on your plate. :) 14:16:06 mmarkus: see if vblagoje can take this off you 14:16:34 not sure galderz, have to look at the task pipeline 14:17:35 manik wanted me to look at ISPN-1065 but this is a more massive undertaking 14:17:36 jira [3ISPN-1065] Use a better mechanism to parse config files [10Open (Unresolved) Enhancement,7 Major,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-1065 14:19:02 ok mmarkus go now 14:19:41 #topic mmarkus 14:19:54 one thing would be the design around locking improvements 14:19:56 http://community.jboss.org/wiki/PossibleLockingImprovements 14:20:15 created JIRAs for each locking improvement 14:20:28 and added design proposal for each one 14:20:28 https://issues.jboss.org/browse/ISPN-1131 14:20:29 jira [3ISPN-1131] Locking optimization: acquire (write) locks at prepare time [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1131 14:20:50 https://issues.jboss.org/browse/ISPN-1132 14:20:51 jira [3ISPN-1132] Locking optimization: reorder lock acquisition to avoid deadlocks [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1132 14:20:58 https://issues.jboss.org/browse/ISPN-1137 14:20:59 jira [3ISPN-1137] Locking optimization: only lock main data owner (dist only) [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1137 14:21:26 I also think that "5. Replicated Keys & Values, non-replicated Locks" is a duplication of ISPN-1137 14:21:27 jira [3ISPN-1137] Locking optimization: only lock main data owner (dist only) [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1137 14:21:39 waiting for sannegrinovero to confirm that 14:22:05 ah right.. 14:22:29 also got some nice feedback around them, especially arroud ISPN-1132 14:22:30 jira [3ISPN-1132] Locking optimization: reorder lock acquisition to avoid deadlocks [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1132 14:22:50 sorry my mind is on a totally different frequency today :) 14:23:03 mmarkus: ok, cool. 14:23:07 sannegrinovero: deadlock? :-) np 14:23:24 I think these might have a very important effect on the throughput 14:23:26 mmarkus: about benchmarking transactions, I presume you will also blog about this after you are done? 14:23:41 yes I'm risking a mind deadlock :/ 14:24:05 also quite serious change in the architecture, so I'd like to take it with Jonathan before writing the code 14:24:40 mmarkus: yes, but we aren't going to be writing any code around this for now, remember? ;) we still need to release 5.0 14:24:42 manik: I sent you an email re: a face2face with jonathan 14:24:49 yes from last jonathan's email it looked like he was worried we where relaxing consistency, but I thought we where actually talking about introducing more consistency 14:25:18 mmarkus sannegrinovero I'll chime on on that email thread soon, but lets take that offline 14:25:26 for now 14:25:27 sannegrinovero: I think he's comment was around improvement #5 (which imo should not be discussed) 14:26:08 mmarkus: any thoughts on documenting/blogging about different tx setups? 14:26:13 manik: I know. just that from experience these need to be planned ahead :-) 14:26:37 manik: yes, once I'll be done with the benchmarking should that's what I'll do 14:26:54 mmarkus: agreed, but as I said lets take that offline 14:27:19 I did some work on that, then switched working on regular JIRAs as galderz pointed out we still have about 25 opened 14:27:35 mmarkus: also, any progress on figuring out why some tests are ignored by maven? 14:27:40 ISPN-1108 ? 14:27:41 jira [3ISPN-1108] Test(s) ignored by maven [10Open (Unresolved) Bug,7 Blocker,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1108 14:27:52 not really 14:28:10 Could you prioritise that? 14:28:12 I think it would be better if vblagoje can take that away from me instead of tx benchmarking 14:28:18 Ok 14:28:25 if vblagoje will accept it 14:28:26 :) 14:28:38 vblagoje: WDYT? 14:29:18 ( I will take silence as a yes ;) ) 14:29:37 mmarkus: so then you are full steam ahead on tx benchmarking and blogging about it? 14:29:56 also had a code walk through with dberindei around the new rebalancing code and tx failover 14:30:34 some interesting findings. I'll raise an issue we found and dberindei also has some things to look at 14:31:12 manik: yes 14:31:30 mmarkus: good stuff. 14:31:38 galderz: you next? 14:31:44 sure 14:31:48 #topic galder 14:32:07 last week was a short one for me with thursday/friday bank holidays 14:32:29 so, last week I focused on fixing as much CR4 stuff as I could 14:32:54 i fixed a fair few and went over other people's list seeing what could be done in 1/2 days 14:33:15 manik: cloudbees is up and running, i just need to document what i did to configure it...etc 14:33:28 but I think we're ready to close the previous hudson installation 14:33:48 me and dberindei have also worked at times on fixing as many of the testsuite failures 14:33:57 and we're down to less than 20 in the entire suite 14:33:58 galderz: awesome 14:34:05 think core is down to 6 failures 14:34:09 Oh? 14:34:21 I thought apart from core the other modules were stable 14:34:27 which other modules have failures? 14:34:39 well, there's a bit of everything 14:34:46 cassandra module not running at all 14:34:57 i've pinged tristan and sannegrinovero said he'd bug him 14:35:17 the other modules have some random failures which so far have been hard to track down 14:35:55 Ok. 14:36:08 apart from that, mmarkus helped release CR4 14:36:15 Let me know when you think I can shut down hudson.infinispan.org and redirect the URL to cloudbees 14:36:20 had some issues with uploading to nexus 14:36:27 Yes I saw the emails 14:36:29 manik: i think we can do that now 14:36:34 ah 14:36:48 apart from that went over a month's worth of user forum posts 14:37:07 galderz: great! Need to make sure we don't drop the ball there. :) 14:37:23 some interesting stuff discovered, such as that user that got frustrated with our clustering examples which did not appear to work: repl enabled but state transfer disabled 14:37:37 Yes I saw the JIRA 14:37:40 manik: for sure! keeping a closer eye this coming weeks with final not too far 14:37:59 that's about it - a lot of small things that had to be done 14:38:00 ah 14:38:04 Lets chat on EDG related stuff offline, but any updates on Hibernate 2LC performance? 14:38:11 Stuff that Andy has been doing, etc? 14:38:12 manik: 14:38:14 no updates 14:38:30 Ok. I'll assume that no news is good news :) 14:38:42 someone has reported a mem leak today though, looking into it today or tomorrow 14:38:53 galderz: if you're done, sannegrinovero feel like going next? 14:39:00 galderz: ok 14:39:02 actually, trying to get the 2LC testsuite to run with synchronization but have a failure 14:39:14 for this week: more of the same i think, manik let's discuss offline 14:39:33 galderz: ok, pls take the sync failure up with mmarkus offline 14:39:50 galderz: treat that as a prio since the AS7 releases will depend on it 14:39:58 sync failure? 14:40:10 "actually, trying to get the 2LC testsuite to run with synchronization but have a failure" 14:40:17 ah yeah :) 14:40:19 :) 14:40:34 sure, going through logs now 14:40:36 #topic sannegrinovero 14:40:43 ok, 14:41:14 so I think nobody knows it as Israel keeps writing me in private; I think he feels uncomfortable in asking many trivial questions 14:41:35 so, I finally got another preview of ISPN-200 to read 14:41:35 jira [3ISPN-200] Distributed queries [10Open (Unresolved) Feature Request,7 Major,6 Sanne Grinovero] https://issues.jboss.org/browse/ISPN-200 14:42:05 this is a myor rewrite of his first experiments, as I had a lot of feedback to him and he needed some help after the Query module rewrite 14:42:54 a part from that, I'm inspecting a forum report about the Lucene Directory having issues with locks when it's used with passivation (LIRS) 14:43:18 this is an Hibernate Search user, and I'm preparing a test about it. 14:43:38 I guess that will link back to our discussions on locking. 14:43:46 sannegrinovero: ok. 14:43:57 sannegrinovero: how's Israel's patch looking? 14:44:12 it's not ready yet, but the concepts are there. 14:44:21 right direction? 14:44:21 sadly, we're back to the point that Sorting is an issue, again 14:44:58 so I don't know yet, I hope it's some small bug, but I have to review all the design back again as it's totally different. 14:45:27 I guess it will take a couple of days of mine, and no less than two weeks from him. 14:46:18 for the rest, I've been helping in chats with the cloud-tm people about integrating OGM in TorqueBox; should be less this week as emmanuel is back, so I'll spend all my time on the HQL parser 14:46:41 that's it for me 14:49:30 sannegrinovero: ok, thanks 14:49:35 dberindei: next? 14:49:41 ok 14:50:10 I worked mainly on ISPN-1123 with galderz and on ISPN-1106 14:50:11 I see you've done a lot more refactoring around ISPN-1000 14:50:11 jira [3ISPN-1123] Stabilise test suite [10Open (Unresolved) Task,7 Blocker,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1123 14:50:12 jira [3ISPN-1106] Rehashing into a running cluster causes shared processing lock contention [10Resolved (Done) Bug,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1106 14:50:13 jira [3ISPN-1000] PUSH based rehashing [10Open (Unresolved) Feature Request,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1000 14:50:50 manik: yes, I've committed it for 1106 but it's related to 1000 of course 14:51:09 dberindei: I presume it's all looking more stable now? 14:52:01 ISPN-1106 had a test program attached, it's running without any error now and I'm thinking of adding it to the test suite as a stress test 14:52:03 jira [3ISPN-1106] Rehashing into a running cluster causes shared processing lock contention [10Resolved (Done) Bug,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1106 14:52:19 dberindei: ok, cool. 14:52:20 I was also able to re-enable OngoingTxAndJoinTest 14:52:29 excellent 14:52:49 since that is a pretty good stress test in itself with pretty unfortunate timings. :) 14:53:04 dberindei: how about overall test suite stability? 14:53:08 but there are still some issues related to the fact that transactions are not blocked while applying state 14:53:47 manik: I'm afraid I'm still seeing test failures from time to time in the rehash tests 14:54:05 maybe one test failing per run, but not always the same 14:54:14 dberindei: but this is only with the rehashing? The rest of the test suite is more or less stable? 14:55:08 morning! 14:55:24 manik: I don't have any other failures, but I've been working only on rehashing and I haven't set a goal to re-enable all the disabled tests 14:55:48 I'm playing around with a ruby client for hotrod, and I have a question about the entry_version value returned from getWithVersion 14:55:53 http://community.jboss.org/wiki/HotRodProtocol#entry_version_8_bytes 14:56:04 should this be unpacked as a 64-bit unsigned int? 14:57:32 dberindei: ok. 14:57:41 manik: I'm still going through the rehashing + transactions code with mmarkus, after that I'll comb through the disabled tests again 14:57:41 dberindei: did you get a chance to benchmark pete's vnode impl? 14:57:52 lanceball: it's a signed 64 bit int - a java long 14:57:58 manik: nope, sorry 14:58:00 galderz: thanks! 14:58:22 ok. Can that be next on your plate, alongside ongoing test suite stabilisation? 14:58:37 sure manik 14:58:43 thanks 14:59:02 ah, also dberindei and sannegrinovero - did you get a chance to look at Alkpone's off-heap container? 14:59:23 Hello :) 14:59:29 manik, no, sorry. 14:59:33 hi Alkpone :) 14:59:33 Alkpone: hi :) 14:59:50 Adium react when you write my name in IRC 14:59:54 quite good :) 15:00:02 manik: me neither manik 15:00:05 about that, I'm mostly wondering how you think it should be integrated in the build, test and release process ? 15:00:06 Fill free to contact me when you will have time to look at my code 15:00:09 still pretty alpha 15:00:30 alban.seurat@me.com / albanseurat.com 15:01:05 Alkpone, careful with your email here, it's a public and logged list ;) 15:01:05 sannegrinovero: I haven't thought too hard about that yet - something we'd need to discuss 15:01:19 oh s*** 15:01:34 manik, that would be a good first step, so I know how to build and start it :) 15:01:43 normally 15:01:45 Alkpone can help with that 15:01:52 it's works on Linux by just taking up my github 15:02:00 it's a fork 15:02:15 and you need to have GCC toolchain installed on your linux box 15:02:22 (indeed, there is native code) 15:02:56 it's integrated into maven build 15:03:19 using the gnu toolchain maven plugin 15:03:29 I've been pretty busy lately 15:03:38 so i didn't have time to works on that 15:03:40 ah, didn't notice it was integrated with maven. that's pretty nice. 15:03:43 but it's stil in my todo list 15:04:00 ah, ok 15:04:07 no no the intragtion is there 15:04:13 but i need to continue to works on the task 15:04:19 to improve it 15:04:21 you can build it 15:04:35 I still missing the itetaror around the DataManager 15:04:46 Not simple to do JNI with iterator 15:05:06 but the build is "automatique" 15:05:13 Alkpone: have you thought about using NIO's Bytebuffer.allocateDirect() instead of native code? 15:05:14 you take my github 15:05:23 Yes I have 15:05:48 We have discuss this approach with manik 15:05:53 +ed 15:06:09 but it seems more hack than using plain native code 15:06:26 I've even try to use JAVA Mmap API which is pretty ughy 15:06:31 (File.mapFile something) 15:06:38 ugly 15:07:34 Alkpone, we're not aviding MMap because it's ugly? In Lucene's code, it's the fastest implementation available. 15:07:47 but of course, the requirments are very different there 15:07:50 I'm using it ;) 15:07:55 but in the native code 15:07:56 ah, cool 15:08:09 the Java API on top of it is just a mess 15:08:15 i'm using the C mmap API 15:08:30 you can't ask for a memory with mmap in the JAVA API 15:08:35 he HAS to be a file 15:08:52 and for performance, we can't have a file for the cache 15:09:12 it will just horrible everytimes the OS sync the mmap with the associated file 15:09:34 using the C mmap i can allocate pure memory using mmap (instead of sbrk/malloc) 15:09:59 Am i clear enought ? 15:12:57 manik: are we finished in terms of updates? 15:15:58 galderz: yes 15:16:08 #endmeeting