14:06:20 <galderz> #startmeeting 14:06:20 <jbott> Meeting started Mon Jun 6 14:06:20 2011 UTC. The chair is galderz. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:06:20 <jbott> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:06:56 <galderz> right, who goes first? 14:07:11 <vblagoje> I'll go 14:07:15 <vblagoje> mine is simple 14:07:18 <galderz> #topic vblagoje 14:07:53 <vblagoje> i believe that ISPN-83 is not reproducible any more, i just wanted to resolve ISPN-1153 before committing a small fix for ISPN-83. I confirmed ISPN-83 not being reproducible with mlinhard and his original test that raised it 14:07:55 <jbossbot> jira [3ISPN-83] Remove dependency on JGroups FLUSH [10Reopened (Unresolved) Task,7 Critical,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-83 14:07:56 <jbossbot> jira [3ISPN-1153] Validate relationship between transport related timeouts [10Open (Unresolved) Task,7 Major,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-1153 14:08:14 <vblagoje> also as far as move to confluence goes 14:08:28 <vblagoje> I talked to Vlastimil and Mark 14:08:49 <vblagoje> and they confirmed my fears that this move has to be manual move of 99 of our documents 14:08:53 <vblagoje> so that sucks 14:09:09 <vblagoje> Vlasimitl is still looking around for some plugin to do this automatically 14:09:33 <vblagoje> so lets wait for this week and his report back, otherwise this is going to be a manual move 14:09:44 <manik> vblagoje: so their migration tool didnt work for us? 14:10:07 <vblagoje> i don't think they have a plugin for sbs content 14:10:23 <vblagoje> they told me to wait and they'll report back 14:11:45 <vblagoje> from my two week list given by manik I also had a task that I was supposed to work on with mmarkus but we did get there yet i think 14:11:49 <manik> vblagoje: ok, let me know as you hear more from them 14:11:56 <vblagoje> so that's it from me 14:11:59 <vblagoje> ok manik 14:12:28 <manik> vblagoje: so you're still working on helping mmarkus benchmark stuff around JTA? 14:13:02 <vblagoje> yeah that is the the task i had in mind 14:13:03 <mmarkus> manik: I'm almost there writing radargun support for tx 14:13:06 <vblagoje> manik 14:13:34 <mmarkus> and after that I'll just run the benchmark on ATL cluster 14:13:49 <manik> mmarkus: so you are doing it? 14:14:12 <mmarkus> manik: yes, as I am 14:15:16 <mmarkus> manik: shall I take it from here? 14:15:36 <galderz> what are you doing this week vblagoje? 14:15:49 <manik> mmarkus: I thought you had a lot more on your plate. :) 14:16:06 <manik> mmarkus: see if vblagoje can take this off you 14:16:34 <vblagoje> not sure galderz, have to look at the task pipeline 14:17:35 <vblagoje> manik wanted me to look at ISPN-1065 but this is a more massive undertaking 14:17:36 <jbossbot> jira [3ISPN-1065] Use a better mechanism to parse config files [10Open (Unresolved) Enhancement,7 Major,6 Vladimir Blagojevic] https://issues.jboss.org/browse/ISPN-1065 14:19:02 <vblagoje> ok mmarkus go now 14:19:41 <galderz> #topic mmarkus 14:19:54 <mmarkus> one thing would be the design around locking improvements 14:19:56 <mmarkus> http://community.jboss.org/wiki/PossibleLockingImprovements 14:20:15 <mmarkus> created JIRAs for each locking improvement 14:20:28 <mmarkus> and added design proposal for each one 14:20:28 <mmarkus> https://issues.jboss.org/browse/ISPN-1131 14:20:29 <jbossbot> jira [3ISPN-1131] Locking optimization: acquire (write) locks at prepare time [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1131 14:20:50 <mmarkus> https://issues.jboss.org/browse/ISPN-1132 14:20:51 <jbossbot> jira [3ISPN-1132] Locking optimization: reorder lock acquisition to avoid deadlocks [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1132 14:20:58 <mmarkus> https://issues.jboss.org/browse/ISPN-1137 14:20:59 <jbossbot> jira [3ISPN-1137] Locking optimization: only lock main data owner (dist only) [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1137 14:21:26 <mmarkus> I also think that "5. Replicated Keys & Values, non-replicated Locks" is a duplication of ISPN-1137 14:21:27 <jbossbot> jira [3ISPN-1137] Locking optimization: only lock main data owner (dist only) [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1137 14:21:39 <mmarkus> waiting for sannegrinovero to confirm that 14:22:05 <sannegrinovero> ah right.. 14:22:29 <mmarkus> also got some nice feedback around them, especially arroud ISPN-1132 14:22:30 <jbossbot> jira [3ISPN-1132] Locking optimization: reorder lock acquisition to avoid deadlocks [10Open (Unresolved) Feature Request,7 Major,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1132 14:22:50 <sannegrinovero> sorry my mind is on a totally different frequency today :) 14:23:03 <manik> mmarkus: ok, cool. 14:23:07 <mmarkus> sannegrinovero: deadlock? :-) np 14:23:24 <mmarkus> I think these might have a very important effect on the throughput 14:23:26 <manik> mmarkus: about benchmarking transactions, I presume you will also blog about this after you are done? 14:23:41 <sannegrinovero> yes I'm risking a mind deadlock :/ 14:24:05 <mmarkus> also quite serious change in the architecture, so I'd like to take it with Jonathan before writing the code 14:24:40 <manik> mmarkus: yes, but we aren't going to be writing any code around this for now, remember? ;) we still need to release 5.0 14:24:42 <mmarkus> manik: I sent you an email re: a face2face with jonathan 14:24:49 <sannegrinovero> yes from last jonathan's email it looked like he was worried we where relaxing consistency, but I thought we where actually talking about introducing more consistency 14:25:18 <manik> mmarkus sannegrinovero I'll chime on on that email thread soon, but lets take that offline 14:25:26 <manik> for now 14:25:27 <mmarkus> sannegrinovero: I think he's comment was around improvement #5 (which imo should not be discussed) 14:26:08 <manik> mmarkus: any thoughts on documenting/blogging about different tx setups? 14:26:13 <mmarkus> manik: I know. just that from experience these need to be planned ahead :-) 14:26:37 <mmarkus> manik: yes, once I'll be done with the benchmarking should that's what I'll do 14:26:54 <manik> mmarkus: agreed, but as I said lets take that offline 14:27:19 <mmarkus> I did some work on that, then switched working on regular JIRAs as galderz pointed out we still have about 25 opened 14:27:35 <manik> mmarkus: also, any progress on figuring out why some tests are ignored by maven? 14:27:40 <manik> ISPN-1108 ? 14:27:41 <jbossbot> jira [3ISPN-1108] Test(s) ignored by maven [10Open (Unresolved) Bug,7 Blocker,6 Mircea Markus] https://issues.jboss.org/browse/ISPN-1108 14:27:52 <mmarkus> not really 14:28:10 <manik> Could you prioritise that? 14:28:12 <mmarkus> I think it would be better if vblagoje can take that away from me instead of tx benchmarking 14:28:18 <manik> Ok 14:28:25 <manik> if vblagoje will accept it 14:28:26 <manik> :) 14:28:38 <manik> vblagoje: WDYT? 14:29:18 <manik> ( I will take silence as a yes ;) ) 14:29:37 <manik> mmarkus: so then you are full steam ahead on tx benchmarking and blogging about it? 14:29:56 <mmarkus> also had a code walk through with dberindei around the new rebalancing code and tx failover 14:30:34 <mmarkus> some interesting findings. I'll raise an issue we found and dberindei also has some things to look at 14:31:12 <mmarkus> manik: yes 14:31:30 <manik> mmarkus: good stuff. 14:31:38 <manik> galderz: you next? 14:31:44 <galderz> sure 14:31:48 <galderz> #topic galder 14:32:07 <galderz> last week was a short one for me with thursday/friday bank holidays 14:32:29 <galderz> so, last week I focused on fixing as much CR4 stuff as I could 14:32:54 <galderz> i fixed a fair few and went over other people's list seeing what could be done in 1/2 days 14:33:15 <galderz> manik: cloudbees is up and running, i just need to document what i did to configure it...etc 14:33:28 <galderz> but I think we're ready to close the previous hudson installation 14:33:48 <galderz> me and dberindei have also worked at times on fixing as many of the testsuite failures 14:33:57 <galderz> and we're down to less than 20 in the entire suite 14:33:58 <manik> galderz: awesome 14:34:05 <galderz> think core is down to 6 failures 14:34:09 <manik> Oh? 14:34:21 <manik> I thought apart from core the other modules were stable 14:34:27 <manik> which other modules have failures? 14:34:39 <galderz> well, there's a bit of everything 14:34:46 <galderz> cassandra module not running at all 14:34:57 <galderz> i've pinged tristan and sannegrinovero said he'd bug him 14:35:17 <galderz> the other modules have some random failures which so far have been hard to track down 14:35:55 <manik> Ok. 14:36:08 <galderz> apart from that, mmarkus helped release CR4 14:36:15 <manik> Let me know when you think I can shut down hudson.infinispan.org and redirect the URL to cloudbees 14:36:20 <galderz> had some issues with uploading to nexus 14:36:27 <manik> Yes I saw the emails 14:36:29 <galderz> manik: i think we can do that now 14:36:34 <galderz> ah 14:36:48 <galderz> apart from that went over a month's worth of user forum posts 14:37:07 <manik> galderz: great! Need to make sure we don't drop the ball there. :) 14:37:23 <galderz> some interesting stuff discovered, such as that user that got frustrated with our clustering examples which did not appear to work: repl enabled but state transfer disabled 14:37:37 <manik> Yes I saw the JIRA 14:37:40 <galderz> manik: for sure! keeping a closer eye this coming weeks with final not too far 14:37:59 <galderz> that's about it - a lot of small things that had to be done 14:38:00 <galderz> ah 14:38:04 <manik> Lets chat on EDG related stuff offline, but any updates on Hibernate 2LC performance? 14:38:11 <manik> Stuff that Andy has been doing, etc? 14:38:12 <galderz> manik: 14:38:14 <galderz> no updates 14:38:30 <manik> Ok. I'll assume that no news is good news :) 14:38:42 <galderz> someone has reported a mem leak today though, looking into it today or tomorrow 14:38:53 <manik> galderz: if you're done, sannegrinovero feel like going next? 14:39:00 <manik> galderz: ok 14:39:02 <galderz> actually, trying to get the 2LC testsuite to run with synchronization but have a failure 14:39:14 <galderz> for this week: more of the same i think, manik let's discuss offline 14:39:33 <manik> galderz: ok, pls take the sync failure up with mmarkus offline 14:39:50 <manik> galderz: treat that as a prio since the AS7 releases will depend on it 14:39:58 <galderz> sync failure? 14:40:10 <manik> "actually, trying to get the 2LC testsuite to run with synchronization but have a failure" 14:40:17 <galderz> ah yeah :) 14:40:19 <manik> :) 14:40:34 <galderz> sure, going through logs now 14:40:36 <galderz> #topic sannegrinovero 14:40:43 <sannegrinovero> ok, 14:41:14 <sannegrinovero> so I think nobody knows it as Israel keeps writing me in private; I think he feels uncomfortable in asking many trivial questions 14:41:35 <sannegrinovero> so, I finally got another preview of ISPN-200 to read 14:41:35 <jbossbot> jira [3ISPN-200] Distributed queries [10Open (Unresolved) Feature Request,7 Major,6 Sanne Grinovero] https://issues.jboss.org/browse/ISPN-200 14:42:05 <sannegrinovero> this is a myor rewrite of his first experiments, as I had a lot of feedback to him and he needed some help after the Query module rewrite 14:42:54 <sannegrinovero> a part from that, I'm inspecting a forum report about the Lucene Directory having issues with locks when it's used with passivation (LIRS) 14:43:18 <sannegrinovero> this is an Hibernate Search user, and I'm preparing a test about it. 14:43:38 <sannegrinovero> I guess that will link back to our discussions on locking. 14:43:46 <manik> sannegrinovero: ok. 14:43:57 <manik> sannegrinovero: how's Israel's patch looking? 14:44:12 <sannegrinovero> it's not ready yet, but the concepts are there. 14:44:21 <manik> right direction? 14:44:21 <sannegrinovero> sadly, we're back to the point that Sorting is an issue, again 14:44:58 <sannegrinovero> so I don't know yet, I hope it's some small bug, but I have to review all the design back again as it's totally different. 14:45:27 <sannegrinovero> I guess it will take a couple of days of mine, and no less than two weeks from him. 14:46:18 <sannegrinovero> for the rest, I've been helping in chats with the cloud-tm people about integrating OGM in TorqueBox; should be less this week as emmanuel is back, so I'll spend all my time on the HQL parser 14:46:41 <sannegrinovero> that's it for me 14:49:30 <manik> sannegrinovero: ok, thanks 14:49:35 <manik> dberindei: next? 14:49:41 <dberindei> ok 14:50:10 <dberindei> I worked mainly on ISPN-1123 with galderz and on ISPN-1106 14:50:11 <manik> I see you've done a lot more refactoring around ISPN-1000 14:50:11 <jbossbot> jira [3ISPN-1123] Stabilise test suite [10Open (Unresolved) Task,7 Blocker,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1123 14:50:12 <jbossbot> jira [3ISPN-1106] Rehashing into a running cluster causes shared processing lock contention [10Resolved (Done) Bug,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1106 14:50:13 <jbossbot> jira [3ISPN-1000] PUSH based rehashing [10Open (Unresolved) Feature Request,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1000 14:50:50 <dberindei> manik: yes, I've committed it for 1106 but it's related to 1000 of course 14:51:09 <manik> dberindei: I presume it's all looking more stable now? 14:52:01 <dberindei> ISPN-1106 had a test program attached, it's running without any error now and I'm thinking of adding it to the test suite as a stress test 14:52:03 <jbossbot> jira [3ISPN-1106] Rehashing into a running cluster causes shared processing lock contention [10Resolved (Done) Bug,7 Major,6 Dan Berindei] https://issues.jboss.org/browse/ISPN-1106 14:52:19 <manik> dberindei: ok, cool. 14:52:20 <dberindei> I was also able to re-enable OngoingTxAndJoinTest 14:52:29 <manik> excellent 14:52:49 <manik> since that is a pretty good stress test in itself with pretty unfortunate timings. :) 14:53:04 <manik> dberindei: how about overall test suite stability? 14:53:08 <dberindei> but there are still some issues related to the fact that transactions are not blocked while applying state 14:53:47 <dberindei> manik: I'm afraid I'm still seeing test failures from time to time in the rehash tests 14:54:05 <dberindei> maybe one test failing per run, but not always the same 14:54:14 <manik> dberindei: but this is only with the rehashing? The rest of the test suite is more or less stable? 14:55:08 <lanceball> morning! 14:55:24 <dberindei> manik: I don't have any other failures, but I've been working only on rehashing and I haven't set a goal to re-enable all the disabled tests 14:55:48 <lanceball> I'm playing around with a ruby client for hotrod, and I have a question about the entry_version value returned from getWithVersion 14:55:53 <lanceball> http://community.jboss.org/wiki/HotRodProtocol#entry_version_8_bytes 14:56:04 <lanceball> should this be unpacked as a 64-bit unsigned int? 14:57:32 <manik> dberindei: ok. 14:57:41 <dberindei> manik: I'm still going through the rehashing + transactions code with mmarkus, after that I'll comb through the disabled tests again 14:57:41 <manik> dberindei: did you get a chance to benchmark pete's vnode impl? 14:57:52 <galderz> lanceball: it's a signed 64 bit int - a java long 14:57:58 <dberindei> manik: nope, sorry 14:58:00 <lanceball> galderz: thanks! 14:58:22 <manik> ok. Can that be next on your plate, alongside ongoing test suite stabilisation? 14:58:37 <dberindei> sure manik 14:58:43 <manik> thanks 14:59:02 <manik> ah, also dberindei and sannegrinovero - did you get a chance to look at Alkpone's off-heap container? 14:59:23 <Alkpone> Hello :) 14:59:29 <sannegrinovero> manik, no, sorry. 14:59:33 <sannegrinovero> hi Alkpone :) 14:59:33 <manik> Alkpone: hi :) 14:59:50 <Alkpone> Adium react when you write my name in IRC 14:59:54 <Alkpone> quite good :) 15:00:02 <dberindei> manik: me neither manik 15:00:05 <sannegrinovero> about that, I'm mostly wondering how you think it should be integrated in the build, test and release process ? 15:00:06 <Alkpone> Fill free to contact me when you will have time to look at my code 15:00:09 <Alkpone> still pretty alpha 15:00:30 <Alkpone> alban.seurat@me.com / albanseurat.com 15:01:05 <sannegrinovero> Alkpone, careful with your email here, it's a public and logged list ;) 15:01:05 <manik> sannegrinovero: I haven't thought too hard about that yet - something we'd need to discuss 15:01:19 <Alkpone> oh s*** 15:01:34 <sannegrinovero> manik, that would be a good first step, so I know how to build and start it :) 15:01:43 <Alkpone> normally 15:01:45 <manik> Alkpone can help with that 15:01:52 <Alkpone> it's works on Linux by just taking up my github 15:02:00 <Alkpone> it's a fork 15:02:15 <Alkpone> and you need to have GCC toolchain installed on your linux box 15:02:22 <Alkpone> (indeed, there is native code) 15:02:56 <Alkpone> it's integrated into maven build 15:03:19 <Alkpone> using the gnu toolchain maven plugin 15:03:29 <Alkpone> I've been pretty busy lately 15:03:38 <Alkpone> so i didn't have time to works on that 15:03:40 <sannegrinovero> ah, didn't notice it was integrated with maven. that's pretty nice. 15:03:43 <Alkpone> but it's stil in my todo list 15:04:00 <sannegrinovero> ah, ok 15:04:07 <Alkpone> no no the intragtion is there 15:04:13 <Alkpone> but i need to continue to works on the task 15:04:19 <Alkpone> to improve it 15:04:21 <Alkpone> you can build it 15:04:35 <Alkpone> I still missing the itetaror around the DataManager 15:04:46 <Alkpone> Not simple to do JNI with iterator 15:05:06 <Alkpone> but the build is "automatique" 15:05:13 <dberindei> Alkpone: have you thought about using NIO's Bytebuffer.allocateDirect() instead of native code? 15:05:14 <Alkpone> you take my github 15:05:23 <Alkpone> Yes I have 15:05:48 <Alkpone> We have discuss this approach with manik 15:05:53 <Alkpone> +ed 15:06:09 <Alkpone> but it seems more hack than using plain native code 15:06:26 <Alkpone> I've even try to use JAVA Mmap API which is pretty ughy 15:06:31 <Alkpone> (File.mapFile something) 15:06:38 <Alkpone> ugly 15:07:34 <sannegrinovero> Alkpone, we're not aviding MMap because it's ugly? In Lucene's code, it's the fastest implementation available. 15:07:47 <sannegrinovero> but of course, the requirments are very different there 15:07:50 <Alkpone> I'm using it ;) 15:07:55 <Alkpone> but in the native code 15:07:56 <sannegrinovero> ah, cool 15:08:09 <Alkpone> the Java API on top of it is just a mess 15:08:15 <Alkpone> i'm using the C mmap API 15:08:30 <Alkpone> you can't ask for a memory with mmap in the JAVA API 15:08:35 <Alkpone> he HAS to be a file 15:08:52 <Alkpone> and for performance, we can't have a file for the cache 15:09:12 <Alkpone> it will just horrible everytimes the OS sync the mmap with the associated file 15:09:34 <Alkpone> using the C mmap i can allocate pure memory using mmap (instead of sbrk/malloc) 15:09:59 <Alkpone> Am i clear enought ? 15:12:57 <galderz> manik: are we finished in terms of updates? 15:15:58 <manik> galderz: yes 15:16:08 <galderz> #endmeeting