15:02:10 #startmeeting 15:02:10 Meeting started Tue May 31 15:02:10 2011 UTC. The chair is pilhuhn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:10 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:02:19 #topic Possible additions to RHQ to better support AS7 (and others) 15:02:35 Hello everyone, thanks for joining this meeting. 15:02:35 I want to basically discuss the child pages of 15:02:35 #link http://rhq-project.org/display/RHQ/RHQ_AS7 15:02:35 startig with an overview at 15:02:35 #link http://rhq-project.org/display/RHQ/Current+setup 15:02:54 I am showing here Domain mode, but most of what I am presenting, applies 15:02:55 to standalone instances as well. Of course for standalone AS7 instances 15:02:55 there are no server groups that may spawn multiple hosts. 15:03:36 Standalone mode is so to say a sub-set of domain mode, where the domain controller is the only host controller which is collapsed into the as server itself 15:04:14 The resource hierarchy in domain mode has some stuff at / and then the subsystems below the profile, while in standalone mode the subsystems are also at / 15:05:43 The first screen shot on http://rhq-project.org/display/RHQ/Current+setup shows how on the dependent host "pintsize", which has no domain controller the server-group and socket binding group point to settings made on the domain controller 15:06:33 So for a user goes to this 'server-five' and wants to see the definition of 'standard-sockets' it must be possible to easily get to this definition. 15:06:37 "dependent host" -- you mean platform right? 15:06:50 "host" and "platform" have different meanings in this conversation as I understand it 15:06:59 (host is a AS7 thing, "pintsize" here is a RHQ platform) 15:07:00 I am currently using the term more in the as7 sense 15:08:40 In the left hand tree you see the domain controller and an entry "hosts" below it with "pintsize" and "snert" -- those two are hosts in the AS7 sense, where the names are defined on AS7. Actually, they can exist and be discovered in the Domain Controller without the platform where this "host pintsize" is on, having an agent 15:09:20 ah. ok, I see "pintsize" as a child node under hosts. I get it. 15:09:21 So what I want to express here is that we need a notion of linking as described on 15:09:27 #link http://rhq-project.org/display/RHQ/Needed+-+Relations 15:09:58 where the admin can click on such a property item and end up in the correct (i.e. defining) tree node for "main-server-group" 15:11:16 #idea allow properties/traits to be hyperlinks to point to other resources 15:11:36 i think that this "click on a property" is a good start, but this whole thing just calls for a custom UI, imho... 15:12:05 lkrejci That is what Heiko Braun is doing for the AS7-console :) 15:12:41 There is another thing related to properties, which is shown on the second screenshot of http://rhq-project.org/display/RHQ/Current+setup 15:12:46 #idea bring back perspectives :) 15:12:46 yeah, but we're going to need it, too, won't we? I mean managing hunderds of servers needs to be even slicker ;) 15:12:49 pilhuhn: do we have an etherpad for questions? 15:13:09 ccrouch are there public ones? 15:14:09 pilhuhn: http://piratepad.net/JYzMyK4IYh 15:14:22 http://etherpad.org/public-sites/ 15:14:23 http://piratepad.net/Ih4kQsoFae 15:14:27 :) 15:14:45 what could go wrong with something called piratepad :-) 15:14:56 lkrejci: wins 15:15:09 When creating a new managed host for example (but applies to other sections of the management tree as well), the user has to provide values. In the managed-as example those are name, host, servergroup,socket-binding-group. The latter three need to contain values that come from different places in the AS7 domain model 15:15:36 #agreed to use http://piratepad.net/JYzMyK4IYh as etherpad for questions 15:18:23 Even if create new managed-as were a create-child operation on e.g. the server group, the other parameters would still be needed at create time. And they need to be correct, otherwise the creation fails, so we need to provide them from our internal data or by querying the Domain controller 15:18:54 #idea Have "dependent values" as also described on http://rhq-project.org/display/RHQ/Needed+-+dependent+properties 15:20:04 Jdobies wrote something like this to go to the database for users and roles almost two years ago. Perhaps some ideas can be recycled 15:20:29 #idea look at JDobie's code for filling properties from external sources 15:21:02 asantos I guess those are expecially needed for user friendlyness 15:22:02 Ok, next topic 15:22:09 #topic admin down 15:22:22 (and yes, this is a bit a pet of mine) 15:23:06 I think we all agree it would be a useful concept - anyone that is annoyed at seeing our disabled wireless network adapter in the list of downed resources knows this :) 15:23:22 its the implementation difficulty that caused us to punt on this in the past. 15:23:23 As7 domain mode allows to create managed as servers, but not to start them. In the screen shots, server-three and server-six are such servers. They can be set up, and only fired if load requires 15:24:07 (10:18:23 AM) pilhuhn: Even if create new managed-as were a create-child operation on e.g. the server group, the other parameters would still be needed at create time. And they need to be correct, otherwise the creation fails, so we need to provide them from our internal data or by querying the Domain controller 15:24:07 how does that work under the covers 15:24:14 pilhuhn: agreed 15:24:46 does it move bits around? Or does it expect them to be on the destination host already? 15:25:06 #agreed on implementing dependent-values for drop-downs etc. 15:25:42 ccrouch Bits need to be present there. I can only fire new servers for boxes with running host controllers 15:25:50 This is not provisioning. 15:26:02 ok, just checking 15:26:09 ccrouch It moved bits around for deployments, which are done on server-group level 15:26:22 s/moved/moves/ 15:26:51 I think we need to address this admin down sooner or later. It will not get easier when we wait longer 15:27:22 and yes, mazz network adapters are the other big resource type that desperately needs it 15:27:43 re " admin down", the requirement for this has always existed, I dont see that AS7 changes anything. I expect people to run/not run the same % of servers as they do today 15:28:06 pilhuhn: +1, people could ignore it with network adapters, but I think not having admin_down state for AS7 would not be tolerable 15:28:22 lkrejci +1 15:28:52 lkrejci: how is this any different from people starting and stopping AS5 instances? 15:29:03 ccrouch I think the situation in domain mode is different - this is much more lightweight 15:29:19 .oO( I wonder if dmlloyd would want to chime in ) 15:29:22 ccrouch: from what i understood from the notes on etherpad, there is functionality to bring up servers on demand automatically 15:29:59 what was the question? 15:30:02 lkrejci yes. This is just an operation on the domain controller 15:30:27 pilhuhn: so it doesn't happen automagically? 15:30:31 dmlloyd About people having managed as defined, but not running in domain mode and quick firing 15:30:50 lkrejci you mean e.g. when load increases? No 15:30:52 pilhuhn: could the avail state be unknown as opposed to down? 15:32:01 jshaughn that is not a valid AvailabilityType -- and I see it as abuse. Because we know the state 15:33:08 So I guess we postpone this until people are annoyed enough :-) 15:33:39 Releated to creation of new managed servers: 15:33:52 #topic Autoimport new managed servers 15:34:01 See http://rhq-project.org/display/RHQ/Needed+-+Auto+Import+of+managed+servers 15:34:07 i guess I was wondering whether something could be down if it's never been up. just a subtle semantic 15:34:58 In AS7 it is possible to create and start new servers by just adding them to a server-group on a specific host. 15:34:58 After this is done, the plugin container logic will (hopefully (1)) trigger a runtime scan to detect this new server. 15:34:58 Now the new server sits in the Autodiscovery portlet for the user to import (2). This means for the user that he needs to go to the AD-portlet, import it and then go back to where he was before. 15:35:31 jshaughn: we could have that even today - avail state going from unknown straight to down. it's a difference between not knowing and knowing. 15:35:38 #help Mazz I guess that is your field of experience 15:35:56 lkrejci yes - and we know the server state 15:36:36 yep 15:37:05 pilhuhn: IIRC, we have a very old BZ already in place asking for a feature to support auto-commit 15:37:09 autoImport could be useful in other contexts as well, particularly in cloud deployments 15:37:13 #action Rethink admin-down and investigate what needs to change 15:38:19 On the wiki at http://rhq-project.org/display/RHQ/Needed+-+Auto+Import+of+managed+servers lists this would be indicated via a new property on the plugin descriptor's type 15:38:26 jsanda good point 15:38:30 i'm not sure autoImport should be driven by plugin descriptors... wouldn't it be better to have some kind of rules configured server-side? 15:38:41 I can't find it, but point is, this is something that we knew would be a nice to have. at least, I remember talking about it :} 15:38:45 mmm rule engines 15:39:19 also the the opposite would need to be considered 15:39:26 lkrejci: impl issues aside (rules engine or not), I think the discussion before was user-driven decisions 15:39:41 #idea auto-uninventory if such a managed-as is removed from the domain by the admin 15:39:47 in other words, the user would tell the UI somehow that "I want to auto-import all servers of THIS type" or "all servers on this platform should be autoimported" 15:39:53 #action pilhuhn add auto-uninventory to that wiki-page 15:39:53 pilhuhn: what about extending dyna-groups - basically if a new resource comes in and it would fall into a dyna-group, it would be automiported 15:39:56 fyi, wrt rules - If I recall correctly, the requirement to manually import resources was one of the bigger issues the drools team had as well. 15:41:04 lkrejci I like that - but don't dynagroupd only work on resources that are commited ? 15:41:22 pilhuhn: that's why i said "extending" :) 15:41:29 ah ok :) 15:41:53 #idea Extending dynagroups to detect resources to auto-imported 15:41:59 I think 15:42:16 #agreed We need auto-import for certain kinds of resources 15:42:16 any changes here would need to be done carefully, given any agent of the right version can connect to a server, What gets imported is an important administrative decision 15:42:31 I dont agree :-) 15:42:37 #notagreed 15:42:38 :-) 15:43:00 ccrouch: exactly. we can't have rogue agents connect and be able to abuse the "auto-import" feature. we need a human admin to tell us it is ok. 15:43:00 but mazz's suggestion: "all servers on this platform should be autoimported" 15:43:03 sounds closer 15:43:43 given an admin, must have imported the platform at some point 15:43:49 Or the special kinds of resources denoted by the plugin descriptor ( and perhaps with an override in system templates) 15:44:23 ccrouch: but more generally speaking what if the resource is a platform as in the case of the cloud? 15:44:25 pilhuhn: definitely something to consider - have a descriptor say "this child can be auto-committed because it is needed by its parent" or something to that effect 15:44:30 So, no auto-import for plaforms 15:44:38 I mean - we already have this today in the world of SERVICES. we auto-commit them all the time 15:44:46 but the admin had to have imported the platform first 15:45:00 #agreed no auto-import of platforms 15:45:13 in a cloud deployment, platforms come and go 15:45:26 cloud is a separate discussion :) 15:45:43 mazz but jsanda has a good point here 15:45:46 but auto-import is certainly relevant 15:45:47 yup 15:45:59 and your pseudo-platforms might be useful here. 15:46:01 with extended auto-groups, this becomes a configurable thing :) 15:46:08 mazz I was coming to that :) 15:46:18 ie. whether to auto-import or not 15:46:20 the point being the functionality is more generally applicable than just a particular resource type 15:46:47 .oO( Like Admin-down :-) 15:46:57 +1 to making it configurable 15:47:22 #action investigate more on ways to do auto-import 15:47:30 Lets go to 15:47:41 #topic DynaGroup definitions 15:48:20 As you can see here http://rhq-project.org/display/RHQ/Current+setup users would like to group servers of a server-group together 15:48:38 #idea put hardcoded group definitions in the server 15:49:35 or we can add them to the plugin descriptor like in http://rhq-project.org/display/RHQ/Needed+-+DynaGroup+push so that additional / changed groups would only require a plugin re-delivery and not a full blown server delivery 15:50:08 anytime I hear "put hardcoded definitions in the server" - my ears perk up and I immediately become skeptical of the proposed idea 15:50:14 -1 if these would get auto-created, I guess this is one of those "important administrative decisions" 15:50:15 what do you mean "hardcode definitions"? 15:50:36 Of course priority here is lower, as the group definition(s9) can also be written in the docs an be done with it 15:51:03 what about having a "auto-discovery queue" for groups as well? like plugin suggests creating these dynagroups.. 15:51:12 hardcoded like those we already have for as4 clusters and the like 15:51:21 I see. in the dyna-group editor 15:51:50 that's less of a concern for me - that's just some hints to the user in some ui editor 15:52:41 It is called "Saved expression" in the DG-editor 15:52:48 it is interesting idea- have a plugin provide some metadata to suggest proposed group defs 15:53:07 that, actually would not be hard to do - that metadata can be persisted with the types, for example 15:53:23 and the UI, when it gets the resource type defs, can get these group proposals to show in the ui editor. 15:53:33 but I agree this is lower priority 15:54:00 But still if we put them in the server and then need to change the plugin code, we also need to deliver a new server version. This is not a type agnostic server 15:54:16 So we postpone this for now ? 15:54:31 the annoying part of that is there will be no connection between the Dynagroup generated groups and the HostContollers>ServerGroups>MyServerGroup, so you click on MyServerGroup expecting to do a group operation and you cant 15:54:52 pilhuhn: i think you already talked about this linking above/on the wiki? 15:55:02 ccrouch This is yet another issue - which also leads back to the first topic of this discussion 15:55:38 #action pilhuhn Investigate more about the semantic of linking between items in the RHQ model 15:56:35 #agreed Dynagroup push has low priority 15:56:55 i gotta jump on another call now, but a couple of impressions: 15:56:56 a) there are lots of features we can add around usability 15:56:56 b) similar to the provisioniing work, I think we need to get something functional out there for people try out ASAP 15:56:56 that way we can get a clearer picture of what are the most critical issues 15:57:11 yes 15:57:30 #topic Hide default config on resource creation http://rhq-project.org/display/RHQ/Needed+-+Hide+default+template+for+resource-config+when+creating+a+child 15:57:39 c) there is obviously a bunch of non-rhq infrastructure work thats needed too, e.g. support for all the various subsystems (including metrics, ops, configuration etc) 15:58:00 which pilhuhn has been working on too 15:58:21 ccrouch metrics is within the subsystems , but AS7 is only slowly providing them, as they have other priorities right now 15:58:38 pilhuhn: thats fine, as long as we can consume as they come 15:59:33 I will definitively 15:59:49 #help on those configuration related things 15:59:58 for me the big takeaway would be: lets not try to do everything, lets try to do the *most* critical stuff and then promptly after AS7 GA's get a release out that we can get further prioritization on 16:00:28 cheers pilhuhn, /me switches calls 16:00:37 Yes, we can't do everything 16:00:40 ok ccrouch 16:01:51 pilhuhn:are logging and patching on your radar? 16:02:18 patching not yet, I don't know that AS7 has that yet 16:02:44 and logging - I've partially implemented their logging subsystem interface - but you probably mean something else? 16:03:18 perhaps - doesnt as7 audit operations? 16:03:50 mazz, ips, jshaughn Is it possible to do child resource creation by not requiring to upload a file, but by using an existing one? 16:04:03 asantos I thing I heard that term - I need to check 16:04:21 #action pilhuhn Check with as7 about audit logging of as7 operations 16:04:28 not currently. but it wouldn't be hard to add support for that 16:04:35 an existing file that exists in our repositories? or an existing one on the agnet? we MIGHT support the former, but not the latter 16:05:17 iirc, i think we might already support it for updates of an existing eg ear/war resource 16:05:22 mazz A file that I have already uploaded once and which lives on the as7 DC in /deployments. 16:05:27 #topic deployment handling 16:05:33 no 16:05:43 Now I have three server groups: test, integration, production 16:06:07 I want to take that deployment to test, then integration then production 16:06:19 those are simple operations within the domain model 16:06:44 #action pilhuhn write a wiki page about this deployment handling (or find the existing one) 16:07:50 #idea simulate with operations on the deployments 16:08:20 Not natural, but that allows to postpone to change the "create configuration" code 16:08:24 pilhuhn: we could do something in the plugin desc like: 16:08:51 ips nice 16:08:54 then the gui would know not to prompt for a file when the user goes to create a new war 16:09:14 #help ips on figuring out how this might work and be implemented 16:09:17 i think this can be generic enough to just give people the choice every time... 16:09:29 lkrejci yes. 16:09:33 and when the facet create call is made, the plugin would receive null for the input stream 16:09:48 lkrejci people need to have both options 16:09:55 yep 16:10:27 and then based on the package create config, the plugin would make the call to the DC to promote the appropriate app 16:10:29 i mean it is quite common that stuff exists or is reachable from the agent machines, so there is not always the need for uploading the stuff manually 16:11:16 lkrejci Right AS can also utilize urls and such. But in the above case, AS7 already knows the content (as it was uploaded previously) 16:11:35 #action ips helps with figuring out how this can be done 16:11:37 ah, I see 16:11:41 last topic 16:11:49 #topic Pseudo-Platforms 16:12:06 http://rhq-project.org/display/RHQ/Needed+-+pseudo+platforms 16:12:50 As the domain controller knows all managed as servers, RHQ can detect them even without an agent 16:13:02 for the local host with the DC the scenario is easy 16:13:30 How do we handle other manged AS (like AS7..9) in the diagram on the above page? 16:13:50 Put them the tree of the platform where the DC is running? 16:14:05 This is easy, but gives the admin a wrong impression 16:14:30 Detection on the other platform is not possible, as no agent is present 16:14:51 Letting them fall below the table is no option either , asantos ? 16:14:56 apart from the pseudo-platform support, wouldn't there also be a feature difference for the ASes under that pseudo-platform. I mean are we going to route every possible communication through the domain/host controllers or is there something that we are going to get directly from the ASes themselves? 16:15:42 there is stuff (config changes) that always need to go through the DC 16:15:59 Other stuff like monitoring can go through the local HC 16:16:16 (which is the DC on the DC box) 16:16:29 so if there the HC is under a pseudo-platform, we won't have access to that information/functionality? 16:16:34 There we come back to linking :) 16:17:06 lkrejci I think it can be still obtained via DC->HC->manged as. Let me quickly check 16:17:49 this is probably the biggest change being proposed. as I mentioned earlier, it was something we recognized we needed a few years ago (in the context of Virtual Machines) and as jsanda mentioned, its an important concept when discussing how we support cloud. 16:17:50 the issue can probably be boiled down to "agentless platforms".. for which we have a wiki page already: http://rhq-project.org/display/RHQ/Design-Agentless+Management 16:19:01 Link added to the wiki page 16:19:36 lkrejci I currently have some issues with may setup, but I recall this is doable for the connection to the DC 16:20:07 #idea Require to have an agent on each platform with a HC for now 16:21:31 it makes our lives easier but renders RHQ less usable than the admin console of AS 16:21:58 yep 16:22:09 which it is going to be anyway, because we won't have the UI tailored for AS needs :) 16:22:10 But then admins are used to run agents on managed nodes 16:22:31 yep, might not be that much of a problem 16:22:39 lkrejci RHQ has other strenghts. 16:22:57 #agreed Require to have an agent on each platform with a HC for now 16:23:24 #action Investigate more what it takes to support pseudo-platforms and agentless management 16:23:33 pilhuhn: i for the most part agree w/ joe's thoughts on the bottom of mazz's link 16:24:20 add ability for an agent to discover one or more agentless platforms (and descendant resources) in addition to its own local platform 16:25:06 Man, that almost reads like Shakespeare :-) 16:25:39 Ok, I am done 16:25:42 but there would be some challenges such as what if a user later decides to install an agent on an agentless platform 16:25:59 yep 16:26:14 would we seamlessly transform the existing platform resource into a full fledged agent-backed platform? 16:27:13 I think that could be doable - probably easier than the other way around 16:27:29 also would it make sense for the as7 plugin to be in charge of discovering the agentless platforms, or would it be better for the platform plugin (or agent/PC itself) to be in charge of it? 16:28:19 I would see it as a base service - not plugin specific. But the plugin would signal the PC that it needs a platform 16:28:19 the latter so that other plugins besides besides the as7 plugin could also potentially discover servers running on the agentless platform 16:29:12 I think we used to have the strong platform concept as once upon a time there was an idea to have licenses with platform count 16:29:15 eg - postgres plugin could discover a postgres server on the agentless platform which granted might not support all facets 16:29:22 yep 16:30:09 also as4 via a remote connection. The pseudo-platform would then be a "more natural" place for it, than within the platform of the agent 16:30:54 #action pilhuhn Add this stream of toughts to the above wiki page 16:31:10 it would also be cool to have an operation on a pseudo-platform (or some other means in the gui) to install an agent to that platform and promote it to an agent-backed platform 16:31:38 hmm... so this basically means duplicating all the resource types with their agentless counterparts (where it makes sense of course) and then have some matching logic in case the agentless becomes "agentful" 16:32:05 lkrejci why do you mean duplicating? 16:32:17 i know we already have the remote agent install page under Administration, but i'm thinking something more integrated with the existing agentless platform Resource 16:32:35 pilhuhn: because agentless will have fewer features and will use different access "protocol" 16:32:37 I think there would be a new Platform "Pseudo" next to Linux, OS X and so on 16:33:08 right, but the servers and services would be the same restypes used on real platforms 16:33:25 ips yes 16:33:43 but the discovery and resource components would differ 16:34:04 and currently you can already manually add an as4 on a platform by giving its JNP url. This can already be remote 16:34:09 actually it would be pretty cool if we could somehow detect the OS remotely and then use the existing platform restypes :) 16:34:11 also their plugin configs could differ because of the different ways we "connect" to them 16:34:11 lkrejci only for the Pseudo-P 16:35:07 pilhuhn: i am able to discover the apache remotely but i will not have info about its server-root and httpd.conf locations, hence i will not be able to use the same discovery component 16:35:08 lkrejci: yeah, that's true, and it might not even be possible to detect a reskey for certain types, eg - if the key is the install dir 16:35:09 side note: we have a lsof plugin :) 16:35:09 #idea detect remote OS and use correct platform resource type (with some isPseudo flag) 16:35:15 I think this can be used to detect remote platforms 16:35:37 ips: yep, that's why i was talking about the "matching" logic - because even the res keys might not be the same 16:35:40 I wrote this with greg's proding a year or so ago. I don't know if it has any relevenance. but I bring it up just because I don't think people know about it 16:36:01 i hate to pour cold water on this topic, but what about saying "if you want to manage an AS7 instance you need to deploy an agent on to its platform" 16:36:27 ccrouch: that's the only agreement we have so far :) 16:36:36 lkrejci: the matching logic scares me though. would be nice to use the same restypes even if it meant some tweaking of plugin configs or even res configs 16:36:39 ccrouch we did have that above already :-) 16:36:43 res configs -> res keys 16:37:10 pilhuhn: lkrejci: ok we're good then right :-)? No more pseudo platforms 16:37:25 ccrouch: we're developers here ;) 16:37:36 ccrouch I think it is still good to have this stream of thoughts 16:37:39 we like problems :) 16:37:46 lkrejci: uh huh... well how about developing some apache tests ;-) 16:37:47 lkrejci: another option would be to add an optional remoteKey and remotePluginConfig to the ResourceType entity 16:37:54 :D 16:37:56 This is a good one - and I like to have this "on tape" 16:38:12 :-) 16:38:34 But I will call meeting end now 16:38:35 1 16:38:37 2 16:38:42 and 16:38:44 3 16:38:50 i'm not going to try to dam the stream of consciousness :-) 16:38:56 Ok, thanks everyone for your time 16:39:02 #endmeeting