Slurm in Guacamole

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Slurm in Guacamole

alipawsey
Anyone knows any slurm integration possibilities with Guacamole?
Any 3rd party application? Airflow plug-in maybe could be implemented in Guacamole?

Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

vnick
On Mon, Dec 2, 2019 at 22:33 Ali Zamani <[hidden email]> wrote:
Anyone knows any slurm integration possibilities with Guacamole?
Any 3rd party application? Airflow plug-in maybe could be implemented in Guacamole?

Can you describe in more detail what you're trying to accomplish, and what you envision using Guacamole for?

-Nick
Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

alipawsey
Hi Nick,

The main reason that I am thinking to Slurm (or any session/workload
manager) is to manage the live/established sessions. I am trying to deploy
Guacamole on HPC. I have lots of nodes/instances that is a bit odd to show
them all in users' dashboard to choose between. In addition it's hard to
manage the load balance on each node. One scenario is to show particular
nodes to a group of users; another one is to have workload manager plugin
deployed on Guacamole to allocate a non-busy node to new user/request. The
second solution is more professional!

I have tried FastX that automatically allocates non-busy nodes to user and
keeps the load balance; however, if user is running an
application/simulation, it will still keep running after killing the session
which is kind of VNC connection and that node still busy.
 
In the other words, it's nice to have an script/app that kills the sessions
automatically and terminates all jobs running on the instance after user
logs out or reach to the wall time. Also be able to find the less busy
instance to allocate to a user by showing that inside his/her dashboard.

Please let me know if you have any advice/solution or somehow possible to
add any workload manager into Guacamole.

Thanks.



--
Sent from: http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

vnick
On Wed, Dec 4, 2019 at 5:47 AM alipawsey <[hidden email]> wrote:
Hi Nick,

The main reason that I am thinking to Slurm (or any session/workload
manager) is to manage the live/established sessions. I am trying to deploy
Guacamole on HPC. I have lots of nodes/instances that is a bit odd to show
them all in users' dashboard to choose between. In addition it's hard to
manage the load balance on each node. One scenario is to show particular
nodes to a group of users; another one is to have workload manager plugin
deployed on Guacamole to allocate a non-busy node to new user/request. The
second solution is more professional!

For Guacamole Client, you should be able to deploy several Tomcat instances, pointed at a single database, and then put a load balancer in front of them, and balance the front-end connections as you so desire.  The metric used for the front-end balancing can vary based on what the load balancer can do - some load balancers will only do it based on number of connections, some allow for feedback loops from the balanced hosts to monitor load and such on the system, but you have several options.  The one caveat, here, is that Guacamole Client currently lacks the ability to share active connection information among multiple deployed instances, so if you are using any limits on concurrent connections you may not see expected results.  Other than that, as long as the load balancer persists the client connection to the correct back-end server and doesn't shuffle things around you should be fine.

For Guacamole Server (guacd), you have a couple of different options:
- Deploy one guacd instance per Guacamole Client instance, and point each Guacamole Client at its "own" guacd instance.  This could live on the same system as Guacamole Client, or could be a different system that is then configured in each guacamole.properties file.
- Deploy several guacd instances behind a load balancer, and then point all of the Guacamole Client configs at the single load balancer.  The caution, here, is that, if you do this, you'll want to make sure that there is some level of persistence for the requests coming from Guacamole Client to the load balancer and then through to the back-end guacd system so that the load balancer doesn't continuously try to switch packets for a given connection to different guacd instances, which will result in pretty immediate and severe problems with connections.
- You can also override the guacd system that is used on a per-connection basis, which would allow you to spread out load based on connections, if you so desired.
 

I have tried FastX that automatically allocates non-busy nodes to user and
keeps the load balance; however, if user is running an
application/simulation, it will still keep running after killing the session
which is kind of VNC connection and that node still busy.

In the other words, it's nice to have an script/app that kills the sessions
automatically and terminates all jobs running on the instance after user
logs out or reach to the wall time. Also be able to find the less busy
instance to allocate to a user by showing that inside his/her dashboard.

I guess I fail to see what value something like Slurm or FastX would have over the possibilities I mentioned above for load balancing?  I'm also not sure why you'd have to worry about manually terminating user sessions - if you go with the methods of load balancing that I mentioned above, the Guacamole components will take care of cleaning up the connections, and you shouldn't have to do any manual cleanup.  Furthermore, the limitations of load balancing that exist within the Guacamole application (lack of sharing of active sessions between nodes and the need for the load balancer to keep track of the connections between clients and Guacamole Client, and between Guacamole Client and guacd instances) would also exist within a load manager.

Is there something I'm missing on the value of such a system with Guacamole that isn't already possible, here??

-Nick
Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

alipawsey
vnick wrote
> On Wed, Dec 4, 2019 at 5:47 AM alipawsey &lt;

> alizamani84@

> &gt; wrote:
>
> For Guacamole Client, you should be able to deploy several Tomcat
> instances, pointed at a single database, and then put a load balancer in
> front of them, and balance the front-end connections as you so desire.
> The
> metric used for the front-end balancing can vary based on what the load
> balancer can do - some load balancers will only do it based on number of
> connections, some allow for feedback loops from the balanced hosts to
> monitor load and such on the system, but you have several options.  The
> one
> caveat, here, is that Guacamole Client currently lacks the ability to
> share
> active connection information among multiple deployed instances, so if you
> are using any limits on concurrent connections you may not see expected
> results.  Other than that, as long as the load balancer persists the
> client
> connection to the correct back-end server and doesn't shuffle things
> around
> you should be fine.
>
> For Guacamole Server (guacd), you have a couple of different options:
> - Deploy one guacd instance per Guacamole Client instance, and point each
> Guacamole Client at its "own" guacd instance.  This could live on the same
> system as Guacamole Client, or could be a different system that is then
> configured in each guacamole.properties file.
> - Deploy several guacd instances behind a load balancer, and then point
> all
> of the Guacamole Client configs at the single load balancer.  The caution,
> here, is that, if you do this, you'll want to make sure that there is some
> level of persistence for the requests coming from Guacamole Client to the
> load balancer and then through to the back-end guacd system so that the
> load balancer doesn't continuously try to switch packets for a given
> connection to different guacd instances, which will result in pretty
> immediate and severe problems with connections.
> - You can also override the guacd system that is used on a per-connection
> basis, which would allow you to spread out load based on connections, if
> you so desired.

Lets distinguish balancing into two items: "/Connection Balancer/" which
refers to the number of connections per node and "/Workload Balancer/" which
redirects new connection to the node with less workload meaning less cpu/ram
usage.
The first case is already possible in the /guacamole connection group/ as
described here:
https://guacamole.apache.org/doc/gug/administration.html#connection-group-management

Could the load balancer you mentioned mange the workload balancing as well?


> I guess I fail to see what value something like Slurm or FastX would have
> over the possibilities I mentioned above for load balancing?  I'm also not
> sure why you'd have to worry about manually terminating user sessions - if
> you go with the methods of load balancing that I mentioned above, the
> Guacamole components will take care of cleaning up the connections, and
> you
> shouldn't have to do any manual cleanup.  Furthermore, the limitations of
> load balancing that exist within the Guacamole application (lack of
> sharing
> of active sessions between nodes and the need for the load balancer to
> keep
> track of the connections between clients and Guacamole Client, and between
> Guacamole Client and guacd instances) would also exist within a load
> manager.
>
> Is there something I'm missing on the value of such a system with
> Guacamole
> that isn't already possible, here??

Slurm has the capability to free resources. Any plugin for Guacamole to do
similar thing  once session killed? Even by killing a session manually, only
connection has been terminated and still a job could run on the instance
keeping the resource busy.




--
Sent from: http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

vnick
Lets distinguish balancing into two items: "/Connection Balancer/" which
refers to the number of connections per node and "/Workload Balancer/" which
redirects new connection to the node with less workload meaning less cpu/ram
usage.
The first case is already possible in the /guacamole connection group/ as
described here:
https://guacamole.apache.org/doc/gug/administration.html#connection-group-management

Could the load balancer you mentioned mange the workload balancing as well?


Yes, I agree that a distinction should be made, here, and I understand the difference.  There are load balancers available that can query the back-end servers for performance data and adjust priorities based on the relative load of the system.  For example, I believe HAProxy has a way to do this - through custom scripting, you can execute scripts that would query performance data on remote systems (via SSH or WMIC, etc.), and then adjust the priority based on that data.  I've done this in the past - it may not be the cleanest thing, but it works.  I think Keepalived had some similar hooks that could do such things, and maybe even IPVS.  I would imagine that some of the commercial load balancers - Citrix Netscaler, F5, etc. - have some similar capabilities to hook into backend servers and adjust priority based on load.  All of these are *Load Balancers* and not *Workload Managers* - that is, they will make decisions about where to put a connection at the time of the connection and then persist that connection until it is completed - they will not attempt to move connections from one system to another dynamically/real-time.  However, for 99% of use-cases out there this is sufficient.
 
It also would be possible to write some additional load balancing integration for Guacamole to query and adjust priorities within a connection group based on load.  I think I started to do this at one point but never completed it, but it is very doable.

Slurm has the capability to free resources. Any plugin for Guacamole to do
similar thing  once session killed? Even by killing a session manually, only
connection has been terminated and still a job could run on the instance
keeping the resource busy.



Ah, I think I get what you're driving at, here.  You're talking about freeing up resources on the remote systems to which a user is connected.  So, if a user connects, through Guacamole to a host called "VNC1" over the VNC protocol, and then disconnects, they could have left something running on the VNC1 host that needs to be "cleaned up" in order to free the resources for other users who would log on to that system, correct?

There's no capability to do this currently in Guacamole, for a couple of reasons.  First, the remote systems are not "aware" of the Guacamole at all.  To the remote systems it could be Guacamole, or vncviewer, or Microsoft Remote Desktop Client, or openssh connecting, and it doesn't have any way to tie running processes on the remote host to session within Guacamole.  This makes the "clean up" job at least a little more difficult - you have to come up with some way, whether through a workload manager or a Guacamole plugin, to tie the Guacamole session to the login on the remote host.

It is probably possible to do this with an extension, at least to some extent.  You could write an extension that had credentials to log in to the remote systems and run something on them, and either intercept the tunnel close event or decorate an existing extension and work with some of the existing methods there to perform an action on the remote system when the tunnel closes.  However, there are a few things you have to think pretty critically about:
- How do you tie the Guacamole session to the processes running on the remote system?  While many of them might be in a process tree with a top-level parent PID, what if a user starts something in the background with nohup (on Linux, for example) such that the process has a parent PID of 1.  How do you determine that the process "belongs" to the Guacamole session?  You could say, well, I'll do it by username - but what if you allow the user to log in multiple times?  How do you determine which of their logins launched a process?  I don't think this are insurmountable problems, but they are things to consider.
- If you allow users to disconnect from and reconnect to sessions, how do you distinguish within that Guacamole tunnel close event when the user intends to disconnect and reconnect versus when they are really trying to log off?  This is where just using the idle timers available within the protocols themselves, on the remote system, would be beneficial, as the remote O/S is already keeping track of that information and can make a determination on its own when a connection has been idle for too long and clean things up.

To be clear, I think these are issues that have to be addressed no matter which way you go - Guacamole Extension or "workload manager."  And, as such, I don't see where integration with a workload manager provides any value over either just writing the extension in Guacamole or allowing the remote systems to use the built-in capabilities to handle idle connections.

Or maybe I'm way off in the weeds here and not understanding what you're saying, still??

-Nick
Reply | Threaded
Open this post in threaded view
|

Re: Slurm in Guacamole

alipawsey
vnick wrote

> I believe HAProxy has a way to do this -
> through custom scripting, you can execute scripts that would query
> performance data on remote systems (via SSH or WMIC, etc.), and then
> adjust
> the priority based on that data.  I've done this in the past - it may not
> be the cleanest thing, but it works.  I think Keepalived had some similar
> hooks that could do such things, and maybe even IPVS.  I would imagine
> that
> some of the commercial load balancers - Citrix Netscaler, F5, etc. - have
> some similar capabilities to hook into backend servers and adjust priority
> based on load.  All of these are *Load Balancers* and not *Workload
> Managers* - that is, they will make decisions about where to put a
> connection at the time of the connection and then persist that connection
> until it is completed - they will not attempt to move connections from one
> system to another dynamically/real-time.  However, for 99% of use-cases
> out
> there this is sufficient.
>
> It also would be possible to write some additional load balancing
> integration for Guacamole to query and adjust priorities within a
> connection group based on load.  I think I started to do this at one point
> but never completed it, but it is very doable.

I agree too. As I experienced, Keepalived wasn't the best option as it had
problem with zambie nodes; it doesn't realise the node is freezed and messes
up things. I have a very custom node status page which tunnels through ssh
by "nc" command so if connection was not established then shows the node as
offline whereas Keepalived was showing that online. I was thinking to a
custom Guacamole plugin getting connected through e.g. Nagios API to check
the system resource. In this case we can redirect the new connection to the
less loaded instance. I'll keep updated here if I could do it. I think
moving connection across nodes is not the case here; Just having load
balance at connection establishment time would be sufficient.



vnick wrote

> Ah, I think I get what you're driving at, here.  You're talking about
> freeing up resources on the remote systems to which a user is connected.
> So, if a user connects, through Guacamole to a host called "VNC1" over the
> VNC protocol, and then disconnects, they could have left something running
> on the VNC1 host that needs to be "cleaned up" in order to free the
> resources for other users who would log on to that system, correct?
>
> There's no capability to do this currently in Guacamole, for a couple of
> reasons.  First, the remote systems are not "aware" of the Guacamole at
> all.  To the remote systems it could be Guacamole, or vncviewer, or
> Microsoft Remote Desktop Client, or openssh connecting, and it doesn't
> have
> any way to tie running processes on the remote host to session within
> Guacamole.  This makes the "clean up" job at least a little more difficult
> - you have to come up with some way, whether through a workload manager or
> a Guacamole plugin, to tie the Guacamole session to the login on the
> remote
> host.

This is absolutely right and that's the why I am looking into Slurm kind of
technique. And yes, the reason is that there is no Guacamole dependent part
on the destination and this is the exact difference between Guacamole and
FastX. However, the only advantage of FastX was load balancing and it still
has problem with freeing up resources after killing the session.


vnick wrote
> It is probably possible to do this with an extension, at least to some
> extent.  You could write an extension that had credentials to log in to
> the
> remote systems and run something on them, and either intercept the tunnel
> close event or decorate an existing extension and work with some of the
> existing methods there to perform an action on the remote system when the
> tunnel closes.  

At the moment I can put a script at each instance to kill the user session
(force log off) after X hrs wall time in case the session left disconnected;
The users need more time can request to except them. Another idea is
integrating Guacamole with a booking system (which I did) to allocate the
node the user has booked! and terminate their session at the end of booked
time slot. However, would be nice to be able to free up resources
automatically when admin kills a session. I was thinking to do it via ssh
tunnelling to execute killer script on the destination node (for that
particular user) tied to killing event in Guacamole.

Regarding the critical points you mentioned, yes should be considered as
well for perfect solution delivery.






--
Sent from: http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]