Tuesday 7 April 2020

Re: RFC: Ubuntu HA resource-agents supportability

On Tue, Mar 31, 2020 at 1:09 AM Rafael David Tinoco
<rafaeldtinoco@ubuntu.com> wrote:
>
> Hello,
>
> As many as you know I'm currently revamping Ubuntu High Availability
> Packages
>
> For 20.04, considered HA (or HA related) packages are:
>
> - Core packages:
>
> - libqb
> - kronosnet
> - corosync
> - pacemaker
> - resource-agents
> - fence-agents
> - crmsh
> - cluster-glue
> - drbd-utils
> - dlm
> - gfs2-utils
>
> - "Deprecated" packages:
>
> - heartbeat
> - keepalived
> - ocfs2-tools
>
> - Not in "main" packages:
>
> - pcs (will likely replace crmsh in near future)
> - csync2
> - corosync-qdevice
> - fence-virt
> - sbd
> - booth
>
> - Related packages:
>
> - multipath-tools
> - open-iscsi
> - sg3-utils
> - targetcli-fb
> - tgt (we're trying to deprecate in favor of LIO)
> - lvm2
>
> For now, until Beta Freeze, we've been trying to catch up with upstream
> latest
> releases and, from now on, I'm going through existing opened bugs and
> addressing
> them with latest fixes (from upstream) + any needed fix to address the bugs
> (done to kronosnet, with FFE opened, and corosync, about to merge fixes to
> it).
>
> Next step is to document in Server Guide all supported scenarios for HA
> related
> packages. The intention here is to describe exact set of scenarios that we
> know
> are good for the perfect behavior of clustering software AND which scenarios
> we
> cannot support.
>
> OBS: This includes the need, or not, to have odd number of nodes/votes, to
> have
> or not proper fencing mechanisms (and which fencing mechanisms to support)
> AND,
> finally, what *resource agents* to support.
>
> I'll probably ask other feedback soon, but, for this moment, I'm asking
> comments
> for the list of resource agents bellow. I tried to split and explain what
> the
> resources are used for and if they are supported in Ubuntu or not (or if the
> related managed service is in [main] or in [universe]).
>
> So please, take some time to provide feedback about this list, whether we
> should
> move resources from one category to the other. *NOTE* that I'm not giving
> the
> "fence agents" list yet. That will be another list.
>
> I'm particularly interested in feedback from @jamespage and @ddstreet as
> they
> probably have good intel about resources usage BUT anyone is welcome to
> provide
> comments!
>
> Thank you very much in advance!

I added a few comments below, otherwise all the categories look
reasonable to me, thanks!

>
> #### RFC: Ubuntu HA resource-agents supportability
>
> #
> ## FULLY SUPPORTED (managed service is likely in main or is important
> enough)
> #
>
> # trivial agents
>
> Delay - test resource for introducing delay
> MailTo - sends email to a sysadmin whenever a takeover
> occurs
> ClusterMon - runs crm_mon to a html page from time to time
> HealthCPU - measures CPU idling and updates #health-cpu attr
> HealthIOWait - measures CPU idling and updates #health-iowait
> attr
> HealthSMART - measures CPU idling and updates #health-smart attr
>
> # services
>
> apache - apache web server instance
> dovecot - dovecot IMAP/POP3 server instance
> dhcpd - chrooted ISC dhcp server instance
> mysql - MySQL database instance
> mysql-proxy - MySQL proxy instance
> named - bind/named server instance
> nfsnotify - nfs sm-notify reboot notifications daemon
> nfsserver - nfs server resource
> nginx - Nginx web/proxy server instance
> postfix - postfix mail server instance
> rabbitmq-cluster - cloned rabbitmq cluster instance
> remote - pacemaker remote resource agent
> rsyncd - rsyncd instance
> rsyslog - rsyslogd instance
> slapd - stand-alone LDAP daemon instance
> Squid - squid proxy server instance
> vsftpd - vsftpd server instance
>
> # storage
>
> Raid1 - software RAID (MD) devices on shared storage
> iscsi - local iscsi initiator and its conns to targets
> iSCSILogicalUnit - iSCSI logical units
> iSCSITarget - iSCSI target export agent (implementation: tgt,
> lio)
> LVM - LVM volume as an HA resource
> LVM-activate - LVM activation/deact work for a given VG
> (lvmlockd+LVM-activate OR clvm+LVM-activate)
> Filesystem - filesystem on a shared storage medium
> symlink - symbolic link
> ZFS - ZFS pools import/export
>
> # locking & reservations
>
> controld - distributed lock manager for clustered FSs
> clvm - clvmd daemon (cluster logical vol manager)

Was clvm dropped from lvm2?
https://launchpad.net/ubuntu/+source/lvm2/2.03.02-2ubuntu1
I haven't used clustered lvm myself; maybe it was just rolled into lvm2.

> lvmlockd - agent manages the lvmlockd daemon.
> mpathpersist - SCSI persistent reservations on mpath devs
> sg_persist - master/slave resource for SCSI3 reservations
>
> # networking
>
> Route - network routes
> iface-bridge - bridge network interfaces
> iface-vlan - vlan network interfaces
> IPaddr2 - virtual IPv4 and IPv6 addresses
> ipsec - ipsec tunnels for VIPs
> IPsrcaddr - preferred source address modification
> IPv6addr - IPv6 aliases
> conntrackd - conntrackd instance
> SendArp - send gratuitous ARP for IP address
> VIPArip - virtual IP address through RIP2
> ifspeed - monitor action runs -> updates CIB with if speed
>
> # virtualization
>
> VirtualDomain - manages virtual domains through libvirt
> (virtual machine only)
>
> # containers
>
> docker - docker container resource agent

as cpaelzer said, docker itself shouldn't be in the fully supported list.

> lxc - allows LXC containers to be managed by the cluster

presumably, this includes lxc and lxd?

>
> #
> ## BEST EFFORT SUPPORT (managed service is likely in universe or is
> interesting)
> #
>
> # trivial agents
>
> anything - generic agent to manage virtually *anything*
> Dummy - testing dummy resource agent (template for RA
> writers)
> AudibleAlarm - audible beeps at interval
> Stateful - example agent that supports two states
> WinPopup - sends a SMB notification msg (popup) to a host
>
> # services
>
> asterisk - asterisk PBX
> CTDB - clustered samba (for needed clustered underlying)
> dnsupdate - ip take-over via dynamic dns updates
> exportfs - nfs exports (not the nfs server)

wouldn't this be fully supported?

> fio - fio instance
> galera - galera instance
> garbd - galera arbitrator instance
> jboss - JBoss application server instance
> jira - JIRA server instance
> kamailio - kamailio SIP proxy/registrar instance
> mariadb - MariaDB master/slave instance
> nagios - nagios instance
> ovsmonitor - clone resource to monitor network bonds on diff
> nodes
> pgagent - pgagent instance
> pgsql - pgsql database instance

shouldn't this be in fully supported?

also Brett (I added to cc) brought up that resource-agents-paf might
be worth considering supporting:
https://launchpad.net/ubuntu/+source/resource-agents-paf

> pound - pound reverse proxy load-bal server instance
> proftpd - proftpd instance
> Pure-FTPd - pure-ftpd instance
> redis - redis server (supports master/slave replicas)
> instance
> syslog-ng - syslog-ng instance
> tomcat - tomcat servlet environment instance
> varnish - varnish instance
>
> # storage
>
> AoEtarget - ata over ethernet
>
> # networking
>
> IPaddr - virtual IPv4 addresses
> ocf:pacemaker:ping - records in CIB number of nodes host can connect to
> portblock - temporarily block/unblock access to tcp/udp ports
>
> # openstack
>
> openstack-cinder-volume - attach cinder vol to an instance (os-info <->)
> openstack-floating-ip - move a floating IP from an instance to another

I would expect both these to be in the fully supported category?

>
> # registration (CIB)
>
> lxd-info - nr of lxd containers running in CIB
> machine-info - records various node attributes in CIB
> NodeUtilization - cpu, host mem, hypervisor mem etc... into CIB
> openstack-info - records attributes of a node into CIB
> SysInfo - records various node attributes into CIB
> SystemHealth - monitors health of system using IPMI
> attribute - sets node attr one way when started and vice-versa
>
> #
> ## COMMUNITY SUPPORT (bugs opened here will be forwarded to upstream
> directly)
> #
>
> # services
>
> SphinxSearchDaemon - sphix search daemon
> Xinetd - start/stop services managed by xinetd
> zabbixserver - zabbix server instance
>
> # storage
>
> o2cb - oracle cluster filesystem userspace daemon
> (oracle)
> sfex - excl access to shared storage using SF-EX
>
> # virtualization
>
> aliyun-vpc-move-ip - move ip within a vpc of the aliyum ecs (alibaba)
> awseip - manages aws elastic IP address (aws)
> awsvip - manages aws secondary private ip addresses (aws)
> aws-vpc-move-ip - move ip within a vpc of the aws ec2 (aws)
> aws-vpc-route53 - update route53 vpc record for aws ec2 (aws)
> azure-events - monitor for scheduled events for azure vm (azure)
> azure-lb - answers azure load balancer health probe req
> (azure)
> gcp-vpc-move-ip - floating ip address within a GCP VPC (google)
> ManageVE - openVZ virtual environment (virtuozzo)
> minio - minio server instance
> podman - creates/launches podman containers
> rkt - creates/launches container based on supplied image
>
> #
> ## UNSUPPORTED (Ubuntu does not support it)
> #
>
> db2 - manages IBM DB2 LUW databases (IBM)
> eDir88 - Novell eDirectory directory server instance
> (novell)
> ICP - ICP vortex clustered host drive (intel)
> ids - IBM informix dynamic server (IDS) (IBM)
> SAPDatabase - SAP database (of any type) instance agent (SAP)
> SAPInstance - SAP application server instances agent (SAP)
> ServeRAID - enables/disables shared serveRAID merge groups
> (IBM)
> ManageRAID - raid devices (/etc/conf.d/HB-ManageRAID)
> oraasm - oracle asm agent, uses ohasd for asm disk grp
> (oracle)
> oracle - oracle database instance (oracle)
> oralsnr - oracle TNS listener (oracle)
> sybaseASE - sybase ASE failover instance (Sybase)
> vdo-vol - https://bugs.launchpad.net/ubuntu/+bug/1869825
> WAS - websphere application server instance (IBM)
> WAS6 - websphere application server instance (IBM)
> Xen - xen unprivileged domains

as cpaelzer mentioned, Xen should probably move up to the 'best
effort' section; this was just moved out of main in focal.

>
> #
> ## DEPRECATED (do not use)
> #
>
> Evmsd - clustered evms vol mgmt (evms is not maintained)
> EvmsSCC - clustered evms vol mgmt (evms is not maintained)
> LinuxSCSI - enables/disables scsi devs through kernel scsi
> hotplug
> scsi2reservation - SCSI-2 reservation agent (depends on
> "scsi_reserve")
> ocf:heartbeat:pingd - monitors connectivity to specific hosts
> ocf:pacemaker:pingd - replaced by pacemaker:ping (this is broken)
> vmware - control vmware server 2.0 virtual machines (2009)
>
>

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel