User Tools

Site Tools


resolution_area:openstack

This is an old revision of the document!


Internal Environment issue

The purpose of this page is to gather a list of resolutions which can be used by anyone to recover an OpenStack environment to keep the system up.

As the environment is not monitored as a production environment, there can be situations like disk space usage which are alerted in the slack channels but not acted upon in a timely manner.

Watchdog Internal Slack Channel Prometheus Internal Slack Channel

Troubleshooting

Check space, typically start with the IDMS Loadbalancer host and work our way through Apps1, Apps2, DB1, DB2

ssh scotty@hostlb1.err
sudo su -
cd /
du -hs | sort -h

Example output 
1.2G	run
3.2G	root
3.3G	home
4.0G	usr
5.0G	swapfile
14G	var

RabbitMQ Space resolution - Internal Env only

NOTE This will wipe all data so apply with care and only on Internal environment.

The RMQ data is stored in the /var/lib/rabbitmq so above we can see 14G locked in the var folder.

As this is an internal environment, we can clean out space by removing the persistent store

/var/lib/rabbitmq/mnesia/HOSTHERE/msg_stores/vhosts/UUIDFOLDER/msg_store_persistent'

Find the largest folder store, and delete all files present

CAS / You do not have permission to access this.

When all normal user profile issues are checked (username, password, account active) checking the CAS log can be a useful start logs/grails/cas.log

If the following is present [org.jasig.cas.CentralAuthenticationServiceImpl] - ServiceManagement: Unauthorized Service Access. Service [http://qascoapps1.err:8081/ReportingManager/shiro-cas] is not found in service registry.

Verify the URL is resolving by a simple ping qascoapps1.err

if this fails to render, then the CAS authentication cannot succeed, and points to a DNS issue.

resolution_area/openstack.1645203799.txt.gz · Last modified: 2022/02/18 17:03 by 10.91.120.28