User Tools

Site Tools


Writing /app/www/public/data/meta/toolsandtechnologies/universal_query_runner.meta failed
toolsandtechnologies:universal_query_runner

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
toolsandtechnologies:universal_query_runner [2021/03/09 13:25] wflahertytoolsandtechnologies:universal_query_runner [2021/06/25 10:09] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +====== Universal Query Runner ======
 +
 +As the name suggests, this application allows you to run a query across all servers or to a filtered few. The query is run on all servers, that aren’t filtered out, at the same time and then the data is returned together.
 +
 +===== How to use it =====
 +
 +==== Basics ====
 +
 +{{start_screen.png|Start View}} In the description there is a large button with a link to the output. It will lead to a google sheets document where you will find the output. You will have to look at the name of the tabs to find your output. It will be named by your name followed by the name you give for a ‘Sheet Name’
 +
 +=== Running ===
 +
 +To run a query, it is recommended to have a query already tested and running on one of the servers. Use something like DBeaver or Datagrip to test it out and then when you are happy with it, clean the query up and run it here. Do not include comments in the query as the query gets put on just one line.
 +
 +You are expected to limit your own query.
 +
 +{{run_screen.png|Build View}} This is the build view.
 +
 +== Query ==
 +
 +Take the query to have written and paste it into the Query box. It should support all characters that would work in a standard query but avoid leaving comments.
 +
 +== Permission ==
 +
 +The permission field controls if a user can only read or if they can write. Writing hasn’t been tested considering it has massive damage potential. What the Permission section does is disallow anything like drop, update, delete, insert, etc from being in the query unless the permission says ''%%write%%''. For safety, this option is removed from the drop-down but can be readded if we ever really needed it.
 +
 +== Environment ==
 +
 +Select an environment for this query specifically. Helpful for if you want to run a different query or check on a specific environment. This option overrides all others and will be the only environment the query will be run on. It is exclusive not inclusive.
 +
 +== Customer and Type ==
 +
 +These two can be used together or just as one. For Customer, it filters out anything that isn’t for that customer. For ATC the servers would be NVQA, Captain America, QAATC and PRODATC.
 +
 +As for the Type. Options are Both/All, Prod or QA. QA will only be the QA envs and Prod will only be Production envs.
 +
 +== Sheet Name ==
 +
 +This is the name of the worksheet you want the query to output to in the sheets document. It is important to remember that your worksheet will be named with your name and the name you have given the sheet. This prevents you from overwriting others. You can still change the name after you generated it.
 +
 +If a sheet doesn’t exist at build time, it will make a new sheet.
 +
 +==== Warnings ====
 +
 +  * Sheets will with the same name are designed to be overwritten. If you out to Results with one query and then run another that outputs to Results, the second one will clear and then write to the Results tab.
 +  * Your sheet is likely going to be placed on the far right of all the tabs.
 +  * There are no hard limits programmed in for the query. You are expected to put a limit you would like into the query.
 +
 +==== Troubleshooting ====
 +
 +=== I have two sheets with the same name ===
 +
 +There is a good chance an extra space is included somewhere
 +
 +=== My sheet is missing ===
 +
 +It’s probably the furthest right tab.
 +
 +=== Multiline output is messed up ===
 +
 +It’s true. Work in progress.
 +
 +==== Managing the data ====
 +
 +As you might have noticed, a column is added to the output of each query with the name of the server that the result row has come from. This was added to make it easier to post process the data. With the column added, you can quickly make a Pivot Table, a chart or a filter for the data to have it represented neatly.
 +
 +===== How it works =====
 +
 +This project is composed of 3 sections all held together with Jenkins.
 +
 +In the repository, there is a Jenkinsfile, which Jenkins reads and runs the configuration from.
 +
 +{{processes.png| Processes View}}
 +
 +==== Section 1: Ansible ====
 +
 +Ansible runs first and checks the env-configuration repo for certain variables and fields that are needed for accessing the databases. The credentials requested to decrypts this information is in PWSafe and Jenkins has a credential id for each environment.
 +
 +The Ansible calls on a job that fills out a properties file template and then renames the file. Each environment get its own properties file for the second section to use.
 +
 +==== Section 2.1: Bash to Groovy ====
 +
 +This section is run in parallel for all envs at the same time, so that it is very fast.
 +
 +Using the properties file and the Jenkins variables, a bash script then goes over some boolean logic to decide if the groovy script can run. If Customer and Type are left as all and an Environment is selected then all will be ignored except that environment. If Environment is left at ‘all’, then Customer and Type can be used to filter which envs run the query.
 +
 +==== Section 2.2: Groovy ====
 +
 +If the bash script passes and the Groovy script runs it will receive a few arguments. This is so the output can be printed to a CSV file with a distinct name to be concatenated later.
 +
 +Firstly the groovy script will initialize a few variables before checking if the query is allowed. Any type of write, update or delete is not permitted with the Permission set as ‘read’. They can only be used if the permission is set to ‘write’. This is highly risky and not recommended.
 +
 +The code is fairly easy to understand as its all in small methods.
 +
 +==== Section 3: Python ====
 +
 +Firstly, a bash script will concatenate the CSV files that the Groovy script created. Secondly, the bash script will then continue to build the Python VEnv and use it.
 +
 +The python script then runs were it reads the sql_results and organizes the data. It then pushes it to a google sheet document. If the worksheet name doesn’t exist it will create it and if it does exist it will clear it and overwrite the data on it.
 +
 +===== Development =====
 +
 +Feel free to go nuts. Just remember to change the branches in the Jenkinsfile and in the job configuration for the jenkins job.
 +