1) Foreword *
2) What is PDQ? *
3) Settings that affect PDQ. *
How to: PDQ
My first introduction to PDQ came at about the same time as my first exposure to Informix. I had been hired by SAIC to work on an Informix project that had been without a DBA for a period of time. They had recently moved their Decision Support database over to a larger server to try to improve some very poor query times. The system was supposed to start migrating users in my first week. However, disaster struck immediately: the new system ran its stress testing queries even slower than the old system!
So, tempers were high, as everyone wondered: how could this new system, with double the processors and quadruple the memory be running slower than the old system?
It was my job to find out. It was my job to get the schedule back on track. Me, the Oracle guy, who had been living in the country for less than 48 hours and had first connected to an Informix 7.3 database earlier that day!
Fortunately, I had a large number of textbooks (and other resources). I systematically went through every single setting in the onconfig file, comparing it to the old system, and to the current hardware, to see what we could do. I found a whole bunch of settings I figured could be tweaked, but nothing major, until I came across a large block of memory allocated to something called PDQ that was otherwise not configured. Being an Oracle guy, I had never heard of it.
After some frantic reading, this PDQ thing seemed to be just the break I was looking for. I activated the PDQ, setting it for 10 concurrent queries, and set the PDQPRIORITY environment variable equal to 100 in the stress testers’ login script. That night, the tester reran the tests: average DSS query time dropped from over 3 hours to less than 20 minutes. Within a few days of fine-tuning, we were well within the technical specifications of 12 minutes average query time.
The client was amazed, and everyone wanted to know exactly
how this happened. This article arose from all the research, documentation
and testing involved with an extended period of performance tuning a large
data warehouse, and getting my certification. I certainly hope you enjoy
PDQ stands for Parallel Data Query. It is an Informix Engine capability designed to allow a DBA to designate and manage memory on systems with many large queries: specifically, Decision Support Systems (DSS systems). When using PDQ, the DBA designates a chunk of memory as being available only to PDQ queries, then designates how much each query is allowed to have. Finally, when a query requests use of the PDQ resources, it is granted a chunk of memory from PDQ instead of using the memory in the buffers pool.
Used properly, PDQ can help manage memory usage, maximize
multi-processing capabilities and help to reduce disk I/O. It can also
allow a DBA greater control over the amount of resources allocated to a
single query, helping to prevent a small number of large queries from disrupting
other database operations.
3.1) Onconfig Settings
Once the DBA has decided to use PDQ, the first step is set up the proper settings in the onconfig file.
The following onconfig settings are used to set how much memory is available to the Virtual Memory segment. Since memory allocated to PDQ must come from the Virtual Segment, this must be planned first.
SHMVIRTSIZE – the initial amount of memory allocated to the Virtual Memory Segment of Shared Memory. A good formula for setting memory is (SHMVIRTSIZE + BUFFERS) *1.1 + other system memory required = Total System Memory.
SHMADD – the size of shared virtual segments that Informix will attempt to add to the virtual portion of shared memory as required.
SHMTOTAL – the most amount of memory that Informix will attempt to add to the virtual portion of shared memory. If (SHMTOTAL + BUFFERS) *1.1 + other system memory required > Total System Memory, then you risk crashing the server if Informix ever requests more memory than is available.
The following settings are used to set the Memory Grant Manager, which manages PDQ activities:
PDQPRIORITY – a percentage value that determines the default PDQPRIORITY to be requested by any query. This setting no longer works after 7.1.
MAX_PDQPRIORITY – a percentage value representing the maximum amount of PDQPRIORITY that the MGM will grant to any query. Actual PDQPRIORITY of a query is (MAX_PDQPRIORITY*PDQPRIORITY). For example, a query requesting 50% of PDQ, with a
MAX_PDQPRIORITY set to 50%, would result in a query with a PDQPRIORITY of 25.
DS_TOTAL_MEMORY – the total amount of memory, in bytes, that is allocated to PDQ queries. Must be a percentage of SHMVIRTSIZE, usually set to 90% of SHMVIRTSIZE on a system optimized for DSS queries.
DS_MAX_QUERIES – The maximum number of PDQ queries that can be executed concurrently.
DS_MAX_SCANS – The maximum number of scan threads available for PDQ queries.
The following is a setting that has an indirect effect on PDQ:
RA_PAGES – RA_PAGES / 8 = the light scan buffer. This
setting also affects other aspects of an Informix Instance, so any changes
made to this setting should be small, then observe results, then make further
3.2) Setting PDQPRIORITY in a query
To set PDQPRIORITY for a specific transaction, include the following SQL line at the beginning of your SQL statement:
SET PDQPRIORITY [number between 0-100].
Any user can set PDQPRIORITY for their queries in this manner.
This setting will override all other PDQPRIORITY settings.
3.3) Setting a default PDQPRIORITY for a user
To set a default PDQPRIORITY for a user, set an environment variable PDQPRIORITY equal to a value between 0-100 in their .login script, or as a default variable for all users. In HP-UX, this would appear in their .login script as the following line:
setenv PDQPRIORITY 100
Note that this setting can be overridden by setting a PDQPRIORITY as part of a query.
3.4) Setting PDQPRIORITY in the informix.rc file
PDQPRIORITY can be set to a default value for all queries by creating a file called informix.rc in the $INFORMIXDIR/etc directory. Include in it the following line:
PDQPRIORITY [number between 0-100]
Note that this can be overridden by setting a PDQPRIORITY as part of a query.
4.1) Multiprocessors and Physical Design
If you have a multiprocessor system, it is best to fragment your tables across multiple devices (drives and controllers) in a number divisible by NUMCPUVPS. This way, you can use the multi-threading aspect of PDQ Queries to run multiple active threads, effectively getting true multi-tasking and fully utilizing the I/O from each device.
4.2) Full DSS Systems
When tuning a system that is running almost exclusively DSS type queries, it is best to allocate as much memory to PDQ as possible. In this case, maximize the amount of memory allocated in SHMVIRTSIZE, (make DS_TOTAL_MEMORY 90% of SHMVIRTSIZE) and at the same time minimize the amount of memory allocated to the resident segment of shared memory by keeping the BUFFERS variable low.
PDQ queries do not use memory from the BUFFERS, they only use memory from the Virtual Shared Memory segments. However, if you are loading data using the High Performance Loader (HPL), it does use the memory allocated to the BUFFERS. There are two ways of handling this:
When tuning a system that is running almost exclusively OLTP type queries, it is best to allocate as much memory to BUFFERS as possible, and avoid using the PDQ. Max BUFFERS and minimize SHMVIRTSIZE and DS_TOTAL_MEMORY.
There are times when you may want to allocate some memory to PDQ:
A mixed DSS and OLTP system will require significant planning and observation. Memory must be allocated to both BUFFERS and SHMVIRTSIZE. The DBA must address how much should each get:
For a graphical description of how the multithreading and I/O management take place, please refer to Appendix B.
5.1) Multi Threading
First, consider how a "normal" query works: One active thread at a time is used to scan, join, sort and return the result to user. One thread does not begin until the previous thread terminates.
The same large query, using PDQ. First, scans, joins and sorts can all be conducted in parallel, breaking each task up into subtasks. The following functions can be done in parallel using PDQ: Scans, Inserts, Joins, Sorts, Aggregates and Groups. This can give some advantage on a single processor server, kicking off a Join Thread while waiting on a Scan thread to complete. However, it is especially powerful on multiprocessor servers, especially when you also use Table Fragmentation – more on this later.
5.2) Multitasking and I/O
When using PDQ, the DBA can fragment tables to take advantage of the multithreading capabilities and significantly increase I/O to dbspaces. To do so, a DBA first sets NUMCPUVPS to a number equal to the number of CPUs to be used by Informix, lets pretend there are four CPUs. Then, fragment your large table across a number of dbspaces that are equal or a multiple of that number. Make certain that each of these dbspaces consist of one chunk, each chunk residing on a different disk, preferably on different controllers. See Appendix B for a complete picture.
Now, when a query is run against this table, instead of one AIO or KAIO thread accessing the table, you have FOUR threads accessing that table concurrently, multiplying your I/O available to access your data. Since I/O is a very common bottleneck on systems, this can result in a HUGE power boost to your system.
5.3) Managing Memory
When not using PDQ, the data retrieved during a transaction is placed in the BUFFER cache. Information stored in the BUFFERS must be kept as long as it is in use, then for a time afterwards based on the LRU settings. However, if a very large transaction fills the BUFFERS, and other transactions are still active, then the BUFFERS will have to be swapped out to disk. This is called a foreground write, and it is the most expensive disk operation of them all!
The same query using PDQ will use only the memory allocated to it by the MGM (Memory Grant Manager), thus will not impact other queries. This memory exists in the Virtual Portion of Shared Memory, and is known as the Big Buffers. The Big Buffers differ from normal Buffers in that they are used for PDQ and are have an 8K page size instead of 2k or 4k. Thus the Big Buffers can retrieve more information per operation. Queries using PDQ also use Light Scans and will not cause foreground writes.
So, you’ve heard all this great stuff about PDQ, set your PDQ setttings, and started running those pesky DSS queries again. Now, how do you make sure your PDQ settings are actually helping your users? Simple, you monitor the Memory Grant Manager!
The Memory Grant Manager (MGM) is the part of the Informix engine that grants PDQ memory to transactions. It only grants PDQ memory to queries that request it, and it never gives any more than the query asks for.
To access the MGM, use the command onstat –g mgm.
The following will describe the output of the onstat –g mgm command that
can be seen in Appendix A. The output is from an active system.
6.1) The PDQ Settings
The first section of the output shows the current PDQ
settings. These could be the same as they were set in the onconfig file,
or they could reflect changes made on the fly, using the onmode –D, -Q,
–S and –M commands to change the MAX_PDQPRIORITY, DS_MAX_QUERIES, DS_MAX_SCANS
and DS_TOTAL_MEMORY, respectively.
6.2) PDQ Resource Allocation
The next few sections display the current allocation of PDQ resources:
Queries: Displays the number of Active queries, the Number of Ready Queries (queries that have parsed, but are waiting to be allocated PDQ Resources) and the Maximum number of queries that can be run through the MGM at any given time.
Memory (in KB): Displays the Total amount of memory given to PDQ, the amount of Free memory (memory not given to queries) and the Quantum.
What is a Quantum? A quantum is equal to DS_TOTAL_MEMORY / DS_MAX_QUERIES. Memory is given to queries requesting PDQ in quantums. For example, here a quantum is 5% of DS_TOTAL_MEMORY. If a query requests PDQPRIORITY of 10, it will be allocated 2 quantums of memory. Any query requesting PDQPRIORITY will be given quantums up to, but not to exceed, the amount requested: for example, if a query were to run with PDQPRIORITY of 12, it would still only be given 2 quantums.
Scans: Displays the Total number of scan threads available to PDQ, the number of Free (unused) threads, and the scan Quantum (DS_MAX_SCANS / DS_MAX_SCANS, thus always 1).
6.3) Load Control
The section titled Load Control shows why queries that are in the Ready queue have been "gated". A gated query is one that cannot currently be run because a resource the query requires is not available. Queries can be gated for the following reasons:
Memory: there are insufficient free quantums of memory to meet the PDQ request of the query. If this is a common problem, you may need to make a decision: is it more important that queries run quickly once they are active, or that every query starts processing right away. If you need queries to start processing right away, increase DS_MAX_QUERIES to reduce quantum size, and reduce the amount of PDQPRIORITY queries ask for (through you developers or through MAX_PDQPRIORITY). More memory is always better too!
Scans: there are insufficient scan threads available to service the amount requested of the queries. If this is the case, increase DS_MAX_SCANS. The overhead cost of maintaining extra scan threads is negligible.
Priority: these queries have been gated because other queries with higher PDQ have taken over the resources that otherwise would be allocated to them. Unfortunately, the author has never seen this happen, so can’t comment too much on it.
Max Queries: the query has been gated because the number of active queries is equal to DS_MAX_QUERIES. If you are frequently seeing this, you should try raising DS_MAX_QUERIES. If you are running a system where all queries have the same PDQPRIORITY, then try offsetting your MAX_PDQPRIORITY and DS_MAX_QUERIES. For example, instead of using a textbook 10 and 10 setting, set MAX_PDQPRIORITY to 10 and DS_MAX_QUERIES to 20. Thus, each query requiring memory will get 2 quantums of memory instead of 1, but most importantly, if an 11th query wants to run, and one or more query currently active is not using any memory, then the 11th query can run!
So, how exactly does this work then? Well, some queries running through PDQ do not necessarily need any memory. For example, any straight table scan does not need any memory. By offsetting your MAX_PDQPRIORITY and your DS_MAX_QUERIES, you can prevent queries from being gated when there are sufficient resources available to run them.
Reinit: When you make a change to you PDQ settings on the fly (using the various onmode commands), it is called a reinit. A reinit cannot complete when there are queries active in the MGM. If a query is active and a request for reinit takes place, all further queries are gated until all active queries complete. Then the reinit takes place, and any queries gated due to the reinit start to run. Lesson: ALWAYS be certain that there is no activity in the PDQ before you try to change settings. There is no way to undo a request for reinit. A reinit run while a large query is currently active could kill all user activity for hours.
6.4) Active Queries
This section gives details on active queries, including:
Session: the session id. For further information on any query, use onstat –g ses (or sql) then the session id.
Query: the query identifier given to this query (good for when a single session runs multiple queries)
Priority: the final PDQPRIORITY of the query. This is equal to the requested PDQPRIORITY * MAX_PDQPRIORITY.
Thread: thread identifier
Memory: amount of memory used/allocated. Note how two active queries have no memory allocated, while several queries are using much less than is allocated. The amount used is a "high-water mark" of the most memory that was used at any time during the execution of the query. Furthermore, notice that memory allocated seems to bear little resemblance the amount of memory in each granted quantum. This is because a quantum is measured in KB, while the memory in the Active Queries is measured in Big Buffer Pages. Big Buffer pages, discussed previously, are 8 KB in size. Thus, each query has two quantums allocated, for 152992 KB total memory, divided by 8 for 19124 Big Buffer Pages.
Scans: amount of scan threads being used / allocated.
Gate: not used in Active queries.
6.5) Ready Queries
Ready queries line up just the same as active queries. However, their Gate column shows which gate (1-5) they are stuck at while waiting to enter the active queue.
6.6) Free Resources
This section gives a bit of a historical / statistical perspective to your PDQ queries since the last reboot / reinit of the MGM. The averages tend to be pretty useless (especially if your system isn’t used 24 hours a day), but the minimum # shows you the least amount of resources that have been available at any time.
This final section displays a historical / statistical view of queries that have been active or gated since the last reinit of the MGM. Once again, the averages tend to be pretty useless.
Maximum: the maximum number of queries that have been active or gated at any given time since the last reinit of the MGM. Notice here that the maximum # of queries is 11, despite the fact that all queries have been running with PDQPRIORITY of 10.
Total #: the total number of queries that have been active or gated at any given time since the last reinit of the MGM. If you see the total number of Ready Queries constantly increasing, you may need to do some tuning, or get some more memory, or set your MAX_PDQPRIORITY lower.
Remember, any user can influence their PDQ request simply by issuing a SET PDQPRIORITY command during their transaction. There is no way to override this, but if your users must have direct SQL access to the database, be sure to set a MAX_PDQPRIORITY value.
Another good reason to be careful who knows what is that
as soon as your users know that some people can be given higher priority
than others, Col. Hapablab will want higher priority for all his queries
than Prt. Smith.
7.2) Offset your PDQPRIORITY and your MAX_DS_QUERIES
I can’t stress this enough. Believe me, there is nothing more frustrating than watching a number of queries being gated just because the max queries limit has been hit, when there is plenty of memory available.
I recommend setting your PDQPRIORITY and MAX_DS_QUERIES
so that PDQPRIORITY * (MAX_DS_QUERIES/2) = 100.
7.3) Look out for scan thread saturation
There is a limit to how much multiprocessing the server
can do. That limit can quickly be reached when you have a limited amount
of CPUs and a large number of queries against fragmented tables. To monitor
how many threads are sitting around, ready, but waiting for a CPU, use
the onstat –g rea command. Try to keep this output as low as possible.
If it is constantly high, you may need to add CPUs (and CPU VPs), or refragment
7.4) Remove indexes
Have you added PDQ and experienced worse results? Are
you using an index? The very first line in the Informix Performance Tuning
Guide states "remove all indexes". So, run your queries with your indexes
intact. Drop your indexes. Update Statistics. Re-run your queries. You
may be very surprised.
7.5) Update Statistics
Properly updating statistics is the number one key to
performance on any system. This is also true when using PDQ: the optimizer
cannot make a good decision.
7.6) In your drive for performance, have a destination in mind
The more you observe and tweak your system, the better
and better the performance you can get out of it. However, keep a goal
in mind, and remember: there is a point when you get all the low hanging
fruit, and there is a diminishing return on your time and effort. Knowing
when your system is good enough and your client is happy is the key to
making performance-tuning fun.
Informix Dynamic Server Version 7.30.FC5 -- On-Line -- Up 50 days 14:26:53 -- 1807384 Kbytes Memory Grant Manager (MGM) -------------------------- MAX_PDQPRIORITY: 10 DS_MAX_QUERIES: 20 DS_MAX_SCANS: 90 DS_TOTAL_MEMORY: 1530000 KB Queries: Active Ready Maximum 7 0 20 Memory: Total Free Quantum (KB) 1530000 765040 76496 Scans: Total Free Quantum 90 68 1 Load Control: (Memory) (Scans) (Priority) (Max Queries) (Reinit) Gate 1 Gate 2 Gate 3 Gate 4 Gate 5 (Queue Length) 0 0 0 0 0 Active Queries: --------------- Session Query Priority Thread Memory Scans Gate 9657 20a4e0050 10 20b3be708 0/0 1/1 - 9660 20a674050 10 20a2260d8 14798/19124 0/1 - 9661 20dd42050 10 209f21198 19122/19124 0/6 - 9666 217420050 10 209f75740 16173/19124 0/1 - 9671 215ab3f20 10 209d42708 19124/19124 1/1 - 9673 222fc2050 10 209faf5c8 0/0 6/6 - 9674 223700050 10 20cdd9150 9573/19124 6/6 - Ready Queries: None Free Resource Average # Minimum # -------------- --------------- --------- Memory 144802.4 +- 0.0 10 Scans 83.4 +- 5.5 54 Queries Average # Maximum # Total # -------------- --------------- --------- ------- Active 3.3 +- 2.0 11 3039 Ready 0.0 +- 0.0 0 0 Resource/Lock Cycle Prevention count: 0
The following diagram demonstrates how PDQ can enable
multitasking and increase your I/O when running a query against a table.
I learned a lot about PDQ (and other things Informix!) from the following books. I highly recommend them all, but I especially recommend Carlton Doe’s book to DBAs who are new to Informix.
Carlton Doe, INFORMIX-OnLine Dynamic Server Handbook
INFORMIX-OnLine Dynamic Server Performance Guide
INFORMIX-OnLine Dynamic Server Administrator’s Guide
INFORMIX-OnLine Dynamic Server Performance Training Manual
INFORMIX-OnLine Dynamic Server Administration Training Manual
Special thanks also to Bob Carts for introducing me to Informix, and for all his help with the information and for doing his best to keep me motivated.